aws_glue_crawler resource
Use the aws_glue_crawler
InSpec audit resource to test properties of a single AWS Glue crawler.
The AWS::Glue::Crawler resource specifies an AWS Glue crawler.
For additional information, including details on parameters and properties, see the AWS documentation on Glue Crawler.
Syntax
Ensure that a crawler name exists.
describe aws_glue_crawler(name: 'CRAWLER_NAME') do
it { should exist }
end
Parameters
name
(required)The name of the crawler.
Properties
name
- The name of the crawler.
role
- The ARN of an IAM role that’s used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
target
- A collection of targets to crawl.
database_name
- The name of the database in which the crawler’s output is stored.
description
- A description of the crawler.
classifier
- A list of UTF-8 strings that specify the custom classifiers that are associated with the crawler.
recrawl_policy
- A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
schema_change_policy
- The policy that specifies update and delete behaviors for the crawler.
lineage_configuration
- A configuration that specifies whether data lineage is enabled for the crawler.
state
- Whether the crawler is running, or whether a run is pending.
table_prefix
- The prefix added to the names of tables that are created.
schedule
- For scheduled crawlers, the schedule when the crawler runs.
crawl_elapsed_time
- If the crawler is running, contains the total time elapsed since the last crawl began.
creation_time
- The time that the crawler was created.
last_updated
- The time that the crawler was last updated.
last_crawl
- The status of the last crawl, and potentially error information if an error occurred.
version
- The version of the crawler.
configuration
- Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler’s behavior.
crawler_security_configuration
- The name of the
SecurityConfiguration
structure to be used by this crawler.
Examples
Ensure a crawler name is available:
describe aws_glue_crawler(name: 'CRAWLER_NAME') do
its('name') { should eq 'CRAWLER_NAME' }
end
Verify the database name in the crawler:
describe aws_glue_crawler(name: 'CRAWLER_NAME') do
its('database_name') { should eq 'CRAWLER_DATABASE_NAME' }
end
Matchers
For a full list of available matchers, see our Universal Matchers page.This resource has the following special matchers.
exist
Use should
to test that the entity exists.
describe aws_glue_crawler(name: 'crawler_name') do
it { should exist }
end
Use should_not
to test the entity does not exist.
describe aws_glue_crawler(name: 'dummy') do
it { should_not exist }
end
be_available
Use should
to check if the work_group name is available.
describe aws_glue_crawler(name: 'crawler_name') do
it { should be_available }
end
AWS Permissions
Your AWS principal will need the EC2:Client:GetCrawlerResponse
action with Effect
set to Allow
.