AWS Glue is a fully managed extract, transform, and load (ETL) service that you can use to catalog your data, clean it, enrich it, and move it reliably between data stores.
• Integrated data catalog
• Automatic schema discovery
• Code generation
• Developer endpoints
• Flexible job scheduler
Priced by regions
Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)
•AWS Glue Data Catalog is your persistent metadata store for all your data assets
•AWS Glue crawlers connect to your source or target data store, progresses through a prioritized list of classifiers
•AWS Glue automatically generates the code to extract, transform, and load your data
•Glue provides development endpoints for you to edit, debug, and test the code it generates for you
•AWS Glue jobs can be invoked on a schedule, on-demand, or based on an event
• No infrastructure to buy, set up or manage.
• Easy to get started.
• Automatically provisions the environment needed to complete the job.
• Customers pay only for the compute resources consumed while running ETL jobs.
• Data is available for analytics in minutes.
• Provides a flexible scheduler with dependency resolution, job monitoring, and alerting.
AWS Glue is a cost-effective and fully managed ETL (extract, transform and load) service that is simple and flexible. With this ETL service it’s easier for your customers to prepare and load their data which is for analytics. With just a few clicks you can create and run an ETL job in the AWS Management Console. How this is done is that you just point AWS Glue to the data that you have already stored on AWS.
The next thing that happens is that the AWS Glue will discover your data and stores the associated metadata in the AWS Glue Data Catalog. Once your data is cataloged then it’s immediately query able and very available for ETL. AWS Glue works by generating the code that will execute your data transformations including the data loading processes. AWS Glue is useful in building your data warehouse to organize, cleanse, validate and format your data. It’s excellent if you want to transform and move AWS Cloud data into your data store.
For regular reporting and analysis, it allows you to load data from different sources into your data warehouse. AWS Glue is very good at discovering and cataloging metadata about your data. It puts all this for you into a central catalog. You can even process semi-structured data. AWS Glue has an advantage of triggering your ETL jobs. It does this based on an event or schedule. It does this by initiating jobs automatically. This then moves your data into your data warehouse. You can use these kinds of triggers to create some form of dependency flow between your jobs.