AWS Glue is a managed extract, transform, and load (ETL) service designed to make it easy for customers to prepare and load data for analytics. With it, users can create and run an ETL job in the AWS Management Console. Users point AWS Glue to data stored on AWS, and AWS Glue discovers data and stores the associated metadata (e.g. table definition and schema) in the AWS Glue Data Catalog. Once cataloged, data is immediately searchable, queryable, and available for ETL.
$0.44
billed per second, 1 minute minimum
AWS Lambda
Score 8.4 out of 10
N/A
AWS Lambda is a serverless computing platform that lets users run code without provisioning or managing servers. With Lambda, users can run code for virtually any type of app or backend service—all with zero administration. It takes of requirements to run and scale code with high availability.
$NaN
Per 1 ms
Pricing
AWS Glue
AWS Lambda
Editions & Modules
per DPU-Hour
$0.44
billed per second, 1 minute minimum
128 MB
$0.0000000021
Per 1 ms
1024 MB
$0.0000000167
Per 1 ms
10240 MB
$0.0000001667
Per 1 ms
Offerings
Pricing Offerings
AWS Glue
AWS Lambda
Free Trial
No
No
Free/Freemium Version
No
No
Premium Consulting/Integration Services
No
No
Entry-level Setup Fee
No setup fee
No setup fee
Additional Details
—
—
More Pricing Information
Community Pulse
AWS Glue
AWS Lambda
Features
AWS Glue
AWS Lambda
Access Control and Security
Comparison of Access Control and Security features of Product A and Product B
One of AWS Glue's most notable features that aid in the creation and transformation of data is its data catalog. Support, scheduling, and the automation of the data schema recognition make it superior to its competitors aside from that. It also integrates perfectly with other AWS tools. The main restriction may be integrated with systems outside of the AWS environment. It functions flawlessly with the current AWS services but not with other goods. Another potential restriction that comes to mind is that glue operates on a spark, which means the engineer needs to be conversant in the language.
Lambda excels at event-driven, short-lived tasks, such as processing files or building simple APIs. However, it's less ideal for long-running, computationally intensive, or applications that rely on carrying the state between jobs. Cold starts and constant load can easily balloon the costs.
It is extremely fast, easy, and self-intuitive. Though it is a suite of services, it requires pretty less time to get control over it.
As it is a managed service, one need not take care of a lot of underlying details. The identification of data schema, code generation, customization, and orchestration of the different job components allows the developers to focus on the core business problem without worrying about infrastructure issues.
It is a pay-as-you-go service. So, there is no need to provide any capacity in advance. So, it makes scheduling much easier.
Developing test cases for Lambda functions can be difficult. For functions that require some sort of input it can be tough to develop the proper payload and event for a test.
For the uninitiated, deploying functions with Infrastructure as Code tools can be a challenging undertaking.
Logging the output of a function feels disjointed from running the function in the console. A tighter integration with operational logging would be appreciated, perhaps being able to view function logs from the Lambda console instead of having to navigate over to CloudWatch.
Sometimes its difficult to determine the correct permissions needed for Lambda execution from other AWS services.
While easy to set up and manage monitoring for large datasets, its complexity can be a barrier for new users. Integration with AWS Ecosystem, Managed Monitoring, Dashboards and monitoring tools for AWS Glue are generally easy to set up and maintain, Automated Data Pipelines. Automates data pipeline creation, making it efficient for certain data integration
I give it a seven is usability because it's AWS. Their UI's are always clunkier than the competition and their documentation is rather cumbersome. There's SO MUCH to dig through and it's a gamble if you actually end up finding the corresponding info if it will actually help. Like I said before, going to google with a specific problem is likely a better route because AWS is quite ubiquitous and chances are you're not the first to encounter the problem. That being said, using SAM (Serverless application model) and it's SAM Local environment makes running local instances of your Lambdas in dev environments painless and quite fun. Using Nodejs + Lambda + SAM Local + VS Code debugger = AWESOME.
Amazon responds in good time once the ticket has been generated but needs to generate tickets frequent because very few sample codes are available, and it's not cover all the scenarios.
Amazon consistently provides comprehensive and easy-to-parse documentation of all AWS features and services. Most development team members find what they need with a quick internet search of the AWS documentation available online. If you need advanced support, though, you might need to engage an AWS engineer, and that could be an unexpected (or unwelcome) expense.
AWS Glue is a fully managed ETL service that automates many ETL tasks, making it easier to set AWS Glue simplifies ETL through a visual interface and automated code generation.
AWS Lambda is good for short running functions, and ideally in response to events within AWS. Google App Engine is a more robust environment which can have complex code running for long periods of time, and across more than one instance of hardware. Google App Engine allows for both front-end and back-end infrastructure, while AWS Lambda is only for small back-end functions
We are using GLUE for our ETL purpose. it’s ease with other our AWS services makes our ROI, 100% ROI.
One missing piece was compatibility with other data source for which we found a work around and made our data source as S3 only, so our dependencies on other data source is also reducing
Positive - Only paying for when code is run, unlike virtual machines where you pay always regardless of processing power usage.
Positive - Scalability and accommodating larger amounts of demand is much cheaper. Instead of scaling up virtual machines and increasing the prices you pay for that, you are just increasing the number of times your lambda function is run.
Negative - Debugging/troubleshooting, and developing for lambda functions take a bit more time to get used to, and migrating code from virtual machines and normal processes to Lambda functions can take a bit of time.