Amazon EMR

Score8.9 out of 10

63 Reviews and Ratings

What is Amazon EMR?

Amazon EMR is a cloud-native big data platform for processing vast amounts of data quickly, at scale. Using open source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi (Incubating), and Presto, coupled with the scalability of Amazon EC2 and scalable storage of Amazon S3, EMR gives analytical teams the engines and elasticity to run Petabyte-scale analysis.

Categories & Use Cases

Hadoop-Related

#1 most frequent

Professional, Scientific, and Technical Services

22.7%

208 installations of 915

#2 most frequent

Information

17.2%

157 installations of 915

#3 most frequent

Manufacturing

10.2%

93 installations of 915

Verified User

Employee in Engineering (1001-5000 employees employees)

Use Cases and Deployment Scope

Amazon EMR (Elastic MapReduce) is heavily used at my organization for most if not all data pipeline computations: we started by using EC2 instances, we then moved to EMR Serverless and we are actually completing the transition to EMR on EKS. In general we use it for long-running analysis (SQLs with a lot of JOINs) and overall for batch processing. From what I've seen, we use it with Spark under the hood.

Pros

EMR on EKS is really flexible and cost-saving
Flexibility on how to run the jobs (and different implementations to choose from)
Support online and it's a regularly updated product

Cons

EMR on EKS could be better documented, especially since for the "magic" it does under the hood when using Spark
UI can be improved (especially for EMR on EKS)

Return on Investment

Switching to EMR on EKS most of our EMR on EC2 jobs has produced a reduction of 4% in the overall costs (while maintaining the same level of data freshness)

Usability

Other Software Used

Apache Spark, Apache Airflow, Amazon S3 (Simple Storage Service), dbt

José David Rodríguez Gómez View profile

Director Nacional. Desarrollo de red, procesos y servicio al cliente Posventa in Engineering at DINISSAN (1001-5000 employees employees)

Use Cases and Deployment Scope

On request transitory clusters for huge information handling. I like its accessibility completely different taken a toll tire makes it greatly flexible for distinctive scale clients. Can be pre-installed with any Huge information apparatuses like Hive, Start, Pig, etc. Nitty-gritty cluster observing makes a difference to track a few measurements, in turn, makes a difference to diminish fetched.

Pros

Big data processing.
The resizing feature is good.
Ease of use and creating new clusters.

Cons

The user interface could use a facelift.
Overhead delay in starting clusters.
Big learning curve for someone who hasn't used a program like this before.

Most Important Features

EMR can execute the code utilizing start or other clusters like Hadoop.
Execution time comes down to a few minutes as against a few hours running on either EC2 or other computing servers.
Easy to select between hadoop or start based EMR clusters.

Return on Investment

Reduced times of processing.
He platform is very useful in regards to its processing and storage of big data.
No need to handle complex configuration of Big data platform.

Nick Waters View profile

Solutions Engineer in Sales at Datameer (10,001+ employees employees)

Use Cases and Deployment Scope

I use Amazon EMR (Elastic Map Reduce) as a scalable platform to deploy my client solutions onto. It allows me to scale our solution elastically in the cloud and allows us to deal with any data size, volume, or complexity. It is very easy to configure and scale and it is my preferred platform to deploy to.

Pros

Scalability
Costings
Flexibility

Cons

Costs
Auto-scale

Most Important Features

Scaling
Costs
Flexibility

Return on Investment

Costs can spiral out of control if not careful
Customers put off by costings
Competition from GCP

Other Software Used

Microsoft Azure, Google BigQuery, Snowflake

Jonathan Brotto View profile

SAP Specialist in Corporate at Ultident Scientific (51-200 employees employees)

Use Cases and Deployment Scope

To keep my review simple it is very convenient that AWS has a MapReduce tool as it was easy to deploy and test with our cloud setup. Also with AWS being well known it is easy to find staff who can use and set up a system and scale our solutions. Definitely an industry leader.

Pros

Scalable
Flexible
Good documentation
Cost effective

Cons

Integration with ERP for SMEs.
To connect to non cloud solutions and replicate data for backup.
Better performance metrics for business people such as cost benefits.

Most Important Features

Elasticity
Reliability
Security
Flexibility

Return on Investment

ROE is slower for small business
Less in house resources to manage
We can focus more on the business

Alternatives Considered

Apache Hadoop

Other Software Used

SAP Business One, Microsoft SQL Server, Microsoft 365 (formerly Office 365)

Uddipan Mukherjee View profile

Big Data Specialist in Engineering at Autodesk (10,001+ employees employees)

Use Cases and Deployment Scope

Used as spark cluster to enable Big data ETL processes. Analysists and data scientists uses clusters for adhoc querying purposes. Raw data ingestion fro. RDBMS systems , APIs, file systems etc. Used elastic feature with different node types to optimize cost. Scope of the use case is a company wide big data platform.

Pros

Big data ETL
Data ingestion
Ad hoc query support

Cons

Library management
Storing historical steps
Downloading EMR job logs could be easier

Most Important Features

Steps
Bootstrap
Elastic

Return on Investment

Quick data load
Ingestion quality
Stability

Other Software Used

AWS Glue, Matillion

Amazon EMR

What is Amazon EMR?

Categories & Use Cases

Most Frequent Users

Professional, Scientific, and Technical Services

Information

Manufacturing

Amazon EMR Reviews

Use Cases and Deployment Scope

Pros

Cons

Return on Investment

Usability

Other Software Used

Use Cases and Deployment Scope

Pros

Cons

Most Important Features

Return on Investment

Use Cases and Deployment Scope

Pros

Cons

Most Important Features

Return on Investment

Other Software Used

Use Cases and Deployment Scope

Pros

Cons

Most Important Features

Return on Investment

Alternatives Considered

Other Software Used

Use Cases and Deployment Scope

Pros

Cons

Most Important Features

Return on Investment

Other Software Used