Likelihood to Recommend We are running it to perform preparation which takes a few hours on EC2 to be running on a spark-based EMR cluster to total the preparation inside minutes rather than a few hours. Ease of utilization and capacity to select from either Hadoop or spark. Processing time diminishes from 5-8 hours to 25-30 minutes compared with the Ec2 occurrence and more in a few cases.
Read full review What you have are different strategies for data encoding, which makes the process quite flexible, it is perfectly done so that a joint and collaborative work can be carried out, this information analyzed in large quantities, is extremely vital for the company, by giving it the correct and timely reading
Read full review Pros EMR does well in managing the cost as it uses the task node cores to process the data and these instances are cheaper when the data is stored on s3. It is really cost efficient. No need to maintain any libraries to connect to AWS resources. EMR is highly available, secure and easy to launch. No much hassle in launching the cluster (Simple and easy). EMR manages the big data frameworks which the developer need not worry (no need to maintain the memory and framework settings) about the framework settings. It's all setup on launch time. The bootstrapping feature is great. Read full review Ultra fast query results. IN Memory Database. Easy integration to reporting services. Read full review Cons It would have been better if packages like HBase and Flume were available with Amazon EMR. This would make the product even more helpful in some cases. Products like Cloudera provide the options to move the whole deployment into a dedicated server and use it at our discretion. This would have been a good option if available with EMR. If EMR gave the option to be used with any choice of cloud provider, it would have helped instead of having to move the data from another cloud service to S3. Read full review Problems Could Be Encountered is particularly pronounced in more complex analyses. Categorical variables are often not precise enough Read full review Usability I give Amazon EMR this rating because while it is great at simplifying running big data frameworks, providing the Amazon EMR highlights, product details, and pricing information, and analyzing vast amounts of data, it can be run slow, freeze and glitch sometimes. So overall Amazon EMR is pretty good to use other than some basic issues.
Read full review Support Rating There's a vast group of trained and certified (by AWS) professionals ready to work for anyone that needs to implement, configure or fix EMR. There's also a great amount of documentation that is accessible to anyone who's trying to learn this. And there's also always the help of AWS itself. They have people ready to help you analyze your needs and then make a recommendation.
Read full review Alternatives Considered Snowflake is a lot easier to get started with than the other options.
Snowflake 's data lake building capabilities are far more powerful. Although Amazon EMR isn't our first pick, we've had an excellent experience with EC2 and S3. Because of our current API interfaces, it made more sense for us to continue with Hadoop rather than explore other options.
Read full review We selected Kognitio because of the legacy systems that are still running. Also, we have legacy systems in place that are fit for Kognitio. End-user has good feedback on our side when we started implementing this solution. Current servers are compatible with Kognitio in place.
Read full review Return on Investment It was obviously cheaper and convenient to use as most of our data processing and pipelines are on AWS. It was fast and readily available with a click and that saved a ton of time rather than having to figure out the down time of the cluster if its on premises. It saved time on processing chunks of big data which had to be processed in short period with minimal costs. EMR solved this as the cluster setup time and processing was simple, easy, cheap and fast. It had a negative impact as it was very difficult in submitting the test jobs as it lags a UI to submit spark code snippets. Read full review The implementation of the formats to integrate the users we have and the program is also good. I also improve the control of aspects related to the work environment Read full review ScreenShots