Research Experience with Amazon EMR
June 22, 2016
Research Experience with Amazon EMR
Score 6 out of 10
Vetted Review
Verified User
Overall Satisfaction with Amazon Elastic MapReduce
As a PhD student, I used Amazon Elastic MapReduce for my research for analyzing my data. Firstly, it was very scalable and did not cause much performance impact when using large data sets. Secondly, their web console is very easy to use and intuitive. There were many resources that could be used whenever I encountered any problems with EMR.
Pros
- The cluster size of MapReduce is very dynamic and therefore scalability is good for EMR.
- It also works well with other Amazon Web Services like Amazon Simple Storage Service, which means that data can be taken from those services and written back to them.
- I tried using the in-house hosting at the university I work in, but there would be a lot of complications with technical support required. For Amazon, the support and documentation was good to solve these problems faster.
Cons
- It would have been better if packages like HBase and Flume were available with Amazon EMR. This would make the product even more helpful in some cases.
- Products like Cloudera provide the options to move the whole deployment into a dedicated server and use it at our discretion. This would have been a good option if available with EMR.
- If EMR gave the option to be used with any choice of cloud provider, it would have helped instead of having to move the data from another cloud service to S3.
- Positive: Helped process the jobs amazingly fast.
- Positive: Did not have to spend much time to learn the system, therefore, saving valuable research time.
- Negative: Not flexible for some scenarios, like when some plugins are required, or when the project has to be moved in-house.
- Cloudera
EMR provides dynamic cluster size, lots of documentation, and integration with other Amazon Web Services which are some of the things that Cloudera distribution for Hadoop lacked. Some products are hard to learn but EMR was much easier and helped save time spent on trying to figure out how to deploy projects in MapReduce.
Comments
Please log in to join the conversation