Overall Satisfaction with Amazon Elastic Compute Cloud (EC2)
The organization uses EC2 for any cloud computing we don't want to do locally. Specifically, my team uses EC2 to do large data processing jobs. We have Docker images of environments that have exactly the installations of languages and dependencies that we need for a specific task or set of tasks--from there, EC2 reads in data from the data source and writes data to some database or S3.
- Flexible: Can get exactly the specs you need, on demand.
- AWS CLI: The EC2 API via the AWS CLI is great for debugging, monitoring, etc.
- Reliable: Rarely have problems or unexpected behavior related to EC2 itself.
- Logging: Sometimes getting the correct logs are difficult.
- Speed: Spinning up a cluster isn't always fast.
- Pricing: The documentation isn't super clear on how hours are incurred for pricing.
- Positive: Easy to set up, very effective
- Positive: Easy to maintain, doesn't require much engineering hours for maintenance
- Positive: Very flexible, fits well with our current stack
- AWS EMR
For Hadoop/Spark jobs, we use AWS EMR. We evaluated this vs. just using EC2 and installing the necessary software on it ourselves. We went with EMR to ensure consistent builds, although it is slightly more expensive.