Item: Amazon Elastic Compute Cloud (EC2)
Rating: 10
Author: David Choi

Use Cases and Deployment Scope

The organization uses EC2 for any cloud computing we don't want to do locally. Specifically, my team uses EC2 to do large data processing jobs. We have Docker images of environments that have exactly the installations of languages and dependencies that we need for a specific task or set of tasks--from there, EC2 reads in data from the data source and writes data to some database or S3.

Pros and Cons

Flexible: Can get exactly the specs you need, on demand.
AWS CLI: The EC2 API via the AWS CLI is great for debugging, monitoring, etc.
Reliable: Rarely have problems or unexpected behavior related to EC2 itself.

Logging: Sometimes getting the correct logs are difficult.
Speed: Spinning up a cluster isn't always fast.
Pricing: The documentation isn't super clear on how hours are incurred for pricing.

Return on Investment

Positive: Easy to set up, very effective
Positive: Easy to maintain, doesn't require much engineering hours for maintenance
Positive: Very flexible, fits well with our current stack

Alternatives Considered

AWS EMR

For Hadoop/Spark jobs, we use AWS EMR. We evaluated this vs. just using EC2 and installing the necessary software on it ourselves. We went with EMR to ensure consistent builds, although it is slightly more expensive.

Likelihood to Recommend

EC2 is appropriate for:

Long running tasks
Tasks that require additional computing power
Tasks that require variable amounts of computing power
Scheduled tasks
Tasks that require a specific build of a language

It is not as appropriate for:

Doing scheduling itself
Very on-demand tasks (other AWS options are better)
Companies on an extreme budget

AWS EC2 for Data Science

Overall Satisfaction with Amazon Elastic Compute Cloud (EC2)

Use Cases and Deployment Scope

Pros and Cons

Return on Investment

Alternatives Considered

Likelihood to Recommend