What users are saying about
25 Ratings
Top Rated
63 Ratings
25 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.2 out of 101
Top Rated
63 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.3 out of 101

Add comparison

Likelihood to Recommend

Amazon EMR

Amazon Elastic MapReduce is useful in cases where two conditions are met. First, that you are planning on using multiple big data tools simultaneously to analyze big data sets. And second, that you need a tool that simplifies managing big data tools. If these two conditions are met, MapReduce does a great job. The user interface is simple. The program eliminates some programming requirements. The software also makes setting up big data analyses much easier. With these benefits acknowledged, MapReduce is not a good tool for "small" data analyses, given that there are other tools that do the job quicker and much more professional output. If you're on the fence, try out MapReduce with competing "small" data tools and see if you really need big data software.
Thomas Young profile photo

Cassandra

Apache Cassandra is a NoSQL database and well suited where you need highly available, linearly scalable, tunable consistency and high performance across varying workloads. It has worked well for our use cases, and I shared my experiences to use it effectively at the last Cassandra summit! http://bit.ly/1Ok56TKIt is a NoSQL database, finally you can tune it to be strongly consistent and successfully use it as such. However those are not usual patterns, as you negotiate on latency. It works well if you require that. If your use case needs strongly consistent environments with semantics of a relational database or if the use case needs a data warehouse, or if you need NoSQL with ACID transactions, Apache Cassandra may not be the optimum choice.
Rekha Joshi profile photo

Pros

  • Ease of use and ease to setup
  • Autoscaling functionality
  • Integrated into the AWS environment
No photo available
  • Continuous availability: as a fully distributed database (no master nodes), we can update nodes with rolling restarts and accommodate minor outages without impacting our customer services.
  • Linear scalability: for every unit of compute that you add, you get an equivalent unit of capacity. The same application can scale from a single developer's laptop to a web-scale service with billions of rows in a table.
  • Amazing performance: if you design your data model correctly, bearing in mind the queries you need to answer, you can get answers in milliseconds.
  • Time-series data: Cassandra excels at recording, processing, and retrieving time-series data. It's a simple matter to version everything and simply record what happens, rather than going back and editing things. Then, you can compute things from the recorded history.
David Prinzing profile photo

Cons

  • Cost overhead is a bit high
  • Limited versions of frameworks that can be used
No photo available
  • Cassandra is a poor choice for implementing application queues.
  • NoSQL requires thinking differently, and can be challenging for people with strong relational database backgrounds to understand. The CQL language helps with this, but it pays to understand how the engine works under the hood. That said, the benefits outweigh the challenge of the learning curve!
  • Database compactions and anti-entropy repair can be burdensome on a busy cluster. Significant improvements have been made in recent versions, but it remains as an operational challenge.
David Prinzing profile photo

Likelihood to Renew

No score
No answers yet
No answers on this topic
Cassandra8.0
Based on 11 answers
I've used Cassandra for 4 years now, on 3 major projects (one of them truly web-scale), and I'm deeply satisfied. These days, it's my go-to database. That said, technology moves quickly, and it's good to keep abreast of new developments...
David Prinzing profile photo

Alternatives Considered

The alternatives to EMR are mainly hadoop distributions owned by the 3 companies above. I have not used the other distributions so it is difficult to comment, but the general tradeoff is, at the cost of a longer setup time and more infra management, you get more flexible versioning and potentially faster access to newer versions of some frameworks such as Spark.
No photo available
Cassandra is the only NoSQL database I have extensive experience with. In terms of other open source database solutions, I can say that I like Cassandra as much or equally as traditional Oracle MySQL, and a lot more than PostgresSQL. The decision to use Cassandra was driven by the ability for fast read and writes, as well as fault tolerance by having multiple rings in a cluster which shard data to each other in near real time.
No photo available

Return on Investment

  • It was easy to set up initial versions of Spark on this
  • Still used as our compute platform as its easy to manage
  • Certain times we forgot to shut down clusters and were overcharged
No photo available
  • Cassandra has had a positive effect on our ROI by improving uptime and performance
No photo available

Pricing Details

Amazon EMR

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Cassandra

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details