Apache Cassandra - Why Would You Look Elsewhere?
Overall Satisfaction with Cassandra
- As a Java based NoSQL database it has the greatest community and adoption. Coupled with great Apache hadoop, Apache Spark and Solr integration and a strong tools ecosystem(unit tests, stress testing), it is a unbeatable combination!
- As a hybrid architecture based on masterless architecture as in DynamoDB and column family data model as in BigTable, it hits the bulls eye!
- It has best in class performance across different kinds of read/write/mixed workloads. It provides linear scalability which works for the best performance, lowest latency and highest throughput.
- Being a tunable consistency model enables you to have consistency as your platform/application needs.
- If configured correctly, there is no downtime and no data loss.These are key criterias on critical domains.
- Apache Cassandra is lacking in some features, which Datastax provides in the Enterprise version. For example, security and advanced tools like OpsCenter. These would be a great addition to open source Apache Cassandra.
- At times we noticed some versions had issues not known in advance, for example, LostNotificationError on repair of nodes. However steadily the newer releases have become better and more stable.
- Examples of datastax native driver with Cassandra 2.1 can be improved, as it does not provide all scenarios one would need on production.
- If you prefer to work with an open source project and be hands on, Apache Cassandra is one of the best. However if you need a managed cassandra like service where you do not even want to configure/deploy/backup/restack, a DynamoDB service would be more preferred.
- Cassandra is JVM based NoSQL, hence garbage collector tuning is a key aspect, Garbage collection in JDK 8 and G1GC garbage collector is better or configure ConcurrentMarkSweep(CMS) garbage collector in an optimum manner.
- Highly Available Services, and Platforms.
- High Performance, Low Latency and Highest throughput across varying workloads.
- Configured, Tuned and Monitored correctly works to provide the best user experience!
Apache Cassandra is a NoSQL database and well suited where you need highly available, linearly scalable, tunable consistency and high performance across varying workloads. It has worked well for our use cases, and I shared my experiences to use it effectively at the last Cassandra summit! http://bit.ly/1Ok56TK
It is a NoSQL database, finally you can tune it to be strongly consistent and successfully use it as such. However those are not usual patterns, as you negotiate on latency. It works well if you require that. If your use case needs strongly consistent environments with semantics of a relational database or if the use case needs a data warehouse, or if you need NoSQL with ACID transactions, Apache Cassandra may not be the optimum choice.