Cassandra Usage and Needs
September 27, 2017

Cassandra Usage and Needs

Ravi Reddy | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User

Overall Satisfaction with Cassandra

We are using Cassandra based on the requirements and data availability to the application (based on queries for search).
  • Cassandra lot of API's ready available for map reducing queries (like materialized queries).
  • Cassandra uses ring architecture approach, there is no master-slave approach (like HBase). If data is published on the node, the data will get synced with other nodes in the ring architecture, compared to HBase which has a dedicated master node to orchestrate the data into its slaves.
  • Write Speed
  • Multi Data Center Replication
  • Tunable Consistency
  • Integrates with JVM because it's written in Java
  • Cassandra Query Language is a subset of SQL query (less learning curve)
  • No Ad-Hoc Queries: Cassandra data storage layer is basically a key-value storage system. This means that you must "model" your data around the queries you want to surface, rather than around the structure of the data itself.
  • There are no aggregations queries available in Cassandra.
  • Not fit for transactional data.
Technology selection should be done based on the need and not based on buzz words in the market (google searching). If your data need flat file approach and more searchable based on index and partition keys, then it's better to go for Cassandra. Cassandra is a better choice compared with HBase because Cassandra has a lot of API's ready and available for map reducing queries (like materialized queries). Cassandra uses ring architecture approach, there is no master-slave approach (like HBase). If data published on of the node the data will get synced with other nodes in the ring architecture when compared to HBase which has a dedicated master node to orchestrate the data into its slaves.
It depends on the data model [whether it makes sense] to structure the data in de-normalized approach.
Cassandra data storage layer is basically a key-value storage system. This means that you must model your data around the queries you want to surface, rather than around the structure of the data itself. This can lead to storing the data multiple times in different ways to be able to satisfy the requirements of your application.