if you're doing joins from hBASE, hdfs, cassandra and redis, then this works. Using it as a be all end all does not suit it. This is not your straight forward magic software that works for all scenarios. One needs to determine the use case to see if Apache Drill fits the needs. 3/4 of the time, usually it does.
We've been super happy with Astra DB. It's been extremely well-suited for our vector search needs as described in previous responses. With Astra DB’s high-performance vector search, Maester’s AI dynamically optimizes responses in real-time, adapting to new user interactions without requiring costly retraining cycles.
We need to be able to process a lot of data (our biggest clients process hundreds of milions of transactions every month). However, it is not only the amount of data, it is also an unpredictable patterns with spikes occuring at different points of time - something athat Astra is great at.
Our processing needs to be extremaly fast. Some of our clients use our enrichment in a synchronous way, meaning that any delay in processing is holding up the whole transaction lifecycle and can have a major impact on the client. Astra is very fast.
A close collaboration with GCP makes our life very easy. All of our technology sits in Google Cloud, so having Astra in there makes it a no-brainer solution for us.
The support team sometimes requires the escalate button pressed on tickets, to get timely responses. I will say, once the ticket is escalated, action is taken.
They require better documentation on the migration of data. The three primary methods for migrating large data volumes are bulk, Cassandra Data Migrator, and ZDM (Zero Downtime Migration Utility). Over time I have become very familiar will all three of these methods; however, through working with the Services team and the support team, it seemed like we were breaking new ground. I feel if the utilities were better documented and included some examples and/or use cases from large data migrations; this process would have been easier. One lesson learned is you likely need to migrate your application servers to the same cloud provider you host Astra on; otherwise, the latency is too large for latency-sensitive applications.
if Presto comes up with more support (ie hbase, s3), then its strongly possible that we'll move from apache drill to prestoDB. However, Apache drill needs more configuration ease, especially when it comes to garbage collection tuning. If apache drill could support also sparkSQL and Flume, then it does change drill into being something more valuable than prestoDB
Their response time is fast, in case you do not contact them during business hours, they give a very good follow-up to your case. They also facilitate video calls if necessary for debugging.
compared to presto, has more support than prestodb. Impala has limitations to what drill can support apache phoenix only supports for hbase. no support for cassandra. Apache drill was chosen, because of the multiple data stores that it supports htat the other 3 do not support. Presto does not support hbase as of yet. Impala does not support query to cassandra
Graph, search, analytics, administration, developer tooling, and monitoring are all incorporated into a single platform by Astra DB. Mongo Db is a self-managed infrastructure. Astra DB has Wide column store and Mongo DB has Document store. The best thing is that Astra DB operates on Java while Mongo DB operates on C++
We are well aware of the Cassandra architecture and familiar with the open source tooling that Datastax provides the industry (K8sSandra / Stargate) to scale Cassandra on Kubernetes.
Having prior knowledge of Cassandra / Kubernetes means we know that under the hood Astra is built on infinitely scalable technologies. We trust that the foundations that Astra is built on will scale so we know Astra will scale.