Likelihood to Recommend The biggest advantage of using Apache Geode is DB like consistency. So for applications whose data needs to be in-memory, accessible at low latencies and most importantly writes have to be consistent, should use Apache Geode. For our application quite some amount of data is static which we store in
MySQL as it can be easily manipulated. But since this data is large R/w from DB becomes expensive. So we started using Redis. Redis does a brilliant job, but with complex data structures and no query like capability, we have to manage it via code. We are experimenting with Apache Geode and it looks promising as now we can query on complex data-structures and get the required data quickly and also updates consistent.
Read full review I find Qubole is well suited for getting started analyzing data in the cloud without being locked in to a specific cloud vendor's tooling other than the underlying filesystem. Since the data itself is not isolated to any Qubole cluster, it can be easily be collected back into a cloud-vendor's specific tools for further analysis, therefore I find it complementary to any offerings such as Amazon EMR or Google DataProc.
Read full review Pros Super Fast data pull/push Provided ACID transactions, so it works like a SQL Database Provides replication & partitioning, so our data is never lost and extraction is super fast. NoSql like properties Read full review From a UI perspective, I find Qubole's closest comparison to Cloudera's HUE; it provides a one-stop shop for all data browsing and querying needs. Auto scaling groups and auto-terminating clusters provides cost savings for idle resources. Qubole fits itself well into the open-source data science market by providing a choice of tools that aren't tied to a specific cloud vendor. Read full review Cons Needs more supporting languages. Out of box Python, Nodejs adapters would be wonderful Currently it supports just KV Store. But if we could cache documents or timeseries data would be great Needs more community support, documentation. Read full review Providing an open selection of all cloud provider instance types with no explanation as to their ideal use cases causes too much confusion for new users setting up a new cluster. For example, not everyone knows that Amazon's R or X-series models are memory optimized, while the C and M-series are for general computation. I would like to see more ETL tools provided other than DistCP that allow one to move data between Hadoop Filesystems. From the cluster administration side, onboarding of new users for large companies seems troublesome, especially when trying to create individual cluster per team within the company. Having the ability to debug and share code/queries between users of other teams / clusters should also be possible. Read full review Likelihood to Renew Personally, I have no issues using Amazon EMR with Hue and Zeppelin, for example, for data science and exploratory analysis. The benefits to using Qubole are that it offers additional tooling that may not be available in other cloud providers without manual installation and also offers auto-terminating instances and scaling groups.
Read full review Usability Still Experimenting. Initial results are good. we need to figure out if we can completely replace Redis. Cost wise if it makes sense to keep both or replacement is feasible.
Read full review Support Rating Never contacted support
Read full review Alternatives Considered Still Experimenting. But looks promising as it has query capabilities over complex data structures
Read full review Qubole was decided on by upper management rather than these competitive offerings. I find that
Databricks has a better Spark offering compared to Qubole's Zeppelin notebooks.
Read full review Return on Investment Still experimenting so difficult to quote For a small size project/teams might be an overkill as it still has certain learning curve For Medium to large projects with complex Data Structures that need to be queried with a fast o/p it definitely works Read full review We like to say that Qubole has allowed for "data democratization", meaning that each team is responsible for their own set of tooling and use cases rather than being limited by versions established by products such as Hortonworks HDP or Cloudera CDH One negative impact is that users have over-provisioned clusters without realizing it, and end up paying for it. When setting up a new cluster, there are too many choices to pick from, and data scientists may not understand the instance types or hardware specs for the datasets they need to operate on. Read full review ScreenShots