Likelihood to Recommend I find Qubole is well suited for getting started analyzing data in the cloud without being locked in to a specific cloud vendor's tooling other than the underlying filesystem. Since the data itself is not isolated to any Qubole cluster, it can be easily be collected back into a cloud-vendor's specific tools for further analysis, therefore I find it complementary to any offerings such as Amazon EMR or Google DataProc.
Read full review I am over our HR data, and we use Workday for our HR management system. I have a script in place that runs reports on Workday and saves the results as CSVs. I can then use stages in Snowflake to insert these CSVs into Snowflake, then I can insert or truncate and replace these staged tables into a final schema. Then once these are in a schema I can reference them and build out my data models. In addition to ingesting CSVs, Snowflake has the ability to write a CSV file to our Amazon S3 bucket. Ingesting these CSVs, transforming the data, then delivering it to a destination would've involved so much more coding than my current process if we were on any other platform.
Read full review Pros From a UI perspective, I find Qubole's closest comparison to Cloudera's HUE; it provides a one-stop shop for all data browsing and querying needs. Auto scaling groups and auto-terminating clusters provides cost savings for idle resources. Qubole fits itself well into the open-source data science market by providing a choice of tools that aren't tied to a specific cloud vendor. Read full review Snowflake scales appropriately allowing you to manage expense for peak and off peak times for pulling and data retrieval and data centric processing jobs Snowflake offers a marketplace solution that allows you to sell and subscribe to different data sources Snowflake manages concurrency better in our trials than other premium competitors Snowflake has little to no setup and ramp up time Snowflake offers online training for various employee types Read full review Cons Providing an open selection of all cloud provider instance types with no explanation as to their ideal use cases causes too much confusion for new users setting up a new cluster. For example, not everyone knows that Amazon's R or X-series models are memory optimized, while the C and M-series are for general computation. I would like to see more ETL tools provided other than DistCP that allow one to move data between Hadoop Filesystems. From the cluster administration side, onboarding of new users for large companies seems troublesome, especially when trying to create individual cluster per team within the company. Having the ability to debug and share code/queries between users of other teams / clusters should also be possible. Read full review This tool is very much technical and proper knowledge is required, so mostly you have to hire an IT team. I wish if various videos could be available for basic quires like its initiation, then I think it would act as a guideline and would help the beginners a lot. Read full review Likelihood to Renew Personally, I have no issues using Amazon EMR with Hue and Zeppelin, for example, for data science and exploratory analysis. The benefits to using Qubole are that it offers additional tooling that may not be available in other cloud providers without manual installation and also offers auto-terminating instances and scaling groups.
Read full review SnowFlake is very cost effective and we also like the fact we can stop, start and spin up additional processing engines as we need to. We also like the fact that it's easy to connect our SQL IDEs to Snowflake and write our queries in the environment that we are used to
Read full review Usability The interface is similar to other SQL query systems I've used and is fairly easy to use. My only complaint is the syntax issues. Another thing is that the error messages are not always the easiest thing to understand, especially when you incorporate temp tables. Some of that is to be expected with any new database.
Read full review Support Rating We have had terrific experiences with Snowflake support. They have drilled into queries and given us tremendous detail and helpful answers. In one case they even figured out how a particular product was interacting with Snowflake, via its queries, and gave us detail to go back to that product's vendor because the Snowflake support team identified a fault in its operation. We got it solved without lots of back-and-forth or finger-pointing because the Snowflake team gave such detailed information.
Read full review Alternatives Considered Qubole was decided on by upper management rather than these competitive offerings. I find that
Databricks has a better Spark offering compared to Qubole's Zeppelin notebooks.
Read full review I have had the experience of using one more database management system at my previous workplace. What Snowflake provides is better user-friendly consoles, suggestions while writing a query, ease of access to connect to various BI platforms to analyze, [and a] more robust system to store a large amount of data. All these functionalities give the better edge to Snowflake.
Read full review Return on Investment We like to say that Qubole has allowed for "data democratization", meaning that each team is responsible for their own set of tooling and use cases rather than being limited by versions established by products such as Hortonworks HDP or Cloudera CDH One negative impact is that users have over-provisioned clusters without realizing it, and end up paying for it. When setting up a new cluster, there are too many choices to pick from, and data scientists may not understand the instance types or hardware specs for the datasets they need to operate on. Read full review Positive impact: we use Snowflake to track our subscription and payment charges, which we use for internal and investor reporting Positive impact: 3 times faster query speed compared to Treasure Data means that answers to stakeholders can be delivered quicker by analysts Positive impact: recommender systems now source their data from Snowflake rather than Spark clusters, improving development speed, and no longer require maintainence of Spark clusters. Read full review ScreenShots