Likelihood to Recommend
The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Read full review
Our workload is 100% analytical. We also have to ingest a lot of data each month. SingleStore is a perfect match for our needs because it has fast pipelines for data ingestion and great performance, even in large and complex queries. We need fast response times for our user interface and great performance in our ETL processes, which are rather complicated. SingleStore handles all of this very well.
Read full review Pros Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues Faster in execution times compare to Hadoop and PIG Latin Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner Interoperability between SQL and Scala / Python style of munging data Read full review Return results of complex queries scanning TBs of data in sub-seconds. Customer support team answer tickets quickly and provide guidance. MySQL engine which allows to query using simple MySQL drivers from different clients. Queries profiling is easy to use and helps investigating performance. Read full review Cons Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Read full review We wish the product had better support for High Availability of the aggregator. Currently the indexes generated by the two different aggregators are not in the same sequential space and so our apps have more burden to deal with HA. More tools for debugging issues such as high memory usage would be good. The price was the one that kept us away from purchasing for the first few years. Now we are able to afford due to a promotion that gives it at 25% of the list price. Not sure if we'll continue after the promotion offer expires in another 2 years. Read full review Likelihood to Renew
Capacity of computing data in cluster and fast speed.
Senior Software Developer (Consultant)
Read full review
We haven't seen a faster relation database. Period. Which is why we are super happy customers and will for sure renew our license.
Read full review Usability
The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
Read full review
[Until it is] supported on AWS ECS containers, I will reserve a higher rating for SingleStore. Right now it works well on EC2 and serves our current purpose, [but] would look forward to seeing SingleStore respond to our urge of feature in a shorter time period with high quality and security.
Read full review Reliability and Availability
We have not experienced any downtime in the two years that we have been using SingleStore.
Read full review Performance
SingleSore can perform transactions and operational analytics together in order to utilize their data and transform their business. SingleStore delivers a database that performs both functions. Before using SingleStore, we had different systems for OLTP queries and for OLAP analyses, and a number of ETL packages to bring data from the OLTP system to Reporting database.
Read full review Support Rating
1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
Read full review
Very responsive to trouble tickets - Often, I think, the SingleStore's monitoring systems have already alerted the engineers by the time I get around to writing a ticket (about 10 - 20 mins after we see a problem). I feel like things are escalated nicely and SingleStore takes resolving trouble tickets seriously. Also SingleStore follows up after incidents to with a post mortem and actionable takaways to improve the product. Very satisfied here.
Read full review Implementation Rating
We allowed 2-3 months for a thorough evaluation. We saw pretty quickly that we were likely to pick SingleStore, so we ported some of our stored procedures to SingleStore in order to take a deeper look. Two SingleStore people worked closely with us to ensure that we did not have any blocking problems. It all went remarkably smoothly.
Read full review Alternatives Considered
Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the
stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
Read full review
Vertica, Snowflake, SQL Server, Azure Data Warehouse, PowerBI, Aerospike, etc. From what I've seen MemSQL is well worth the cost when latency and data freshness needs are high, i.e. you need a lot of queries to run with UI latency (the query itself takes less than a second or so), with very fresh streaming fact and dimensional data. It will be more expensive per "unit of performance" but if you need that performance then it'll get the job done.
On-prem Vertica (note, not Eon) provides more knobs for optimizing a particular data set and set of queries against it and performs as well or better in a single table, fact table queries. It will also scale to data size more cheaply due to its on-disk model. For large queries against large data sets where data freshness isn't as important (and latency either is or isn't), I'd take Vertica, although if you need to do a lot of joins that will struggle). However, as they still are exclusively columnar, dimension table updates, and recalls based on them, can only be tuned to happen so fast (we could do much better than 10 seconds with 10-100 updates per second for raw replication, and Vertica's joins are always slow so recalls were worse). Snowflake suffers similarly to Vertica in the data freshness, replication, and re-calc area; SF also doesn't give as many knobs to turn as Vertica for data set optimization but seems to be better at joins. If you have a lot of queries to run against a lot of data and joins are limited, you need query latency low and consistent but you don't need a ton of freshness, I'd stick with Vertica. If joins matter more, or you can accept notably-but-not-terribly worse performance, then Snowflake is fine and cheaper from what we've seen. (Again, I can't speak to SF vs Vertica Eon). SQL Server and ADW we couldn't get to perform as well as the other options, but I'll say we didn't try that hard on those. Aerospike is amazing as a KV store; however for OLAP use cases where you want to balance performance against the flexibility of queries against general event (time series) data (i.e. be able to roll up to different grains) then KV becomes challenging. PBI is great if you want an integrated BI tool, but if you want an OLAP solution to build against, with some particular scale or performance needs to be mentioned above, I'd go with one of these other solutions. It really can be great for letting non-tech folks build relatively small data sets and quick insights for customers (internal or external), great leverage in that case. Read full review Scalability
We needed more memory on our cluster. SingleStore handled it very smoothly.
Read full review Return on Investment Business leaders are able to take data driven decisions Business users are able access to data in near real time now . Before using spark, they had to wait for at least 24 hours for data to be available Business is able come up with new product ideas Read full review As the overall performance and functionality were expanded, we are able to deliver our data much faster than before, which increases the demand for data. Metadata is available in the platform by default, like metadata on the pipelines. Also, the information schema has lots of metadata, making it easy to load our assets to the data catalog. Read full review ScreenShots