Cloudera Data Platform (CDP), launched September 2019, is designed to combine the best of Hortonworks and Cloudera technologies to deliver an enterprise data cloud. CDP includes the Cloudera Data Warehouse and machine learning services as well as a Data Hub service for building custom business applications.
$0.04
per CCU (hourly rate)
SingleStore
Score 8.3 out of 10
N/A
SingleStore aims to enable organizations to scale from one to one million customers, handling SQL, JSON, full text and vector workloads in one unified platform.
I have seen that Cloudera Data Platform is well suited for large batch processes. It works really well for our indication analyses that are performed by the actuaries. I feel that rapid streaming operations may be a situation where additional technology would be needed to provide for a robust solution.
Well-Suited Scenarios: Real-Time Analytics: Financial trading platforms requiring instant insights. Operational Dashboards: Retail businesses monitoring live sales. IoT Data Processing: Smart device monitoring with high data ingestion. Fraud Detection: Banks detect suspicious transactions instantly. Less Appropriate Scenarios: Archival Storage: Cold data storage with infrequent access. Low-Volume Workloads: Small-scale apps with minimal data processing needs. Complex ETL Pipelines: Heavy data transformations without real-time demands.
It does not release a patch to have back porting; it just releases a new version and stops support; it's difficult to keep up to that pace.
Support engineers lack expertise, but they seem to be improving organically.
Lacks enterprise CDC capability: Change data capture (CDC) is a process that tracks and records changes made to data in a database and then delivers those changes to other systems in real time.
For enterprise-level backup & restore capability, we had to implement our model via Velero snapshot backup.
[Until it is] supported on AWS ECS containers, I will reserve a higher rating for SingleStore. Right now it works well on EC2 and serves our current purpose, [but] would look forward to seeing SingleStore respond to our urge of feature in a shorter time period with high quality and security.
SingleStore Performance is excellent and concurrency that it supports is amazing compared to other DB technologies. It scales out pretty well by adding leaf nodes as the data grows. Pipelines are the strongest feature of SingleStore, which eliminates writing manual code to ingest data from Kafka, S3, and Hadoop systems, and it does that parallelly.
We have utilized Cloudera support quite frequently and are very satisfied with the capability and responsiveness of that team. Often, the new features delivered with the platform give us an opportunity to mature the way we're doing things, and the support team have been valuable in developing those new patterns.
The support deep dives into our most complexed queries and bizarre issues that sometimes only we get comparing to other clients. Our special workload (thousands of Kafka pipelines + high concurrency of queries). The response match to the priority of the request, P1 gets immediate return call. Missing features are treated, they become a client request and being added to the roadmap after internal consideration on all client needs and priority. Bugs are patched quite fast, depends on the impact and feasible temporary workarounds. There is no issue that we haven't got a proper answer, resolution or reasoning
We allowed 2-3 months for a thorough evaluation. We saw pretty quickly that we were likely to pick SingleStore, so we ported some of our stored procedures to SingleStore in order to take a deeper look. Two SingleStore people worked closely with us to ensure that we did not have any blocking problems. It all went remarkably smoothly.
IBM's offering of the Cloud Pak for Data has been a moving target and difficult to compare to Cloudera Data Platform. We have implemented our solution on Amazon Web Services, which appears to be supported by IBM at this point, but the migration would be very expensive for us to endeavor.
SingleStore is built for fast data ingestion and fast queries against large tables (> billions of rows). This is possible because of the column store engine that SingleStore uses. SingleStore also support a memory engine. Pipelines is also another big advantage. Being able to ingest data from S3 (and loading new data) without adding any additional services (with just 1 SQL query) is pretty cool. SingleStore is also pushing hard with vector engine. It is becoming a single storage solution.
As the overall performance and functionality were expanded, we are able to deliver our data much faster than before, which increases the demand for data.
Metadata is available in the platform by default, like metadata on the pipelines. Also, the information schema has lots of metadata, making it easy to load our assets to the data catalog.