Microsoft's Azure Data Factory is a service built for all data integration needs and skill levels. It is designed to allow the user to easily construct ETL and ELT processes code-free within the intuitive visual environment, or write one's own code. Visually integrate data sources using more than 80 natively built and maintenance-free connectors at no added cost. Focus on data—the serverless integration service does the rest.
N/A
SingleStore
Score 7.8 out of 10
N/A
SingleStore aims to enable organizations to scale from one to one million customers, handling SQL, JSON, full text and vector workloads in one unified platform.
I can only compare it with Exasol, which I have used a similar base, which manages the Hadoop scheme and is very similar to SingleStore. SingleStore has many advantages: being in the cloud, with just a couple of clicks I can increase the capacity, the configuration is super …
Best scenario is for ETL process. The flexibility and connectivity is outstanding. For our environment, SAP data connectivity with Azure Data Factory offers very limited features compared to SAP Data Sphere. Due to the limited modelling capacity of the tool, we use Databricks for data modelling and cleaning. Usage of multiple tools could have been avoided if adf has modelling capabilities.
Good for Applications needing instant insights on large, streaming datasets. Applications processing continuous data streams with low latency. When a multi-cloud, high-availability database is required When NOT to Use Small-scale applications with limited budgets Projects that do not require real-time analytics or distributed scaling Teams without experience in distributed databases and HTAP architectures.
Granularity of Errors: Sometimes, Azure Data Factory provides error messages that are too generic or vague for us, making it challenging to pinpoint the exact cause of a pipeline failure. Enhanced error messages with more actionable details would greatly assist us as users in debugging their pipelines.
Pipeline Design UI: In my experience, the visual interface for designing pipelines, especially when dealing with complex workflows or numerous activities, can become cluttered. I think a more intuitive and scalable design interface would improve usability. In my opinion, features like zoom, better alignment tools, or grouping capabilities could make managing intricate designs more manageable.
Native Support: While Azure Data Factory does support incremental data loads, in my experience, the setup can be somewhat manual and complex. I think native and more straightforward support for Change Data Capture, especially from popular databases, would simplify the process of capturing and processing only the changed data, making regular data updates more efficient
It does not release a patch to have back porting; it just releases a new version and stops support; it's difficult to keep up to that pace.
Support engineers lack expertise, but they seem to be improving organically.
Lacks enterprise CDC capability: Change data capture (CDC) is a process that tracks and records changes made to data in a database and then delivers those changes to other systems in real time.
For enterprise-level backup & restore capability, we had to implement our model via Velero snapshot backup.
So far product has performed as expected. We were noticing some performance issues, but they were largely Synapse related. This has led to a shift from Synapse to Databricks. Overall this has delayed our analytic platform. Once databricks becomes fully operational, Azure Data Factory will be critical to our environment and future success.
[Until it is] supported on AWS ECS containers, I will reserve a higher rating for SingleStore. Right now it works well on EC2 and serves our current purpose, [but] would look forward to seeing SingleStore respond to our urge of feature in a shorter time period with high quality and security.
Solutions are based around a business needs and even when implementing such solution, real time insights are also followed through showing the updates the business are implementing while informing the end users as what is new with technology.
SingleStore excels in real-time analytics and low-latency transactions, making it ideal for operational analytics and mixed workloads. Snowflake shines in batch analytics and data warehousing with strong scalability for large datasets. SingleStore offers faster data ingestion and query execution for real-time use cases, while Snowflake is better for complex analytical queries on historical data.
We have not had need to engage with Microsoft much on Azure Data Factory, but they have been responsive and helpful when needed. This being said, we have not had a major emergency or outage requiring their intervention. The score of seven is a representation that they have done well for now, but have not proved out their support for a significant issue
The support deep dives into our most complexed queries and bizarre issues that sometimes only we get comparing to other clients. Our special workload (thousands of Kafka pipelines + high concurrency of queries). The response match to the priority of the request, P1 gets immediate return call. Missing features are treated, they become a client request and being added to the roadmap after internal consideration on all client needs and priority. Bugs are patched quite fast, depends on the impact and feasible temporary workarounds. There is no issue that we haven't got a proper answer, resolution or reasoning
We allowed 2-3 months for a thorough evaluation. We saw pretty quickly that we were likely to pick SingleStore, so we ported some of our stored procedures to SingleStore in order to take a deeper look. Two SingleStore people worked closely with us to ensure that we did not have any blocking problems. It all went remarkably smoothly.
Azure Data Factory helps us automate to schedule jobs as per customer demands to make ETL triggers when the need arises. Anyone can define the workflow with the Azure Data Factory UI designer tool and easily test the systems. It helped us automate the same workflow with programming languages like Python or automation tools like ansible. Numerous options for connectivity be it a database or storage account helps us move data transfer to the cloud or on-premise systems.
Greenplum is good in handling very large amount of data. Concurrency in Greenplum was a major problem. Features available in SingleStore like Pipelines and in memory features are not available in Greenplum. Gemfire was not scaling well like SingleStore. Support of both Greenplum and Gemfire was not good. Product team did not help us much like the ones in SingleStore who helped us getting started on our first cluster very fast.
As the overall performance and functionality were expanded, we are able to deliver our data much faster than before, which increases the demand for data.
Metadata is available in the platform by default, like metadata on the pipelines. Also, the information schema has lots of metadata, making it easy to load our assets to the data catalog.