Databricks offers the Databricks Lakehouse Platform (formerly the Unified Analytics Platform), a data science platform and Apache Spark cluster manager. The Databricks Unified Data Service provides a platform for data pipelines, data lakes, and data platforms.
$0.07
Per DBU
IBM watsonx.data
Score 8.7 out of 10
N/A
Watsonx.data is presented as an open, hybrid and governed data store that makes it possible for enterprises to scale analytics and AI with a fit-for-purpose data store, built on an open lakehouse architecture, supported by querying, governance and open data formats to access and share data.
N/A
Informatica MDM & 360 Applications
Score 5.4 out of 10
N/A
Informatica MDM is an enterprise master data management solution that competes directly with IBM's InfoSphere and Oracle's Siebel UCM product.Informatica MDM and the company's 360 applications present a multidomain solution with flexibility to support any master data domain and relationship—whether on-premises, in the cloud, or both.
May be I cannot say why I choose, business preferred to use IBM watsonx.data which is good for me as well to learn. I cannot compare this tool with others because it has unique feature which Alteryx or Amazon or Azure dont have. So this tool is going good for us.
Medium to Large data throughput shops will benefit the most from Databricks Spark processing. Smaller use cases may find the barrier to entry a bit too high for casual use cases. Some of the overhead to kicking off a Spark compute job can actually lead to your workloads taking longer, but past a certain point the performance returns cannot be beat.
Real-time transaction processing (both reads and writes) is where DataStax Enterprise shines. It's very fast with linear scalability should more resources be needed. Additional nodes are added very easily. DataStax Enterprise on its own (without Solr or Spark enabled) isn't well suited for long complicated reports. The data model doesn't support joining multiple tables together which is common in BI reporting.
It is a robust software with great management and data model. It can be difficult to learn how to use and deploy the first time but the mapping and features work very well and have optimized our productivity and cost savings. We can customize reports, create rules and integrate with other applications. Overall I recommend it.
Datastax Cassandra provides high availability and good performance for a database. It is built on top of open source Apache Cassandra so you can always somewhat understand the internal functioning and why.
Datastax Cassandra is fairly simple to start using, you can install/setup your cluster and be productive in 1 day.
Datastax Cassandra provides a lot of good detailed documentation, and when starting, the detailed free videos on the Datastax site and documentation are very helpful.
Datastax Enterprise Edition of Cassandra provides more tools, good support, and quick response SLA for enterprise business support.
This program raises us to a professional level where we have better versatility to control all the media of my work and have a correct response for each scenario.
It is essential to be right about the destination and development of my data, Informatica MDM is here to simplify all these processes for its users.
Integration complexity with Security Tools while watsonx.Data is well-suited for native tools, but integration with third-party security tools requires custom connectors or manual ETL pipelines. which leads to an increase in setup time.
It is unfortunate how this program has a couple of limitations in terms of insertions; it does not have the ability to agglomerate and archive the data in real-time by groups.
To have automation functions, the program is very limited in performing one task at a time, compared to other systems that perform functions simultaneously.
As an open source technology Cassandra can be readily used with or without any commercial support. DataStax provides value-added services and features, and in the end it is up to individual situations to strike a balance between the desirability of such support/service versus the associated cost.
Because it is an amazing platform for designing experiments and delivering a deep dive analysis that requires execution of highly complex queries, as well as it allows to share the information and insights across the company with their shared workspaces, while keeping it secured.
in terms of graph generation and interaction it could improve their UI and UX
DataStax has a good community built around it and has amazing scalability options. Though the initial setup is a bit costly, in the long run, it makes up for it. It also has powerful monitoring tools and a clean UI.
One of the best customer and technology support that I have ever experienced in my career. You pay for what you get and you get the Rolls Royce. It reminds me of the customer support of SAS in the 2000s when the tools were reaching some limits and their engineer wanted to know more about what we were doing, long before "data science" was even a name. Databricks truly embraces the partnership with their customer and help them on any given challenge.
We have had a few situations where we caused an outage or something has gone wrong and we are able to get a support person to offer live help within minutes. The escalation process is excellent - the best I've seen - and the support team is incredibly strong. Outside of emergencies, the team is very helpful with general questions and working through data model exercises and the subscription I believe still comes with some hours to help get the data model reviewed.
I'm not sure since I never used support. My colleagues never had any issues with it, therefore my rating would be an 8 with a certain range of uncertainty.
The most important differentiating factor for Databricks Lakehouse Platform from these other platforms is support for ACID transactions and the time travel feature. Also, native integration with managed MLflow is a plus. EMR, Cloudera, and Hortonworks are not as optimized when it comes to Spark Job Execution. Other platforms need to be self-managed, which is another huge hassle.
Pinecone and IBM watsonx.data (Milvus in our case) both work great as a full-managed cloud-based vector database. We selected IBM watsonx.data because it integrates well with watson.ai and is a little more beginner friendly than Pinecone, but I think both are great anyway.
Informatica MDM has proven it's worth in the organization by driving the revenue growth. It saves our lot of time by filtering out duplicate values and helps in solving critical business problems. It is very helpful when we deal with a lot of data. Apart from this we can populate data on various third party integration which is most useful case
I cannot speak to this for 2 reasons. 1. I am not privy to the financials associated with this implementation or the previous one. 2. We have not hit our 'go-live' for this implementation yet to compare it's performance to our previous solution.