Apache Spark vs. Microsoft Fabric

Microsoft Fabric

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Spark	Score 8.9 out of 10	N/A	Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.	N/A
Microsoft Fabric	Score 8.4 out of 10	N/A	Microsoft Fabric: A Comprehensive Data Management Solution Microsoft Fabric presents a unified, robust platform designed to optimize data management, enhance AI model development, and empower users across an organization. It focuses on integrating data seamlessly, ensuring governance and security, and providing AI capabilities. Microsoft Fabric is presented as an all-encompassing data management solution, providing organizations with tools for efficient data integration,…	N/A

Pricing

Apache Spark

Microsoft Fabric

Editions & Modules

No answers on this topic

Offerings

Pricing Offerings
Apache Spark	Microsoft Fabric
Free Trial
No	Yes
Free/Freemium Version
No	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

Additional Details

—

Use Microsoft Fabric by purchasing Fabric Capacity, a billing unit that enables each Fabric experience. Pay for every data tool in one transparent, simplified pricing model and save time for other business needs. Fabric Capacity is priced uniquely across regions.

More Pricing Information

Community Pulse
	Apache Spark	Microsoft Fabric
Considered Both Products	Apache Spark Ananth Gouri Assistant Professor Chose Apache Spark We used Surprise Kit for one of the other research works. It is more fine-tuned to Recommendation systems and their algorithms. Apache Spark has MLlib for majority of ML problems. Where as software like Surprse Kit - it suitable for a specific task of Recommendations only. Incentivized Helpful? Riyaz Khan Staff Engineer Chose Apache Spark Apache Spark is a fast-processing in-memory computing framework. It is 10 times faster than Apache Hadoop. Earlier we were using Apache Hadoop for processing data on the disk but now we are shifted to Apache Spark because of its in-memory computation capability. Also in SAP … Incentivized Helpful? Steven Li Senior Software Developer (Consultant) Chose Apache Spark Other teams used to work on Apache Hadoop but our team started with Apache Spark directly. Incentivized Helpful? Verified User Anonymous Chose Apache Spark There are a few alternatives that can do the same transformation and aggregation like Apache Spark can do but most of them are not able to perform parallel computation. For example, pandas is a really good tool to do that but not parallelized; However, there are some tools that … Incentivized Helpful? Surendranatha Reddy Chappidi Senior Data Engineer Chose Apache Spark Apache Spark works in distributed mode using cluster Informatica and Datastage cannot scale horizontally We can write custom code in spark, whereas in Datastage and Informatica we can only choose the different features proivided already. Incentivized Helpful? Verified User Anonymous Chose Apache Spark Apache Spark has much more better performance and features if we compare with Hive or map/reduce kind of solutions. Spark has many other features for machine learning, streaming. Incentivized Helpful? Chetan Munegowda Software Engineer Chose Apache Spark Spark is simply awesome to work on with any data sets and also has an in-memory database which makes it very flexible. Incentivized Helpful? YM Yogesh Mhasde Technical Manager Chose Apache Spark 1. Apache Spark is almost 100 % faster than Hadoop. 2. Apache Spark is more stable than Amazon EMR. 3. The end to end distributed machine library is more robust in Apache Spark. Incentivized Helpful? Verified User Anonymous Chose Apache Spark Databricks uses Spark as a foundation, and is also a great platform. It does bring several add-ons, which we did not feel needed by the time we evaluated - and haven't needed since then. One interesting plus in our opinion was the engineering support, which is great depending … Incentivized Helpful? Verified User Anonymous Chose Apache Spark It is easy to learn, read and to maintain. It brings the best of the Ruby on Rails framework from Java that helps to create a web service so easily. Communication is one of the most distinctive features of Apache Spark compared to alternative products. You are able to … Incentivized Helpful? SS Shiv Shivakumar Acquisitions Leader Chose Apache Spark We evaluated SAS alongside with Apache Spark but during the course of proof of concept found that Apache Spark was able to support the hadoop eco-system and hadoop file system much better. It was much faster at that time while having the ability to process data quickly for the … Incentivized Helpful? Carla Borges Consultor Tecnico - Java Developer and Php Developer. Chose Apache Spark I prefer Apache Spark compared to Hadoop, since in my experience Spark has more usability and comes equipped with simple APIs for Scala, Python, Java and Spark SQL, as well as provides feedback in REPL format on the commands. At the same time, Apache Spark seems to have the … Incentivized Helpful? Nitin Pasumarthy Software Engineer Chose Apache Spark All the above systems work quite well on big data transformations whereas Spark really shines with its bigger API support and its ability to read from and write to multiple data sources. Using Spark one can easily switch between declarative versus imperative versus functional … Incentivized Helpful? Kartik Chavan Data Analyst Chose Apache Spark Even with Python, MapReduce is lengthy coding. Combination of Python with Apache Spark will not only shorten the code, but it will effectively increase the speed of algorithms. Occasionally, I use MapReduce, but Apache Spark will replace MapReduce very soon. It has many … Incentivized Helpful? Anson Abraham Data Czar Chose Apache Spark vs MapRedce, it was faster and easier to manage. Especially for Machine Learning, where MapReduce is lacking. Also Apache Storm was slower and didn't scale as much as Spark does. Spark elasticity was easier to apply compared to storm and MapReduce. managing resources for … Incentivized Helpful? Verified User Anonymous Chose Apache Spark We specifically choose Spark over MapReduce to make the cluster processing faster Incentivized Helpful? Verified User Anonymous Chose Apache Spark Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and … Incentivized Helpful? Kamesh Emani Software Developer Intern Chose Apache Spark Apache Pig and Apache Hive provide most of the things spark provide but apache spark has more features like actions and transformations which are easy to code. Spark uses optimization technique as we can select driver program and manipulate DAG (Directed Acyclic Graph) Python … Incentivized Helpful? Verified User Anonymous Chose Apache Spark There are a few newer frameworks for general processing like Flink, Beam, frameworks for streaming like Samza and Storm, and traditional Map-Reduce. I think Spark is at a sweet spot where its clearly better than Map-Reduce for many workflows yet has gotten a good amount of … Incentivized Helpful? Jordan Moore Staff Consultant Chose Apache Spark Spark has primarily replaced my use of writing pure Hadoop MapReduce or Apache Pig jobs for processing data. I like the fact that I can alternate between the main programming languages that I know - Java and Python - and use those to learn the Scala API. Spark also can be … Incentivized Helpful?	Microsoft Fabric Verified User Anonymous Chose Microsoft Fabric Microsoft Fabric integrates data ingestion, engineering, warehousing, and Power BI visualization into one cohesive environment. This "one-stop shop" approach dramatically reduces complexity, minimizes operational overhead, and eliminates the need to integrate disparate tools … Incentivized Helpful?

Best Alternatives
	Apache Spark	Microsoft Fabric
Small Businesses	No answers on this topic	No answers on this topic
Medium-sized Companies	Cloudera Manager Score 9.9 out of 10	Snowflake Score 8.7 out of 10
Enterprises	IBM Analytics Engine Score 8.6 out of 10	Snowflake Score 8.7 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Spark	Microsoft Fabric
Likelihood to Recommend	9.0 (0 ratings)	8.0 (0 ratings)
Likelihood to Renew	10.0 (0 ratings)	- (0 ratings)
Usability	8.0 (0 ratings)	4.0 (0 ratings)
Support Rating	8.7 (0 ratings)	- (0 ratings)

User Testimonials
	Apache Spark	Microsoft Fabric
Likelihood to Recommend	Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries. Incentivized Nitin Pasumarthy Software Engineer Read full review	I would highly recommend Microsoft Fabric, especially for medium to large enterprises aiming to build a robust, scalable, and secure data analytics platform. It effectively unifies various data workloads, streamlining data integration, engineering, and particularly enhancing our ability to create and share reliable Power BI dashboards. The deep integration with Azure AD for features like Row-Level Security is a significant advantage for data governance. Incentivized Verified User Anonymous Read full review
Pros	It performs a conventional disk-based process when the data sets are too large to fit into memory, which is very useful because, regardless of the size of the data, it is always possible to store them. It has great speed and ability to join multiple types of databases and run different types of analysis applications. This functionality is super useful as it reduces work times Apache Spark uses the data storage model of Hadoop and can be integrated with other big data frameworks such as HBase, MongoDB, and Cassandra. This is very useful because it is compatible with multiple frameworks that the company has, and thus allows us to unify all the processes. Incentivized Carla Borges Consultor Tecnico - Java Developer and Php Developer. Read full review	Access Control / Security & Governance Collaborative Work up to Visualization Data Ingestion Incentivized Verified User Anonymous Read full review
Cons	Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Incentivized Anson Abraham Data Czar Read full review	Clarity and Documentation on Power BI Integration Changes and Best Practices More Flexible and Advanced Scheduled Report Delivery Options Enhanced Data Export Control Based on Sensitivity Labels Incentivized Verified User Anonymous Read full review
Likelihood to Renew	Capacity of computing data in cluster and fast speed. Steven Li Senior Software Developer (Consultant) Read full review	No answers on this topic
Usability	If the team looking to use Apache Spark is not used to debug and tweak settings for jobs to ensure maximum optimizations, it can be frustrating. However, the documentation and the support of the community on the internet can help resolve most issues. Moreover, it is highly configurable and it integrates with different tools (eg: it can be used by dbt core), which increase the scenarios where it can be used Incentivized Verified User Anonymous Read full review	I've rated Microsoft Fabric's overall usability as a 4, primarily due to its extensive and multifaceted feature set, which can make it challenging to navigate and determine the optimal functionality for a given task.While the breadth of capabilities is a core strength for large enterprises, it often leads to a sense of being "lost" or overwhelmed for teams like ours that do not have highly formalized roles or dedicated specialists for each Fabric "experience" (e.g., Data Engineering, Data Warehousing, Data Science). Incentivized Verified User Anonymous Read full review
Support Rating	1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications. YM Yogesh Mhasde Technical Manager Read full review	No answers on this topic
Alternatives Considered	We used Surprise Kit for one of the other research works. It is more fine-tuned to Recommendation systems and their algorithms. Apache Spark has MLlib for majority of ML problems. Where as software like Surprse Kit - it suitable for a specific task of Recommendations only Incentivized Ananth Gouri Assistant Professor Read full review	Microsoft Fabric integrates data ingestion, engineering, warehousing, and Power BI visualization into one cohesive environment. This "one-stop shop" approach dramatically reduces complexity, minimizes operational overhead, and eliminates the need to integrate disparate tools and manage data across multiple systems. It provides superior scalability for large datasets, supports open data formats, and offers a much broader suite of data engineering and data science capabilities.In essence, Fabric's integrated ecosystem and streamlined operational management were key differentiators, providing a more cohesive, scalable, and efficient solution for our evolving data strategy than combining specialized tools. Incentivized Verified User Anonymous Read full review
Return on Investment	Faster turn around on feature development, we have seen a noticeable improvement in our agile development since using Spark. Easy adoption, having multiple departments use the same underlying technology even if the use cases are very different allows for more commonality amongst applications which definitely makes the operations team happy. Performance, we have been able to make some applications run over 20x faster since switching to Spark. This has saved us time, headaches, and operating costs. Incentivized Verified User Anonymous Read full review	Transformation from Monthly Reports to Daily Insights, Accelerating Decision-Making Streamlined Reporting: Enhanced Data Consistency and 30% Reduction in Reports Provided Empowered Self-Service Analytics Through Elevated Employee Data Literacy Incentivized Verified User Anonymous Read full review
ScreenShots		Microsoft Fabric Screenshots