Apache Spark vs. Oracle Autonomous Data Warehouse

Apache Spark

Apache Spark

164 Reviews and Ratings

Oracle Autonomous Data Warehouse

Oracle Autonomous Data Warehouse

242 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Spark	Score 9.0 out of 10	N/A	Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.	N/A
Oracle Autonomous Data Warehouse	Score 8.3 out of 10	N/A	Oracle Autonomous Data Warehouse is optimized for analytic workloads, including data marts, data warehouses, data lakes, and data lakehouses. With Autonomous Data Warehouse, data scientists, business analysts, and nonexperts can discover business insights using data of any size and type. The solution is built for the cloud and optimized using Oracle Exadata.	N/A

Pricing

Apache Spark

Oracle Autonomous Data Warehouse

Editions & Modules

No answers on this topic

No answers on this topic

Offerings

Pricing Offerings
Apache Spark	Oracle Autonomous Data Warehouse
Free Trial
No	No
Free/Freemium Version
No	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	Apache Spark	Oracle Autonomous Data Warehouse

Best Alternatives
	Apache Spark	Oracle Autonomous Data Warehouse
Small Businesses	No answers on this topic	Google BigQuery Score 8.8 out of 10
Medium-sized Companies	Cloudera Manager Score 9.9 out of 10	Cloudera Enterprise Data Hub Score 9.0 out of 10
Enterprises	IBM Analytics Engine Score 7.2 out of 10	Oracle Exadata Score 9.8 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Spark	Oracle Autonomous Data Warehouse
Likelihood to Recommend	9.0 (24 ratings)	8.9 (32 ratings)
Likelihood to Renew	10.0 (1 ratings)	8.0 (1 ratings)
Usability	8.0 (4 ratings)	- (0 ratings)
Support Rating	8.7 (4 ratings)	- (0 ratings)
Implementation Rating	- (0 ratings)	9.0 (1 ratings)

User Testimonials
	Apache Spark	Oracle Autonomous Data Warehouse
Likelihood to Recommend	Apache Well suited: To most of the local run of datasets and non-prod systems - scalability is not a problem at all. Including data from multiple types of data sources is an added advantage. MLlib is a decently nice built-in library that can be used for most of the ML tasks. Less appropriate: We had to work on a RecSys where the music dataset that we used was around 300+Gb in size. We faced memory-based issues. Few times we also got memory errors. Also the MLlib library does not have support for advanced analytics and deep-learning frameworks support. Understanding the internals of the working of Apache Spark for beginners is highly not possible. Incentivized Ananth Gouri Assistant Professor Read full review	Oracle II would recommend Oracle Autonomous Data Warehouse to someone looking to fully automate the transferring of data especially in a warehouse scenario though I can see the elasticity of the suite that is offered and can see it is applicable in other scenarios not just warehouses. Incentivized Verified User Anonymous Read full review
Pros	Apache Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues Faster in execution times compare to Hadoop and PIG Latin Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner Interoperability between SQL and Scala / Python style of munging data Incentivized Nitin Pasumarthy Software Engineer Read full review	Oracle Very easy and fast to load data into the Oracle Autonomous Data Warehouse Exceptionally fast retrieval of data joining 100 million row table with a billion row table plus the size of the database was reduced by a factor of 10 due to how Oracle store[s] and organise[s] data and indexes. Flexibility with scaling up and down CPU on the fly when needed, and just stop it when not needed so you don't get charged when it is not running. It is always patched and always available and you can add storage dynamically as you need it. Erik Dvergsnes Senior Database Administrator Read full review
Cons	Apache Memory management. Very weak on that. PySpark not as robust as scala with spark. spark master HA is needed. Not as HA as it should be. Locality should not be a necessity, but does help improvement. But would prefer no locality Incentivized Anson Abraham Data Czar Read full review	Oracle It is very expensive product. But not to mention, there's good reasons why it is expensive. The product should support more cloud based services. When we made the decision to buy the product (which was 20 years ago,) there was no such thing to consider, but moving to a cloud based data warehouse may promise more scalability, agility, and cost reduction. The new version of Data Warehouse came out on the way, but it looks a bit behind compared to other competitors. Our healthcare data consists of 30% coded data (such as ICD 10 / SNOMED C,T) but the rests is narrative (such as clinical notes.). Oracle is the best for warehousing standardized data, but not a good choice when considering unstructured data, or a mix of the two. Incentivized Verified User Anonymous Read full review
Likelihood to Renew	Apache Capacity of computing data in cluster and fast speed. Steven Li Senior Software Developer (Consultant) Read full review	Oracle Because It is really simple to provision and configure. Does not require continous attention from the DBA, autonomous features allows the database to perform most of the regular admin tasks without need for human intervention. Allows to integrate multiple data sources on a central data warehouse, and explode the information stored with different analytic and reporting tools. Incentivized Lisandro Fernigrini Database Practice Lead Read full review
Usability	Apache If the team looking to use Apache Spark is not used to debug and tweak settings for jobs to ensure maximum optimizations, it can be frustrating. However, the documentation and the support of the community on the internet can help resolve most issues. Moreover, it is highly configurable and it integrates with different tools (eg: it can be used by dbt core), which increase the scenarios where it can be used Incentivized Verified User Anonymous Read full review	Oracle No answers on this topic
Support Rating	Apache 1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications. YM Yogesh Mhasde Technical Manager Read full review	Oracle No answers on this topic
Implementation Rating	Apache No answers on this topic	Oracle Understanding Oracle Cloud Infrastructure is really simple, and Autonomous databases are even more. Using shared or dedicated infrastructure is one of the few things you need to consider at the moment of starting provisioning your Oracle Autonomous Data Warehouse. Incentivized Lisandro Fernigrini Database Practice Lead Read full review
Alternatives Considered	Apache Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing. Incentivized Verified User Anonymous Read full review	Oracle As I mentioned, I have also worked with Amazon Redshift, but it is not as versatile as Oracle Autonomous Data Warehouse and does not provide a large variety of products. Oracle Autonomous Data Warehouse is also more reliable than Amazon Redshift, hence why I have chosen it Incentivized Prashast Vaish Decision Scientist Read full review
Return on Investment	Apache Business leaders are able to take data driven decisions Business users are able access to data in near real time now . Before using spark, they had to wait for at least 24 hours for data to be available Business is able come up with new product ideas Incentivized Surendranatha Reddy Chappidi Senior Data Engineer Read full review	Oracle Overall the business objective of all of our clients have been met positively with Oracle Data Warehouse. All of the required analysis the users were able to successfully carry out using the warehouse data. Using a 3-tier architecture with the Oracle Data Warehouse at the back end the mid-tier has been integrated well. This is big plus in providing the necessary tools for end users of the data warehouse to carry out their analysis. All of the various BI products (OBIEE, Cognos, etc.) are able to use and exploit the various analytic built-in functionalities of the Oracle Data Warehouse. Incentivized SM Suresh Muddaveerappa Senior DBA and Architect Read full review
ScreenShots