Apache Spark vs. Presto
Product | Rating | Most Used By | Product Summary | Starting Price |
---|---|---|---|---|
Apache Spark | N/A | N/A | N/A | |
Presto | N/A | Presto is an open source SQL query engine designed to run queries on data stored in Hadoop or in traditional databases. Teradata supported development of Presto followed the acquisition of Hadapt and Revelytix. | N/A |
Apache Spark | Presto | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Editions & Modules | No answers on this topic | No answers on this topic | ||||||||||||||
Offerings |
| |||||||||||||||
Entry-level Setup Fee | No setup fee | No setup fee | ||||||||||||||
Additional Details | — | — | ||||||||||||||
More Pricing Information |
Apache Spark | Presto | |
---|---|---|
Considered Both Products | Apache Spark | Presto |
Top Pros |
| |
Top Cons |
|
|
Apache Spark | Presto | |
---|---|---|
Highlights |
Research Team Insight Published Apache Spark and Presto are open-source distributed data processing engines. Both engines are designed for ‘big data’ applications, designed to help analysts and data engineers query large amounts of data quickly. Although they have many similarities, Presto is focused on SQL query jobs, while Apache Spark is designed to handle applications that require more computational analysis, such as machine learning. Both Apache Spark and Presto are used mostly by large enterprises, with a significant mid-sized company user base as well. Since both engines are designed for big data processing, they’re often overkill for smaller businesses. FeaturesAlthough both Apache Spark and Presto are used for similar applications, they each have distinguishing features that set them apart from each other. Apache Spark is designed for fast data processing in a variety of contexts, including machine learning, ETL, and ad-hoc querying. It uses an in-memory processing design, meaning it can run with very few disk read/write operations and process enormous datasets quickly. Developers report that its SQL interface and object-oriented design make it easy to understand and write code for. Users also appreciate its wide variety of APIs for ETL procedures and cluster management. Apache Spark has a large support community and wide industry adoption, and the internet has plenty of recommended solutions to common problems. Presto is optimized specifically for SQL, meaning it can exceed Apache Spark’s speed for SQL queries. It queries data in-place, without copying or moving data. Presto also uses a flexible, plug-and-play architecture that makes it easy to combine and simultaneously query data from multiple sources, including both SQL and NoSQL databases. It’s suitable for ad-hoc querying, batch ETL jobs, and data analysis for A/B testing. LimitationsBefore adopting Apache Spark or Presto, consider the limitations of each engine. Apache Spark’s in-memory processing may be fast, but it also requires plenty of memory, which can quickly get expensive. Some users found that Apache Spark isn’t ideal for real-time analytics, while others found its data security capabilities lacking. It lacks automatic optimization and caching features, requiring some users to build the functionality themselves. Finally, Apache Spark may be designed intuitively, but it’s still a complicated tool with a steep learning curve. Presto’s SQL optimization is also its primary limitation. It’s designed primarily to run SQL queries, while Apache Spark is suitable for a wider range of applications. This also means that Presto is at its best when the data it’s querying is already in SQL databases; although Presto can query and join data from multiple database types, you only get the highest speeds with SQL data. Additionally, Presto requires a lot of setup to run properly, with installation and configuration across many different nodes. PricingBoth Apache Spark and Presto are open-source and free. |
Apache Spark | Presto | |
---|---|---|
Small Businesses | No answers on this topic | SingleStore Score 9.8 out of 10 |
Medium-sized Companies | Cloudera Manager Score 9.7 out of 10 | SingleStore Score 9.8 out of 10 |
Enterprises | IBM Analytics Engine Score 8.8 out of 10 | SingleStore Score 9.8 out of 10 |
All Alternatives | View all alternatives | View all alternatives |
Apache Spark | Presto | |
---|---|---|
Likelihood to Recommend | 9.9 (24 ratings) | 7.8 (2 ratings) |
Likelihood to Renew | 10.0 (1 ratings) | - (0 ratings) |
Usability | 10.0 (3 ratings) | - (0 ratings) |
Support Rating | 8.7 (4 ratings) | - (0 ratings) |
Apache Spark | Presto | |
---|---|---|
Likelihood to Recommend | Apache | Open Source |
Pros | Apache | Open Source |
Cons | Apache | Open Source |
Likelihood to Renew | Apache | Open Source No answers on this topic |
Usability | Apache | Open Source No answers on this topic |
Support Rating | Apache | Open Source No answers on this topic |
Alternatives Considered | Apache | Open Source |
Return on Investment | Apache | Open Source |
ScreenShots |