Vertica's Strengths and Weakness
Praveen Murugesan | TrustRadius Reviewer
November 04, 2016

Vertica's Strengths and Weakness

Score 7 out of 10
Vetted Review
Verified User
Review Source

Overall Satisfaction with Vertica

Vertica is used by uber for data analytics use cases. We have a vertica based data mart (subset of business data) for analytics insight and data science across the entire organization. We use it as a complementary solution to Hadoop. We initially started our with Vertica which worked for our needs, but over the last couple of years have started leveraging Hadoop in addition to vertica to help our data efforts with high scale.
  • Extremely fast query performance - Vertica is one of the fastest query engines out there.
  • Scales to TBs - Scales reasonably well up to 10-20 nodes and 10 - 100s of TB of data.
  • Easy to Use - Fairly easy to user, we made quite some headway with just 1 person running it for a while.
  • PetaByte Scale data - Vertica Just cannot deal with this, it starts to crumble beyond 100s of TB of data.
  • Concurrent Usage - Vertica starts to have significant backpressure as your concurrent users grow quickly. We had trouble scaling post 20-30 users and had to invent our our queuing strategies.
  • Vertical stack - storage + compute tier in one stack, this doesn't help the cause of scaling. Other systems leverage the advantage of storage and compute being different tiers (eg: HDFS + Presto)
  • We've been using vertica to derive a lot of valuable ad-hoc human insights
  • Used to run periodic batch jobs that generate production results in the past, now moved to Hadoop for such use-cases
  • We had a couple of big outages due to vertica unable to keep up with the load of queries and data (however were mitigated w/ leveraging Hadoop).
Vertica is great for small low complex queries and has great query performance over the other technologies that I have worked with.
Vertica fails to Hive wrt scalability and resource isolation, where Hive exploits Hadoop's resource isolation.
Presto is almost comparable to vertica (vertica is slightly faster). However, Presto can scale better.

As someone just starting out with data analytics and warehousing vertica is a great tool for a small scale business. It has amazing performance and can scale upto TBs of data. It works well for any organization which has about 100 - 500 DAUs of the system. The system doesn't require a lot of ops overhead.

Scaling for PB data and 1000s of DAU is vertica's weak point. The system is just not designed for large scale usage and still has a long way to go to improve scalability. There are experiments to run Vertica query engine on top of HDFS which seem promising, however - if you have the the Hadoop ecosystem you are better off going the HDFS + Presto/Impala/SparkSQL route. But if you are in the Hadoop ecosystem, you probably are already investing a lot in ops.