What users are saying about
18 Ratings
65 Ratings
18 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 7.3 out of 101
65 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.1 out of 101

Add comparison

Likelihood to Recommend

Apache Pig

It is one great option in terms of database pipelining. It is highly effective for unstructured datasets to work with. Also, Apache Pig being a procedural language, unlike SQL, it is also easy to learn compared to other alternatives. But other alternatives like Apache Spark would be my recommendation due to the high availability of advanced libraries, which will reduce our extra efforts of writing from scratch
Kartik Chavan profile photo

Apache Hive

Apache Hive shines for ad-hoc analysis and plugging into BI tools. Its SQL-like syntax allows for ease of use not for only for engineers but also for data analysts. Through our experience, there are probably more desirable tools to use if you are planning on integrating Hive into your processing pipeline.
No photo available

Pros

  • Long logics in Java? Apache Pig is a good alternative.
  • Has a lot of great features including table joins on many databases like DBMS, Hive, Spark-SQL etc.
  • Faster & easy development compared to regular map-reduce jobs.
Kartik Chavan profile photo
  • Apache Hive allows use to write expressive solutions to complex problems thanks to its SQL-like syntax.
  • Relatively easy to set up and start using.
  • Very little ramp-up to start using the actual product, documentation is very thorough, there is an active community, and the code base is constantly being improved.
No photo available

Cons

  • UDFS Python errors are not interpretable. Developer struggles for a very very long time if he/she gets these errors.
  • Being in early stage, it still has a small community for help in related matters.
  • It needs a lot of improvements yet. Only recently they added datetime module for time series, which is a very basic requirement.
Kartik Chavan profile photo
  • It's not as ACID compliant as an RDBMS. It's a recently added feature and still needs work.
  • This is not the tool to go for online data processing.
  • It does not support sub-queries.
  • It can't process data in real time.
No photo available

Likelihood to Renew

No score
No answers yet
No answers on this topic
Apache Hive10.0
Based on 1 answer
Since I do not know the second data warehouse solution that integrate with HDFS as well as Hive.
Yinghua Hu profile photo

Usability

Apache Pig10.0
Based on 1 answer
It is quick, fast and easy to implement Apache Pig which makes is quite popular to be used.
Subhadipto Poddar profile photo
Apache Hive9.0
Based on 1 answer
Hive's support SQL like queries improves its usability since almost every potential user of Hive would have had experience with SQL.
Tom Thomas profile photo

Alternatives Considered

I use both Apache Pig and its alternatives like Apache Spark & Apache Hive. Apache Pig was one of the best options in Big Data's initial stages. But now alternatives have taken over the market, rendering Apache Pig behind in the competition. But it is still a better alternative to Map Reduce. It is also a good option for working with unstructured datasets. Moreover, in certain cases, Apache Pig is much faster than Hive & Spark.
Kartik Chavan profile photo
Hive was one of the first SQL on Hadoop technologies, and it comes bundled with the main Hadoop distributions of HDP and CDH. Since its release, it has gained good improvements, but selecting the right SQL on Hadoop technology requires a good understanding of the strengths and weaknesses of the alternative options
Jordan Moore profile photo

Return on Investment

  • Return on Investments are significant considering what it can do with traditional analysis techniques. But, other alternatives like Apache Spark, Hive being more efficient, it is hard to stick to Apache Pig.
  • It can handle large datasets pretty easily compared to SQL. But, again, alternatives are more efficient.
  • While working on unstructured, decentralized dataset, Pig is highly beneficial, as it is not a complete deviation from SQL, but it does not take you in complexity MapReduce as well.
Kartik Chavan profile photo
  • I think productivity has increased for us as we're now able to store data going far back as we want
  • Allows us to perform analytics that we wouldn't be able to do otherwise. For example customer life cycle mapping is possible through this
  • ROI in terms of ramp up time for new employees who don't have a big data background. Since HQL is available, which like sql, analyst that have little to no big data exposure can quickly get upto speed and start working
Sameer Gupta profile photo

Pricing Details

Apache Pig

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Apache Hive

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details