Apache Pig - lot to improve
April 28, 2021

Apache Pig - lot to improve

Anonymous | TrustRadius Reviewer
Score 7 out of 10
Vetted Review
Verified User

Overall Satisfaction with Apache Pig

Apache Pig and its query language (Pig Latin) allowed us to create data
pipelines with ease and heavily used by our teams. The language is designed to reflect the way data
pipelines are designed, so it discards extraneous data, supports user
defined functions (UDFs) , and offers a lot of control over the data
  • Data pipeline and aggregation
  • Log parsing and reporting
  • Combine Map Reduce jobs
  • Pig lacks in supporting the advanced features that Apache Spark provides
  • Well outdated
  • Debugging in Pig is a complex part
  • Handling unstructured dataset
  • To perform the tasks of collecting, loading, consolidating the data
  • Apache Pig is a 1st pass compiler, which is at its best using DAG.
  • Doesn't support all kinds of SQL-like abstraction
  • It's DML based scripting requires lot of training
  • Error handling is not helpful in debugging production issues
Apache Pig might help to start things faster at first and it was one of the best tool years back but it lacks important features that are needed in the data engineering world right now. Pig also has a steeper learning curve since it uses a proprietary language compared to Spark which can be coded with Python, Java.

Do you think Apache Pig delivers good value for the price?


Are you happy with Apache Pig's feature set?


Did Apache Pig live up to sales and marketing promises?


Did implementation of Apache Pig go as expected?


Would you buy Apache Pig again?


Write complex map reduce jobs without having much deep knowledge of Java, Python, Scala. Advanced features such as secondary sorting, optimization algorithms, predicate push-down techniques are very useful. With Apache Pig it's easy to aggregate data at scale compared to other tools. It automates important Map Reduce tasks into SQL kind queries.