Apache Pig - lot to improve
April 28, 2021
Apache Pig - lot to improve
Score 7 out of 10
Vetted Review
Verified User
Overall Satisfaction with Apache Pig
Apache Pig and its query language (Pig Latin) allowed us to create data
pipelines with ease and heavily used by our teams. The language is designed to reflect the way data
pipelines are designed, so it discards extraneous data, supports user
defined functions (UDFs) , and offers a lot of control over the data
flow.
pipelines with ease and heavily used by our teams. The language is designed to reflect the way data
pipelines are designed, so it discards extraneous data, supports user
defined functions (UDFs) , and offers a lot of control over the data
flow.
- Data pipeline and aggregation
- Log parsing and reporting
- Combine Map Reduce jobs
- Pig lacks in supporting the advanced features that Apache Spark provides
- Well outdated
- Debugging in Pig is a complex part
- Handling unstructured dataset
- To perform the tasks of collecting, loading, consolidating the data
- Apache Pig is a 1st pass compiler, which is at its best using DAG.
- Doesn't support all kinds of SQL-like abstraction
- It's DML based scripting requires lot of training
- Error handling is not helpful in debugging production issues
Apache Pig might help to start things faster at first and it was one of the best tool years back but it lacks important features that are needed in the data engineering world right now. Pig also has a steeper learning curve since it uses a proprietary language compared to Spark which can be coded with Python, Java.
Do you think Apache Pig delivers good value for the price?
Yes
Are you happy with Apache Pig's feature set?
No
Did Apache Pig live up to sales and marketing promises?
No
Did implementation of Apache Pig go as expected?
No
Would you buy Apache Pig again?
No