Apache Pig vs. Hortonworks Data Platform

Apache Pig

Apache Pig

22 Reviews and Ratings

Hortonworks Data Platform

Hortonworks Data Platform

37 Reviews and Ratings

Overview
Product	Rating	Most Used By	Product Summary	Starting Price
Apache Pig	Score 8.4 out of 10	N/A	Apache Pig is a programming tool for creating MapReduce programs used in Hadoop.	$0
Hortonworks Data Platform	Score 5.0 out of 10	N/A	Hortonworks Data Platform (HDP) is an open source framework for distributed storage and processing of large, multi-source data sets. HDP modernizes IT infrastructure and keeps data secure—in the cloud or on-premises—while helping to drive new revenue streams, improve customer experience, and control costs. Hortonworks merged with Cloudera in eary 2019.	N/A

Pricing

Apache Pig

Hortonworks Data Platform

Editions & Modules

No answers on this topic

No answers on this topic

Offerings

Pricing Offerings
Apache Pig	Hortonworks Data Platform
Free Trial
No	No
Free/Freemium Version
Yes	No
Premium Consulting/Integration Services
No	No

Entry-level Setup Fee

No setup fee

No setup fee

Additional Details

—

—

More Pricing Information

Community Pulse
	Apache Pig	Hortonworks Data Platform
Considered Both Products	Apache Pig Verified User Engineer Chose Apache Pig Early on Apache Pig was a great tool for easily writing distributed processing applications without needing to write a complete Java MapReduce job from scratch, but as time as moved on there now better alternatives to get results faster for both ad-hoc analysis and for … Incentivized Helpful?	Hortonworks Data Platform No answer on this topic

Best Alternatives
	Apache Pig	Hortonworks Data Platform
Small Businesses	No answers on this topic	No answers on this topic
Medium-sized Companies	Cloudera Manager Score 9.9 out of 10	Cloudera Manager Score 9.9 out of 10
Enterprises	IBM Analytics Engine Score 7.1 out of 10	IBM Analytics Engine Score 7.1 out of 10
All Alternatives	View all alternatives	View all alternatives

User Ratings
	Apache Pig	Hortonworks Data Platform
Likelihood to Recommend	8.2 (9 ratings)	7.0 (9 ratings)
Usability	10.0 (1 ratings)	- (0 ratings)
Support Rating	6.0 (1 ratings)	- (0 ratings)
Implementation Rating	- (0 ratings)	9.0 (1 ratings)

User Testimonials
	Apache Pig	Hortonworks Data Platform
Likelihood to Recommend	Apache Apache Pig is best suited for ETL-based data processes. It is good in performance in handling and analyzing a large amount of data. it gives faster results than any other similar tool. It is easy to implement and any user with some initial training or some prior SQL knowledge can work on it. Apache Pig is proud to have a large community base globally. Incentivized Verified User Anonymous Read full review	Cloudera I find HDP easy to use and solves most of the problems for people looking to manage their big data. Evaluating the Hortonworks Data Platform is easy as it is free to download and install in your cluster. Single node cluster available as Sandbox is also easy for POCs. Incentivized Piyush Routray Senior Software Developer Read full review
Pros	Apache Its performance, ease of use, and simplicity in learning and deployment. Using this tool, we can quickly analyze large amounts of data. It's adequate for map-reducing large datasets and fully abstracted MapReduce. Incentivized Sourov K Chowdhury Database Software Engineer Read full review	Cloudera It does a good job of packaging a lot of big data components into bundles and lets you use the ones you are interested in or need. It supports an extensive list of components which lets us solve many problems. It provides the ability to manage installations and maintenance using Apache Ambari. It helps us in using management packs to install/upgrade components easily. It also helps us add, remove components, add, remove hosts, perform upgrades in a convenient manner. It also provides alerts and notifications and monitors the environment. What they excel in is packaging open source components that are relevant and are useful to solve and complement each other as well as contribute to enhancing those components. They do a great job in the community to keep on top of what would be useful to users, fixing bugs and working with other companies and individuals to make the platform better. Incentivized Verified User Anonymous Read full review
Cons	Apache UDFS Python errors are not interpretable. Developer struggles for a very very long time if he/she gets these errors. Being in early stage, it still has a small community for help in related matters. It needs a lot of improvements yet. Only recently they added datetime module for time series, which is a very basic requirement. Incentivized Kartik Chavan Data Analyst Read full review	Cloudera Since it doesn't come with propriety tools for big data management, additional integration is need (for query handling, search, etc). It was very straightforward to store clinical data without relations, such as data from sensors of a medical device. But it has limitations when needed to combine the data with other clinical data in structured format (e.g. lab results, diagnosis). Overall look and feel of front-end management tools (e.g. monitoring) are not good. It is not bad but it doesn't look professional. Incentivized Verified User Anonymous Read full review
Usability	Apache It is quick, fast and easy to implement Apache Pig which makes is quite popular to be used. Incentivized Subhadipto Poddar Research Assistant Read full review	Cloudera No answers on this topic
Support Rating	Apache The documentation is adequate. I'm not sure how large of an external community there is for support. Incentivized Jordan Moore Software Consultant Read full review	Cloudera No answers on this topic
Implementation Rating	Apache No answers on this topic	Cloudera Try not to change variable names. Incentivized Wonoh Kim Principal Software Engineer Read full review
Alternatives Considered	Apache Apache Pig might help to start things faster at first and it was one of the best tool years back but it lacks important features that are needed in the data engineering world right now. Pig also has a steeper learning curve since it uses a proprietary language compared to Spark which can be coded with Python, Java. Incentivized Verified User Anonymous Read full review	Cloudera We chose [Hortonworks Data Platform] because it's free and because [it] was an IBM partner, suggested as big data platform after biginsights platform. You can install in more physical computer without high specs, then you can use it in order to learn how to deploy, configure a complete big data cluster. We installed also in a cloud infrastructure of 5 virtual machine Incentivized Andrea Bardone Project List 2018 - 2012 Read full review
Return on Investment	Apache Higher learning curve than other similar technologies so on-boarding new engineers or change ownership of Apache Pig code tends to be a bit of a headache Once the language is learned and understood it can be relatively straightforward to write simple Pig scripts so development can go relatively quickly with a skilled team As distributed technologies grow and improve, overall Apache Pig feels left in the dust and is more legacy code to support than something to actively develop with. Incentivized Verified User Anonymous Read full review	Cloudera It is difficult to have a negative impact, because the required investment is not that high. The big open community behind Hortonworks and related Apache Project makes it easy to put 'the wheel to meet the road' quite quickly. We have seen management meetings where the attendants were impressed by the results achieved with the datalake built on HDP. Incentivized Fernando López Bello Big Data & Cognitive Computing Practice Leader Read full review
ScreenShots