Apache Hive is database/data warehouse software that supports data querying and analysis of large datasets stored in the Hadoop distributed file system (HDFS) and other compatible systems, and is distributed under an open source license.
N/A
Pricing
Apache Hive
Editions & Modules
No answers on this topic
Offerings
Pricing Offerings
Apache Hive
Free Trial
No
Free/Freemium Version
No
Premium Consulting/Integration Services
No
Entry-level Setup Fee
No setup fee
Additional Details
—
More Pricing Information
Community Pulse
Apache Hive
Considered Both Products
Apache Hive
Verified User
Engineer
Chose Apache Hive
To query a huge, distributed dataset, Apache Hive was built by Facebook. Unlike Apache Hive, Apache Spark is an in-memory computation engine, which is why it is significantly quicker than Apache Hive at querying large amounts of data. In contrast to Apache HBase, Apache Hive is …
Community support and ease of use -not deployment.
It enables querying and analyzing large amounts of data stored in HDFS, on the petabyte scale. It has a query language called HQL that transforms SQL queries into MapReduce jobs that run on Hadoop, and it is wonderful for the …
Apache Spark is similar in the sense that it too can be used to query and process large amounts of data through its Dataframe interface. Hive is better for short-term querying while Spark is better for persistent and long-term analysis. Another product is Impala. For our …
We have used a simple but necessary function such as merging certain data tables, which although they may be from different areas, complement each other or are necessary, you can use metadata if what you need is to validate the origin of your information and what impact it has, …
Apache Hadoop is built on top of the Hadoop File system so it gives its best when integrated with Hadoop. Data analysis and query optimization become very easy when used with Hadoop to perform Extract transform load operations. As Hadoop is a big data system and handles large …
We have used the system to migrate data either for new versions or because we will use another operating program, the software helps us to synchronize programs between different operating systems, a history of information can be kept constant, it can be sent to third parties …
Queries are easy to write and interface is similar to SQL so learning overhead is reduced. Multi user and data type support is provided. Can be easily scaled for very large amount of analytics. It is very flexible in terms of using file formats.
Apache Hive is a query language developed by Facebook to query over a large distributed dataset. Apache is a query engine that runs on top of HDFS, so it utilizes the resources of HDFS Hadoop setup, while Apache Spark is an in memory compute engine, and that's why [it is] much …
Besides Hive, I have used Google BigQuery, which is costly but have very high computation speed. Amazon Redshift is the another product, I used in my recent organisation. Both Redshift and BigQuery are managed solution whereas Hive needs to be managed
Hive and Spark have the same parent company hence they share a lot of common features. Hive follows SQL syntax while Spark has support for RDD, DataFrame API. DataFrame API supports both SQL syntax and has custom functions to perform the same functionality. Spark is faster and …
One of the major advantages of using Presto or the main reason why people use Presto (Teradata) is due to that fact it can support multiple data sources - which is lacking as in the case of Apache Hive. But still, most people who come from a Structured data-based background …
Easy to understand, well supported by the community, good documentation. However, it is possible that SAP Business Warehouse could be a good fit, too, even maybe better. I did not have the chance to try it though. We selected Apache Hive because it was far less expensive and …
Hive was one of the first SQL on Hadoop technologies, and it comes bundled with the main Hadoop distributions of HDP and CDH. Since its release, it has gained good improvements, but selecting the right SQL on Hadoop technology requires a good understanding of the strengths and …
For storing bulk amount of data in a tabular manner, and where there's no need need of primary key, or just in case, if redundant data is received, it will not cause a problem. For small amounts of data, it does run MR, so beware. If your intention is to use it as a …
I wasn't part of the evaluation process for Apache Hive. This was already implemented when I joined the company. I have worked with other big data plaftforms and I personally thinks most of them are quite comporable to one another. It really depends on what the company is going …