Skip to main content
TrustRadius
Azure HDInsight

Azure HDInsight

Overview

What is Azure HDInsight?

HDInsight is an implementation of the Apache Hadoop technology stack on the Microsoft Azure cloud platform: It is based on the Hortonworks Hadoop distribution. Microsoft Azure HDInsight includes implementations of Apache Spark, HBase, Storm, Pig, Hive, Sqoop, Oozie, Ambari, etc.…

Read more
Recent Reviews

totally friendly!

9 out of 10
November 09, 2020
Incentivized
Azure HDInsight is being used across the whole organization in 3 countries that work together. The solution is that the organization pay …
Continue reading
Read all reviews

Awards

Products that are considered exceptional by their customers based on a variety of criteria win TrustRadius awards. Learn more about the types of TrustRadius awards to make the best purchase decision. More about TrustRadius Awards

Reviewer Pros & Cons

View all pros & cons
Return to navigation

Product Demos

Demo Azure HDInsight auf Linux

YouTube

Using semi structured data like JSON with Hive on Azure HDInsight

YouTube

Drive higher utilization of Azure HDInsight clusters with Autoscale | Data Exposed

YouTube

Fine-grained security with Apache Ranger on HDInsight Kafka | Azure Friday

YouTube

HDInsight: Fast Interactive Queries with Hive on LLAP | Azure Friday

YouTube
Return to navigation

Product Details

What is Azure HDInsight?

HDInsight is an implementation of the Apache Hadoop technology stack on the Microsoft Azure cloud platform: It is based on the Hortonworks Hadoop distribution.

Microsoft Azure HDInsight includes implementations of Apache Spark, HBase, Storm, Pig, Hive, Sqoop, Oozie, Ambari, etc. It also integrates with with business intelligence (BI) tools such as Power BI, Excel, SQL Server Analysis Services, and SQL Server Reporting Services.

Azure HDInsight Technical Details

Operating SystemsUnspecified
Mobile ApplicationNo
Return to navigation

Comparisons

View all alternatives
Return to navigation

Reviews and Ratings

(37)

Attribute Ratings

Reviews

(1-5 of 5)
Companies can't remove reviews or game the system. Here's why
Score 4 out of 10
Vetted Review
Verified User
Incentivized
Azure HDInsight that's the main ETL and Data Processing platform of our company. It hosts all processes that generated the data we provide for our costumers, managed by ADF pipelines.
  • Integration with Azure Datafactory
  • Integration with Azure Management API
  • Easy to deploy from powershell, API, ADFv2 and other Azure Resources
  • Cheap when creating and deleting dynamically on demand clusters
  • A high level of documentation
  • Some diagnostic tools
  • Instable for months, with freaky problems
  • By default generates huge logs that finally break the cluster
  • Difficult to control its resources escalation
  • Difficult to plan, monitor and limit its costs
  • On-Demand clusters are very unstable and untrustful
  • Cluster creation process fails often
  • Cost is huge and difficult to control / limit
  • Dead weight (and cost) of some products you will never use into the cluster (we just needed Spark and Hadoop)
  • Poor support team
  • Outdated software in the clusters (almost out of support Python versions)
  • Almost impossible to customize and adapt for our specific needs
Well suited:

A tiny-mid sized company with no immediate plans of growing the volume of their data processing, that can afford long response times from support.
Also it helps if you are not prone to put your hands on Linux and Spark configuration. In fact, it can make things go really faster if you also work with the bundle-in Jupyter.
And, if you need to perform some diagnostics and / or administrative tasks, that's full of tools to find an understand the Root Cause. Ideal for non experts.
Less appropriate:
Big Data company, intense on demand cluster creation, mission critical, costs reduction, latest versions of libraries required, sophisticate customizations required.




  • When our business started, it was a great value for a company with no tech experts
  • When business grow and data volumes also, problems started to arise: lots of on-demand clusters creations failed, cluster internal cluster crashed due to logs size (no way to limit them)
  • When business got really big, after having months of outage, we have to start looking for a new platform. It became ineffective in ROI.
Many times you just need spark performing fast and cheap. Azure HDInsight Includes lots of features and not required software. Also its libraries and runtime versions are pritty old. But, what is great Is you don't need to have an expert in your team and things -when work- performs always in the exact same way. Also, as I mentioned, for a starter that's a great ROI.
Inexpert, isolated teams... not good for support an excessively complex platform. Lots of weeks or months for a complex problem troubleshoot. Many time lost stuck on MindTree, before the case was finally escalated with Microsoft!
I had a trial, was cool, but I found pointless to pay for solving issues in your platform. It just makes no sense, if that's something wrong on your size, you must pay, not me.
Additionally, the "architectural consulting" that's not of great help. When you need consulting hours from a Partner, there is a long way of paperwork to do.
Yes
NO, in a case we have to wait almost six months with the error happening once and once, asking for the same information with different engineers. Everything before to asume an Azure Platform issue, that was the true root cause of the reported issue.

And that's just one of the points that irritated more internal users.
No, sadly not. Support was always poor and escalation was many times stopped or almost locked.
November 09, 2020

totally friendly!

Score 9 out of 10
Vetted Review
Verified User
Incentivized
Azure HDInsight is being used across the whole organization in 3 countries that work together. The solution is that the organization pay for what they use.
  • It's scalable.
  • It's from Microsoft so it's friendly to use.
  • Data is available when you need it.
  • Performance with high volume of data.
If you want to save costs and just pay for what you use, I highly recommend it. It will help you also to work with data for your reports and analytics. on the other hand I think it could be the subscription you have but high volume of data make it slow but not so much. anyway I think it's really good because it's from Microsoft which always is friendly to use it as all the suit they have.
  • The impact is so far positive, the organization pays for the whole company in 3 countries.
I think it's easier to use because almost everyone knows how Microsoft works.
I highly recommend it.
Score 10 out of 10
Vetted Review
Verified User
Incentivized
Azure HDInsight is used for managing data processing at TechnipFMC. Being a big company managing the data of thousands of users and customers is always a challenge. We can now access the past data and present data without being dependent on a hard drive or on some other PC.
  • Data is presented without interfering others (IT or other dept).
  • Data is managed properly and is available for retrievable any time.
  • Legacy use of CD/DVD and Pendrive are not required.
  • Sometimes, lower bandwidth faces some issues (Slow speed)
  • Something can be done about lower bandwidth capability as higher bandwidth may not be available at all the places.
  • But improvement will be there in the above capability I am sure as the technology evolves.
Why do you need legacy hard drives and storage and other software to present the data when Azure can manage it efficiently and is always available on-demand?
  • ROI is of course there, as no legacy software for data presentation.
  • No manual intervention for data retrieval.
  • Data is available anywhere as requested.
Microsoft 365 (formerly Office 365), Microsoft Teams, Honeywell Connected Plant, Honeywell Production Control Center for Oil & Gas (PCC), ABB Ability Solution Suite
Overall support is amazing. When we have a problem, IT and local support are able to solve problems faster.
I think this is being used in many of the big companies by default.
Krishn Garg | TrustRadius Reviewer
Score 8 out of 10
Vetted Review
Verified User
Incentivized
HDInsight is used to help the Sales Department with processing analytics and managing large data. It is utilized by the Advertising and Sales Team and it addresses the problems of overall cost spend in google analytics. We definitely use the powerful query to translate high volume data into Excel in our company. The best thing is the high availability of Microsoft. Azure HDInsight makes it very easy to work with large scale data workloads in Hadoop, and it gives good benefits working on the top of the Azure data lakes.
  • Promote things that needed to be on top.
  • Current analytics changes.
  • As it's from Microsoft, it has high availability and support from Microsoft.
  • Easy transformation of high volume data.
  • Performance issue with large volume data.
  • Cost
For the current purpose, it fits best. Save other costs in analytics and transforming high volume data.
  • Meet business objectives.
  • Advertising events so save cost.
  • High availibility
Azure HD Insight is the one only used and it works well and serves its purpose. Azure Blob Storage could be an alternative, but if you really want to work on a large-scale workload, then HD analytics is best with Azure Data Lakes. PowerBI could be useful for small scale data stored on different services such as SharePoint and One Drive.
Well suited in advertising and marketing to serve the business purpose and best fit in the transformation of high volume data.
Azure HDInsight is usable on the top of Azure Data Lake and gives us the benefit of analyzing large scale data workload in Hadoop. Usability and support from Microsoft are outstanding.
Score 9 out of 10
Vetted Review
Verified User
Incentivized
It was used to create and manage Hadoop instance. Extracting undiscovered information from big quantities of unstructured data was easy. The Hadoop cluster was built within minutes.
  • The Hadoop cluster was built within minutes.
  • It is cost-effective to collect and store structured or unstructured data.
  • The flat network storage system technology offers a high-speed connection between nodes and blob storage system.
  • Had issues while using Azure ARM Templates for deployments.
  • The cost structure suited for high end temporary data analytics, rather than a traditional frequently queried data warehouse.
Faster along with Pay for the service you used.
Extracting undiscovered information from big quantities of unstructured data was easy

The following insights were used in monitoring the Hadoop instance:

Disk Usage
Utilization of CPU
Cluster Load
Memory Used
Network Used
  • Less implementation cost.
  • No hosting services cost.
  • Faster data retrieval.
Quick and easy to manage through Azure portal.
Return to navigation