Syncsort Trillium DQ for Big Data (formerly Trillium Quality for Big Data) supports enterprises using a Big Data framework like Hadoop with data quality functions like data integration, data cleansing, standardization and parsing, with prebuilt process flows that can be configured to meet busines...
Best Hadoop-Related Software
TrustMaps are two-dimensional charts that compare products based on satisfaction ratings and research frequency by prospective buyers. Products must have 10 or more ratings to appear on this TrustMap.
Hadoop-Related Software Overview
What is Hadoop Software?
Hadoop is a very unusual kind of open-source data store from the Apache Foundation. However, an entire ecosystem of products has evolved around the Hadoop data store, to the point where it has become its own technology category.
The central idea of Hadoop is that data is spread across many commodity, inexpensive servers, although there are several commercial distributions of Hadoop from Cloudera and Hortonworks who wrap services around the technology.
Unlike a traditional database, Hadoop can handle huge volumes of both structured and unstructured data including log files, streaming data, images, audio and video files. All of this data can be put into the Hadoop cluster and accessed, modified and processed in place, eliminating the need to duplicate and structure data in a traditional warehouse.
Once this huge volume of structured and unstructured data has been stored, how do you extract any value from it? Since Hadoop is not a structured database, structured query languages like SQL do not work. But Hadoop has its own data processing and query framework called MapReduce. Developers can use MapReduce to write programs that can retrieve whatever data is needed. However, MapReduce has several constraints affecting performance and a newer product like Apache Spark provides an alternative distributed computing framework, which is significantly more efficient. Similarly, products like Hive and Cloudera Impala provide a SQL-like query language, which is much easier for data analysts to learn and use.
Listings (26-35 of 35)
WX2 is the data and analytics focused data warehouse appliance solution from UK company Kognitio.
RedPoint Data Management & Quality handles the core requirements of data management including data quality, with data-profiling and general-purpose data-cleansing functionality, including parsing, standardization, matching and cleansing. The platform functions across NoSQL, Hadoop, and tradit...
The Syncfusion Big Data Platform is a Hadoop distribution designed for Windows. Its users can develop on Windows using familiar tools, and deploy on Windows. The vendor says they have taken the advantages of the Hadoop environment – from easy querying across structured and unstructured data to c...
Bitwise offers Hydrograph, a data integration tool with provides ETL functionality on Hadoop and Spark.
Bitwise Hadoop Adaptor for Mainframe Data acquires mainframe data and converts it to Hadoop format for processing.
Imanis Data headquartered in San Jose offers their enterprise data management platform supporting data backup and recovery for heterogenous data and big data sources (Hadoop, Hortonworks, MongoDB, etc.).
IBM Analytics for Apache Spark for Cloud is a service designed to provide the fast in-memory performance of Apache Spark without the hassle of self-managing Spark clusters, relying instead on the convenience of IBM Cloud.