Data Deduplication Tools

Data Deduplication Tools Overview

What are Data Deduplication Tools?

Data deduplication tools are important for backup and restore operations where large quantities of data are backed up at regular intervals. Frequent backup always means copying and storing large data sets for recovery purposes. As much of this data is duplicate data, storing it all repeatedly would quickly lead to unmanageably large data storage requirements. It is essential to deduplicate these data streams to optimize data backup storage.

Deduplication is achieved by means of a deduplication algorithm which is capable of examining an incoming data stream, and comparing data segments to data that has been stored previously. However, there are several things to consider when looking for a product as not all deduplication products work in the same way:

  • Source versus target deduplication: Software running on a server which is the source of data is deduped before it is transmitted to the storage device. The advantages of this approach are that a smaller quantity of data is transmitted to the target storage solution and this method therefore uses less bandwidth for data transmission Source deduplication can increase processing time, which is often an important consideration in virtualized environments where there is a very large quantity of data duplication. The alternative is target deduplication where the data is all transmitted to a storage NAS device or tape library, and is deduped once it has been sent. This method reduces the storage capacity required for backup data, but does not reduce the amount of data sent across a LAN or WAN during backup.
  • Inline deduplication versus post-processing deduplication: Inline processing means that the deduplication process happens in real time as the data is being transmitted to storage. In post-processing deduplication, the backup data is all written to a disk cache before it starts the deduplication process.
  • Global deduplication: Global deduplication is an important consideration as most deduplication processes are designed to remove duplicated data from a single storage device. Global deduplication is removing redundant data across the entire data storage infrastructure. Global deduplication allows administrators to efficiently manage the entire backup data storage environment.
The benefits of data deduplication are primarily in reducing data storage requirements and hence costs. Deduplication also makes data restore operations more efficient since there is much less data to restore.


Top Rated Data Deduplication Products

These products won a Top Rated award for having excellent customer satisfaction ratings. The list is based purely on reviews; there is no paid placement, and analyst opinions do not influence the rankings. Read more about the Top Rated criteria.

Data Deduplication Tools TrustMap

TrustMaps are two-dimensional charts that compare products based on trScore and research frequency by prospective buyers. Products must have 10 or more ratings to appear on this TrustMap.

Data Deduplication Products

(1-25 of 33) Sorted by Most Reviews

Druva inSync
94 ratings
156 reviews
Top Rated
TRUE
Workforce mobility and the rise of cloud services is an essential part of any business, but it creates a number of challenges for IT. Data spread across devices and cloud services, unpredictable schedules, and varied network connections all complicate efforts to protect and govern enterprise informa…
Druva Phoenix
28 ratings
50 reviews
Top Rated
TRUE
As businesses are adopting a “cloud-first” strategy, data protection is one of the first IT functions to migrate to the cloud. However, it can lead to a number of challenges. With a thoughtful cloud strategy, businesses can achieve the best results. According to the vendor, with Druva PhoenixTM, bus…
Barracuda Backup
33 ratings
28 reviews
Barracuda Backup is a data recovery, restoration, and deduplication product from Barracuda Networks. It features data center backup support for email protection, network & application security, and general data protection.
Dell EMC Avamar
41 ratings
19 reviews
Dell EMC Avamar is a hardware and software data backup and deduplication product. It provides protection and recovery through a complete software and hardware solution when paired with Dell EMC Data Domain for virtual environments, remote offices, enterprise apps, NAS servers, and desktops/laptops.
PowerProtect DD (formerly Dell EMC Data Domain)
27 ratings
8 reviews
PowerProtect DD (a next-generation appliance replacing Dell EMC Data Domain) is a suite of hardware appliances used for data protection, backup, storage and deduplication. PowerProtect appliance offerings are cloud-enabled and vary by organization size, capable of supporting small business and ente…
NetApp FAS series
29 ratings
7 reviews
NetApp's FAS series systems offers a storage array system for enterprises.
Veritas NetBackup Appliance
18 ratings
4 reviews
Symantec NetBackup Appliance is a storage and deduplication solution.
HPE StoreOnce
4 ratings
4 reviews
HPE StoreOnce is a backup and recovery hardware solution from Hewlett-Packard Enterprise, providing disk-based backup, deduplication, and long-term storage. StoreOnce offerings can support virtual and cloud environments for small business, mid-size organizations, and enterprises.
XtremIO Flash Storage
3 ratings
3 reviews
XtremIO is flash storage from EMC.
Quantum DXi Series
0 ratings
3 reviews
Quantum DXi Series is public company Quantum's deduplication solution.
Exagrid EX Series
5 ratings
2 reviews
The Exagrid EX Series offers a storage solution with deduplication.
RingLead DMS Duplicate Prevention (Unique Entry)
1 rating
1 review
RingLead DMS Duplicate Prevention (Unique Entry) enforces perimeter protection around B2B databases to stop dirty data in real time, at the source, and consistently maintain and improve the health of data.
IBM ProtecTier
ProtecTier from IBM is a data deduplication appliance from IBM.
Nexsan Data Deduplication Appliance
Nexsan, from Imation, offers data deduplication in their storage solutions.
Spectra nTier Deduplication Appliance
Spectra Logic's nTier data storage product line offers data deduplication.
Quest DR Series
Quest DR Series is a backup and recovery solution, featuring data deduplication.
Falconstor VTL
Falconstor Virtual Tape Library (VTL) is the eponymous data center backup and recovery solution from the company headquartered in Melville, New York.
Hitachi Protection Platform Backup Appliance
Since the 2014 acquisition, technologies formerly offered by Sepaton (with DeltaStor data deduplication software) are now part of the Hitachi Protection Platform.
Fujitsu Data Deduplication Appliance
Fujitsu offers data deduplication solutions.
Falconstor FDS
Falconstor FDS is a data management / deduplication solution from Falconstor.
Nexsan DeDupe SG
Nexsan DeDupe SG is a disk based backup and data deduplication option, from Imation company Nexsan.
Microsoft DPM Appliance UNPUBLISHED
In early 2015 Microsoft added data deduplication capability to their System Center 2012 R2 Data Protection Manager.
Clear Analytics
5 ratings
0 reviews
Clear Analytics is a business intelligence solution that enables non technical end users to perform analytics by leveraging existing knowledge of Excel coupled with a built in query builder. Some key features include: Dynamic Data Refresh, Data Share and In-Excel Collaboration.
StarDQ
StarDQ is a real time solution for cleansing, de-duping, and enriching enterprise data. By integrating StarDQ Solution, organizations can cleanse, match and unify data across multiple data sources and data domains. According to the vendor, the goal is to ensure that data is a strategic, trustworthy,…
DupeCatcher
DupeCatcher is a free de-duplication tool from the maker of Cloudingo.

Data Deduplication Tools Overview

What are Data Deduplication Tools?

Data deduplication tools are important for backup and restore operations where large quantities of data are backed up at regular intervals. Frequent backup always means copying and storing large data sets for recovery purposes. As much of this data is duplicate data, storing it all repeatedly would quickly lead to unmanageably large data storage requirements. It is essential to deduplicate these data streams to optimize data backup storage.

Deduplication is achieved by means of a deduplication algorithm which is capable of examining an incoming data stream, and comparing data segments to data that has been stored previously. However, there are several things to consider when looking for a product as not all deduplication products work in the same way:

  • Source versus target deduplication: Software running on a server which is the source of data is deduped before it is transmitted to the storage device. The advantages of this approach are that a smaller quantity of data is transmitted to the target storage solution and this method therefore uses less bandwidth for data transmission Source deduplication can increase processing time, which is often an important consideration in virtualized environments where there is a very large quantity of data duplication. The alternative is target deduplication where the data is all transmitted to a storage NAS device or tape library, and is deduped once it has been sent. This method reduces the storage capacity required for backup data, but does not reduce the amount of data sent across a LAN or WAN during backup.
  • Inline deduplication versus post-processing deduplication: Inline processing means that the deduplication process happens in real time as the data is being transmitted to storage. In post-processing deduplication, the backup data is all written to a disk cache before it starts the deduplication process.
  • Global deduplication: Global deduplication is an important consideration as most deduplication processes are designed to remove duplicated data from a single storage device. Global deduplication is removing redundant data across the entire data storage infrastructure. Global deduplication allows administrators to efficiently manage the entire backup data storage environment.
The benefits of data deduplication are primarily in reducing data storage requirements and hence costs. Deduplication also makes data restore operations more efficient since there is much less data to restore.