TrustRadius: an HG Insights company

What is Apache Gobblin?

A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems. It is open source and free to use under an Apache 2.0 license.

Capabilities
  • Ingestion and export of data from a variety of sources and sinks into and out of the data lake. Gobblin is optimized and designed for ELT patterns with inline transformations on ingest (small t).
  • Data Organization within the lake (e.g. compaction, partitioning, deduplication)
  • Lifecycle Management of data within the lake (e.g. data retention)
  • Compliance Management of data across the ecosystem (e.g. fine-grain data deletions)

Technical Details

Technical Details
Mobile ApplicationNo

FAQs

What is Apache Gobblin?
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and batch data ecosystems. It is open source and free to use under an Apache 2.0 license.
How much does Apache Gobblin cost?
Apache Gobblin starts at $0.