Item: Treasure Data
Rating: 8
Author: Verified User

Use Cases and Deployment Scope

I am part of a team that has been involved in the development of a customer data platform for an international consumer good producer.
The client has decided to adopt Treasure Data among other technologies. Using Treasure Data we are able to collect datapoints from different sources and rearrange/propagate them according to the datamodel designed by the business team.
The propagation of data is executed by scheduled workflows.
There exists a native segmentation tool, on top of this, it was necessary to develop workflows responsible to do deduplication of customer, housekeeping of datapoints, validation rules, data quality indexes etc.

Pros and Cons

Simple to set up and to use
Support is fully available
Cloud technology --> almost nothing is on prem

Documentation is not always fully update --> better off reaching to support for some topics that are not covered
Small bugs on the graphical user interface
If 2 people are editing on the same project simultaneously, the latter that saves the workflow overwrites the changes of the former one

Return on Investment

When there are Treasure Data updates, there might be old functions that are deprecated or existing functions which no longer work as before --> this may have impact on existing workflows/queries
As many developers are working on the same environment, the jobs are queued because there is a limited amount of computation cores available --> if we want to increase it, our client needs to pay for more cores
As data are increasing, some workflows are too expensive and need to be rethought / made more efficient --> this means re-designing existing workflows and also requires constant support from Treasure Data which analyzes the queries and identifies points of improvement that allows client to pay less

Innovative Uses

For GDPR, there is a workflow that automatically sends a PDF report translated into the client native language containing all their information/data
To schedule different workflows, a new scheduler was created without using native scheduling functions
Used deduplication rules running in Python to identify potential double accounts from the same customer

Likelihood to Recommend

When there are many different data sources and the amount of data is huge and we need to create segments Treasure Data is the right tool as it has a growing list of connectors as well as it supports Hive
May not be the right tool to run simple python code --> not as easy as SQL code (Presto or Hive).

Good platform to analyze data

Overall Satisfaction with Treasure Data

Use Cases and Deployment Scope

Pros and Cons

Return on Investment

Innovative Uses

Likelihood to Recommend