Item: Treasure Data
Rating: 9
Author: Giacomo Vannucchi

Use Cases and Deployment Scope

Treasure Data has allowed us to build an end-to-end Big Data pipeline in only few months, without a dedicated data engineering team. The main business problem we were able to address was how to unlock value from our 150+ million records MySQL Database, updated at a rate of 1 billion events per month. The Data team needed a quick and flexible way to explore, interrogate and train models with minimum impact on the company's development team.

In addition to this, we have used Treasure Data as an in-house web tracking and analytics solution by leveraging their Google Tag Manager integration. This has allowed the us to support Product Marketing teams in providing an independent and flexible way to track website usage and marketing campaigns performance.

Pros and Cons

Flexible, easy and really schema free data landing/ingestion part. No need to worry about changes in the data. Treasure Data storage solutions adapt very well and gives you a lot of flexibility.
Amazing Support! The guys are really great and available to answer most of the questions. Most of the time within less than an hour from when tickets are opened.
Support for a lot of data integration solutions and continuous development of new solutions.

Some documentation can be improved. Would be good to have more practical examples
Website homepage/UX are changing all the time. I would like it to be a bit more consistent simple and minimal. Also, why do I have to log in again every time I open an new tab? It wasn't like this before.
It would be nice if it was easier to work with fields which has been saved as a Json. Would be great to have more support on this. Not sure if this is a limitation with Hive and Presto or with Treasure Data.

Return on Investment

I don't have estimates for this

Innovative Uses

Google Tag Manager tag
Recommendation engine project using Hivemall ML libraries
using Luigi to manage Treasure Data jobs, we ended up adopting Luigi also for out MySQL-to-MySQL pipeline

Alternatives Considered

this was the first Cloud analytics solution I have worked with. I have chosen it because development team is using FluentD internally, and also because at the time the Business decided not to use AWS because of what our particular business model was at the time (deals against Amazon).

Likelihood to Recommend

A mid-sized start up is an ideal scenario because it allows you to build data pipelines without having to maintain infrastructure internally, without the need of too many dedicated resources, and at the same time offers a lot of flexibility and it is ideal for an environment where there is a lot of different technologies used.

Built end-to-end Big Data pipeline in few months without dedicated engineering team

Overall Satisfaction with Treasure Data