Fast machine learning with H2O
Overall Satisfaction with H2O
H2O was used as an analytical tool, with easy to access machine learning functionalities. The data science team comprises different people with different backgrounds and abilities to code. We used H2O as an easily trained on, highly accessible tool for beginners in the AI area. As an open source version, it is good for small projects and trials in data analysis, scoring, clustering, and predictive modeling. It is a really fast tool and also runs on older hardware.
Pros
- Excellent analytical and prediction tool
- In the beginning, usage of H20 Flow in Web UI enables quick development and sharing of the analytical model
- Readily available algorithms, easy to use in your analytical projects
- Faster than Python scikit learn (in machine learning supervised learning area)
- It can be accessed (run) from Python, not only JAVA etc.
- Well documented and suitable for fast training or self studying
- In the beginning, one can use the clickable Flow interface (WEB UI) and later move to a Python console. There is then no need to click in H20 Flow
- It can be used as open source
Cons
- No weaknesses found yet
- This is not really a drawback, but rather a warning - the Drivereless AI is not a replacement for a data scientist yet, and will not replace data scientists in the next decade neither. The Driverless AI feature delivers reliable results only if the analyst is sure about the meaning of input data. The data quality is usually a major issue and no tool can detect the meaning of data in the input. Data scientists are also required for business interpretation of the findings. So be careful, and do not rely on this feature without a good understanding of what it really does in each step.
- By using H2O the analyst can focus on analysis itself, not spend too much time with coding etc.
- Reuse of algorithms and easy model sharing saves time and money
- An easy learning curve assures low training costs
- By moving to a paid version, even the Driverless AI, you will still need data scientists and analysts, but maybe not so many!
Both are open source (though H2O only up to some level). Both comprise of deep learning, but H2O is not focused directly on deep learning, while Tensor Flow has a "laser" focus on deep learning. H2O is also more focused on scalability. H2O should be looked at not as a competitor but rather a complementary tool. The use case is usually not only about the algorithms, but also about the data model and data logistics and accessibility. H2O is more accessible due to its UI. Also, both can be accessed from Python. The community around TensorFlow seems larger than that of H2O.
Comments
Please log in to join the conversation