Administration of Hadoop cluster - Cloudera, Datameer
Chose Datameer
Datameer was not chosen to exclude other tools. Other tools are still used and gaining traction due to their suitability to a task and due to cost-effectiveness of license models. Every organization needs to determine the best approach to the data flow into and out of Hadoop, …
Without the need for technical subject matter expertise or in-depth technical knowledge, Datameer has enabled me to swiftly convert data and develop insights utilizing data from a variety of sources. Datameer is an ETL software that may be used with little to no knowledge of programming. Users with no technical background may construct transformations and sophisticated SQL logics with a few mouse clicks using the intuitive graphical user interface.
For quick daily integrations Talend is a very good tool and it makes development time so short and easy. Citizen developers who are not great programmers can pick up and start using Talend Open Studio within weeks. It's well suited for all kinds of data migration between various systems. It is less appropriate for smaller synchronous services where you need to trace the complete transaction and how data moved between them. It's also less appropriate for small data movements where other tools can be easier to use and manage.
The community is not that up to date and forum is not that great in response. Probably we should make people aware of the tool more on how to use and its implementations.
Talend crashes when transforming a lot of data (millions of rows).
Proper training documentation is a must for talend which is currently lagging. This will help users to learn more about Talend and use it effectively.
We have already renewed our Datameer license for year number 2. They have come a long way since their initial release and I am expecting them to be a top BI tool in the coming year. The tool was built by one of the original creators of Hadoop. Therefore fits on top of a Hadoop cluster like a glove
There is no licence requirement for Talend Open Studio. So, this is not relevant question. However, if you are asking whether we will use Talend in future. Yes. We will continue to use it. It's very powerful free tool which caters to all our extra, transform, load capabilities. We just love Talend for it's great functionality and ease of use.
Talend Open Studio is based on Eclipse and is full of redundant procedures to do one thing, like when installing libraries. Sometimes I cannot manually download the libraries that it can't find.
Many times, Talend freezes. When you give a cancel command, it takes several minutes to stop. It also takes a great toll on our PC with 16 GB of ram and I7 CPU, even in idle status. If you are downloading Maven Jar/Libraries, you cannot do anything and have to wait until the task is finished.
Talend Open Studio is free and we are not using the enterprise version which comes with licence and support. So, mostly depend on the open source community for any issues that we face. The document is good and we didn't have to use any support so far. We did evaluate the enterprise version and so far sticking to the free version.
Based on our findings, we are unable to utilize the Apache Hive platform due to the associated costs. We also looked into Informatica, but decided against it because of its expensive pricing and poor integration with other BI products. Those unfamiliar with SQL may nevertheless use this tool to build complex transformation processes after a brief onboarding session.
Informatica has a limited number of components that you can use. This places a heavy limitation on the capabilities of Informatica. On the other hand, Talend allows you to create your own custom components using Java. For businesses that need to perform a wide variety of data operations, it can be quite useful to have the option of creating your own custom components to satisfy business needs.
I delivered projects the client did not believe were possible, and I provided intermediate value by providing visibility to hidden data problems in their systems they could not detect before.
I was able to work 3 projects at a time, pausing gracefully in one while switching to the other, with minimal effort.