What users are saying about

Apache Spark

97 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.6 out of 101

Talend Data Integration

40 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 7.9 out of 101

Add comparison

Likelihood to Recommend

Apache Spark

Apache Spark has rich APIs for regular data transformations or for ML workloads or for graph workloads, whereas other systems may not such a wide range of support. Choose it when you need to perform data transformations for big data as offline jobs, whereas use MongoDB-like distributed database systems for more realtime queries.
Nitin Pasumarthy profile photo

Talend Data Integration

If your organisation or department works regularly with ETL jobs or sources data from multiple locations and needs to integrate them to provide for better data compatibility, then Talend Data Integration is a great tool and I would strongly recommend it. This mostly applies to established corporations, and not startups as the licensing fee is quite high.
No photo available

Feature Rating Comparison

Data Source Connection

Apache Spark
Talend Data Integration
8.1
Connect to traditional data sources
Apache Spark
Talend Data Integration
9.5
Connecto to Big Data and NoSQL
Apache Spark
Talend Data Integration
6.7

Data Transformations

Apache Spark
Talend Data Integration
9.3
Simple transformations
Apache Spark
Talend Data Integration
9.5
Complex transformations
Apache Spark
Talend Data Integration
9.0

Data Modeling

Apache Spark
Talend Data Integration
7.6
Data model creation
Apache Spark
Talend Data Integration
8.6
Metadata management
Apache Spark
Talend Data Integration
7.6
Business rules and workflow
Apache Spark
Talend Data Integration
9.0
Collaboration
Apache Spark
Talend Data Integration
5.7
Testing and debugging
Apache Spark
Talend Data Integration
7.1

Data Governance

Apache Spark
Talend Data Integration
8.8
Integration with data quality tools
Apache Spark
Talend Data Integration
9.5
Integration with MDM tools
Apache Spark
Talend Data Integration
8.1

Pros

  • Rich APIs for data transformation making for very each to transform and prepare data in a distributed environment without worrying about memory issues
  • Faster in execution times compare to Hadoop and PIG Latin
  • Easy SQL interface to the same data set for people who are comfortable to explore data in a declarative manner
  • Interoperability between SQL and Scala / Python style of munging data
Nitin Pasumarthy profile photo
  • JSON parsing, if you are into highly nested JSON object parsing Talend has the ability to display the structure of JSON and allows you to define the extraction logic in the metadata. Same with XML source.
  • Customization, For mostly all ETL work you will have a component in Talend. And in case you have a very specific requirement you can end up easily designing a custom component that you can code and reuse and share with others.
  • Talend has a connectivity option to almost all the databases (relational or NOSQ)L and sources available. It also has generic JDBC/ ODBC drivers in case you need it.
  • I like the ease of deployment across environment (DEV and PROD) with the use of context variables.
No photo available

Cons

  • Documentation could be better as I usually end up going to other sites / blogs to understand the concepts better
  • More APIs are to be ported to MLlib as only very few algorithms are available at least in clustering segment
Nitin Pasumarthy profile photo
  • The Talend Administration Console TAC is a great place to schedule and monitor your jobs. Probably the interface can be improved.
No photo available

Support

No score
No answers yet
No answers on this topic
Talend Data Integration9.0
Based on 1 answer
Good support, specially when it relates to PROD environment. The support team has access to the product development team. Things are internally escalated to development team if there is a bug encountered. This helps the customer to get quick fix or patch designed for problem exceptions. I have also seen support showing their willingness to help develop custom connector for a newly available cloud based big data solution
No photo available

Alternatives Considered

We specifically choose Spark over MapReduce to make the cluster processing faster
No photo available
Compared to Microsoft SQL Server Integration Services (SSIS) talend gives developers much more tools and flexibility in order to achieve different ETL processes. For instance, SSIS, separates processing from data management, and Talend mixes both stages so that you can perform complex processes like iterating sub-jobs for each data row. It also provides a huge component list compared to SSIS which allows retrieving and saving data from many various sources. The administration part is also wider than what is offered from SSIS. In other words, SSIS is like a toy compared to Talend Integration's capabilities.When comparing Talend with Kettle (Pentaho) it's is easy to find similarities because they are both similar tools. In my experience, I'd rather [use] Talend because, in my opinion, it is more focused at data management. Kettle is a component provided from a wider BI tool, and Pentaho is not only focused at data management. I also found Talend gives better performance and manages connection sockets better than Kettle.
Josep Coves Barreiro profile photo

Return on Investment

  • It has had a very positive impact, as it helps reduce the data processing time and thus helps us achieve our goals much faster.
  • Being easy to use, it allows us to adapt to the tool much faster than with others, which in turn allows us to access various data sources such as Hadoop, Apache Mesos, Kubernetes, independently or in the cloud. This makes it very useful.
  • It was very easy for me to use Apache Spark and learn it since I come from a background of Java and SQL, and it shares those basic principles and uses a very similar logic.
Carla Borges profile photo
  • Easy to build complex data transformations.
  • Licensing model isn't that flexible.
  • Memory mangement for huge volumes of information. You have to modify ETL designs to handle it properly.
Josep Coves Barreiro profile photo

Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

Talend Data Integration

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details