What users are saying about

Apache Spark

99 Ratings

SSIS

155 Ratings

Apache Spark

99 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 8.6 out of 101

SSIS

155 Ratings
<a href='https://www.trustradius.com/static/about-trustradius-scoring' target='_blank' rel='nofollow'>trScore algorithm: Learn more.</a>
Score 7.9 out of 101

Add comparison

Likelihood to Recommend

Apache Spark

Well suited:1. Data can be integrated from several sources including click stream, logs, transactional systems2. Real-time ingestion through Kafka, Kinesis, and other streaming platforms
No photo available

SSIS

If you need to move data around or direct the workflow of a process, SSIS can do it. It is a very capable piece of software that I use heavily every day. You do need to be careful because you can over-utilize it for simple things. If you just need to run a piece of SQL every hour to update some values, just use the Agent Scheduler, it's easier. But if you need to automate things in a repeatable and consistent manner, SSIS is a very good product.
Greg Goss profile photo

Feature Rating Comparison

Data Source Connection

Apache Spark
SSIS
8.2
Connect to traditional data sources
Apache Spark
SSIS
9.1
Connecto to Big Data and NoSQL
Apache Spark
SSIS
7.3

Data Transformations

Apache Spark
SSIS
8.8
Simple transformations
Apache Spark
SSIS
9.4
Complex transformations
Apache Spark
SSIS
8.1

Data Modeling

Apache Spark
SSIS
7.4
Data model creation
Apache Spark
SSIS
7.1
Metadata management
Apache Spark
SSIS
6.6
Business rules and workflow
Apache Spark
SSIS
8.4
Collaboration
Apache Spark
SSIS
6.9
Testing and debugging
Apache Spark
SSIS
7.8

Data Governance

Apache Spark
SSIS
7.7
Integration with data quality tools
Apache Spark
SSIS
7.7
Integration with MDM tools
Apache Spark
SSIS
7.7

Pros

  • We used to make our batch processing faster. Spark is faster in batch processing than MapReduce with it in memory computing
  • Spark will run along with other tools in the Hadoop ecosystem including Hive and Pig
  • Spark supports both batch and real-time processing
  • Apache Spark has Machine Learning Algorithms support
No photo available
  • SSIS can query, filter, and transfer data between databases on different servers without establishing explicit trust relationships between those servers.
  • SSIS can be used to refresh a reporting database from a transactional source database, transforming or flattening the data and tables as necessary to facilitate reporting. This can be done incrementally, or by emptying and refilling the reporting database from scratch.
  • SSIS is configured through graphical interfaces that make it relatively easy to see the flow of data including where problems occur.
  • SSIS has a number of tools that allow you to debug SSIS packages and track down problematic data or configurations.
  • SSIS allows you to program Script Tasks in C# and VB allowing extremely powerful functionality including looping and sending consolidated alerts.
  • SSIS allows you to control virtually every part of the SSIS package (connections, variables, etc.) using configuration files so you can have one package that can be used in several different places (such as dev, test, and production environments) only by editing the configuration file that the package uses when the job is scheduled.
Chris Morgan profile photo

Cons

  • Consumes more memory
  • Difficult to address issues around memory utilization
  • Expensive - In-memory processing is expensive when we look for a cost-efficient processing of big data
No photo available
  • Some components are not working very well, including sorting, SCD, etc.
  • Different components could have different syntax or data type definition.
  • Not enough scripting learning materials.
No photo available

Likelihood to Renew

No score
No answers yet
No answers on this topic
SSIS6.0
Based on 2 answers
A bit outdated compared to competitors, esp in the open source community
No photo available

Usability

No score
No answers yet
No answers on this topic
SSIS8.0
Based on 2 answers
Easy to use, however there are functionality limits
No photo available

Support

No score
No answers yet
No answers on this topic
SSIS9.0
Based on 3 answers
They are generally responsive and knowledgeable too
No photo available

Implementation

No score
No answers yet
No answers on this topic
SSIS10.0
Based on 1 answer
The implementation may be different in each case, it is important to properly analyze all the existing infrastructure to understand the kind of work needed, the type of software used and the compatibility between these, the features that you want to exploit, to understand what is possible and which ones require integration with third-party tools
Luca Campanelli profile photo

Alternatives Considered

We specifically choose Spark over MapReduce to make the cluster processing faster
No photo available
Unfortunately SSIS is the only ETL tool I have used by far. I used Dundas BI before, didn't like its ETL component, but that is more of a data visualization tool.
No photo available

Return on Investment

  • We were able to make batch job faster by 20 times as compared to MapReduce
  • With the language support like Scala, Java, and Python, easily manageable
No photo available
  • Comes free with SQL server so no additional license
  • Easy to learn and material is available for free, no need for expensive training
Waheed Abualrous, MCP, MCTS, MCE profile photo

Pricing Details

Apache Spark

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details

SSIS

General
Free Trial
Free/Freemium Version
Premium Consulting/Integration Services
Entry-level set up fee?
No
Additional Pricing Details