Informatica’s Intelligent Cloud Services (IICS) platform is a solution for synchronizing and integrating cloud and on-premise applications. It offers prebuilt connectors and actions between applications and programs, allowing for data transformation within the program, as well as case-specific services.
The software appears to run more efficiently than other big data tools, such as Hadoop. Given that, Apache Spark is well-suited for querying and trying to make sense of very, very large data sets. The software offers many advanced machine learning and econometrics tools, although these tools are used only partially because very large data sets require too much time when the data sets get too large. The software is not well-suited for projects that are not big data in size. The graphics and analytical output are subpar compared to other tools.
Informatica Cloud is a great tool for use when data must be formatted consistently. Once configured, it is very robust and reliable. It is also well-suited for an organization without a robust IT staff to maintain a full server infrastructure. It offers a cost-effective approach to high-quality data integration for even the largest organizations. Organizations without staff experienced in data analytics may find it challenging to take advantage of the more complex results of this tool.
Once the secure connection is established it’s quite easy to operate and create new jobs. The controls are simple, and we appreciate the fact there are not a lot of complex fine-tunings required. Navigation is also easy, and we enjoy the ability to open multiple tabs in the browser to work on multiple projects.
The monitoring functionality works well to help track the progress of the jobs, again, without too much complication. In a fast dev environment, speed is essential and we quickly seeing the status/progress of jobs as well as any errors if the jobs fail helps us maintain speed.
The web interface is a lot easier to interact with than the client/on-prem version. Putting much of the heavy lifting of interacting with the tool onto the shoulders of the browser makes it easier to keep multiple sessions open and get in/out quickly without having to VPN into the office.
The only thing I dislike about spark's usability is the learning curve, there are many actions and transformations, however, its wide-range of uses for ETL processing, facility to integrate and it's multi-language support make this library a powerhouse for your data science solutions. It has especially aided us with its lightning-fast processing times.
1. It integrates very well with scala or python. 2. It's very easy to understand SQL interoperability. 3. Apache is way faster than the other competitive technologies. 4. The support from the Apache community is very huge for Spark. 5. Execution times are faster as compared to others. 6. There are a large number of forums available for Apache Spark. 7. The code availability for Apache Spark is simpler and easy to gain access to. 8. Many organizations use Apache Spark, so many solutions are available for existing applications.
I've never had trouble getting into contact with Informatica's support for technical help. I give it a nine because it does pretty well for mid to enterprise-scale workflows.
Spark in comparison to similar technologies ends up being a one stop shop. You can achieve so much with this one framework instead of having to stitch and weave multiple technologies from the Hadoop stack, all while getting incredibility performance, minimal boilerplate, and getting the ability to write your application in the language of your choosing.
First, the wizard is easy to use making the learning curve for simple ETL tasks nice. Second, since Informatica is mature there are a good variety of connectors available. Finally, we have driven some fairly complex ETL solutions using only the cloud.