PowerCenter Review: "A powerful ETL solution which focuses on enterprise scalability, flexibility, and code re-usability"
- Enforces enterprise wide ETL development standards.
- Provides code re-usability with shared connections and objects.
- Particularly adept at integrating a wide range of disparate data sources (handles flat files particularly well).
- Well suited for moving large amounts of data.
- There are too many ways to perform the same or similar functions which in turn makes it challenging to trace what a workflow is doing and at which point (ex. sessions can be designed as static or re-usable and the override can occur at the session or workflow, or both which can be counter productive and confusing when troubleshooting).
- The power in structured design is a double edged sword. Simple tasks for a POC can become cumbersome. Ex. if you want to move some data to test a process, you first have to create your sources by importing them which means an ODBC connection or similar will need to be configured, you in turn have to develop your targets and all of the essential building blocks before being able to begin actual development. While I am on sources and targets, I think of a table definition as just that and find it counter intuitive to have to design a table as both a source and target and manage them as different objects. It would be more intuitive to have a table definition and its source/target properties defined by where you drag and drop it in the mapping.
- There are no checkpoints or data viewer type functions without designing an entire mapping and workflow. If you would like to simply run a job up to a point and check the throughput, an entire mapping needs to be completed and you would workaround this by creating a flat file target.
For small projects or even smaller development teams with mostly a single data source, expect frustration with being able to quickly test a solution as the design flow is very structured. It is also designed in a way that segregation of duties at a very high level can also cause small development teams to be counter-productive. Each step in the design process is a separate application, and although stitched together, is not without its problems. In order to design a simple mapping for example, you would first need a connection established to the source (example, ODBC) and keep in mind that it will automatically name the container according to how you named your connection. You would then open the designer tool, import a connection as a source, optionally check it in, create a target, optionally check it in as well, and design a transformation mapping. In order to test or run it, you will need to open a separate application (Workflow Manager) and create a workflow from your mapping, then create a session for that workflow and a workflow for those one or more sessions at which point you can test it. After running it, in order to observe, you then need to open a separate application (Monitor) to see what it is doing and how well. For a developer coming from something like SSIS, this can be daunting and cumbersome for building a simple POC and trying to test it (although from the inverse, building an enterprise scalable ETL solution from SSIS is its own challenge).
- PowerCenter processes input files, performs specified transformations, and maps the input data format to the output data format very quickly. The PowerCenter backend implementation seems to be optimized to process and map structured input records to structured output records and load the records into a database. One of the strengths of PowerCenter is performance of processing petabytes of structured input data files.
- PowerCenter does not require a software development experience or education. After providing initial hands-on training, the data consultants (who are statisticians, subject matter experts) in our organization were able to implement data ingest and data transformation tasks fairly easily.
- PowerCenter supports multiple DBMS technologies (for example, Oracle, Netezza). This flexibility allows it to be used by multiple departments within our organization.
- One of the challenges of PowerCenter is the lack of integration between the components and functionality provided by PowerCenter. PowerCenter consists of multiple components such has the repository service, integration service, metadata service. Considerable time and resources were required to install and configure these components before PowerCenter was available for use.
- In order to connect to various data sources such as Netezza database or SAS datasets, PowerCenter requires the installation and configuration of separate plug-ins. We spent considerable time trouble-shooting and debugging problems while trying to get the various plug-ins integrated with PowerCenter and get them up and running as described in the documentation.
- PowerCenter works well with structured data. That is, it is easy to work with input and output data that is pre-defined, fixed, and unchanging. It is much more difficult to work with dynamic data in which new fields are added or removed ad-hoc or if data format changes during the data ingest process. We have not been as successful in using PowerCenter for dynamic data.
- One of the challenges of learning PowerCenter is that it is difficult to find documentation or publications that help you learn the various details about PowerCenter software. Unlike SAS Institute, Informatica does not publish books about PowerCenter. The documentation available with PowerCenter is sparse; we have learned many aspects of this technology through trial and error.
PowerCenter is well suited for processing of large amounts of data that is structured and pre-defined. It is well-suited for large organizations that have the resources to install, configure and support PowerCenter. It is well suited for large organizations that have a large number of data consultants/analysts that do not have a software development/programming background.
PowerCenter is not a good fit for smaller, agile organizations that work with unstructured data and changing/dynamic data.
- It is quite flexible to handle different type of data formats like DB objects or Tables
- Good performance when processing lots of data in batch
- Easy to learn and use
- The client is quite heavy in terms of size and function
- Better way to upgrade
- Test and deployment automation needs to be improved
- Data migration from multiple sources. It handles any type of source data including RDBMS, flat files, XML, mainframe etc.
- Implementing data migration rules is very easy and efficient and reusable.
- Development and maintenance of code is very easy. Design, development and scheduling of full load and incremental load is very easy.
- Provides lot of features for developers to implement any kind of business rules.
- Not worth doing simple data migration using powercenter.
- Licensing cost.
- Handling Blobs and clobs data types.
- PowerCenter connects to multiple sources and targets with ease, and at the same time.
- PowerCenter allows the application of complicated business rule logic with ease through a user friendly interface.
- PowerCenter logic can be captured and shared in mapplets, enabling reusability.
- Documentation of how the different offerings in the Informatica product line work together as a total data platform, and licensing across those offerings. This information is difficult to come by and more often than not requires Informatica Professional Services assistance.
- Copy and paste of elements to provide additional documentation and coding capabilities.
- Interface with industry standards such as Microsoft Excel.
- The tool is excellent at pulling data from source systems, modifying it to fit a target system, and then pushing the data accordingly.
- It handles conversion of datatypes very well
- PowerCenter is very adapt at applying data filters and 'business rules' to source data before pushing the results to the end user.
- As a developer/integrator, i feel that PowerCenter could use improvement, and that is in the area of automated deployments. There isn't really any scripting options to automate back end deployments of ETL Workflows from one environment to the next. Instead, everything has to be done via the GUI (Graphical User Interface). And while this is a relatively straight forward process, it can be time consuming.
- There is no real interface between PowerCenter and programs that manage the encapsulation of password for System Accounts. All connections to source and target systems have to be updated via the Power Center Workflow Manager GUI, and entered manually. There is no interface with encryption programs such as MAC VAULT, and as such, admins are required to have access to passwords that information security departments might otherwise not want them to have.
However, if you are looking for a tool to simply look at data from one source, there are other products out there. This is really designed for aggregating data and filtering/manipulating it into useful information.
- Ability to work with different types of sources and targets.
- Version control of the code.
- Ability to integrate with LDAP for security.
- Ease of use.
- Performance with ODBC drivers can be improved.
- Memory utilization is very high during ETL execution.
- Improved scheduling capabilities for in built scheduler.