A major difference between PowerCenter and Data Engineering Integration (DEI) would be in the ability to pushdown mapping logic into the Hadoop cluster using Blaze/Spark engines in DEI which is not present in PowerCenter.
Also, DEI is built on top of services that are commonly used in Data Quality like Data Integration Service and Model Repository Service.
Please see below link for more details -
If you might have a certain specific area in mind, please do let us know.
To add on to Krishnan's note. DEI is not just hadoop based execution but also can push down into databricks, google dataproc and others.
not can DEI push to spark clusters it can also process data natively exactly like PowerCenter. The same transformations against in DEI that exist in PwC but more. DEI can tap natviely into DEQ (Data Qaulity) and directly imbedded quality rules with not having to export out anywhere.
DEI also is the newer platform for on-prem ETL tools within Informatica so has newer capabilities like REST API consumer which PwC does not (some will argue you can do via HTTP but lets be honest it is a hack).
Happy to answer any other questions if you have them
Informatica Product Specialist - Data Engineering Cloud Elastic
thanks for the response.
So, in generally, could we say that DEI is an advanced PowerCenter?
In other word, all that we can to do with PWC we can to do it also with DEI but not true the opposite (but not vice versa).
However, in meantime, I look at documentation link posted/suggested by Krishnan for better understand DEI tool and hadoop feature in order to do more detailed questions.
Thanks in advance.
We shoould never compare PC with DEI. Both are designed and developed based on different requeirment.
Yes, whereas in DEI, you can achieve most of the things which work on PC will even work on DEI. But DEI will best fit to those use case only when yours use case have source or targets are from HADOOP echo system and when you want to run your mapping logic on SPARK/DATABRICKS Pushdown engine to take the advantage of Distrubuted system and parallel job processing .
Now it's all that more clear.
Thanks to all for your response.
Ok have to chime in here. To counter Puneeth's point. While DEI is designed specifically for Spark based processing it is not only designed for that.
DEI is built on the mercury platform which started as Informatica's Data Quality solution. This solution is a complete ETL tool in itself. IDQ allowed customers to do complete native based ETL exactly like PowerCenter. yes the engines themselves were slightly different and may have processed things a tad differently from a backend code but the result was the same.
So since DEI was built on this framework, DEI can also do native based processing (yes requires a license) as well as Spark/hadoop/databricks execution.
You may ask why. Well certain transformations that Informatica supports, Spark does not. Take example the web service consumer. This transformation is not supported by spark so DEI customers who need to something with SOAP run those mappings natively.
Also being able to process natively allows customers to have a single solution verse jumping between DEI and PwC or DEI and Informatica Cloud.
There are many reasons and benefits to every side of this question.