1 Reply Latest reply on Aug 17, 2021 8:12 AM by Nico Heinze

    How to load only new data from source to target in  IICS

    Retro Flame Guru




      Please consider the above picture as an example, I am running a mapping from Azure Synapse to Anaplan. In the mapping task I am having a filter SCOPE_ID = 'MN' to only load the monthly data(MN for monthly and DY for daily). Currently I am doing a full load, how can I run the mapping task so that only the new data gets loaded so that its a delta load and not a full load? Greatly appreciate if someone could fill me in detail about the steps involved in this.



        • 1. Re: How to load only new data from source to target in  IICS
          Nico Heinze Guru

          In general that depends on whether you have some "flag" which tells you which records are new since your last load.

          Such a "flag" often can be derived from e.g. some "creation date". This would mean that your initial load would query for all "creation date" values since the beginning of time. After the successful initial load, some persistent value (such as a mapping variable in PowerCenter, no idea whether this exists in IICS, I guess so) must be updated to the start time of the mapping run. Now during your next load, only "creation date" values greater than or equal to this persistent value would be retrieved from the source. After this successful run, the persistent value needs to be updated once more (namely to the start time of this particular mapping run). And so on.


          If, however, there is nothing available like a "creation date" in the source data, then you have to resort to some other mechanism. For example, if you have access to the transaction logs of the source system, then you could use that same approach based on the date/time of each entry in the transaction logs (or some other mechanism, there are several ways to achieve the same).


          Among others, Informatica offers a so-called PowerExchange CDC (Change Data Capture) for several common source DBMS (Oracle, Db2, SQL Server, if memory serves me right). You could use that (of course that costs money) to access the transaction logs of your source system in order to get only new records.


          If there's nothing available like these, then you will have to find some other mechanism. For example, if each source record has a primary key, then you could save the primary key values of all records processed so far in some database table (or file or whatever) during initial processing. Whenever you start a delta load, you could try to source only those records which have primary key values not yet in your target database. If that's physically not possible, then you will have to look up the primary key values of each source record in your target table and filter out everything that's already there. Not nice but that works.


          That was a very rough and short walk through the base approaches. I hope you get the gist of it. Please don't hesitate to ask more questions you might have.