4 Replies Latest reply on May 1, 2021 1:25 PM by Nico Heinze

    Load same number of records in both tables.

    asagpariya_ciber Active Member

      Hi Team,

       

      We have requirement where we have 2 target tables. One has all clumns + Pks. Another table holds only PK. Source has around 3000M records.

       

      Requirement is that both tables should be loaded same number of records at a time. If for any reason job fails in between or we abort the job and when we need to restart rather reloading all records, we can use 2nd table (with only PKs) to join with source and extract only those which were not loaded.

       

      Issue is, both target table load different number of records at a given time. When we abort the job. both shows different number hence it is not working as expected.

       

      Any suggetion?

        • 1. Re: Load same number of records in both tables.
          JanLeendert Wijkhuijs Seasoned Veteran

          Hi,

           

          I'm not sure your proposed way of working will ever work, maybe only if you minimize the commit interval but that will impact performance in a negative wat which you also don't want with those kind of numbers of records.
          You are working with 2 tables so 2 datasets and basically you expect powercenter and the rdbms to keep the number of records synchronized while the full records consume much more memory and cpu power to enter a record in the table.

          In this case I would choose to use a view on the full table which contains only the PK of the records.

          Regards,

          JanLeendert

          • 2. Re: Load same number of records in both tables.
            Nico Heinze Guru

            It is possible under certain assumptions. The key term to look up in the PowerCenter manuals is User-Defined Commit, the respective chapter in the Advanced Workflow guide is named Commit Points.

             

            The most important point is that you insert a Transaction Control transformation which feeds both target tables. Make sure you issue a TC_COMMIT_AFTER after every e.g. 5000 records. If you rely on the Target Based Commit size in such a case, you cannot guarantee that the same numbers of records are written to the target tables within each block of data, hence this can't work for you. You need to set the session to User-Defined Commit (hence the Transaction Control).

             

            There are a few more technical requirements you have to look for. Please check the paragraph named Understanding Transaction Control Units. It's no rocket science, but you have to follow the guidelines carefully, otherwise the commit control on both target tables won't work reliably (and that's surely not what you need).

             

            Regards,

            Nico

            • 3. Re: Load same number of records in both tables.
              asagpariya_ciber Active Member

              Hi All,

               

              Based on your reply and with additional Google, I found solution. Instead of Target Based Commit, I have used Source Based Commit and it works.

               

              Since there is no ransformation/filter in between, in this scenario "Source based commit" works as expected.

              • 4. Re: Load same number of records in both tables.
                Nico Heinze Guru

                Not a safe approach. The point is that the buffers for the two target tables probably differ in size, and that means that for example the buffer for the PK table may hold 13,000 records while the buffer for the full table may be able to hold 135 records. In such a case it can easily happen that in the PK table no records are written at all while e.g. 2,700 records are written to the full table.

                 

                Until now you've been plain lucky that it worked fine for you.

                 

                Regards,

                Nico