4 Replies Latest reply on Jun 8, 2017 12:11 PM by inuser466392

    tuning questions, please help me

    New Member

      I am Informatica Developer and currently working on Informatica tuning project. I met two problems and do not understand them and after google, I still cannot solve or understand what I did wrong and need experts give me suggestions.

      First mapping having 7 tables Oracle tables as source, three tables have data between 300k to 500k records, the rest are relatively small. When I am looking at session log, on the aggregation transformation and joiner transformation, There are having following:

      TE_7212 : Increasing [Index Cache] size for transformation [mplt_FileWatcher.aggr_same_error] from [3878787] to [3913728].

      TE_7212 : Increasing [Data Cache] size for transformation [mplt_FileWatcher.aggr_same_error] from [7757457] to [7757904].

      So I increased that aggregator data cache to 200000000 and index cache to 100000000. then I ran workflow, this time session log having:

      TE_7212 : Increasing [Index Cache] size for transformation [mplt_FileWatcher.aggr_same_error] from [100000000] to [100002456].

      TE_7212 : Increasing [Data Cache] size for transformation [mplt_FileWatcher.aggr_same_error] from [200000000] to [200001624].

      I do not understand, what is the reason integration service increased cache automatically? I repeated increased cache memory again, Every time during the run time integration services increase cache slightly larger than the cache memory I changed. I am using the same set of data. I suppose the cache memory should remain. Does the cache memory increase really help performance in this case?

      Another tuning is because data volumes 500K data, the whole session take around 20 minutes in PROD, From session log it seems writing target tables take long time. I am thinking using session partition.

      Source qualifier only has limited partition option: database_partition; pass_through, key_range. Since database are not partitioned, and we do not clear key value to split the data, I think probably I should use pass through, But suppose pass through partition we need put filter condition for each partition, otherwise, each partition pulling extract full set of data. It equivalent partition number * total row number. I tried some suggestion using rownum in filter condition, It does not working. Only first partition return rows, the other partition return zero.  My question is how I can use session partition in this case? How can pass through partition help performance, if data do not have valid column to split data?


      Thank you very much for help, and thanks ahead.



        • 1. Re: tuning questions, please help me
          Suraj Thygarajan Paramasivam New Member

          Hi Linda,


          First things first, I think you have wrong tags for this issue(Complex Event processing, rulepoint). I think if you use Data integration or power center you would get more responses since this is related to PwC as I understand.


          Now coming to your aggregator issue, in PwC aggregator is a blocking transformation, which means you need all records to come in before you can do aggregation on the data. Now, my advice would be to first look at the need for the aggregator, and then determine if you can replace it with something else. If you cannot replace the aggregator with any other transformation, try to see if you can pre-sort the data at the source and use the aggregator as a sorted aggregator, which will increase your performance. A sorter before aggregator might help, but using a sort at the source might be a better option.


          As far as your question on partition goes, I am not able to understand the question completely, but I assume you are a little confused between session partition and database partition. The session partition  uses pass thru partition differently and it uses different threads of the same session, however if you are using database partition,the actual partitioning is to be used on the database !




          • 2. Re: tuning questions, please help me
            Apsar Shaik New Member

            Hi Linda,


            The better option when you have aggregator, joiner or lookup transformations in your mapping is, to use the AUTO option for cache sizes. Keeping the property to AUTO, you are letting integration service to decide how much data/index cache is required for your transformations. This will be done during the run time based on the number of source records.

            Coming to the partitioning, you said you have the bottleneck with writing the data to the target database. So I am little confused when you are talking about partitioning at the source qualifier level. You didn't mention what kind of write you are performing. Is that all inserts or updates or both. Also, check for the indexes on the target database if you are doing updates.

            You didn't mention whether you are having Informatica grid with multiple nodes or you have Informatica on a stand alone server. These configurations also have impact on the session runs. If you have GRID with multiple nodes then running the workflow with the integration services configured to run on grid will have better results compared to stand alone servers.


            Let me know if you have more questions.



            • 3. Re: tuning questions, please help me
              New Member

              Hi Linda,


              Regarding second issue, Rownum filter work for less than (<) operator only. Rownum filter on greater than (>) will always retrun false hence your other two partitions are reading zero records.


              Now as part of solution , you can read the data in one partition from source and then use expression transformation and partition the rows using round robin partitioning. If you are aggregating the data in pipeline then do not forget to apply hash auto key partitioning..


              Hope it help.

              • 4. Re: tuning questions, please help me

                Sorry to say that the AUTO memory setting for caching transformations works completely differently than has been suggested: in fact the Integration Service will decide (based on two values on the Config tab, if memory serves me right) how much memory to allocate for the buffer pool; then this memory will be distributed among all AUTO transformation instances.

                Let's for example look at a Lookup transformation which would need 500 KB to cache 1,500 records from a lookup table. If we set this Lookup transformation to Auto memory, then the Integration Service might allocate 250 MB for index cache and 250 MB for data cache, a total of 500 MB. Whereas with exact settings the LKP would only need 500 KB. This means a waste of 499.5 MB of memory. Per session partition per caching transformation.


                In short: you should always look at the amount of data to be cached, the number of sessions running in parallel at peak times, and the system memory of the server. Then you should decide whether to go for explicit memory settings or for AUTO.