5 Replies Latest reply on Nov 14, 2017 9:05 AM by ChandrapalReddy Borra

    Unable to get consistent results using substitution Masking in Informatica TDM

    ChandrapalReddy Borra New Member

      I am trying to Mask Patient Names(First Name, Last Name) using substitution Masking. The record counts are different in different tables across different databases(SQL, Oracle, Netezza) with different column names. I am trying to get a same masked value for a same source patient names across all Databases. I am doing In-place Masking in few places and In-stream Masking in few places. All I want is same masked value for a patient across all QA databases.

       

      >I am using Informatica TDM 10.1

      >I am using Same Masking rule for patient names across all projects (For Names, I am doing convert to Upper, LTRIM, RTRIM to treat input similar) 

      >Same Seed Value 501 in Masking rule. (I don't know what that seed value is exactly doing).

       

      *Note: In few projects I am able to get consistent Masked values while using Workflow with different sessions for masking different tables having similar table names and patient name columns)

       

      I would appreciate any answers and questions..

        • 1. Re: Unable to get consistent results using substitution Masking in Informatica TDM
          sunny K Active Member

          Hello,

           

          1. what was the masking rule you are using here?

          2. Are using standard Name dictionary file that comes part of installation?

          3. Make sure the Share storage table option is CHECKED in under Plan Settings-->Advanced Settings.

          4. Seed value along with input values will create a hash value which matches with the substitution.dic file SNO and assign the corresponding value. Seed value will make sure that it always generates the same hash value and pull the same SNO from the dictionary file so that the masked name will be consistent al the time.

           

          Thanks

          • 2. Re: Unable to get consistent results using substitution Masking in Informatica TDM
            Nico Heinze Guru

            If you use exactly the same masking rule for all these projects with identical settings, then you should be able to get consistent masking results (otherwise masking across several tables of course wouldn't make any sense). However, that requires that the lookup table (=substitution table) is the same for all masking processes.

             

            Regards,

            Nico

            • 3. Re: Unable to get consistent results using substitution Masking in Informatica TDM
              ChandrapalReddy Borra New Member

              I have checked Share storage table option is CHECKED in under Plan Settings-->Advanced Settings and ran it in the debugger, I am getting same masked name value for a patient but in the target the row never gets updated.

              I have netezza source and netezza target. I am using netezza bulk writer.

              I am using same Masking rule, seed value is 501, same dictionary

              I am using update as update and insert option in the session properties.

              I will try truncate and load next.

               

               

              Regards

              Chandra

               

              • 4. Re: Unable to get consistent results using substitution Masking in Informatica TDM
                sunny K Active Member

                hello,

                 

                Netezza works better with instream. Neezza does not use the keys while updating even though you define a key in the plan.

                 

                I would sugget you to do instream.

                 

                Thanks,

                sandeep

                • 5. Re: Unable to get consistent results using substitution Masking in Informatica TDM
                  ChandrapalReddy Borra New Member

                  I have tried In-stream masking on Netezza from Prod to QA. I have picked 2 tables in netezza from different databases. 2 Tables will have similar/different column names, but data would be similar(patient demographics). I have used same masking rules and I have generated a single plan for two tables. I have checked share storage table option and I am able to get consistent masking results across the 2 tables.

                   

                  I took similar tables in SQL Server and I have to do In-place masking. I used same masking rules for all the similar PHI columns. I am getting different masked values.

                   

                  EX:

                  In Netezza (In-Stream)(different volumes)

                  Table1: Patient A---->(masked to) Patient B

                  Table2: Patient A---->(masked to) Patient B

                   

                  In SQL Server (IN-Place)(different volumes)(different Project Folder)

                  Table1: Patient A---->(masked to) Patient C

                  Table2: Patient A---->(masked to) Patient C

                   

                  I see, the storage connection is similar for all the plans generated.

                  Same seed value=501 and same masking rules.

                   

                  *Note: [I tried this test lastly; I have created a subset and added all the 4 tables from Netezza and sql server and managed to generate a plan and changed connections config as per IN_PLACE and IN_STREAM in the session level in the same workflow and Executed. I got consistent results here for all 4 tables]

                   

                  But I can't do the above as I need to mask them seperately when needed.

                   

                  Thanks

                  Chandra