8 Replies Latest reply on Feb 26, 2018 5:20 PM by Kamal Haria

    Data Masking

    Kamal Haria Active Member

      Hi

       

      I want to know apart from TDM, is there any other way to mask certain elements of XML file. I need to mask few elements of the XML file before passing it to the other team. Currently I am using Informatica IDQ and PowerCenter

       

      Thanks & Regards

       

      Kamal

        • 1. Re: Data Masking
          Robert Whelan Guru

          Hi

          Do you simply want to block them from being able to view certain data or does it need to be masked in a manner which allows it be unmasked in subsequent processing?

           

          If the aim is imply to block the data replacing it using the parser or standardizer would work. If you need to be able to process or unmask the data, there is a Data Masking Tx available in Developer, but it requires a Data Masking option on your license.

          • 2. Re: Data Masking
            Kamal Haria Active Member

            Hi Robert,

             

            I want the data to be masked in a manner which allows it to be unmasked later.

             

            Yes I have found that Data Masking requires license in Developer.

             

            Thanks Anyway

             

            Regards

             

            Kamal

            • 3. Re: Data Masking
              Nico Heinze Guru

              PowerCenter offers several functions which you can use to perform data masking "manually". For example, you can replace values from a lookup table. Or you can combine this lookup with a hash value (in order to make it reversable). It just means you have to "build" your masking rules yourself in PowerCenter.

               

              If you have any specific masking requirements which you don't know how to implement, let us know, we'll try to help.

               

              Regards,

              Nico

              • 4. Re: Data Masking
                Kamal Haria Active Member

                Hi Nico

                 

                Thanks for your input and help. Yes, currently I am trying manually by replacing / substituting / appending the values. I hope it works.

                 

                It would be great if you can provide some link on how to lookup with a hash value as I have never tried it before.

                 

                Thanks & Regards

                 

                Kamal

                • 5. Re: Data Masking
                  Nico Heinze Guru

                  I don't have any such link, I can only describe here what I have in mind.

                   

                  Hash values always have the inherent disadvantage that they cannot be unique (no matter what some other people claim, this is as per their design). So it can always happen that two distinct input strings have the same hash value.

                  There are several ways around that. Probably the easiest way is to use two completely distinct hash functions, but - as mentioned above - this is no 100% safe approach; it is pretty unlikely that e.g. two distinct input strings will yield the same MD5 and CRC32 hash values, but it is not impossible.

                   

                  So what I would try is to use the built-in PowerCenter Transformation Language functions MD5() and CRC32() to calculate two different hash values for each input value. Then use these two hash values to find a corresponding entry in a lookup table.

                   

                  This means you have to perform this kind of masking in two steps. In the following example I assume you have to mask 320,000 person names (contact addresses) in the XML file.

                  In the first step you read the first and last names from the XML source and calculate MD5() and CRC32() hash values for each input record. Write these two hash values into some lookup table as two separate attributes. Name this table HASH2CONTACT.

                  Now you somehow need a "replacement" lookup table for your input data. In this example, you need some data base table with 320,000 "fake" entries. Let's assume this table is named REPLACE_NAMES.

                  Once you have this table at hand, pick up these 320,000 records in any order (it just must be a unique order) and assign them to the pairs of hash values in the first table; this means you have to UPDATE each record in the HASH2CONTACT table with the "fake" name from REPLACE_NAMES.

                   

                  Now you are ready to perform the masking.

                  Read your XML source file again (this time the complete records); for each first and last name, you calculate MD5() and CRC32() values and use these values to look up the replacement name in HASH2CONTACT.

                  Forward these replacement data to the target XML file.

                   

                  Regards,

                  Nico

                  • 6. Re: Data Masking
                    Kamal Haria Active Member

                    Hi Nico

                     

                    Thanks a lot.

                     

                    Regards

                     

                    Kamal

                    • 7. Re: Data Masking
                      Nico Heinze Guru

                      Hmm, writing "Thanks a lot" could mean quite a few different things.

                      It could mean, "Thanks a lot, that answered my question."

                      It could also mean, "Thanks a lot, you puzzled me so heavily that I've given up to understand what you were writing about."

                      Or anything in between these two extremes.

                       

                      Which one was it?

                       

                      Regards,

                      Nico

                      • 8. Re: Data Masking
                        Kamal Haria Active Member

                        Ha Ha Ha. It means you have answered my question and have given me a think on an alternative way to approach this issue

                         

                        Thanks once again

                         

                        Regards

                         

                        Kamal