5 Replies Latest reply on Nov 24, 2020 9:13 AM by Nico Heinze

    Flat files with different multiple headers and footers

    Subrahmanyam Yenumala New Member

      I have flat file in the below format with multiple headers and footers and I need to Mask/De-identify random columns.

       

      H Col1 Col2 Col3 Col5

      D XX1 XX2 XX1

      F 000001

      H Col2 Col1 Col3 Col4

      D XX1 XX2 XX2

      D XX1 XX2 XX1

      F 000002

      H Col4 Col3 Col1 Col5 Col2

      D XX1 XX2 XX1

      F 000001

       

      And in my target I need all the rows and columns.

      How to handle different multiple headers or unstructured headers for masking or load data into target flat files.

        • 1. Re: Flat files with different multiple headers and footers
          Nico Heinze Guru

          There are several different approaches to such requests, but in order to give good advice we will need more information from you.

          Can you please provide a complete example for the intended process, such as an input file and the expected output? Or several potential outputs? Plus, of course, the "rules" how to identify the values to extract.

           

          Regards,

          Nico

          • 2. Re: Flat files with different multiple headers and footers
            Subrahmanyam Yenumala New Member

            Hi, Here is the sample source and targets I am looking. Having number of different headers format in a single file. But I need to mask only few PHI columns like Names, Address and so on..

             

             

            H SourceCode, MemberID, FName, LName, SSN, Address, City

            D AX, 1234, Sam, Yen, 987654321, 3456 Fremont Dr, Fremont

            F 1

            H SourceCode, LName, FName, SSN, MemberID

            D AX, Yen, Sam, 987654321, 1234

            D ER, Kim, Lanx, 983747848, 2134

            F 2

             

             

             

             

             

             

            H SourceCode, MemberID, FName, LName, SSN, Address, City

            D AX, 1234, Robert, Antony, 123456789, 3456 Fremont Dr, Fremont

            F 1

            H SourceCode, LName, FName, SSN, MemberID

            D AX, Antony, Robert, 123456789, 1234

            D ER, Ram, Kams, 3434398494, 2134

            F 2

            • 3. Re: Flat files with different multiple headers and footers
              Nico Heinze Guru

              Do I understand correctly that you have to mask fields from the H and D records?

               

              In this case probably the easiest approach goes like this:

               

              To start with, you need a flat-file source definition (Delimited style) with one long string field. Set the delimiter character between the fields to \037 (yes, backslash followed by the digits 0, 3, and 7). This defines the delimiter character as ASCII 31, a control character (Unit Separator) which I haven't seen in use since at least the early 1980's (meaning no one will use it today).

               

              Now read the source file. Because the delimiter character does not exist in the file, each line is read as a single long string.

               

              In an Expression transformation, you will need to set up some variable ports.

              First you need to identify whether the current line is a H or D record. If not, simply forward the record, and be done.

              If the current record is a H or a D line, then you will need to cut out the respective name fields, SSN, and so on using the INSTR() function (to get the position of each comma) and the SUBSTR() function (to get the individual field contents).

              Then you can use whatever masking technique you have/want to apply based on these field contents.

               

              I know this is still a pretty rough outline, but I hope I could make clear how I would proceed here.

              If not, please ask for all the details you need.

               

              Regards,

              Nico

              • 4. Re: Flat files with different multiple headers and footers
                Subrahmanyam Yenumala New Member

                Thank you Nico for the response.

                 

                Instead of doing like this. Can we do it CSV->JSON->CSV? Once data moved to Json, then we can use Masking logic and again move to original file (CSV)? Please advice.

                • 5. Re: Flat files with different multiple headers and footers
                  Nico Heinze Guru

                  Out of the box not with PowerCenter , PWC has no built-in REST capabilities. Many people use a Java Transformation to parse and process JSON. That works, I just don't have any experience with this approach.

                   

                  Regards,

                  Nico