    Parse fixed width file containing multiple record layouts

    hayes5736 Active Member

      I am new to DEI and not sure if what i want to do is possible.


      I have a scenario where I am getting a fixed width text file. Inside the file I have a header record, trailer record and 12 different fixed width formats.


      All 12 formats have a record type value in the same positions so I can consistently identify each of the 12 types. Then from record type to record type the number of fields and field names will differ.


      I am wondering if there is a way to parse this file with DEI as opposed to creating a pre-processing step to split the file out into the 12 various record types.


      I'm coming from a PowerCenter environment and what I've learned so far in DEI is incredible. The functionality in DEI is amazing compared to PowerCenter Designer, so I am really hoping this new tool can solve my problem.


      Thank you.

          Krishnan Sreekandath Seasoned Veteran

          Hello Hayes,


          Can you please tell me if you might be getting the multiple record file from mainframe ? If yes, then I think it would be easy to use the PWX NRDB reader using a datamap that has all the 12 record type layouts plus header and trailer and write it to HDFS/Hive.


          A successive job in Hadoop pushdown mode (or Native) can be used to process that data in HDFS/Hive, if needed. Please note that we cannot use NRDB sources in Hadoop pushdown mode.


          If not, I am afraid we might not have an easy way to parse the different types of records Natively or using just the PDO and the logic to separate/identify the 12 record types will have to be built into the mapping itself.


            hayes5736 Active Member

            I am getting the file from an external source not an internal mainframe source.


            I can consistently identify the record type in the same positions for every non header/footer row.


            Is there a way to dynamically route each record type through a different pipeline and somehow apply a control file to each pipeline?


            I know I can manually code a router transformation with 12 groups. Then from here, can I somehow apply the record format using some sort of control file? I would prefer not having to substring all of the fields for the 12 different formats.


            I'm also wondering if I could use a parser or data parser transformation? These transformations are new to me and I still don't understand them so I'm not sure if they could work for my use case.