4 Replies Latest reply on May 13, 2020 2:27 AM by Ashutosh Jumde

    Source File Format Validations in IDQ

    Ashutosh Jumde Active Member

      Can we perform source file format validations on daily cadence using IDQ tool? This includes following:

      1. Format of the file  e.g. txt , csv, xlsx etc
      2. Encoding of the file – if the encoding of the file is as per expectation or not
      3. Number of the columns in the header i.e. first row of the file to be matched with set of columns
      4. Sequence of the columns in the header as per requirement
      5. Check if multiple tabs are available in the excel source file with the required names 
      6. Check if the source file is not empty – i.e. at least 1 data row is present in the source file

       

      As part of a Data Quality framework for 1 of the clients , we need to validate the source file format as well. There are multiple incidences where the format of the file is not correct and causes the failure at ETL layer.

      Can IDQ tool be used for do such validations as well?

        • 1. Re: Source File Format Validations in IDQ
          user126898 Guru

          Can IDQ do this. yes.

          Can IDQ do this out of the box: no

          Can IDQ do this but with a great deal of setup and configuration. yes

           

          IDQ is more a field level validation tool and not a file level.  You would have to use a lot of scripts to perform the checks you are looking for.

           

          Informatica Data Exchange was designed for doing this file level checking.  I would suggest looking at DX or maybe even the newer IICS platform at the Cloud Integration hub or Cloud Application Integration and see if they can solve this.

           

           

          thanks,

          Scott

          • 2. Re: Source File Format Validations in IDQ
            Ashutosh Jumde Active Member

            Thanks for the reply Scott,

             

            So out of box capability is not present in IDQ for such File level Validations. This means IDQ does not have any Transformation which can do.

            And if we want to achieve this , we would need to write custom code (it can be shell script , batch script or code in any other language) and can execute that through IDQ.

            Please correct me , if I am wrong.

             

            As part of comprehensive Data Quality framework , we would like to have this capability of validating the source file format.

            You mentioned other product i.e. Informatica Data Exchange , Can that be integrated with the IDQ pipelines/mappings/workflows?

            • 3. Re: Source File Format Validations in IDQ
              user126898 Guru

              In DQ you could do 3,4 and 6.  It is not an out of the box transform but can be done with the transformation available. 

               

              1 & 2 could be done via scripts.

               

              6 you can script but please note the IDQ cannot read excel. 

               

               

              Data Exchange is a stand alone product and has its own interface.  Normally the flow is DX processes the file, if all the checks pass then it kicks off the downstream process to load the file.  In this case it would be IDQ.

               

               

              thanks,

              Scott

              • 4. Re: Source File Format Validations in IDQ
                Ashutosh Jumde Active Member

                Thanks Scott.

                This helps a lot.