11 Replies Latest reply on May 26, 2021 5:47 AM by Mani Ar

    How to read multi tagged elements from XML Source

    Mani Ar Seasoned Veteran

      Hi,

       

      I have the following XML source file (I just put a portion of my XML source file here and not the entire content due to size).

       

      I have created my denormalized XML source using the raw XML file and not able to read the below red fonted tag alone but everything else without any issues.

       

      When I tried with normalized options, I am not able to connect both the sources using joiner transformation.

       

      I am not sure what I am doing wrong here and any help or suggestion is highly appreciated please.

       

      Thanks,

      Mani A

       

      <entry>

          <id>https://vegamour.com/products/4671084527731</id>

          <published>2021-05-18T08:21:05-07:00</published>

          <updated>2021-05-18T08:21:05-07:00</updated>

          <link rel="alternate" type="text/html" href="https://vegamour.com/products/gro-advanced-replenishing-conditioner"/>

          <title>GRO+ Advanced Replenishing Conditioner</title>

          <s:type>Hair</s:type>

          <s:vendor>VEGAMOUR</s:vendor>

          <summary type="html">

            <![CDATA[<table border="0">

        <tr>

          <td width="200"><img width="200" src="https://cdn.shopify.com/s/files/1/1336/0857/products/gro-advanced-conditioner-main_c18ccb4c-9a49-49c9-8570-594ea6f582ff.jpg?v=1621351528"></td>

          <td valign="bottom">

            <p>

       

       

              <strong>Vendor: </strong>VEGAMOUR<br>

              <strong>Type: </strong>Hair<br>

              <strong>Price: </strong>

                  58.00

            </p>

          </td>

        </tr>

        <tr>

          <td colspan="2"></td>

        </tr>

      </table>

      ]]>

          </summary>

        • 1. Re: How to read multi tagged elements from XML Source
          Syed Aziz Guru

          Hello Mani,

           

          If possible please try posting fiddler trace.  Meaning of each part according to w3c:

          rel: Set of space-separated tokens.

          alternate: A type of hyperlink that gives alternate representations of the current document.

          type: A valid MIME type that destination of the hyperlink. Gives the MIME type of the linked resource.

           

          So basically the tag ("<link rel="alternate" type="text/html" href=") gives a reference to an alternate location. Probably showing the same site with slight modifications.

           

          Best regards,

          Syed

           

          • 2. Re: How to read multi tagged elements from XML Source
            Mani Ar Seasoned Veteran

            Thank you Syed!

             

            I am not sure what is fiddler trace. never heard of it. I am not a web person and all I know is just informatica. I just need that href value (i.e. the link) to be read in my mapping.

             

            Thanks,

            MAni A

            • 3. Re: How to read multi tagged elements from XML Source
              Syed Aziz Guru

              Hello Mani,

               

              Fiddler Trace is helpful for HTTP traffic debugging.  Here is the related KB 532750:

              "HOW TO: Capture an HTTP trace from a browser and later analyze it" Support

               

              Best regards,

              Syed

              • 4. Re: How to read multi tagged elements from XML Source
                Mani Ar Seasoned Veteran

                Thank you Syed!

                 

                I have not gone that far yet. Because I would expect the XML source mapping would pull this value like any other element https://vegamour.com/products/gro-advanced-replenishing-conditioner.

                Informatica is not very friendly when it comes to JSON or complex XML feeds. I have so much experience dealing with the XMLs and JSON with informatica and went with python instead of dealing with it.

                 

                It is good for traditional file systems like csv, tsv, relational DBs but not for modern file formats.

                 

                Thanks,

                Mani A

                • 5. Re: How to read multi tagged elements from XML Source
                  user126898 Guru

                  Mani,

                  You are correct that PowerCenter is not good at handling XML and JSON not Informatica in general.  In the Informatica on-prem world, we introduced data transformation as part of the data quality/data engineering stack to provide the ability to parse complex file types and industry formats.  These parsers in turn you could use inside of PowerCenter to enhance the original functionally.

                   

                  Now with Informatica's cloud data integration ,which has replaced powercenter, has build in capabilities to natively parse modern file formats liek JSON, Avro, Parquet and ORC and much easier.

                   

                  Thanks,

                  Scott

                  • 6. Re: How to read multi tagged elements from XML Source
                    Mani Ar Seasoned Veteran

                    Thank you Scott!

                     

                    Unfortunately 90% of our operations are done with on-prem power center suite and we never had to use data transformation package for our purposes.

                     

                    We tried ICS and it is very costly for us considering what power center is offering to us right now with the cost.

                     

                    Thanks

                    • 7. Re: How to read multi tagged elements from XML Source
                      user126898 Guru

                      Interestingly I am surprise about your cost comment.  Normally people move to IICS and pay around the same amount of the year subscription as they do on the powercenter year maintenance.  If is even better now Informatica is on the IPU model (pay for what you use) and there is even a migration tool to convert all objects over to IICS.

                       

                      There has been many improvements since you have looked at cloud since you called it "ICS"

                      1 of 1 people found this helpful
                      • 8. Re: How to read multi tagged elements from XML Source
                        Nico Heinze Guru

                        Agreeing that IICS has come a long way since the ICS days, no question.

                        Nonetheless migrating from PowerCenter to IICS can become very difficult. Not all PowerCenter functionality is available in IICS, and on average 90-95% of PowerCenter objects can be migrated to IICS, the other parts need manual intervention.

                        Now when looking at an "average" customer having 5,000 workflows in PowerCenter which run on a daily or monthly basis or so, that means that on average 250 workflows cannot be migrated to IICS.

                        Which means that for these 250 workflows an analysis must be performed what exactly these workflows do (let's face facts, documentation in most cases is useless or at least incomplete), then it must be decided whether the missing parts in IICS can be substituted by an easy workaround or whether a redesign of these workflows is necessary. And changes in such a fragile infrastructure as many PowerCenter workflows (for example, at my current customer there are 1000s of workflows which consists of at least two standard pre- and post-processing worklets and much other stuff) can incur changes in all other parts of the infrastructure as well.

                         

                        From a technical point of view a conversion rate 95% is not bad.

                        From a business-process perspective those remaining 5% can make up for 95% of all the trouble and migration costs.

                         

                        It would be better if Informatica would decide to provide the missing functionality, this would be helpful to convince more customers to move to IICS.

                         

                        And still there's one point which many customers will never negotiate about: IICS needs (please correct me if I'm wrong here!) a permanent internet connection from each IICS machine. And you can trust me, many financial service providers will not allow any server in their PROD environment to connect to the internet. Period. That's a fact.

                        What can those customers do if they would be willing to try IICS?

                         

                        And yes, I know that this last question in particular is not your area of discussion, Scott, that's something for Product Management.

                         

                        Regards,

                        Nico

                        1 of 1 people found this helpful
                        • 9. Re: How to read multi tagged elements from XML Source
                          Mani Ar Seasoned Veteran

                          Thank you Nico!

                           

                          I agree with your thoughts and comments. We had a demo this Feb and considering the volume of changes and the cost involved, we are very much happy with existing power center tools. Some of our modern clients send their data in JSON format and we need to rely on other tools to convert them to be usable in power center.

                           

                          All the traditional data formats works great in power center.

                           

                          Reg my question,

                           

                          Do you know what I am missing in reading those key elements please? All other simple tags are read properly except this one

                           

                              <link rel="alternate" type="text/html" href="https://vegamour.com/products/gro-advanced-replenishing-conditioner"/>

                           

                          Thanks,

                          Mani A

                          • 10. Re: How to read multi tagged elements from XML Source
                            Nico Heinze Guru

                            Unfortunately that's not my area of expertise, XML handling is pretty much a mystery to me. Sorry that I can't help here.

                             

                            Regards,

                            Nico

                            • 11. Re: How to read multi tagged elements from XML Source
                              Mani Ar Seasoned Veteran

                              No problem Nico!

                               

                              I got a workaround for this and made it to work as expected.
                              I have replaced the string

                               

                                  ‘<link rel="alternate" type="text/html" href="’ to <link> and replaced ’"/>’ to </link>

                              So that it will appear like a normal tag in my source XML.

                              it worked out for me.

                               

                              thanks