4 Replies Latest reply on Dec 9, 2019 12:33 AM by Vladislav Ponomarenko

    convert rows to column based on multiple key columns

    inuser494112 New Member

      Hi,

       

      I have table which looks like this

       

      Model    Year      Code      Part

      IP           2018     300        Gasket

      IP           2018     300        Pot

      IP           2018     300        Lid

      IP           2018     301        Pot

      IP           2018     301        Lid

      IP           2018     301       Gasket

      IP           2018     301       Basket

       

      Based on Model, Year and Code, I would need to concatenate the parts. The rows values can change.

       

      Output should look like this:

      Model    Year      Code      Parts

      IP           2018     300        Gasket,Pot,Lid

      IP           2018     301        Pot,Lid,Gasket,Basket

       

      I believe aggregator has to be used and have seen up solutions using 1 key. In this case, i need to convert rows to columns (separated by ',') based on 3 columns. Can someone help me?

        • 1. Re: convert rows to column based on multiple key columns
          Nico Heinze Guru

          Not really difficult.

          You should use an AGG, that's the easiest approach.

          Nevertheless you need to use a "trick". It's not sufficient to simply group by model, year, and code; you also have to keep track of these three values using variable ports, otherwise you'll have trouble to set up either an initial string for each combination of these three values or to append the current "Part" to the string assembled so far.

           

          Here's a short description:

           

          Set up your AGG to group by Model, Year, and Code. This will ensure that for each combination of these values you get exactly one output record.

           

          Set up three variable ports v_prev_model, v_prev_year, and v_prev_code of the same data types as Model, Year, and Code. Make sure these three variable ports are located at the lower end of the AGG's port list.

           

          Assign the values of the input ports MODEL, YEAR, and CODE to these three variable ports.

           

          Above these three ports, create another variable port "v_is_new" of type Integer with this expression:

          MODEL != v_prev_model OR YEAR != v_prev_year OR CODE != v_prev_code

           

          Immediately below v_is_new, create another variable port v_result of type String (make it long enough ) with this expression term:

            IIF( v_is_new, '', v_result || ',') || PART

          This will make sure that whenever a new model, year, or code is encountered the v_result string is initialised to the current PART only; in all other cases (as long as all three values equal their values from the previous record) the current PART is appended to v_result (plus comma).

           

          Finally set up an output port of type String with this expression term:

            v_result

           

           

          Regards,

          Nico

          • 2. Re: convert rows to column based on multiple key columns
            inuser494112 New Member

            Thanks for the solution. I cannot use state variables because it doesn't run on spark mode. Is there any other alternate workaround.?

            • 3. Re: convert rows to column based on multiple key columns
              Nico Heinze Guru

              Oh my, my solution is for PowerCenter. It should work on the Informatica platform as well, but very likely not on any Hadoop machine, only on the Informatica server itself (and some optimisations must be turned off for that mapping, if memory serves me right).

               

              Regards,

              Nico

              • 4. Re: convert rows to column based on multiple key columns
                Vladislav Ponomarenko Guru

                Hi,

                 

                BDM 10.2.2 supported stateful vars now but in Spark Mode only.

                In addition, when I worked with BDM 9.6.x some years ago I used Java TX in Hive execution mode and it's inner variables instead of variable ports of Expression TX.

                 

                Regards, Vlad.