2 Replies Latest reply on May 20, 2020 7:08 AM by Fabiano Seki

    Remove invalid character from Redshift source

    Fabiano Seki New Member

      I have a map with a Redshift source and Hive/S3 target that has some rows with some symbols within the text and I believe that's causing the entire row to be discarded.

       

      I tried to use TRIM(regexp_replace(troublesome_column,'[^a-zA-Z0-9\s]' ,' ')) to remove these symbols and this works on a SELECT statement at a SQL Client. However, when trying to use it wether in the 'SQL Query' in the Advanced tab of the Read Data Object operation or in the filter option in the Query tab, I have inconsistent results and none of them working at all.

       

      Without any treatment, the rows with these symbols get rejected..

       

      Does anyone have gone through this issue or have some ideas on how to solve this?

      Using Big Data Developer 10.2.2, running in spark mode

       

      Example of the value inside of troublesome_column:

       

      Regards,

      Fabiano Seki