It seems that the set of characters that cannot be written to a valid XML file is smaller than the set of characters matched by the regular expression [^[:print:]].
- the TAB character is matched with the [^[:print:]] but it can be written in a valid XML
- the (U+001A) cannot be written in a valid XML and it is matched with [^[:print:]]
As long as you can identify a "safe set" of characters, there is a way. A little complicated at first glance, but it works like a charm.
Let's assume you have an EXP where you set up a variable port v_valid_chars of type String. This variable port must contain all valid characters that can be written to a XML target file. Havily simplified this character list contains all letters A-Z and a-z, the digits 0-9, and most of the ASCII special characters (like comma, dot, and so on).
Now you have some input port in_Value which may or may not contain "dirty" characters which you want to remove. Do it this way:
Below v_valid_chars, set up a variable port v_dirty of type String with this expression:
ReplaceChr( 1, in_Value, v_valid_chars, '')
Now set up an output port out_Value of type String with this expression:
ReplaceChr( 1, In_Value, v_dirty, '')
This way the variable port v_dirty will contain all "dirty" characters from the current input string.
Then the output port will deliver all characters from the input value without the dirty characters.
Once more thanks a million to Kieran for this great trick. You were and are a source of inspiration.