I've been bitten by a possibly related issue recently. I'm trying to do almost the same thing as you - cut down a CLOB column to something that can be stored in a VARCHAR2(4000) - but in my case I have a VARCHAR2(4000 CHAR) column in the database and my connection is set up to use CHAR semantics. Nevertheless, I've found that sending anything over 4000 bytes fails with a "value too large" error, even if it is fewer than 4000 characters.
Significantly smaller columns work correctly, but I still got the same issue using 3999, 3990, etc.; I suspect that the DB client drivers only permit a maximum of 4000 bytes for a VARCHAR2, even when operating with CHAR semantics. Or, given your experience, perhaps only 3999 bytes (maybe it's 4000 total, being 3999 bytes of string data and one terminator).
In my case I had to truncate the string short enough to account for as many multi-byte characters as we would reasonably expect (the database being in UTF-8), which is both unnecessarily restrictive in most cases and fragile in problematic cases. Since you're operating in byte semantics, it looks like you only have to limit it to 3999 - console yourself with the thought that it could be worse!
Hi Both, I've had this issue in the past whilst importing from Oracle. Leave the source as clob, look in the sq_qual and choose 'text 4000' datatype. next add an expression and substring the field in chunks of 4000
You'll need to test what is the maximum amount of data stored in that field (rather than what is the possible maximum amount of data stored in the field max(length(field_clob)) in order to know how many substr you will need.
Thanks, but I believe you misunderstand the problem. The issue is that under some circumstances, sending a 4000 byte (resp. character) string to an Oracle VARCHAR2(4000) column fails with a "value too large" error, even when the string is confirmed to fit in 4000 bytes (resp. characters).
In other words, cutting up the clob isn't the problem, it's storing the pieces.
Thanks for the replies, everyone. And sorry about hijacking this thread; this probably should have been a separate conversation.
I've considered doing that, and I probably would have tried it had I not found the "reduce the target column by one" workaround. In our case, we would have "sub-stringed" once and forgot about the rest (we are purposefully truncating). However, as Sacha said, this wasn't the original problem, and I'm willing to bet that if I tried that approach, we'd still find situations where Informatica is sending too much data to the target.
Since my last post we've spent several additional hours testing about everything we could think of. We had a large set of test data we ran, and we found repeatable situations where Informatica would send one (one and only one) additional character to the target. We believe one of two situations is possible:
- We are unaware of some setting that would properly sync-up the byte vs char discrepancy (if that is what's going on here)
- We have encountered a bug, where, in certain situations, Informatica is not properly truncating data to the appropriate length/byte as defined in the source qualifier and target object.