AnsweredAssumed Answered

Please Explain Character Encoding

Question asked by DarrellFlenniken9941 on Mar 9, 2017
Latest reply on Sep 30, 2018 by Adam Arrowsmith

I'm trying to figure out how character encoding works in Boomi. I have a  process  that only works if I set the atom -Dfile option to UTF-8. Unfortunately, that breaks other processes that rely on the default, windows-1252. The file that I'm processing  is an XML file that contains ISO-Latin-1 characters


After removing the -Dfile setting, I've tried encoding conversion after the file is read using a process shape to UTF-8, and I've tried using the encoding option on a map profile setting UTF-8. Neither of these work.  Downstream I get errors indicating illegal characters in other shapes (e.g. a process shape that attempts to combine XML documents).


Is there a complete and thorough, and perhaps instructive, guide to how Boomi handles character encoding. The user guide is very terse. There's a lot of information on the forum on character encoding but it's disjointed, not comprehensive and not instructive.


What magic is the -Dfile option doing? How is it different from using explicit conversion with process shape?  Is Java behind the scenes messing with me? 


To add to my misery, I have to check the lengths of XML elements that contain ISO-Latin characters and load the XML document into a SQL server  [2008 R2] XML type column. I made that work but that's journey down hack street.  


Bottom line, everything works if -Dfile is set to UTF-8. Everything is sideways if its the default.