Please Explain Character Encoding

Question asked by DarrellFlenniken9941 on Mar 9, 2017
I'm trying to figure out how character encoding works in Boomi. I have a  process  that only works if I set the atom -Dfile option to UTF-8. Unfortunately, that breaks other processes that rely on the default, windows-1252. The file that I'm processing  is an XML file that contains ISO-Latin-1 characters


After removing the -Dfile setting, I've tried encoding conversion after the file is read using a process shape to UTF-8, and I've tried using the encoding option on a map profile setting UTF-8. Neither of these work.  Downstream I get errors indicating illegal characters in other shapes (e.g. a process shape that attempts to combine XML documents).


Is there a complete and thorough, and perhaps instructive, guide to how Boomi handles character encoding. The user guide is very terse. There's a lot of information on the forum on character encoding but it's disjointed, not comprehensive and not instructive.


What magic is the -Dfile option doing? How is it different from using explicit conversion with process shape?  Is Java behind the scenes messing with me? 


To add to my misery, I have to check the lengths of XML elements that contain ISO-Latin characters and load the XML document into a SQL server  [2008 R2] XML type column. I made that work but that's journey down hack street.  


Bottom line, everything works if -Dfile is set to UTF-8. Everything is sideways if its the default.