XML error: Invalid null character in text to output

Document created by mike_c_frazier Employee on Dec 10, 2012Last modified by Adam Arrowsmith on Jun 23, 2017
Version 2Show Document
  • View in full screen mode

Issue

Process returns the error when attempting to parse an XML formatted document:

 

First document failure: Unable to store data, error copying stream.; Caused by: Failed generating xml document; Caused by: Invalid null character in text to output

Cause

Although this error can be the cause of different reasons, a common reason is that the XML document is encoded in a character set that is not being recognized by the atom.
The atom has an encoding (most commonly UTF-8 or windows-1252 (ANSI)).

 

If a document is in another encoding is could produce the above error. This can be verified by viewing the document data and searching for unusual characters.

 

Solution

It is recommended that you try and identify the encoding of the document if it is coming from a source by contacting the source or researching what the encoding is from the source.
Once the encoding is identified, try adding a Data Process Decode step and specify for the Character Set the document encoding.

 

The follow link contains the Java supported character encoding sets: Supported Encodings. Depending on the source data, you should be able to decode these character sets by specifying the Canonical Name in the Data Process Decode Step Character Set field

 

For example, if the source file is encoded with little endian with a byte order mark, specify "UnicodeLittle" in the Character Set in the Data Process step. It will decode from this character to set to UTF-8 by default.

 

See also How to design a process to handle non UTF-8 characters.

2 people found this helpful

Attachments

    Outcomes