How to search and replace ?xml version='1.0' encoding='UTF-8'? (XML declaration)

Document created by mike_c_frazier Employee on Apr 4, 2013Last modified by Adam Arrowsmith on May 25, 2016
Version 2Show Document
  • View in full screen mode
Need the ability to remove the XML declaration (for example, <?xml version='1.0' encoding='UTF-8'?>) from document data.

 

Note: This is typically only required for scenarios required advanced manual XML manipulation.

Use a Data Process step with Search/Replace processing step with the following regular expression configuration:

 

Text to Find:  <\?xml.*\?>

Replace With: (no value/blank)

 


Note: The question mark character is a reserved metacharacter in Java regular expressions and must be escaped with a backlash (\). The .* notation is a wildcard that will match any characters before the declaration closing tag.

 

The full list of metacharacters are:

<([{\^-=$!|]})?*+.>

See http://docs.oracle.com/javase/tutorial/essential/regex/literals.html for more information.

7 people found this helpful

Attachments

    Outcomes