How to use a regular expression to match across lines

Document created by rich_patterson Employee on Oct 22, 2014Last modified by Adam Arrowsmith on Apr 3, 2017
Version 4Show Document
  • View in full screen mode

Use Case

You need to find or match a string value in a document but the document contains multiple lines. Using the regular expression dot-star syntax ".*" does not find a match.

 

Two common scenarios are:

  • Using a Decision shape to check if a value exists anywhere in a document. In other words, comparing the Current Data to some value.
  • Using a Data Process Search/Replace step to replace any occurrences of a given string.

 

Solution

By default the dot "." character will match any character EXCEPT for new lines. However you can set something called an embedded flag expression (a.k.a. mode modifier) at the beginning of the expression to change the matching behavior.

 

The flag "(?s)" enables the "dot-all" mode for the remainder of the expression. With this set, the dot "." character will match any character INCLUDING new lines.

 

For example, if you wanted to find and remove all occurrences of the TransactionType element from an XML document, use the expression:

 

(?s)<TransactionType>.*?<\/TransactionType>

 

Additional Reference

2 people found this helpful

Attachments

    Outcomes