Insert a Byte Order Mark (BOM) in front of a .csv file

Document created by mike_c_frazier Employee on Jan 7, 2013Last modified by Adam Arrowsmith on Apr 20, 2018
Version 4Show Document
  • View in full screen mode

Use Case

This script will insert a BOM (byte order mark) at the beginning of a document. As a result, the file should be recognized by MS Excel if opened whether using File > Open or by File Association within Windows.

 

On some applications such as MS Excel versions previous to 2010, UTF-8 encoding may not be realized without a byte order mark (BOM) at the start of the file. This functionality differs when a file is opened with File > Open versus File Association within Windows.

 

Implementation

Add a Data Process shape with a Custom Scripting step and replace with the script below.

 

Note: This script assumes the document data is UTF-8 encoded. If not, you can do so by using a Character Encode step in the Data Process shape before the script.

 

Script

import java.io.ByteArrayInputStream;
import java.io.InputStream;
import java.io.SequenceInputStream;
import java.util.Properties;

byte[] bom = [(byte)239, (byte)187, (byte)191];

for (int i = 0; i < dataContext.getDataCount(); i++) {

  InputStream is = dataContext.getStream(i);
  Properties props = dataContext.getProperties(i);

  dataContext.storeStream(new SequenceInputStream(new ByteArrayInputStream(bom), is), props);
}
2 people found this helpful

Attachments

    Outcomes