How to Remove First Line of a Document using Groovy

Document created by dprostko on Jun 13, 2011Last modified by Adam Arrowsmith on Jan 17, 2017
Version 4Show Document
  • View in full screen mode

Use Case

You need to strip out the first line of a flat file or CSV document (e.g. the column headers) to simplify processing later in the process.

 

Approach

Use a Data Process shape with a Custom Scripting step. Replace the default script with one below.

 

Implementation

 

Script

//This script strips the first line out of each document and outputs the rest of the document contents unaltered.
newline = System.getProperty("line.separator");

for( int i = 0; i < dataContext.getDataCount(); i++ ) {
  InputStream is = dataContext.getStream(i);
  Properties props = dataContext.getProperties(i);

  reader = new BufferedReader(new InputStreamReader(is));
  outData = new StringBuffer();
  lineNum = 0;

  while ((line = reader.readLine()) != null) {
    // Skip first line
    if (lineNum==0) {
      lineNum++;
      continue;
    }

    outData.append(line);
    outData.append(newline);
  }

  is = new ByteArrayInputStream(outData.toString().getBytes());
  dataContext.storeStream(is, props);
}
2 people found this helpful

Attachments

    Outcomes