How to ensure file is completely written before being read

Document created by Adam Arrowsmith Employee on Nov 6, 2015Last modified by Adam Arrowsmith Employee on Jul 22, 2016
Version 3Show Document
  • View in full screen mode

This article describes a technique for renaming a file after it has been completely written to ensure a subsequent process or client does not attempt to read a partial file. This approach  involves custom scripting.

 

 

Use Case

Some scenarios require writing a file to disk for a destination application or subsequent integration process process to read. When execution frequencies are low, process schedules can simply be staggered to ensure the file is safely written before next process or application attempts to read it. However when execution frequencies are higher that approach becomes impractical.

 

Approach

The solution is to write the file using the disk connector with an "intermediate" name and then immediately rename the file using a Groovy script to match the desired naming convention for which the destination application or process is looking.

 

Implementation

In this scenario, a process is writing a file to disk using the Disk connector. A second process is configured to get files that match the file filter "*.stage".

 

0EM40000000N3gQ

 

  1. Set Properties shape - Set the standard Disk File Name and Directory document properties.
    • The File Name should NOT include the ".stage" suffix.
    • These properties must be set BEFORE the Branch shape so that they are available on both paths.
  2. Branch shape - A Branch shape is needed to execute the script after the Disk connector because a Disk connector Send action does not return documents so no additional shape are executed on that path.
    1. Path 1: Disk Connector shape - Write the document and end. The Disk document properties that were set above will overwrite the Disk connector Operation configuration.
    2. Path 2: Data Process shape - Custom Script step, replace entire contents with Groovy script below. This script assumes the standard Disk document properties have been set and renames the specified file in the same directory by appending the ".stage" suffix.

 

Groovy script (Java 7+)

 

import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;
import com.boomi.util.StringUtil;

String STAGE_SUFFIX = ".stage";

for ( int i = 0;  i < dataContext.getDataCount(); i++ ) {
  InputStream is = dataContext.getStream(i);
  Properties props = dataContext.getProperties(i);

  dirName = props.getProperty("connector.dynamic.disk.directory");
  fileName = props.getProperty("connector.dynamic.disk.filename");

  if ( StringUtil.isBlank(dirName) || StringUtil.isBlank(fileName) ) {
     throw new RuntimeException("Disk Directory and File Name doc props must be set.");
  }

  Path source = Paths.get(dirName + "/" + fileName);
  Files.move(source, source.resolveSibling(fileName + STAGE_SUFFIX), StandardCopyOption.ATOMIC_MOVE, StandardCopyOption.REPLACE_EXISTING);

  dataContext.storeStream(is, props);
}

 

Considerations

  • To reuse the script, place the Data Process shape in a subprocess (Start shape=Data Passthrough) and reference it with a Process Call shape.
4 people found this helpful

Attachments

    Outcomes