Design Pattern: How to Execute Different Process Paths in Parallel

Document created by Adam Arrowsmith Employee on May 20, 2016
Version 1Show Document
  • View in full screen mode

This article describes an advanced technique for essentially assigning documents to specific Flow Control threads be able to execute different paths with different documents at the same time. This pattern is relevant to a specific scenario and is not widely used.

 

 

Use Case

You want to do DIFFERENT things (but probably similar) with DIFFERENT documents at the same time. This differs from normal Flow Control with parallel threading which does the SAME thing with DIFFERENT documents at the same time.

 

The most common context for this partner is when a process needs to maximize throughput when loading records into a given application that does not allow concurrent sessions for a given user. In this case the Flow Control step with multi-threading alone cannot be used to execute the same connection step in parallel because it would use the same user credentials. To overcome this, different user credentials should be used simultaneously.

 

Because application user credentials are associated one-to-one with connection components, this means the process will need multiple connections and therefore multiple/different connector steps that should be executed in parallel.

 

One specific usage of this pattern is when loading large numbers of records into NetSuite, with or without the SuiteCloud Plus license.

 

Approach

  1. With custom scripting implement the same document grouping algorithm used by the Flow Control step to identify which documents will be assigned to each thread.
  2. “Tag” documents that will be in the same thread with the same dynamic document property (e.g. simple counter “1”, “2”, “3”, etc.).
  3. Use a Flow Control to spawn as many threads as different connections.
  4. Within each thread route by the dynamic document property to a different connection.

 

Grouping Algorithm Example

Assume there are 10 documents to be sent via three different connections. This means there will be three threads. The documents would be grouped as follows:

 

DocumentDynamic Document PropertyFlow Control ThreadConnection
111User 1
211User 1
311User 1
411User 1
522User 2
622User 2
722User 2
833User 3
933User 3
1033User 3

 

Assumptions

  • All documents in a given thread should go to the same connection.
  • Any document can go to any connection. The grouping is arbitrary.

 

Implementation

 

Process Overview

 

 

Notes:

  • The custom script assigns dynamic document property values to each document using the same algorithm the Flow Control step uses to divide documents into threads. This script has a variable to set the number of threads to be created.
  • The number of Flow Control Threads = number of different paths/connections.
  • Route based on the dynamic document value.

 

Groovy Custom Script

 

import java.util.Properties;
import java.io.InputStream;

// CONFIGURE Set this value to the number of Flow Control threads
int chunkCount = 3;

int docsPerChunk = (dataContext.getDataCount() + (chunkCount - 1)) / chunkCount;
int remDocs = dataContext.getDataCount() % chunkCount;

int docsInCurrentChunk = 0;
int currentChunk = 1;

for( int i = 0; i < dataContext.getDataCount(); i++ ) {
  InputStream is = dataContext.getStream(i);
  Properties props = dataContext.getProperties(i);

  props.setProperty("document.dynamic.userdefined.CHUNK_ASSIGN", String.valueOf(currentChunk));
  docsInCurrentChunk++;

  if (docsInCurrentChunk == docsPerChunk) {

    docsInCurrentChunk = 0;
    currentChunk++;

    if(--remDocs == 0) {
      docsPerChunk--;
    }

  }

  dataContext.storeStream(is, props);
}

 

Notes

  • Set chunkSize to the number of Flow Control threads.
  • The name of the dynamic document property to route by is CHUNK_ASSIGN.

 

Usage Considerations

  • Specific for NetSuite, note that if the different users had SuiteCloud Plus licenses, you could use subsequent Flow Control steps on each of the connection paths to further multi-thread. Using the example above, that means there could be a maximum of 30 concurrent connections.
  • However be mindful of the number of concurrent threads running on a given Atom or node. If running on a cloud or molecule, consider configuring the initial Flow Control with scope=Processes to divide work across multiple nodes.
4 people found this helpful

Attachments

    Outcomes