I'm testing out the Parallel Processing feature and I'm curious if anyone has determined what an optimal configuration might be for a use case like this.
My process looks like this.
1) Start Shape: Database Query (returns ~750k records)
2) Flow Control: (Parallel Processing for Threads enabled)
3) Map data
4) Upsert to Salesforce
When i ran this process without Parallel Processing and the Flow Control shape it was completing in around 5 1/2 hours once a day.
I just enabled Parallel Processing and started executing it using these configured values. Waiting to see how long this takes.
- Batch Count set to 0 (default)
- Run as Batches of: 5000
Parallel Processing Options
- Number of Units: 4
- Unit Scope: Threads
- Use Bulk API: yes
- Batch Count: 5000
Thoughts on a better configuration?