This four-part series discusses various methods and considerations for keeping multiple applications in sync, including one-way, bidirectional, real-time/event-based, and hub-and-spoke.
- Sync Strategies Part 1: One-Way Syncs
- Sync Strategies Part 2: Two-Way Syncs
- Sync Strategies Part 3: Real-Time Syncs
- Sync Strategies Part 4: Syncing Multiple Applications
- Challenge 1: Update Conflicts
Challenge 2: Circular Updates
- Method 1: Skip Records Last Updated By a Dedicated “Integration User”
- Method 2: Use Application Triggers to Distinguish Update Made by End User vs. Integration
- Final Thoughts
This article builds on the incremental integration goodness discussed in Sync Strategies Part 1: One-Way Syncs.
You might often hear the term bidirectional or two-way sync used casually to describe a need for records to be exchanged between two applications. In many cases though, this translates to some types of records need to go from A to B and other types of records need to go from B to A. However integration developers are keen to recognize this as simply a collection of unrelated one-way syncs. A true two-way sync situation is when changes to the same record type are made in either of two applications and must be synced to the other application.
Although practically speaking a two-way sync is in fact two one-way syncs between the same applications, the fact that both applications can update the same records back and forth presents a few unique challenges. This situation more commonly occurs with master data records (e.g. customers, contacts, vendors, etc.) as opposed to transactions (e.g. invoices, payments, etc.) that are typically synced one-way accordingly to business processes.
Challenge 1: Update Conflicts
If a record can be modified in either application, what happens when the same record is modified in both applications between syncs? Which application’s version should “win”? This problem is addressed by choosing one of the applications as the “master”, whose version will always overwrite that of the secondary application.
Again, a two-way sync is really just a pair of one-way syncs working in concert however depending on the capabilities of the applications themselves, the sequence in which those two syncs should execute will vary.
If the master application has the ability to determine if the submitted update is newer than the version it currently has (typically via an application trigger and/or requiring a previous version number be included in the request), records should be extracted from the secondary application first and “optimistically” synced to the master application. The master will reject/skip any updates that are older than its version. (More on triggers later.)
If the master application is unable to perform logic to reject stale updates, then records should be extracted from the master application first. Any conflicting changes made in the secondary application will be overwritten by the values from the master, then any other records that were only modified in the secondary application will be synced back to the master..
Some may argue why not just query the destination record to compare its last modified date to the record from the source application? While technically possible, one of the design tenants you should always strive for is to minimize external API calls whenever possible for performance (i.e. external calls are slow) and governance (i.e. some applications may limit the number of calls you can make in a given period of time) reasons. In reality, doing these sorts of lookups is only practical when dealing with very low volumes.
A Note About Field-Level Vs. Record-Level Conflicts
Sync use case requirements often state “only the fields that have changed should be synced”. This is a perfectly legitimate request however attempting to sync changes at a field-by-field level is not a trivial task and typically requires the use of master data-management software or similar logic to maintain a cached version of the records outside of either application and perform change detection logic on each field. (We’ll talk about master data management strategies in a future article.)
More often than not, this is overkill for many integration scenarios and consequently they operate at the record level: if any field on a record changed, all the values from that record will be synced to the other application. That is the assumption we will make in this article.
But keep this in mind: even though changes are detected at a record level, each application can be the master for a subset of the fields on a record. For example, between a customer relationship management (CRM) application and an accounting application, the CRM could “own” the customer name, shipping info, and other sales-related information, but the accounting system could own the billing address and payment terms. This manifests in the integration by simply not mapping certain fields when performing an update from CRM to accounting or vice versa.
Challenge 2: Circular Updates
The second and trickier challenge when syncing the same records between the same applications is the dilemma of circular updates. Assuming each step of the two-way sync is extracting records incrementally (as it should be!), those records could have been created or recently modified by integration itself and continually update one another without any “real” changes made by an end user. Consider the following:
- A user makes a change to a record in application A, updating that record’s last modified date.
- The integration looks for recently modified records in application A, finds this record and syncs it to application B, updating the record’s last modified date in application B.
- The integration then looks for recently modified records in application B, finds the record that the previous sync just updated, and syncs it back to application A, updating the record’s last modified date in application A.
- The next time the integration runs, it looks for recently modified changes in application A, finds the same record again, syncs it to application B....thus perpetually syncing this same record back and forth until the end of time.
The theoretical conclusion is that eventually you would be syncing every record every time, which is what we’re trying to avoid with incremental syncs in the first place!To overcome this dilemma, we need a way to somehow identify and sync record modifications made by a real end user and ignore modifications made by the integration itself. Below are two methods for doing just that.
Method 1: Skip Records Last Updated By a Dedicated “Integration User”
This method relies on configuring the integration with its own user credentials and ignoring records that were last modified by that user. When the integration creates or updates a record, the destination application captures the integration’s user name as the last-modified-by user value. This approach would be used in combination with the last modified date to extract records WHERE LastModifiedDate is greater than <last record date1> AND LastModifiedBy is not <integration user>1 The most recent last modified date captured and persisted from the records processed during the previous sync.
- The end application must capture the last modified by user and expose it via API queries.
- Simple to implement.
- No end application customizations required.
- Provisioning a dedicated user for the integration typically consumes a user license and associated cost.
- No ability to detect if an end user has more recently updated the record in the destination application and skip the update if desired.
- Changes to specific fields in the destination application that are not mapped from the source application will not be synced back to the source application (at least not immediately).
Taking it to AtomSphere
- Configure the Start Step’s connector Operation with a query filter for the appropriate “last modified by” field and operator “not equal to”. In the Start Step Parameters tab, enter the user name or internal ID of the dedicated integration user as a Static Value.
Method 2: Use Application Triggers to Distinguish Update Made by End User vs. Integration
This method enhances the Sync Flag approach introduced in Sync Strategies Part 1 (whereby the integration extracts records based on the value of this flag field and then resets the flag at the successful completion of the integration) by incorporating trigger logic in the end applications. Triggers enable you to perform conditional logic within the destination application right before the record is actually saved. This is much more efficient than making a series of API calls during the integration itself. The trigger logic detects the “context” of a given record update, whether from a real end user or from a programmatic/API user, and sets value of the Sync Flag field accordingly. Then the integration can simply extract all records where Sync Flag=true.This is similar to the Sync Flag approach used in one-way incremental syncs, however with the the added requirement that a record may be synced multiple times. This is often the case for master data records like Customers, Vendors, Items, etc. vs. transactional records that are usually synced once and only once. The feasibility and implementation of this approach can vary widely between applications.
- Must be able to configure custom fields and triggers in end application(s)
- Does not require additional user licensing cost.
- Allows you to determine if an end user has updated a record more recently and proceed differently.
- More complex solution, with moving parts in both the integration and application layers.
- Requires end application customization.
- Not all applications support user-defined triggers.
Taking it to AtomSphere
- The Connector Operation filters are simplified to only query records where the Sync Flag=true.
- Field mapping must be coordinated to populate custom fields that assist the integration and trigger.
Sync Trigger Example
Let’s take a closer look at how triggers can be used to facilitate the sync. In this example we will assume that we can implement triggers in both applications and that the triggers execute on create and update requests, from both the user interface and API. We will also need to create two custom fields on the given record type in both applications to assist with the sync:
- Sync Flag - boolean (true, false)
- Last Sync Date - date
Here’s the big idea:
- When the integration updates a record, it will always populate Last Sync Date with a new value.
- Capture the most recent Last Modified Date and use it as the Last Sync Date when writing back to the source application at the end of the sync. Using this value instead of the current system time stamp will let the trigger precisely determine if the record was updated by another user while the sync was running.
- However when an end user updates a record, s/he will not modify (or even see) the Last Sync Date field.
- Before a record update is saved, the trigger logic compares the before and after values of the Last Sync Date field. If they are the same (or if the record’s Last Modified Date is greater than the new Last Sync Date), assume an end user made the change and set the SyncFlag=true. If they are different, assume the integration made the change and set the SyncFlag=false.
Or if you prefer pictures, here’s the flow representing the first half of the two-way sync, from App A to App B only (the opposite sync from App B to App A would simply be the reverse of this flow):
Or for those who like pseudo-code, here’s the before-commit trigger logic:
// Note new and/or old LastSyncDate values could be blank
if new.LastSyncDate == old.LastSyncDate
SyncFlag = true
SyncFlag = false
// LastSyncDate new/old values will be the same if updated by an end user
if new.LastSyncDate == old.LastSyncDate
SyncFlag = true
// LastSyncDate new/old values will be different if updated by the integration
if old.LastModifiedDate > new.LastSyncDate
// Do nothing: something updated the record while the integration was running; don’t change SyncFlag value
SyncFlag = false
Note: Some applications may be able to natively distinguish the “context” of the record update, such as via the user interface vs. web services. If the integration is the only entity that interacts with the application via web services, it would be safe to rely on that context and simplify the above trigger logic to always set SyncFlag=false if the update came from the web services. However if other entities use the web services, you would need to treat those other entities like end users and use the conditional logic.
Two-way syncs are inherently tricky and require some important design decisions to ensure a successful implementation. You may find the applications you are looking to integrate do not support some of the options discussed in this article; maybe only one of the applications supports triggers, for example. In these situations, be flexible and creatively work with the options available for your applications. It may not be perfect and updates might bounce back and forth once; just make sure you don’t get caught in a circular update loop for ever and ever!
Up next...Sync Strategies Part 3: Real Time Syncs!