This article provides important concepts and considerations as well as common scenarios for using document caching.
- User Guide Articles and Other Resources
- Common Scenarios
- Additional Considerations
- Common Errors
Document Cache is a very useful concept when it comes to designing complex processes which involve bulk data or data retrieval. Here are a few concepts as a refresher:
- Indexing - This is how documents are stored for data retrieval. A few tips to keep in mind:
- IMPORTANT: Documents must be cached and indexed by the granularity you wish to retrieve them. For example, if you have a single CSV file containing all your employees but you want to be able to retrieve/lookup individual employees by email address, you must split the data before adding it to the cache.
- You must configure at least one index with at least one key.
- The value(s) for the keys must not be blank.
- An index can consist of one or more keys from a profile and/or document properties. For example, you could create a "composite" key using a combination of first name and last name.
- You can configure multiple indices, if you want to be able to retrieve the same cached documents using different parameters at different points in your process. For example, if you were caching a set of employee records, you could create separate indices so that you could lookup cached records by employee number, name, or email address.
- Document Properties - Any Document Properties associated with a given document are also cached along with the document data itself. When a cached document is retrieved using a Load from Cache shape, the original document properties are restored as well.
User Guide Articles and Other Resources
- Everything you wanted to know about Document Caching but were afraid to ask
- How to Use the Document Cache Shape
- All "Document Cache" content
Scenario 1: Joining Data From Multiple Sources
In case you need to reference data from multiple sources and want to avoid making multiple calls, a good approach would be to add data to cache after retrieving records from an individual source. Once the complete data is available in multiple caches, add them as a source profile in your map. Let's refer to the example below:
In this example, the complete list of doctor profiles along with hospital details where each doctor practices are first retrieved and stored in separate caches, each indexed by "Doctor ID". In the third branch, based on the doctor ID contained in the incoming web service request, data from each cache is joined with the source profile based on their "Doctor ID" index and mapped to create the final output. Note that the profile used at source matches with the input data and cached data is added at the parent element level. This creates a 'super profile' on the source end of the map. Adding the cached data to the source profile vs. a document cache lookup function allows repeating/looping data elements from the cached documents to be mapped naturally, such as Hospitals in this example.
Scenario 2: Existence Check Lookup
Once you add data to document cache, you can use this temporary repository to lookup data. Lookup option is available as a part of parameters for decision shape, set properties, connector as well as map function.
One popular use case is to efficiently check for the existence of records in a destination system. Instead of making a separate external API call for each incoming record to determine if it exists in the destination system, first make a single call to extract and cache all the records then perform the individual lookups against the "local" cached copy. If the record does not exist in the cache, an insert can be performed, otherwise an update can be performed. Note the technique of storing the document cache lookup result in a document property so that if the record exists the ID value in the property can be used in the "update" map instead of having to perform a second lookup to the cache. This example assumes the destination system requires an ID value to perform an update--a common practice.
Side note: As always, there are considerations and practical limitations of this approach, such as if there are a very large number of records in the destination system; it may be more efficient to query individually instead.
Get an example of this scenario from the Process Library here
Scenario 3: Use as a Temporary "Array"
A single document cache can be used to store multiple document properties (as long as the document is of the same structure). This can be later retrieved and used as appropriate.
The below example shows use of a single document cache for storing different types of errors. These errors are aggregated at the end of the process and returned to the user.
Because document properties are stored as a part of document cache, we will be retrieving try/catch message and cleanse result message from branch 1 and 2 respectively. Before writing to cache in each branch, a dummy static dynamic document property is set with a static value (which is the same for branch 1 and 2), which is used as an index in the cache.
In branch 3, the same static value which was set in the dummy dynamic document property before writing to cache is used for retrieval of the complete data (both cached document as well as document properties) from the cache.
The document property values which we are interested in are the try-catch message and the cleanse result message. We will use a notify shape for this purpose.
Scenario 4: Multiple Lookups
Documents can be added to cache and looked up multiple times in different stages of the process, based on particular input to avoid making the external call multiple times. Let us take a case where we have a consolidated list of all orders received from a website that must be sent to both the Order and Invoice Departments. However these departments both need additional Product information that is not present in the source data from the website. To avoid having to make the external queries to the product database multiple times (once for orders and once for invoices), we can instead make one call to retrieve all the product details, add them to a document cache, and then perform the lookup against the local cache in the maps.
In this process, Branch 1 extracts the complete information of all possible combinations of product categories and departments. (In the Database Operation, be sure to set Batch Count=1 so each record is returned as a separate document and cached individually. This will give you the desired level of granularity while retrieving data from the cache.
Branch 2 and 3 pass the required data to the Order and Invoice departments. There is a need to lookup the value of Product Department based on the category of the ordered item. Document Cache Lookup function is used within the maps in each branch for this purpose.
Lookup functions work perfectly well for multiple occurrences of line items (unbounded elements). One order can have multiple items and for each item ordered, the lookup function will be able to pick up the corresponding department value based on the item's category.
Tips for Using the Document Cache Lookup Function
Error: Duplicate values for same index
If the same input for the key retrieves multiple entries from the cache, Boomi gives an error "Found n more than 1 document in document cache". This error is specific to document cache lookup using a map function or using the lookup while in a parameter (for example, in a set properties or decision shape). It is perfectly fine to have multiple entries retrieved from the cache in case it is retrieved in a map (using Add Cached Data in Source Profile, as in Scenario 1).
Note that deleting/updating records in cache is not possible.