Transposing and Reverse Transposing a Flat File

Document created by josh_hutchins778263 Employee on Sep 8, 2017Last modified by john_yocum067395 on Sep 13, 2017
Version 21Show Document
  • View in full screen mode

Purpose

The purpose of this document is to demonstrate how we would transpose a Flat File data set and later on in the article, reverse that transposition. I have attached source files titled Part1.csv and Part2.csv to follow along with Part 1 and Part 2 of this article should you be interested in building your own mock up.

 

 

Part 1 - Transpose

 

The Goal

We are attempting to transpose the data below. The data on the top set will be rotated in order to turn it into what is seen below. A use case for doing something like this may be to:

 

 

 

 

Why Do This?

 

There could be a number of reasons and use cases depending on the data structure before and after. Taking in the example above, some reasons to transpose your data could be as follows:

 

- To consolidate the total flow of data (cutting 58% of data used per record set after transposing).

- To change the structure of data as per a developer or endpoint requirement. 

 

The Data

 

This use case of source data is able to work in our favor due to the structure of each column. Why this is is explained below.

#1 - 4 Unique Project fields, these are finite and always occurring for each record set of data. When transposed, these four Project fields will become column headers and no longer used in row data. We will later explain how these 4 Project fields allow us to flatten our data. 

 

#2 - For set of Project fields, there are a set of repeating IDs accompanying them. After transposing, we will be only use 1 id per set of data, cutting it's occurrence by 75 percent within row data.

 

#3 - Free flow data, before being transposed, were values after each Project field. After being transposed, all values for each record are displayed across a single row. 

 

The Process Flow

 

 

1. Message Shape

 

Explained under The Data section of this article .

 

2. FF to XML Map

 

 

Our data comes into the map as a FF with elements of each column within the data. On the XML destination side, we have the following assumptions:

 

- Document_ID is separate of the Column_Name and Column_Value elements because we want to limit the amount of times it occurs and as a result, split our data into 2 files, one containing one occurrence 8675309 and one containing an occurrence of 9998212. We do this by making sure Max Occurs is set to 1.

 

- Transpose_Me contains Column_Name and Column_Value exclusively and sets instance identifier qualifiers = to each Column_Name Project field. Doing this allows us to flatten the data and make it easier to manipulate via the xml profile and profiles thereafter because we can route data by each Column_Name and associate each Column_Name with it's Column_Value in the later profile.

 

- Transpose_Me exists as a "Max Occurs: Unbounded" element because we want to allow all Column_Name Project fields to exist alongside one instance of a Document_ID. If we were to instead set "Max Occurs" to 1, we would have 1 Document_ID to one project field, resulting in a total of 8 documents from our two records. 

 

This first map will output two files, as seen in the screenshot below:

 

Notice that each data sets contain the following:

 

- 1 document ID

- All 4 Column_Name Project fields and associated Column_Value values under a Transpose_Me tag (later to be routed via instance identifier).

 

3. XML to FF

 

When our data passes through the next map, our instances identifiers kick in and route each Column_Name Project field alongside it's associated Column_Value, to the associated instance Identifier. We can now seamlessly map our source data to all destination elements thanks to the instance identifiers. 

 

The second map will output the following two Flat Files:

 

 

The format we want is now complete. Now all that needs to be done is a combine of the two Flat Files.

 

4. Data Process: Combine

 

 

We will be doing a combine shape of our two files. Make sure to select "retain first line as Column Headers"

 

The final output will be as follows: 

 

 

 

 

Part 2 - Reverse - Transpose

 

The Goal

We are now going to take what we did in Part 1 and reverse transpose it for part 2. 

 

 

Why Do This?

 

To show how we can reverse the original transposition and add knowledge on how transposing works within Boomi.

The Data 

 

 

#1 - In our first map, we will be mapping each Project Column to their respective instance identifier Column_Value field in order to repopulate their data structure just after flattening. In the second map, we will be using a profile without any instance Identifiers altogether (as we already have our desired XML format from the previous mapping) in order to map to the Flat File. 

 

#2 - Due to keeping Document_ID a "Max Occurs: 1" and "Looping Option: Unique", We will be once again splitting our data into two files, one for each Unique Document ID encountered. Because we are keeping our Transpose_Me element unbounded in our second map, our XML to FF will loop each document's Document_ID with each row containing a Column_Name, bringing it back to a 1 to Many occurrence in our data. 

 

#3 - This XML value data will be converted back into a row by row value under Column_Value once passed from XML to Flat File.

 

The Process Flow

 

 

1. Message Shape

Explained under The Data section of this Article.

 

2. FF to XML Map

 

 

Our data comes into the map as a FF with elements of each column within the data. On the XML destination side, we have the following assumptions:

 

- Document_ID element is separated from the Column_Name and Column_Value and is set with "Max Occurs: 1" and "Looping Option: Unique".

 

- The Parent Element Transpose_Me and associated instance identifiers we are mapping the Project fields into is set with a  "Max Occurs: Unbounded".

 

- The result of the above will create two documents, each housing a unique Document_ID and all required Project Column_Name fields and associated values. 

 

The resulting two document outputs from this map will look exactly like the first part of this article, as also seen below:

 

 

3. XML to FF

 

Thanks to the FF to XML Map, we no longer need our instance identifiers. The data will map directly to their respective fields and output the two documents below:

 

 

4. Data Process: Combine

 

 

We will be doing a combine shape of our two files. Make sure to select "retain first line as Column Headers"

 

The final output will be as follows: 

 

 

Conclusion

 

This document is not a universal guide for transposing and should only be used as a reference. If you have any questions, please be sure to post down below and I will do my best to assist!

2 people found this helpful

Attachments

Outcomes