I have a very simple question, if my data is in CSV file & is not fixed in position. I need to pick comma delimited data from file based on column headers(Which means the column header can change in sequence).
I had a similar issue some time ago, and ended up with this solution:
I start by trying mapping the normal flat file profile, if that fails in the try/catch it goes through the other map shape.
The destination document has a date field, which does not exist in the abnormal flat file profile and the mapping fails.
Hope it helps.
Thanks for quick response.
But, in my case there is no surety of data positioning, any field can be anywhere in the file. Just the number & name of header would be same in every run. Sequence will keep on changing.
This cannot work. You need to have a defined profile\schema so that the incoming data matches the profile and loads successfully. If your incoming data is switching header elements, you can still get the data loaded but you may not get desired output for each element. This is integration basic.
I am curious to know why your elements keep switching positions. Is there any specific reason for this? Best practice is asking the source to stick to a defined format.
Thanks for quick response. This requirement is weird but has come to us from source file owner.
Britto is absolutely right this is not best practice. I would call it worst practice.
But I couldn't help creating a small process which is able to handle random position of columns:
Source Flat File Profile:
Destination Flat File Profile:
Map Function PrependAppendXmlTag:
Output = "<"+Colname+">"+Value+"</"+Colname+">";
Output from Map Shape:
Data Process Shape:
Output from Data Process Shape (3 documents)
Output from Data Process Shape:
Output from Message Shape:
Output from Map Shape
The solution is rather primitive but works.
First I "XML-ify" the source flat file profile.
When that is done I can map to the desired flat file profile.
Only prerequisite in this solution is that you know the column names coming in, but it would probably be possible to do a work around here if you don't know the column names.
Thanks a lot Leif,
I will definitely try this out.
Get back to me if you have any questions.
My suggestion would be to convert the raw csv to an XML and then convert back to a fixed/ordered csv and continue with processing.
The XML could be of the following structure :
<Data> <Row> <element> <columnName> mapped to the column header</columnName> <value> value </value> </element> </Row></Data>
The node 'Row' would be looped based on number of rows and the element 'element' would be looped based on number of columns in that row.
You would be able to extract the information from XML using qualifiers on column name.
Hope this helps. I haven't tried this out but a thought as I stumbled upon this question. Based on the implementation, there could be a few tweaks required.
Retrieving data ...