AnsweredAssumed Answered

How do you handle errors when processing thousands of records in ETL? is there an alternative to try/catch?

Question asked by SatyaKomatineni3761 on Apr 30, 2016
Latest reply on May 5, 2016 by Sjaak Overgaauw

If I have thousands of "records" (say 10,000) in one large file going through a process and various shapes, I start with an assumption that it is inefficient to split that file into 10,000 individual "documents" representing one 10,000 records.  This assumption on my part comes from the fact that there may be 10,000 temp files on the temp space between each shape.

 

Say I do process one record at a time as one document. Now I can use try/catch and capture the failed records.

 

But if my assumption is valid, then I might want to split the ONE file into say 10 "documents" with 1000 "records" in each and pass those 10 documents across the shape. Now this is efficient from temp file space perspective and also for something like a DB connector that can do batch commits.

 

But in my try/catch don't I get the failed document that contains 1,000 records in it? Because (again assumption) try/catch works at a document level and not a 'record' level.

 

Are there alternatives to try/catch errors at a "record' level?

Outcomes