What's the best practice for sources which don't have the option to filter data on last update date? For example, I'm currently using the Anaplan connector to store data in MDM. Unfortunately, this Anaplan connector does not have filter capabilities and the source records do not have a last update date field. The connector sends all records as single CSV files.
This can cause issues which can take up a lot of time for datastewards. For example, suppose the data quality of the source data is pretty bad, e.g. fields you use in matching rules are empty. That causes lot's of quarantine records with the status "Data quality error", which is logical. But this bad data quality causes lot's of work for datastewards, e.g. deleting the same records over and over again. And we talk about thousands of records coming in each day.
What's the best practice for this. Filtering the data before you upsert it to MDM is an option but you still have to inform "the business" about data quality issues. Or filter/select data in the MDM quarantine and delete the quarantine records every time? Other recommendations are welcome.