top of page

Training Data Quality

LinkedIn Post 10: Training Data Quality


Your synthetic comparable algorithm is trained on 10,000 transactions. That sounds robust.


But what transactions? What time period? What geographic scope? What data quality filters?


If the training data is fundamentally different from the subject market, the synthetic output is unreliable. Using coastal resort data to value inland agricultural property? Problematic. Using 2008 financial crisis data in a normal market? Concerning. Using only transaction data from one property type to value a different type? Risk.


Before using synthetic comparables, assess the training data quality. Does it appropriately represent the relevant market? Are significant property types or time periods missing? Were exclusions appropriate?


Document your assessment. It becomes part of your due diligence record.


 
 
 

Recent Posts

See All

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
bottom of page