Risk scores, label bias, and everything but the kitchen sink

Sci Adv. 2024 Mar 29;10(13):eadi8411. doi: 10.1126/sciadv.adi8411. Epub 2024 Mar 29.

Abstract

In designing risk assessment algorithms, many scholars promote a "kitchen sink" approach, reasoning that more information yields more accurate predictions. We show, however, that this rationale often fails when algorithms are trained to predict a proxy of the true outcome, for instance, predicting arrest as a proxy for criminal behavior. With this "label bias," one should exclude a feature if its correlation with the proxy and its correlation with the true outcome have opposite signs, conditional on the other model features. This criterion is often satisfied when a feature is weakly correlated with the true outcome, and, additionally, that feature and the true outcome are both direct causes of the proxy outcome. For example, criminal behavior and geography may be weakly correlated and, due to patterns of police deployment, direct causes of one's arrest record-suggesting that excluding geography in criminal risk assessment will weaken an algorithm's performance in predicting arrest but will improve its capacity to predict actual crime.