Advik Shreekumar

Advik Shreekumar

Job Market Candidate

Research Fields

Health Economics, Behavioral Economics, Econometrics

Contact Information

Job Market Paper

X-raying Experts: Decomposing Mistakes in Radiology

Medical errors are consequential but difficult to study, usually requiring laborious human review of past cases. I apply algorithmic tools to measure the extent and nature of medical error in one of the most common medical decision settings: radiologists interpreting chest x-rays. I use state-of-the-art natural language processing to extract radiologists' claims about cardiac health from their free text reports, and compare these claims to algorithmic predictions of the same. I adjudicate between the two using exogenously administered blood tests that directly measure cardiac health. At least 55 percent of radiologists make mistakes, issuing reports that predictably misrank the severity of patients' cardiac health. Careful choice of algorithmic benchmark shows that these errors reflect, in roughly equal proportion, individual radiologists falling short of best clinical practice (a "human frontier"), and a further gap between best practice and algorithmic predictions (a "machine frontier"). Reaching the human frontier would reduce radiologists' false negative rates by 20% and false positive rates by 2%; reaching the machine frontier would reduce false negatives by an additional 12% and false positives by 2%. In contrast to a leading hypothesis in the medical literature, errors do not reflect radiologists overweighting salient information; rather, they systematically under-react to signals of patient risk. Finally, the mistakes revealed by machine learning do not skew against underrepresented groups.

Publications

When Guidance Changes: Government Stances and Public Beliefs (with Charlie Rafkin and Pierre-Luc Vautrey)

Journal of Public Economics, April 2021

Governments often make early recommendations about issues that remain uncertain. Do governments’ early positions affect how much people believe the latest recommendations? We investigate this question using an incentivized online experiment with 1900 US respondents in early April 2020. We present all participants with the latest CDC projection about coronavirus death counts. We randomize exposure to information that highlights how President Trump previously downplayed the coronavirus threat. When the President’s inconsistency is salient, participants are less likely to revise their prior beliefs about death counts from the projection. They also report lower trust in the government. These results align with a simple model of signal extraction from government communication, and have implications for the design of changing guidelines in other settings.


Working Papers

Managing Emotions: The Effects of Online Mindfulness Meditation on Mental Health and Economic Behavior (with Pierre-Luc Vautrey)

Mindfulness meditation has gained popularity, fueled by accessible smartphone apps and rising concerns about mental health. While such apps are claimed to affect mental well-being, produc- tivity, and decision making, existing evidence is inconlcusive due to limited sample sizes and high attrition. We address these concerns by conducting a large-scale, low-attrition experiment with 2,384 US adults, randomizing access and usage incentives for a popular mindfulness app. App access improves an index of anxiety, depression, and stress by 0.38 standard deviations (SDs) at two weeks and 0.46 SDs at four weeks, with persistent effects three months later. It also improves earnings on a focused proofreading task by 2 percent. However, we find near-zero effects on a standard cognitive test (a Stroop task), and on decisions over risk and information acquisition where past economics research has indicated that emotions affect choice. This study provides evidence that digital mindfulness improves mental health and can raise productivity, but suggests that these effects do not stem from traditional measures of cognitive skills nor do they accompany more primitive changes in the information and risk preferences we measure.


Work in Progress

Common Functional Decompositions Can Mis-attribute Differences in Outcomes Between Populations (with Manuel Quintero, William Stephenson, and Tamara Broderick)

We often wish to explain why an outcome is different in two populations. For instance, if one expert decision maker proves more accurate than another, is that due to differences in their cases they handle (i.e., covariates) or their skill at judging them (i.e., outcomes given covariates)? The Kitagawa-Oaxaca-Blinder (KOB) decomposition is a standard econometric tool that decomposes a difference in the mean outcomes across two populations into terms that depend on covariates, and those that depend on the relationship between covariates and outcomes. However, the KOB decomposition assumes a linear relationship between covariates and outcomes, while the true relationship may be meaningfully nonlinear. Modern machine learning boasts many nonlinear functional decompositions for the relationship between outcomes and covariates in one population. It seems natural to extend the KOB decomposition using these decompositions. We observe that a successful extension should not attribute the differences to covariates — or, respectively, outcomes given covariates — if those are the same in the two populations. Unfortunately, we demonstrate that two common decompositions — the functional ANOVA and Accumulated Local Effects — can misattribute differences to outcomes given covariates, even in simple examples where they are identical in two populations. We provide a characterization of when functional ANOVA misattributes, as well as a general property that any decomposition should satisfy to avoid misattribution. We show that if the decomposition is independent of its input distribution, it does not misattribute. We further conjecture that misattribution arises in any reasonable additive decomposition that depends on the distribution of the covariates.