The (mis)estimation of neighborhood effects: causal inference for a practicable social epidemiology
Introduction
A “neighborhood effect” is the independent causal effect of a neighborhood (i.e., residential community) on any number of health and/or social outcomes (Jenks & Mayer, 1990; Mayer & Jenks, 1999). Of interest have been (a) so-called contextual effects, which presumably emerge from within-neighborhood social interactions, and (b) so-called integral effects that emerge from toxic dumps, parks, sidewalks, etc. (Ozonoff et al., 1987; Geschwind et al., 1992; Susser & Susser, 1996; see Diez-Roux, 1998), though one need not distinguish between them.1
Epidemiologists have long recognized that people residing in different areas have differing health outcomes (cf. Macintyre, Maciver, & Sooman, 1993; McMichael, 1999; Catalono & Pickett, 2000; Lawson, 2001). Most would agree that spatial variation in morbidity and mortality is somehow associated with the clustering of genetic predispositions, cultural norms, opportunity structures, and/or environmental conditions. By definition, advantaged neighborhoods offer cleaner, safer, and less stressful environments as compared to, say, ghetto areas. It would be shocking to learn that such contexts did not somehow impact health. The question is about magnitude, mechanism, and mutability: How big are the effects, how do they emerge, and how might such information be exploited to improve the public's health?
Social scientists have long suffered an interest in neighborhood effects, which they view as a special case of context effects—the raison d’etre of social science. Most know Durkheim (1951 (1897)) aimed to show that social forces (e.g., norms and values) external to the individual influenced suicide and that Weber (1958 (1905)) aimed to show how religious ideology shaped economic behavior. But many distinguished contemporary social scientists—notably Merton (1949), Lazarsfeld (Lazarsfeld & Menzel, 1961), Blau (Blau, 1960; Blau & Duncan, 1967), Coleman (Coleman, 1958, 1990; Coleman et al., 1966), Sewell (Sewell, 1964; Sewell & Armer, 1966), Blalock (1984), Hauser (1970), Hauser (1974), Jenks and Mayer (Jenks & Mayer, 1990; Mayer & Jenks, 1999), Bandura (1986), Sampson (Sampson, 1991; Sampson, Raudenbush, & Earls, 1997), Massey (Massey and Denton, 1993; Massey, Gross, & Shibuya, 1994); Wilson (1987), Manski (Manski (1993a), Manski (1993b), Manski (1995), 2000), Arrow (1971), Arrow (1994), Achen & Shively (1970), Achen & Shively (1976), Achen & Shively (1980), Akerlof (1997); Bowles and Gintis (Bowles (1998), Bowles & Gintis, (1977); Gintis, 2000; Bowles & Gintis, 2002), Shively (Achen & Shively (1974), Shively (1987); Achen & Shively, 1995), and King (1997)—have ably addressed related questions from an analytic/statistical perspective.
Of special import are the similarities between the epidemiologist's neighborhood and the educational scientist's school for they lead us to methodological work on “school effects” by Raudenbush and colleagues (Raudenbush & Bryk, 1986; Raudenbush & Whillms, 1995; Raudenbush & Sampson, 1999a), among others (e.g., Coleman et al., 1966; Aitkin & Longford, 1986; Goldstein, 1995). The problem for educational science is how to estimate the independent effect of good teachers or school administrators on student achievement. The analogous problem for social epidemiologists is to estimate the independent effect of toxic dumps, locally promulgated smoking policies, or inducible increases in social networking, on a neighborhood's health. Both problems share two fundamental characteristics. First, they are typically analyzed with non-experimental (i.e., observational) data. Second, people/students are nested within neighborhoods/schools, which yields a hierarchical data structure where measurements are taken on both individuals and the groups in which they act. How to address these problems is the central concern of this paper.
From an epidemiological perspective, it is difficult to understate the importance of studying contexts such as neighborhoods (cf. Cassel, 1976; McMichael, 1999; Susser, 1999; Berkman, Glass, Brissette, & Seeman, 2000; Krieger, 2001). Social forces, above and beyond any individual, have been repeatedly shown to play an important role in how we perceive, measure, and address health and illness (cf. Parsons, 1951; Starr, 1982; Rose, 1985; Clark, Potter, & McKinlay, 1991; Barr, 1995; McKinlay, 1996; Feldman et al., 1997). Even if we could measure more person-level characteristics, including them all in a model for health violates the principle of parsimony, to say nothing of the theoretical arguments against the atomistic fallacy, biophysical reductionism, or other such “Robinson Crusoe” assumptions (Link & Phelan, 1995; Kaplan, 1996; Susser, 1998). It just may be that “upstream” causes, i.e., those that systematically affect people through their neighborhoods or social groups, are more amenable to interventions designed to improve the public's health.
Focused epidemiologic interest in neighborhood effects dates back to Chadwick's sanitation efforts, circa 1842. Contemporary efforts begin with Cochran, who in the early 1960s developed multivariable regression to help the city of Baltimore estimate the effect of public housing on health-related outcomes (Salsbur (2001a), Salsbur (2001b)). Motivated by advances in social epidemiology, multilevel theories and sophisticated statistical models, interest in such questions has recently surged. Pickett and Pearl (2001) reviewed 25 “neighborhood effect” studies published in epidemiology journals since the mid-1980s, the thrust of which was that investigators detected small but consistent “context” effects associated with group-level socioeconomic status (SES) on health outcomes. Diez-Roux et al. (2001) recently published “Neighborhood of Residence and Incidence of Coronary Heart Disease” in the New England Journal of Medicine, the thrust of which was that people living in lower SES neighborhoods had higher incidence of cardiovascular disease (CVD), independent of their individual-level SES. What is more, the broad notions are now central to our greater discipline: a naive Medline search for “multilevel” or “contextual” in TITLE revealed over 1200 citations; an unrestricted search of the same key words yielded almost 5000 citations. Enthusiasm for such studies is understandable for they exemplify the effect of social forces, emergent contexts, and social relationships on health—the raison d’etre of social epidemiology.
Yet due largely to persistent and complex methodological obstacles, along with a lack of attention to them, the causal effect of neighborhood contexts on health continues to confuse and elude us (see Hook, 2001). There appear to be no multilevel neighborhood effect studies with observational data, including those cited above, that directly confront causal inference.2 What is more, despite over three decades of methodological discussion (cf. Hauser (1970), Hauser (1974); Stipak & Hensler, 1982; Blalock, 1984; Gray, 1989; Swaminathan, 1989; Von Korff, Koepsell, Curry, & Diehr, 1992; Manski 1993a; DiPrete & Forristal, 1994; Draper, 1995; Raudenbush & Whillms, 1995; Duncan, Connell, & Klebanov, 1997; Blakely & Woodward, 2000; Greenland (2001), Greenland (2002)), it is evident that many social epidemiologists are not clear on exactly what multilevel models are or how they may be used to estimate and interpret causal neighborhood effects.
This paper aims to advance our understanding and assist social epidemiologists designing, conducting and/or reviewing multilevel neighborhood effect studies. The first section motivates a model for estimating neighborhood effects. The second section develops, with causal inference in mind, the now common multilevel model for observational data. The third section adopts a critical methodological perspective to show the impossibility of estimating useful neighborhood effects with a regression model, of which the multilevel model is a special case. In order to avoid criticizing without an alternative, the fourth section shows the relationship between multilevel neighborhood effect models and randomized community trial designs, and argues the latter appears to be the best bet for estimating useful neighborhood effects. We conclude by summarizing findings and suggesting that in concert with the development of both a dependency-based methodology and a rigorous theory of social interaction, the community trial be viewed as the canonical experimental design for a social epidemiology seeking to actually improve the public's health.
Methodologists will appreciate the enormous and subtle complexities to be addressed. Such issues are typically handled in isolation and with great precision. But this creates a disparate and technical literature often inaccessible to social epidemiologists. We presume the dearth of tailored and richly annotated translations explains much of the current confusion. Pursuant to remedy, we build our case slowly, take liberties with nomenclature (equations, hats, Greek letters and such), employ a conversational style, and include abundant footnotes and citations. Reaching a broader audience is consistent with our ultimate goal of encouraging a more thoughtful social epidemiological methodology, more likely to provide valid inferences, and thereby the basis for scientific understanding and sound policy recommendations for improving the public's health.
Section snippets
Motivating a causal model for neighborhood effects
This section aims to motivate a causal model for estimating neighborhood effects. The word “cause” is central for it illuminates practical obstacles and potential solutions, and because it links this paper to the important and growing interdisciplinary interest in causal inference (cf. Heckman & Smith, 1995; Manski, 1995; Sobel, 1995; McKim & Turner, 1997; Greenland, Pearl, & Robins, 1999; Kaufman & Poole, 2000; Pearl, 2000; Robins, 2001; Greenland, 2002; Rosenbaum, 2002; Shadish, Cook, &
A causal multilevel model for neighborhood effects
We now conceptually develop the standard multilevel model for estimating neighborhood effects with observational data. The section is important because despite the vast and growing literature cited above, no one appears to have conceptually developed the multilevel model with an eye on causal inference.14
Methodological obstacles
Methodological research is concerned with the logic of causal inference. The objective is to learn what conclusions can and cannot be drawn given a specified combination of assumptions (Manski, 1995). In short, we explore the extent to which an effect parameter can be “identified” through various research designs, where “identified” may be loosely defined as accurately estimated or detected (see Hsiao, 1983; Manski, 1995 for formal definitions).
We now adopt a critical methodological perspective
An alternative: community trials
The preceding has begged the nihilistic question: Are multilevel neighborhood effect studies of any use to a practicable social epidemiology? Our answer is “yes, but…”
Estimation of independent neighborhood effects from observational data and multilevel models, as described above, appears as if they will always be wrong, but some might be useful for theory development, preliminary testing, and provisional conclusions when experiments are not possible. There is no question that neighborhood
Conclusion
Ever since Durkheim empirically “demonstrated” that emergent properties of groups influenced the behavior of individuals independent of their background characteristics, social scientists have aimed to estimate them. But the quantification of such effects has proved both elusive and vexing. Exploiting recent theoretical and statistical advances, social epidemiologists recently have joined the quest by examining neighborhood effects with sophisticated multilevel models and observational data.
Acknowledgements
This paper was supported by grant HL61573 from the National Heart, Lung and Blood Institute (NHLBI/NIH). Beyond the helpful recommendations of two anonymous reviewers, the comments and criticisms of several colleagues improved this paper. Thanks to Andre Araujo, Henry Blackburn, Heather R. Britt, Henry A. Feldman, David A. Freedman, Pamela Jo Johnson, Jay S. Kaufman, Ichiro Kawachi, David M. Murray, Stephen W. Raudenbush, Peter H. Rossi, Ruth N. Lopez Turley, and members of the Social Epi
References (309)
- et al.
Theory as mediating variablesWhy aren’t community interventions working as desired?
Annals of Epidemiology
(1997) - et al.
The paucity of effects in community trialsIs secular trend the culprit?
Preventive Medicine
(1999) - et al.
From social integration to healthDurkheim in the new millennium
Social Science & Medicine
(2000) - et al.
Models for cultural inheritancegroup mean and group variation
Theoretical Population Biology
(1973) - et al.
A brief observational measure for urban neighborhoods
Health Place
(2001) - et al.
Bringing social structure back into clinical decision making
Social Science & Medicine
(1991) Selecting end point variables for a community intervention trial
Annals of Epidemiology
(1997)- et al.
Statistical design of REACT (Rapid Early Action for Coronary Treatment), a multisite community trial with continual data collection
Controlled Clinicial Trials
(1998) - et al.
A theoretical and empirical analysis of contextNeighbourhoods, smoking and youth
Social Science & Medicine
(2002) - et al.
Cross-level inference
(1995)
Multiple regressionTesting and interpreting interactions
Statistical modeling issues in school effectiveness studies
Journal of the Royal Statistical Society, Series A
The market for ‘Lemons’Quality uncertainty and the market mechanism
Quarterly Journal of Economics
The economics of caste and of the rat-race and other woeful tales
Quarterly Journal of Economics
A theory of social custom, of which unemployment may be one consequence
Quarterly Journal of Economics
Social distance and social decisions
Econometrica
Environmental equityThe demographics of dumping
Demography
Environmental equity in superfund. Demographics of the discovery and prioritization of abandoned toxic sites
Evaluation Review
Political and economic evaluation of social effects and externalities
Methodological individualism and social knowledge
American Economic Review
Social foundations of thought and action
The effects of organizational structure on primary care outcomes under managed care
Annals of Internal Medicine
The evolution of norms
American Journal of Sociology
Toward a methodology for mere mortals
Selection biases in sociological data
Social Science Research
Statistical inference for apparent populations (with discussion)
The importance of specifying the underlying biologic model in estimating the probability of causation
Health Physics
A randomised controlled trial of a community intervention to prevent adolescent tobacco use
Tobacco Control
Ecological effects in multi-level studies
Journal of Epidemiology and Community Health
Contextual-effects modelsTheoretical and methodological issues
Annnual Review of Sociology
Structural effects
American Sociological Review
The American occupational structure
Cluster effect and simultaneity in multilevel models
Health Economics
Endogenous preferencesthe cultural consequences of markets and other economic institutions
Journal of Economic Literature
Schooling in capitalist AmericaEducational reform and the contradictions of economic life
Walrasian economics in retrospect
Quarterly Journal of Economics
The inheritance of inequality
Journal of Economic Perspectives
Contextual analysisConcepts and statistical techniques
Cultural and the evolutionary process
Interaction-based models
Hierarchical linear models
Factors relevant to the validity of experiments in social settings
Psychological Bulletin
Reforms as experiments
A primer on regression artifacts
Experimental and quasi-experimental designs for research
The Pawtucket heart health programI. An experiment in population-based disease prevention
Rhode Island Medical Journal
The Pawtucket Heart Health ProgramCommunity changes in cardiovascular risk factors and projected disease risk
American Journal of Public Health
The contribution of the social environment to host resistanceThe fourth Wade Hampton Frost Lecture
American Journal of Epidemiology
A taxonomy of research concerned with place and health
Cited by (574)
Inferring nonwork travel semantics and revealing the nonlinear relationships with the community built environment
2023, Sustainable Cities and SocietyContext, health and migration: a systematic review of natural experiments
2023, eClinicalMedicineHospital context in surgical site infection following colorectal surgery: a multi-level logistic regression analysis
2023, Journal of Hospital Infection