—from Epidemiological Bulletin, Vol. 24 No. 3, September 2003


Introduction to Multilevel Analysis

In 2002, the Epidemiological Bulletin (Vol. 23, No. 1) published an introduction to social epidemiology and Dr. Nancy Krieger’s “Glossary of social epidemiology.” As mentioned in these articles, social epidemiology recognizes that individual characteristics are not sufficient to explain the distribution of health problems in the population. As a result, it relies on statistical methods that allow including several levels of determinants in a single model. Those so-called multilevel methods are important health analysis tools as they extend beyond the study of individual epidemiological factors by incorporating simultaneously different levels of variables (e.g. family, neighborhood, community) that influence the state of health.

In order to introduce the concepts of multilevel analysis, this issue of the Bulletin presents the glossary prepared by Dr. Ana Diez-Roux of Columbia University, which was originally published in the Journal of Epidemiology and Community Health. This glossary will be published in the Bulletin in three parts and will also include an appendix with the English and Spanish term equivalencies.

 

A Glossary for Multilevel Analysis

Ana V. Diez Roux
Divisions of Medicine and Epidemiology, Columbia University
New York, New York, United States

PART I

Multilevel analysis has recently emerged as a useful analytical technique in several fields, including public health and epidemiology. This glossary defines key concepts and terms used in multilevel analysis.

Multilevel analysis, originally developed in the fields of education, sociology, and demography, has received increasing attention in public health and epidemiology over the past few years. This glossary defines key terms and concepts in multilevel analysis. The intent is to provide conceptual explanations of basic concepts, particularly those that are fundamental, that have been used inconsistently or that lend themselves to confusion. Selected terms and concepts more broadly related to the presence of multiple levels of organisation (such as group level variables and inferential fallacies) are also included. Although the glossary often refers to individuals nested within groups, multilevel analysis is applicable to a broad range of situations involving units at a lower level (or micro units) nested within units at a higher level (or macro units) (including for example, persons nested within studies as in meta-analysis, and measures over time nested within individuals as in the analysis of repeat measures). References to terms that have their own specific entry are in italics.

AGGREGATE DATA
Term used to refer to data or variables for a higher level unit (for example, a group) constructed by combining information for the lower level units of which the higher level unit is composed (for example, individuals within the group). Examples of aggregate data include summaries of the properties of individuals comprising a group, for example, the percentage of persons in a neighbourhood with complete high school or the mean income of state residents. Implicit in most uses of the term aggregate data is the idea that aggregate variables are merely summaries of the properties of lower level units and not measures of higher level properties themselves (although this is not necessarily true in all cases, see Derived variables).

ATOMISTIC FALLACY
The fallacy sometimes present when drawing inferences regarding variability across groups (or the relation between group level variables) based on individual level data, or more generally, the fallacy of drawing inferences regarding variability across units defined at a higher level based on data collected for units at a lower level. The atomistic fallacy arises because associations between two variables at the individual level may differ from associations between analogous variables measured at the group level. For example, a study of individuals may find that increasing individual level income is associated with decreasing coronary heart disease mortality. If it is inferred from these data that at the country level, increasing per capita income is associated with decreasing coronary heart disease mortality, the researcher may be committing the atomistic fallacy (because across countries, increasing per capita income may actually be associated with increasing coronary heart disease mortality). The sources of the atomistic fallacy are similar to those of the Ecologic fallacy. In the atomistic fallacy, the conceptual model being tested corresponds to the higher level, but the data are collected for a lower level.(1,2) The atomistic fallacy has sometimes been referred to as the Individualistic fallacy.(3,4)

COMPOSITIONAL EFFECTS
When inter-group (or inter-context) differences in an outcome (for example, disease rates) are attributable to differences in group composition (that is, in the characteristics of the individuals of which the groups are comprised) they are said to result from compositional effects.(5) On the other hand, when group differences are attributable to the effects of Group level variables or properties, they are said to result from Contextual effects.

CONTEXTUAL ANALYSIS
An analytical approach originally used in sociology to investigate the effect of collective or group characteristics on individual level outcomes.(4,6,7) In contextual analysis, group level predictors (often constructed by aggregating the characteristics of individuals within groups) are included together with individual level variables in standard regressions with individuals as the units of analysis (Contextual effects models). This approach permits the simultaneous examination of how individual level and group level variables are related to individual level outcomes. It thus allows for macro processes that are presumed to have an impact on individuals over and above the effects of individual level variables.(6) The terms “contextual analysis” and Multilevel analysis have sometimes been used synonymously,(8–10) and both approaches are similar in allowing the investigation of how group level (or macro) and individual level (or micro) variables (as well as their interactions) are related to individual level outcomes. However, Multilevel models are more general than the original contextual models in that (1) they allow (and account for) the possibility of residual correlation between individuals within groups; and (2) they allow examination of between group variability and the factors associated with it. In contrast, contextual models often do not account for residual correlation (although they can be modified to do so) and do not allow the examination of inter-group variability or of the factors associated with it (see also Variance components).

CONTEXTUAL EFFECTS
Term generally used to refer to the effects of variables defined at a higher level (usually at the group level) on outcomes defined at a lower level (usually at the individual level) after controlling for relevant individual level (lower level) confounders. The term is most often used to refer to the effect of a Derived group level variable (for example, mean neighbourhood income) on an individual level outcome (such as blood pressure) after controlling for its individual level namesake (for example, individual level income).(6,11) However, “contextual effects” is also sometimes used to refer to the effects of group level variables generally be they Derived variables or integral variables, and can apply to any situation involving lower level units nested within higher level units (for example, contextual effects of country characteristics on disease rates for small areas, contextual effects of tissue characteristics on cell biology). Contextual effects are sometimes contrasted with Compositional effects.(5)

CONTEXTUAL EFFECTS MODELS
Regression models with individuals as the units of analysis that include both group level and individual level variables as predictors of individual level outcomes. Traditional contextual effects models are equivalent to multilevel models in which all coefficients are modelled as fixed (that is, no error terms are included in the group level or level 2 equations, see Multilevel models). See Contextual analysis.

CONTEXTUAL VARIABLES
See Derived variables and Group level variables.

CROSS LEVEL EFFECTS
Term used to refer to the main effects of higher level variables (for example, group level variables) on outcomes at a lower level (for example, individual level outcomes) as well as to modifications of the effects of lower level (individual level) variables by higher level (group level) variables (see Cross level interaction).(12) Examples include the effect of country level income inequality on individual level self reported health (effect of a higher level variable on outcomes at a lower level), and the presence of stronger associations between individual level income and self reported health in the presence of high country level income inequality (modifications of the effects of lower level variables by higher level variables). The term “ecological effects” has sometimes been used as a synonym for “cross level effects”.(12)

CROSS LEVEL INFERENCE
The drawing of inferences regarding factors associated with variability in the outcome at one level based on data collected at another level (for example, drawing inferences regarding relations between individual level variables based on group level associations, or vice versa). See Ecologic fallacy and Atomistic fallacy.

CROSS LEVEL INTERACTION
Refers to the interaction between higher level and lower level variables—that is, to modification of the effects of lower level variables by characteristics of the higher level units to which the lower level units belong (or vice versa).(5,12) For example, if the relation between individual level income and blood pressure differs by neighbourhood characteristics (that is, neighbourhood and individual level variables interact), there is said to be a cross level interaction. In multilevel models whenever group specific estimates of the effect of a lower level variable are modelled as a function of higher level (group level) variables (as in equation (3) under the entry for Multilevel models), a cross level interaction appears in the final model ( in equation (4) under Multilevel models).

DERIVED VARIABLES
A type of Group level variable constructed by mathematically summarising the characteristics of individuals in the group (for example, means, proportions, or measures of dispersion, such as, percentage of persons with incomplete high school, mean income, standard deviation of the income distribution).(11,13) Some derived variables have no individual level analogue (for example, standard deviation of the income distribution) and therefore necessarily refer to group level constructs. Others (for example, mean neighbourhood income) do have individual level analogues (for example, individual level income), but may provide information on group level constructs, distinct from their individual level namesake. The mean of the dependent variable in the group (for example, proportion infected in a study of the causes of infection) can be thought of as a special type of derived variable.(14) Although derived and Integral variables are sometimes presented as conceptually distinct, they are closely interrelated. Derived variables often operate by shaping certain integral properties of the group. For example, the composition of a group may influence the predominant types of interpersonal contacts, values, and norms or may shape organisations or regulations within the group that affect all members.(15) The terms “analytical variables” and “aggregate variables” have been used as synonyms for “derived variables”. The term “contextual variables” has also been used as a synonym for “derived variables”(14) although it is sometimes used to refer to Group level variables generally.(6,13)

ECOLOGICAL FALLACY
The fallacy sometimes present when drawing inferences at the individual level (that is, regarding relations between individual level variables) based on group level data. The ecological fallacy arises because associations between two variables at the group level (or ecological level) may differ from associations between analogous variables measured at the individual level. These differences between individual level and group level associations were first described for correlation coefficients (16) but may also be present for other measures of association such as regression coefficients.(11,17) More generally, the fallacy may occur whenever data for units at a higher level are used to draw inferences regarding factors associated with variability across units at a lower level—that is,when the conceptual model being tested corresponds to the lower level, but the data are collected for a higher level.(1,2) Suppose a researcher finds that at the country level, increasing per capita income is associated with increasing mortality attributable to traffic accidents. If he/she infers that at the individual level, increasing personal income is associated with increasing motor vehicle related mortality, she may be committing the ecological fallacy, because within countries, motor vehicle related mortality may always be lower in high income than in low income persons. In the case of regression coefficients, the sources of the ecological fallacy include (1) the lack of information on constructs pertaining to a lower level of organisation; and (2) the failure to realise that a variable defined and measured at one level of organisation may tap into a different construct than its namesake at another level.(18)

References
(1) Riley MW. Special problems of sociological analysis. In: Sociological research I: a case approach.New York: Harcourt, Brace, and World, 1963:700–25.
(2) Diez-Roux AV. Bringing context back into epidemiology: variables and fallacies in multilevel analysis. Am J Public Health 1998;88:216–22.
(3) Alker HR. A typology of ecological fallacies. In: Dogan M, Rokkam S, eds. Social ecology. Boston: The MIT Press, 1969:69–86.
(4) Scheuch EK. Social context and individual behavior. In: Dogan M, Rokkam S, eds. Social ecology. Boston: The MIT Press, 1969:133–55.
(5) Duncan C, Jones K, Moon G. Context, composition, and heterogeneity: using multilevel models in health research. Soc Sci Med 1992;46:97–117.
(6) Blalock HM. Contextual-effects models: theoretical and methodological issues. Ann Rev Sociol 1984;10:353–72.
(7) Iversen G. Contextual analysis. Newbury Park, CA: Sage, 1991.
(8) Hermalin A. The multilevel approach: theory and concepts. The methodology for measuring the impact of family planning programs on fertility. Population studies Addendum Manual IX. New York: United Nations,1986;66:15–31.
(9) van den Eeden P, Huttner HJ. Multi-level research. Curr Sociol 1982;30:1–178.
(10) DiPrete TA, Forristal JD. Multilevel models: methods and substance. Annu Rev Sociol 1994;20:331–57.
(11) Morgenstern H. Ecologic studies in epidemiology: concepts, principles, and methods. Annu Rev Public Health1995;16:61–81.
(12) Blakely TA, Woodward AJ. Ecological effects in multi-level studies. J Epidemiol Community Health 2000;54:367–74.
(13) Lazarsfeld PF, Menzel H. On the relation between individual and collective properties. In: Etzioni A, ed. A sociological reader on complex organizations. New York: Holt, Rinehart, and Winston, 1971:499–516.
(14) Susser M. The logic in ecological: I. The logic of analysis. Am J Public Health 1994;84:825–9.
(15) Valkonen T. Individual and structural effects in ecological research. In: Dogan M, Rokkam S, eds. Social ecology. Boston: The MIT Press, 1969:53–68.
(16) Robinson WS. Ecological correlations and the behavior of individuals. Am Sociol Rev 1950;15:351–7.
(17) Piantadosi S, Byar DP, Green SN. The ecological fallacy. Am J Epidemiol 1988;127:893–904.
(18) Diez-Roux AV, Schwartz S, Susser E. Ecologic studies and ecologic variables in public health research. In: The Oxford textbook of public health. Volume 2. London: Oxford University Press, 2002:493–508. 1995.

Source: Published initially as “A glossary for multilevel analysis” in the Journal of Epidemiology and Community Health, 56:588-594, 2002.




Return to Index
Epidemiological Bulletin, Vol. 24 No. 3, September 2003