|
from Epidemiological Bulletin, Vol. 24 No. 3, September 2003
|
|
In 2002, the Epidemiological Bulletin (Vol. 23, No. 1) published an introduction to social epidemiology and Dr. Nancy Kriegers Glossary of social epidemiology. As mentioned in these articles, social epidemiology recognizes that individual characteristics are not sufficient to explain the distribution of health problems in the population. As a result, it relies on statistical methods that allow including several levels of determinants in a single model. Those so-called multilevel methods are important health analysis tools as they extend beyond the study of individual epidemiological factors by incorporating simultaneously different levels of variables (e.g. family, neighborhood, community) that influence the state of health. In order to introduce the concepts of multilevel analysis, this issue of the Bulletin presents the glossary prepared by Dr. Ana Diez-Roux of Columbia University, which was originally published in the Journal of Epidemiology and Community Health. This glossary will be published in the Bulletin in three parts and will also include an appendix with the English and Spanish term equivalencies. |
A Glossary for Multilevel Analysis
Ana V. Diez Roux
Divisions of Medicine and Epidemiology, Columbia University
New York, New York, United States
PART I
Multilevel analysis has recently emerged as a useful analytical
technique in several fields, including public health and epidemiology. This
glossary defines key concepts and terms used in multilevel analysis.
Multilevel analysis, originally developed in the fields of education,
sociology, and demography, has received increasing attention in public health
and epidemiology over the past few years. This glossary defines key terms and
concepts in multilevel analysis. The intent is to provide conceptual explanations
of basic concepts, particularly those that are fundamental, that have been used
inconsistently or that lend themselves to confusion. Selected terms and concepts
more broadly related to the presence of multiple levels of organisation (such
as group level variables and inferential fallacies) are also included. Although
the glossary often refers to individuals nested within groups, multilevel analysis
is applicable to a broad range of situations involving units at a lower level
(or micro units) nested within units at a higher level (or macro units) (including
for example, persons nested within studies as in meta-analysis, and measures
over time nested within individuals as in the analysis of repeat measures).
References to terms that have their own specific entry are in italics.
AGGREGATE DATA
Term used to refer to data or variables for a higher level unit (for example,
a group) constructed by combining information for the lower level units of which
the higher level unit is composed (for example, individuals within the group).
Examples of aggregate data include summaries of the properties of individuals
comprising a group, for example, the percentage of persons in a neighbourhood
with complete high school or the mean income of state residents. Implicit in
most uses of the term aggregate data is the idea that aggregate variables are
merely summaries of the properties of lower level units and not measures of
higher level properties themselves (although this is not necessarily true in
all cases, see Derived variables).
ATOMISTIC FALLACY
The fallacy sometimes present when drawing inferences regarding variability
across groups (or the relation between group level variables) based on individual
level data, or more generally, the fallacy of drawing inferences regarding variability
across units defined at a higher level based on data collected for units at
a lower level. The atomistic fallacy arises because associations between two
variables at the individual level may differ from associations between analogous
variables measured at the group level. For example, a study of individuals may
find that increasing individual level income is associated with decreasing coronary
heart disease mortality. If it is inferred from these data that at the country
level, increasing per capita income is associated with decreasing coronary heart
disease mortality, the researcher may be committing the atomistic fallacy (because
across countries, increasing per capita income may actually be associated with
increasing coronary heart disease mortality). The sources of the atomistic fallacy
are similar to those of the Ecologic fallacy. In the atomistic fallacy,
the conceptual model being tested corresponds to the higher level, but the data
are collected for a lower level.(1,2) The atomistic fallacy has sometimes been
referred to as the Individualistic fallacy.(3,4)
COMPOSITIONAL EFFECTS
When inter-group (or inter-context) differences in an outcome (for example,
disease rates) are attributable to differences in group composition (that is,
in the characteristics of the individuals of which the groups are comprised)
they are said to result from compositional effects.(5) On the other hand, when
group differences are attributable to the effects of Group level variables
or properties, they are said to result from Contextual effects.
CONTEXTUAL ANALYSIS
An analytical approach originally used in sociology to investigate the effect
of collective or group characteristics on individual level outcomes.(4,6,7)
In contextual analysis, group level predictors (often constructed by aggregating
the characteristics of individuals within groups) are included together with
individual level variables in standard regressions with individuals as the units
of analysis (Contextual effects models). This approach
permits the simultaneous examination of how individual level and group level
variables are related to individual level outcomes. It thus allows for macro
processes that are presumed to have an impact on individuals over and above
the effects of individual level variables.(6) The terms contextual analysis
and Multilevel analysis have sometimes been used synonymously,(810)
and both approaches are similar in allowing the investigation of how group level
(or macro) and individual level (or micro) variables (as well as their interactions)
are related to individual level outcomes. However, Multilevel models are
more general than the original contextual models in that (1) they allow (and
account for) the possibility of residual correlation between individuals within
groups; and (2) they allow examination of between group variability and the
factors associated with it. In contrast, contextual models often do not account
for residual correlation (although they can be modified to do so) and do not
allow the examination of inter-group variability or of the factors associated
with it (see also Variance components).
CONTEXTUAL EFFECTS
Term generally used to refer to the effects of variables defined at a higher
level (usually at the group level) on outcomes defined at a lower level (usually
at the individual level) after controlling for relevant individual level (lower
level) confounders. The term is most often used to refer to the effect of a
Derived group level variable (for example, mean neighbourhood income)
on an individual level outcome (such as blood pressure) after controlling for
its individual level namesake (for example, individual level income).(6,11)
However, contextual effects is also sometimes used to refer to the
effects of group level variables generally be they Derived variables
or integral variables, and can apply to any situation involving lower
level units nested within higher level units (for example, contextual effects
of country characteristics on disease rates for small areas, contextual effects
of tissue characteristics on cell biology). Contextual effects are sometimes
contrasted with Compositional effects.(5)
CONTEXTUAL EFFECTS MODELS
Regression models with individuals as the units of analysis that include
both group level and individual level variables as predictors of individual
level outcomes. Traditional contextual effects models are equivalent to multilevel
models in which all coefficients are modelled as fixed (that is, no error terms
are included in the group level or level 2 equations, see Multilevel models).
See Contextual analysis.
CONTEXTUAL VARIABLES
See Derived variables and Group level variables.
CROSS LEVEL EFFECTS
Term used to refer to the main effects of higher level variables (for example,
group level variables) on outcomes at a lower level (for example, individual
level outcomes) as well as to modifications of the effects of lower level (individual
level) variables by higher level (group level) variables (see Cross
level interaction).(12) Examples include the effect of country level
income inequality on individual level self reported health (effect of a higher
level variable on outcomes at a lower level), and the presence of stronger associations
between individual level income and self reported health in the presence of
high country level income inequality (modifications of the effects of lower
level variables by higher level variables). The term ecological effects
has sometimes been used as a synonym for cross level effects.(12)
CROSS LEVEL INFERENCE
The drawing of inferences regarding factors associated with variability
in the outcome at one level based on data collected at another level (for example,
drawing inferences regarding relations between individual level variables based
on group level associations, or vice versa). See Ecologic fallacy and
Atomistic fallacy.
CROSS LEVEL INTERACTION
Refers to the interaction between higher level and lower level variablesthat
is, to modification of the effects of lower level variables by characteristics
of the higher level units to which the lower level units belong (or vice versa).(5,12)
For example, if the relation between individual level income and blood pressure
differs by neighbourhood characteristics (that is, neighbourhood and individual
level variables interact), there is said to be a cross level interaction. In
multilevel models whenever group specific estimates of the effect of a lower
level variable are modelled as a function of higher level (group level) variables
(as in equation (3) under the entry for Multilevel models), a cross level
interaction appears in the final model (
in equation (4) under Multilevel models).
DERIVED VARIABLES
A type of Group level variable constructed by mathematically summarising
the characteristics of individuals in the group (for example, means, proportions,
or measures of dispersion, such as, percentage of persons with incomplete high
school, mean income, standard deviation of the income distribution).(11,13)
Some derived variables have no individual level analogue (for example, standard
deviation of the income distribution) and therefore necessarily refer to group
level constructs. Others (for example, mean neighbourhood income) do have individual
level analogues (for example, individual level income), but may provide information
on group level constructs, distinct from their individual level namesake. The
mean of the dependent variable in the group (for example, proportion infected
in a study of the causes of infection) can be thought of as a special type of
derived variable.(14) Although derived and Integral variables are sometimes
presented as conceptually distinct, they are closely interrelated. Derived variables
often operate by shaping certain integral properties of the group. For example,
the composition of a group may influence the predominant types of interpersonal
contacts, values, and norms or may shape organisations or regulations within
the group that affect all members.(15) The terms analytical variables
and aggregate variables have been used as synonyms for derived
variables. The term contextual variables has also been used
as a synonym for derived variables(14) although it is sometimes
used to refer to Group level variables generally.(6,13)
ECOLOGICAL FALLACY
The fallacy sometimes present when drawing inferences at the individual
level (that is, regarding relations between individual level variables) based
on group level data. The ecological fallacy arises because associations between
two variables at the group level (or ecological level) may differ from associations
between analogous variables measured at the individual level. These differences
between individual level and group level associations were first described for
correlation coefficients (16) but may also be present for other measures of
association such as regression coefficients.(11,17) More generally, the fallacy
may occur whenever data for units at a higher level are used to draw inferences
regarding factors associated with variability across units at a lower levelthat
is,when the conceptual model being tested corresponds to the lower level, but
the data are collected for a higher level.(1,2) Suppose a researcher finds that
at the country level, increasing per capita income is associated with increasing
mortality attributable to traffic accidents. If he/she infers that at the individual
level, increasing personal income is associated with increasing motor vehicle
related mortality, she may be committing the ecological fallacy, because within
countries, motor vehicle related mortality may always be lower in high income
than in low income persons. In the case of regression coefficients, the sources
of the ecological fallacy include (1) the lack of information on constructs
pertaining to a lower level of organisation; and (2) the failure to realise
that a variable defined and measured at one level of organisation may tap into
a different construct than its namesake at another level.(18)
References
(1) Riley MW. Special problems of sociological analysis. In: Sociological research
I: a case approach.New York: Harcourt, Brace, and World, 1963:70025.
(2) Diez-Roux AV. Bringing context back into epidemiology: variables and fallacies
in multilevel analysis. Am J Public Health 1998;88:21622.
(3) Alker HR. A typology of ecological fallacies. In: Dogan M, Rokkam S, eds.
Social ecology. Boston: The MIT Press, 1969:6986.
(4) Scheuch EK. Social context and individual behavior. In: Dogan M, Rokkam
S, eds. Social ecology. Boston: The MIT Press, 1969:13355.
(5) Duncan C, Jones K, Moon G. Context, composition, and heterogeneity: using
multilevel models in health research. Soc Sci Med 1992;46:97117.
(6) Blalock HM. Contextual-effects models: theoretical and methodological issues.
Ann Rev Sociol 1984;10:35372.
(7) Iversen G. Contextual analysis. Newbury Park, CA: Sage, 1991.
(8) Hermalin A. The multilevel approach: theory and concepts. The methodology
for measuring the impact of family planning programs on fertility. Population
studies Addendum Manual IX. New York: United Nations,1986;66:1531.
(9) van den Eeden P, Huttner HJ. Multi-level research. Curr Sociol 1982;30:1178.
(10) DiPrete TA, Forristal JD. Multilevel models: methods and substance. Annu
Rev Sociol 1994;20:33157.
(11) Morgenstern H. Ecologic studies in epidemiology: concepts, principles,
and methods. Annu Rev Public Health1995;16:6181.
(12) Blakely TA, Woodward AJ. Ecological effects in multi-level studies. J Epidemiol
Community Health 2000;54:36774.
(13) Lazarsfeld PF, Menzel H. On the relation between individual and collective
properties. In: Etzioni A, ed. A sociological reader on complex organizations.
New York: Holt, Rinehart, and Winston, 1971:499516.
(14) Susser M. The logic in ecological: I. The logic of analysis. Am J Public
Health 1994;84:8259.
(15) Valkonen T. Individual and structural effects in ecological research. In:
Dogan M, Rokkam S, eds. Social ecology. Boston: The MIT Press, 1969:5368.
(16) Robinson WS. Ecological correlations and the behavior of individuals. Am
Sociol Rev 1950;15:3517.
(17) Piantadosi S, Byar DP, Green SN. The ecological fallacy. Am J Epidemiol
1988;127:893904.
(18) Diez-Roux AV, Schwartz S, Susser E. Ecologic studies and ecologic variables
in public health research. In: The Oxford textbook of public health. Volume
2. London: Oxford University Press, 2002:493508. 1995.
Source: Published initially as A glossary for multilevel analysis
in the Journal of Epidemiology and Community Health, 56:588-594, 2002.
Return to Index
Epidemiological Bulletin, Vol. 24 No. 3, September
2003


