DMCA
A meta-analytic review of obesity prevention programs for children and adolescents: The skinny on interventions that work. (2006)
Venue: | Psychological Bulletin, |
Citations: | 75 - 4 self |
BibTeX
@ARTICLE{Stice06ameta-analytic,
author = {Eric Stice and Heather Shaw and C Nathan Marti},
title = {A meta-analytic review of obesity prevention programs for children and adolescents: The skinny on interventions that work.},
journal = {Psychological Bulletin,},
year = {2006},
pages = {667--691}
}
OpenURL
Abstract
This meta-analytic review summarizes obesity prevention programs and their effects and investigates participant, intervention, delivery, and design features associated with larger effects. A literature search identified 64 prevention programs seeking to produce weight gain prevention effects, of which 21% produced significant prevention effects that were typically pre-to post effects. Larger effects emerged for programs that targeted children and adolescents (vs. preadolescents) and females, programs that were relatively brief, programs that solely targeted weight control versus other health behaviors (e.g., smoking), programs evaluated in pilot trials, and programs wherein participants must have self-selected into the intervention. Other factors, including mandated improvements in diet and exercise, sedentary behavior reduction, delivery by trained interventionists, and parental involvement, were not associated with significantly larger effects. Keywords: obesity, prevention, meta-analysis, moderators Obesity in adulthood results in an increased risk for future death from all causes, coronary heart disease, atherosclerotic cerebrovascular disease, and colorectal cancer, as well as serious medical problems including hyperlipidemia, hypertension, gallbladder disease, and diabetes mellitus Unfortunately, successful treatments for obesity have been elusive. For adults, the current treatment of choice only results in about a 10% reduction in body weight, and virtually all patients regain this weight within a few years of treatment . Obesity treatments for children and adolescents have yielded similar effects, though behavioral family-based interventions have produced more persistent weight loss effects Studies have evaluated four major types of interventions that were expected to produce weight gain prevention effects. These include (a) multifocus cardiovascular disease prevention programs that targeted obesity along with other risk factors for cardiovascular disease (e.g., hypertension and smoking), (b) prevention programs that focused solely on the prevention of obesity or weight gain, (c) interventions designed to solely increase physical activity, and (d) eating disorder prevention programs that promoted use of healthy weight-management skills. Although numerous evaluations of weight gain prevention programs have been conducted, their results have not been comprehensively reviewed and analyzed with meta-analytic procedures. Several excellent narrative reviews exist (e.g., We are very grateful to Amy Greenwold, Krista Heim, and David Huh for their assistance with the literature search and article preparation. Correspondence , Vol. 132, No. 5, 667-691 0033-2909/06/$12.00 DOI: 10.1037/0033-2909 667 to systematically consider the moderators associated with interventions that produced the largest effects. The third aim is to discuss promising directions for future research in light of the findings from completed trials. Putative Moderators of Intervention Effects A unique feature of meta-analyses is that they permit empirical examination of factors associated with variation in effect sizes. Elucidating factors that moderate prevention program effects is informative because it highlights aspects of the participants, intervention, program delivery, and research design that are associated with stronger intervention effects. This information should increase the yield of future prevention efforts by identifying the conditions under which optimal prevention effects occur. As well, this information might identify particular subgroups of individuals for whom alternative obesity prevention programs need to be developed. Analyses of moderators of intervention effects should also advance general theories regarding effective routes to alter maladaptive health behaviors and attitudes. Accordingly, we investigated several potential moderators of intervention effects that were selected on the basis of theory, prior findings, and previous literature reviews. Participant Features Participant Age Researchers have hypothesized that obesity prevention programs are more effective when they are delivered to middle school or high school students versus grade school students Participant Gender Results from prior trials suggest that obesity prevention programs that promoted a healthier lower calorie diet Participant Ethnicity There is also reason to believe that ethnicity might moderate obesity prevention effects. On the one hand, there is evidence that Black and Hispanic individuals show elevated rates of overweight and obesity as well as greater increases in weight over development, relative to other ethnic groups (e.g., Risk Status of Participants More generally, we have hypothesized Intervention Features Intervention Duration Previous meta-analyses of prevention programs for other problem behaviors have suggested that longer duration multisession interventions produced more superior effects than very brief interventions Parental Involvement It has also been suggested that parental involvement leads to more favorable results in obesity prevention, as the family is thought to be key to developing a psychosocial environment that is conducive to healthy eating and physical activity Psychoeducational Content Because research has suggested that psychoeducational content is ineffective in producing behavioral change Dietary Improvement One implication from the energy balance model of obesity is that a reduction in fat and sugar intake and an increase in fruit and vegetable intake will decrease the risk for future weight gain Increased Activity Another implication from the energy balance model of obesity is that increased physical activity will decrease risk for future weight gain Reduced Sedentary Behavior A third implication of the energy balance model of obesity is that interventions that reduce sedentary behavior, such as TV viewing and video game use, should also decrease risk for future weight gain. Indeed, it has been theorized that more effective obesity prevention programs focused on reducing sedentary behavior Number of Behavior Targets Our review of the literature suggested that the number of health behaviors targeted in an intervention was inversely related to the magnitude of intervention effects for obesity. Specifically, it appeared that interventions that attempted to change a broad array of health behaviors, such as body weight, blood pressure, cholesterol, and smoking, were less effective than programs that focused solely on body weight. Our clinical experience from designing and evaluating prevention programs also suggests that interventions focusing on a few concepts are more effective than those focusing on a broader array of concepts. It may be that the greater the complexity of the message relayed by the intervention, the more difficult it is for participants to process, store, and retrieve information presented in the programs. Consistent with this general impression, a review of school-based cardiovascular disease prevention trials concluded that broad-based programs targeting multiple health behaviors aimed at reducing risks for cardiovascular disease have not been effective for reducing obesity in children OBESITY PREVENTION PROGRAMS Delivery Features Teachers Versus Professional Interventionists Researchers have suggested that obesity prevention programs are more effective when delivered by dedicated interventionists versus classroom teachers Didactic Versus Interactive Format Meta-analytic reviews of substance abuse Design Features Pilot Study Our review of the prevention and treatment literature for obesity and eating disorders suggested that larger intervention effects were often observed for pilot trials of a new intervention relative to large demonstration trials. Such a pattern of effects might occur because interventionists are more passionate about new prevention programs or because demonstration trials are more methodologically rigorous and are therefore more immune to experimenter effects (e.g., because they more often use blinded assessors and minimal intervention control conditions). Thus, we hypothesized that intervention effects would be significantly larger for pilot evaluations of new interventions. Recruitment Method Our experience suggests that intervention effects are often larger when prevention programs are delivered solely to participants who have actively self-selected into trials in response to recruitment efforts, such as media advertisements, relative to when prevention programs are offered to all individuals in a defined population (e.g., a particular school). Presumably this is because the former strategy recruits individuals who are more motivated to achieve weight gain prevention effects and therefore engage more effectively in the prevention program. Thus, we hypothesized that intervention effects would be significantly larger for selfpresenting volunteers than for participants recruited through population-based recruitment efforts. Random Assignment We theorized that trials that randomly assigned participants to condition might produce larger intervention effects than trials that used alternative approaches to allocating participants to treatment condition, such as matching. We reasoned that because random assignment is the best approach to generating groups that are equivalent on any potential confounding variables at baseline (with sufficiently large sample sizes), it should therefore minimize the chances that any of these confounding variables are correlated with treatment condition, which should thus maximize the ability to detect intervention effects if they really occur (i.e., randomization maximizes the signal-to-noise ratio reflected in inferential tests of the intervention effects). Accordingly, we hypothesized that intervention effects may be greater for trials that used random assignment relative to other approaches to assigning participants to condition. However, because the proper analysis of intervention effects involves tests of differential change across conditions, which adjusts for any initial differences at baseline on the outcome, we suspected that this effect might not reach statistical significance. Consistent with this expectation, random assignment did not emerge as a significant moderator of effects sizes in our meta-analysis of eating disorder prevention programs Nested Data Modeled Incorrectly Virtually all parametric inferential tests, such as repeated measures analysis of variance, growth curve, and survival models, used to test for intervention effects within randomized trials assume independence of errors. However, when participants are nested within schools, classes, or group-based interventions, the assumption of independence may not hold Potential Artifacts We also investigated three variables that might produce artifacts for the effect sizes and bias our estimates of effect size moderators, with the goal of including these variables as covariates in the models if necessary. First, our review of the eating disorder prevention field suggested that interventions tend to produce larger effect sizes when they are compared with assessment-only or waitlist control conditions relative to when they are compared with active interventions that are credible and structurally matched to the intervention in terms of contact hours Method Sample of Studies Following the recommendations of Lipsey and Wilson Preventive Medicine, Journal of Pediatrics, Health Education Quarterly). Third, we consulted narrative reviews of the obesity prevention field to search for additional citations of relevance. Fourth, the reference sections of all identified articles were examined. Finally, established obesity prevention researchers were contacted and asked for copies of unpublished articles (under review or in press) describing prevention trials. Inclusion and Exclusion Criteria The defining feature of a successful obesity prevention program is that it results in significantly less weight gain or risk for obesity onset than observed in the control group. Thus, we only included trials that used some type of proxy measure of body fat as an outcome. Most trials used the body mass index (BMI ϭ Kg/M 2 ) as the primary proxy measure of body fat, but a few studies, particularly older ones, used skinfold thickness. It is important to note that BMI is not a direct measure of body fat. Although this proxy measure tends to show high correlations with the most precise measures of body fat (r ϭ .80 -.90), such as dual energy x-ray absorptiometry (DEXA; Dietz & Robinson, 1998), it has been found to show lower agreement with DEXA measures in large epidemiology samples (r ϭ .71; Ellis, As noted previously, we included trials that were primarily conceptualized as evaluations of obesity prevention programs, as well as trials that evaluated other interventions that were expected to result in less weight gain or risk for obesity onset but that were not primarily conceptualized as obesity prevention programs (e.g., certain physical activity interventions, eating disorder prevention programs, and psychoeducational interventions). A prior meta-analysis indicated that certain eating disorder prevention programs and psychoeducational interventions produced significant weight gain prevention effects This meta-analysis focused solely on effect sizes for weight gain prevention effects, as assessed by differential change in body fat measures. We did not include effect sizes for changes in self-reported dietary intake or physical activity, because numerous trials have found significant intervention effects for self-reported dietary intake and physical activity, but no significant effects for weight change (e.g., We focused exclusively on prevention programs that were evaluated in controlled trials. We included trials in which participants were randomly assigned to an intervention; to active interventions that were not focused on weight gain prevention (e.g., a general parent training intervention); or to usual-programming (e.g., standard physical education classes), waitlist, or assessment-only control conditions. We also included trials in which some relevant comparison group was used (e.g., matched controls) in a quasiexperimental design. Random assignment to condition is optimal because it is the best approach to generating comparison groups that are equated on any potential confounding variables at baseline We also focused exclusively on studies that tested whether the change in the outcomes over time was significantly greater in the intervention group versus the control group. This could take the form of a Time ϫ Condition interaction in a repeated-measures analysis of variance model, an analysis of covariance model that controlled for initial levels of the outcome variable, or a growth curve model that controlled for initial levels of the outcome (e.g., the effects were conditional upon the intercept value of the dependent variable coded to reflect the level of the outcome at baseline; We excluded trials that were described as obesity treatment programs by the authors because the purpose of the present report was to provide a meta-analytic review of programs that sought to prevent future weight gain or obesity onset. Nonetheless, we included evaluations of programs that sought to prevent future weight gain in overweight or obese samples if they were not referred to as treatment programs by the authors. More generally, we did not exclude studies solely because the average BMI of participants fell above conventional cutoffs for overweight or obese (e.g., over 25 or 30 for young adult samples). We also restricted our focus to trials that targeted children and adolescents because of our interest in determining whether effective interventions have been designed for developing individuals. We believe that obesity prevention programs should be implemented before most individuals will show onset of obesity. However, we used a broad view of adolescence and included trials with a mean participant age of up to 22 years because this captured college-based obesity prevention programs. College-aged individuals are still developing self-regulation skills, particularly with regard to dietary and exercise behaviors. In addition, many developmental psychologists consider adolescence to span from approximately age 12 through age 24 because most individuals in the United States have not settled into adult roles by their early 20s Effect Size Estimation Procedures We calculated effect sizes for tests of differential change in BMI and risk for obesity onset across the intervention and control conditions because virtually all of the prevention trials included BMI as a primary outcome. Although other proxy measures of adiposity were used in several trials, such as skinfold thickness and waist-to-hip ratios, these latter outcomes were operationalized inconsistently and were collected in only a subset of the trials. We considered averaging the effect sizes from these various adiposity proxy measures, but we noted that the intervention effects for these various outcomes were often contradictory and were concerned that averaging across diverse measures would introduce unnecessary error variance into the analyses. Furthermore, the measurement error is considerably lower for the BMI relative to alternative proxy body fat measures, including waist circumference, triceps skinfold, and subscapular skinfold measures The correlation coefficient (r) was selected as the index of effect size because of its similar interpretation across different combinations of interval, ordinal, and nominal variables (Pearson's r, Spearman's rho, and point biserial; 1 If effect sizes were reported in Cohen's (1988) d, we converted them to r with the formula provided on page 20 of We were able to use the methods described previously to generate effect sizes or estimates of effect sizes for all trials that reported significant intervention effects and for most trials that reported nonsignificant effects. However, for the two trials that reported nonsignificant effects and did not provide any other data with which to estimate the effect size Operationalization and Coding of Effect Size Moderators 2 It might be noted that only 55% of the trials that did not use random assignment to condition used matching to create the groups, suggesting that the variable reflecting random assignment was not simply a surrogate for matching, which would have complicated the interpretation of the former moderator. STICE, SHAW, AND MARTI One aspect of our coding system was constrained by the distribution of a certain moderator across studies. Specifically, although we were interested in testing whether the intervention effects were significantly larger for females than males, only 33% of the trials that we located reported effect sizes separately for the sexes (and only 21% provided a direct test of whether sex moderated the intervention effects). Accordingly, we tested whether interventions offered solely to females were more effective than those offered solely to males or those offered to both sexes. We took this approach because (a) this variable emerged as a significant predictor of eating disorder prevention program effects There were also a number of other potential moderators that we were unable to code because insufficient information was provided in the articles and reports. We were unable to code average attendance because only 44% of the studies reported this variable. We were unable to code the socioeconomic status of the sample because parallel information (e.g., average parental income) was reported in only 35% of studies. We were unable to code the method of handling missing data (e.g., listwise deletion [completer analysis], last observation carried forward, full information maximum likelihood estimation imputation) because less than 40% of the studies reported this information. We used a consensus approach to coding the effect size moderators. Eric Stice and Heather Shaw were each responsible for coding certain moderators but consulted with each other when questions regarding the coding of particular studies arose. Although this approach allowed for a refinement of the coding system and served to increase interrater agreement, we did not use the consensus approach on all data points or double code all studies. Thus, we examined intercoder agreement by having Eric Stice and Heather Shaw code all of the moderators for a randomly selected 30% of the trials examined in this meta-analytic review. Results Descriptive Statistics The literature search identified 46 trials that met the inclusion criteria, in which 61 different obesity prevention programs were evaluated (12 trials evaluated more than 1 prevention program, and 3 prevention programs were evaluated in 2 trials), resulting in a total of 64 effect sizes for this review. Of these 64 prevention programs, 30 were universal, and 34 were selected. The majority focused on both males and females (n ϭ 48), but 14 focused solely on females, and 2 focused solely on males. The majority of these interventions were school-based programs (84%). A total of 51 of the 64 prevention programs used random assignment to condition, of which 13% were randomized at the participant level, 2% were randomized at the group level, and 85% were randomized at the school level. Brief descriptions of the samples, program content, and intervention effects are provided in To assess interrater agreement between the two coders responsible for abstracting effect sizes and moderators, we calculated the interclass correlation coefficient for continuous variables and kappa () coefficients for nominal variables (see Average Effect Size and Effect Size Heterogeneity Analyses were conducted on the effect size for change in BMI in the intervention condition versus the control condition. We first converted Pearson's rs to z scores to avoid problematic standard error estimates The average effect size across all studies was very small (r ϭ .04) but was significantly larger than zero (z ϭ 2.94, p Ͻ .01). The rs for the effect sizes ranged from Ϫ.24 to .50. Only 13 of these interventions (1 of which was evaluated in two trials), or 21% of the 61 programs evaluated, found significant positive intervention effects based on an alpha level of .05 There was significant heterogeneity in effect sizes (Q ϭ 204.41, p Ͻ .001), indicating that there was variability across the effect sizes produced by the interventions (i.e., that effects were not equivalent across trials). The heterogeneity in the effects suggests that there may be participant, intervention, delivery, and design features that account for the variability in effect sizes. Moderator Analyses Two moderators could not be examined because of severe restrictions in range; because only two studies used credible active control conditions, and because we located only two unpublished reports, we did not consider type of control condition or publication status 3 further. Two potential confounding variables were not examined because they did not show significant relations to effect sizes: preliminary univariate analyses indicated that length of follow-up (z ϭ 1.58, p ϭ .11,  ϭ 0.18) and the age range of participants in the trials (z ϭ .80, p ϭ .42,  ϭ 0.10) were not significantly related to effect size magnitude. Within this context, it should be noted that preliminary analyses also indicated that 3 Even though there were only two unpublished trials included in the present meta-analysis, we confirmed that there was no evidence that the unpublished studies had significant different effect sizes relative to published studies (z ϭ .03, p ϭ .82,  ϭ 0.03).