Maestas and Gaillot (2010) noted that because of the possibility of attrition bias, treatment effects at follow-up should be viewed as only suggestive. The results looked at the full sample of observations as well as a subsample of valid observations only (invalid response patterns, about 25 percent of the sample, were excluded).
Overall, the teacher reports showed there were significant treatment effects of the Success for Kids (SFK) program on some of the individual and composite scales at the posttest, but most effects dissipated by the 12-week follow-up. Of the 10 individual clinical scales (aggression, anxiety, attention problems, atypicality, conduct problems, depression, hyperactivity, learning problems, somatization, and withdrawal), the results showed that the SFK program caused statistically significant reduction in 1 scale (attention problems). When observations with invalid response patterns are excluded, the SFK program is associated with a significant reduction in 4 of the 10 clinical scales (attention problems, atypicality, hyperactivity, and withdrawal). Also, four of the five adaptive scales (adaptability, functional communication, leadership, and study skills) showed statistically significant treatment effects.
On the composite scales, there was a significant effect of the SFK program on adaptive skills (a summary of scales measuring appropriate emotional expression and control; communication skills; and prosocial, organization, and study skills). The treatment effects were statistically significant when all observations were examined and when only valid observations were examined, meaning there were greater reductions in negative outcomes for the treatment group. However, the effects were not statistically significant by the 12-week follow-up test.
For the behavioral symptom index (a composite measure of the overall level of problem behavior), there was a statically significant decline in negative behavior for the treatment group, but only when examining the valid observations subsample. By the 12-week follow-up, treatment effects had increased and were significant for all observations as well as for valid observations only. The results for the externalizing problems scale (a composite of the scales measuring disruptive behaviors) showed there were significant treatment effects of the SFK program at posttest, but only for the subsample of valid observations. The effects dissipate, and they were nonsignificant at the 12-week follow-up. For the internalizing problems scale (a composite of scales measuring overly controlled behaviors), there were no significant treatment effects at the posttest, but there was a significant treatment effect for the full sample of observations at the follow-up.
Finally, for the scale for school problems (a composite of scales measuring academic difficulties, including motivation, attention, learning, and cognition), the treatment effects of SFK were significant in the full sample and valid-observation sample at the posttest, but were not significant for either group at the follow-up.
The study authors also noted their doubt about the validity of the child self-report data because of the large number of invalid response patterns, the lower reliability coefficients, and the potential for nonresponse bias. However, they estimated the treatment effects of the SFK program based on available data. The results showed that, for the full sample at posttest, there was just one statistically significant treatment effect (the atypicality scale, which is the tendency toward bizarre thoughts or other thoughts and behaviors considered odd). However, the effect is not in the expected direction, implying detrimental program effects (the treatment group did worse than the control group). When looking at the valid observations subsample, there are no significant differences between the groups on the atypicality scale. The only statistically significant scale was self-esteem, where the treatment group showed a greater gain than the control group.
At the 12-week follow-up, only one clinical scale showed statistically significant treatment effects of SFK for the full sample of observations (attitude to teachers), meaning the treatment group showed a greater reduction in the negative outcome than the control group. When examining the valid observations subsample, only one scale showed a significant treatment effect (atypicality), where again the treatment group showed a greater reduction in the negative outcome than the control group.
The Success for Kids (SFK) curriculum was evaluated by Maestas and Gaillot (2010) using a combination of experimental and quasi-experimental research designs. During late 2006, the SFK program was rapidly expanding across different afterschool programs in southeast Florida. This allowed the study authors to randomly assign 19 participating afterschool program sites to immediate implementation of SFK programming and delayed implementation after an approximately 12-week waiting period. This setup is similar to using a waitlist control group design. During the waiting period, the delayed-implementation sites formed a control group for the immediate intervention sites.
There were three groups of sites that implemented the SFK curriculum throughout the study period. In fall 2006, group 1 and group 2 entered the study. Group 1 received SFK immediately; programming for group 2 was delayed until winter 2006–07. However, both groups were tested in fall 2006 and winter 2006–07. During these times, group 1 received pretests and posttests as a treatment group and group 2 received pretests and posttests as the control group. Then in winter 2006–07, group 2 switched from control-group status to treatment-group status; subsequently, the group’s posttest simultaneously became its treatment-group pretest. Group 2’s posttest as a treatment group was administered in spring 2007, while group 1 received a follow-up test. Group 3 then entered the study in winter 2006–07, first as the control group, and then switched to the treatment-group status in spring 2007. The pretest for group 3 as a control group was administered in winter 2006–07. When group 3 served as a
control group, the posttest was delivered in spring 2007. This posttest
simultaneously worked as the pretest for group 3 when they became a treatment
group. The posttest to group 3 as a treatment group was administered in summer 2007, when group 2 also received a follow-up test. The study ended in summer 2007, so no follow-up data was collected for group 3.
During analysis of the data, all of the treatment-group observations and all control-group observations were pooled together. This combined the experimental variations (induced by randomization) with quasi-experimental variations. This was done because preliminary analyses revealed that randomization did not sufficiently balance the covariates across treatment and control groups, meaning there were significant group differences despite the random assignment of receiving immediate or delayed programming. This may have been due to the fact that randomization was done over a small number of units (19 program sites) drawn from demographically heterogeneous neighborhoods, and such group differences tend to “average out” as the number of randomization units increases.
To address the issues that arose with the randomization process, the research design was reconceptualized as a longitudinal study, because repeated measures were available. By letting the control sites in groups 2 and 3 also contribute treatment observations, the authors were able to pool all treatment and control observations and use site-level fixed effects, which effectively let group 2 and 3 sites act as control groups for themselves. This means that children in groups 2 and 3 enter the study once as control children and again as treatment children, which helps minimize observed and unobserved differences between the groups. The experimental variation is still present to a certain degree, because group 1 (which never acted as a control group) is retained in the treatment group.
All measurements were taken using the Behavior Assessment System for Children, Second Edition (BASC–2), which is a multimethod, multidimensional system used to evaluate the behavior and self-perceptions of children and young adults. It consists of two scales, one for teachers and one for children. The BASC–2 individual scales are adaptability, aggression, anxiety, attention problems, atypicality, conduct problems, depression, functional communication, hyperactivity, leadership, learning problems, social skills, somatization, study skills, and withdrawal. Composite scales (groupings of individual scales) are adaptive skills, behavioral symptoms index, externalizing problems, internalizing problems, and school problems. The BASC–2 scales may be classified as two types: clinical scales (measuring negative outcomes) and adaptive scales (measuring positive outcomes).
There were 19 program sites that participated in the study from various cities across southeast Florida, including Miami, Hialeah Gardens, Davie, Boynton Beach, Lauderhill, Delray Bach, Fort Lauderdale, Palm Beach, Pompano Beach, and Opa Locka. However, the participating sites were a heterogeneous group. For example, the racial and ethnic composition of all program sites were predominately white, African American, and Latino; however, the percent breakdown of each group differed depending on the city. For instance, in Davie, participating children were approximately 90.6 percent white, 4.5 percent African American, and 17.4 percent Latino (racial and ethnic categories were not mutually exclusive, which is why percentages do not total to 100). But in Pompano Beach, students were 54.7 percent white, 41.2 percent African American, and 11.9 percent Latino. Logistic regression of pretests of teachers’ reports showed that there were significant differences between the treatment and control groups. The treatment group children tended to come from relatively disadvantaged neighborhoods, with higher crime rates and higher percentages of both whites and blacks than other racial and ethnic groups. However, there were no significant differences between the treatment and control groups in the average standardized-test scores, gender of study children, and the availability of other types of programming at the sites. There also was no statistically significant difference between groups on 14 of the 15 BASC–2 behavioral scales.
A total of 737 children were enrolled in the study. Of these, teachers completed questionnaires for 89 percent of enrolled children, while 53 percent of children completed the self-report questionnaire. Attrition rates for teachers participating in the study were 22 percent for the treatment group and 19 percent in the control group, which was not a significant difference. For children, the attrition rates were significantly higher in the treatment group (40 percent) than in the control group (26 percent). However, the attrition rate in the treatment group climbed dramatically between the posttest and follow-up test (57 percent in the teacher sample and 66 percent in the child sample). Since pretest and follow-up test differences for the treatment group was compared with pretest and posttest differences for the control group (there were no follow-up tests administered to the control groups), the very different attrition rates for the two groups under comparison indicates that very different attrition processes were likely at play. Regression analysis revealed no differences between the groups in baseline behavioral scores, but there were very significant differences in neighborhood demographic characteristics and availability of alternative programming. The site-level differences can be accounted for with site-level fixed effects; however, the follow-up results should be interpreted as only suggestive, since the underlying experimental variation was flawed.