The present study involved the analysis of 81 second language instructed vocabulary acquisition (L2 IVA) studies over 2 phases. In Phase I, we categorized and coded the effect sizes of the studies. Observing that the basic between- and within-subject design dichotomy lacked the sensitivity to capture the heterogeneity of observed effects, we employed a more granular approach. In both between- and within-subject designs, treatment versus comparison contrasts best represented comparisons of most interest in L2 IVA experiments, with median effect sizes (g) of.62 (between-subject) and.25 (counterbalanced within-subject). In Phase II, the aggregated effect sizes observed in Phase I were utilized in a priori power simulations to suggest approximate sample sizes for common L2 IVA analyses. For conservatively powered between-subject designs, the simulations suggested sample sizes ranging from 292 to 492 participants. Counterbalanced within-subject designs required 95 to 203 subjects depending on the assumed correlation between the repeated measures. The overarching implication of these simulations suggests that future L2 IVA experiments require larger samples that reference effect sizes from previous research, and we offer 3 potential solutions to the problem of obtaining larger samples.
All Science Journal Classification (ASJC) codes