Meta-Psychology, 2022, vol 6, MP.2021.2803 https://doi.org/10.15626/MP.2021.2803 Article type: Original Article Published under the CC-BY4.0 license Open data: Yes Open materials: Yes Open and reproducible analysis: Yes Open reviews and editorial process: Yes Preregistration: Yes Edited by: Thomas Nordström Reviewed by: Matthew Miller, František Bartoš Analysis reproduced by: Lucija Batinović All supplementary files can be accessed at OSF: https://doi.org/10.17605/OSF.IO/39W6K Meta-analytic findings of the self-controlled motor learning literature: Underpowered, biased, and lacking evidential value Brad McKay School of Human Kinetics, University of Ottawa Department of Kinesiology, McMaster University Zachary D. Yantha School of Human Kinetics, University of Ottawa Julia Hussien School of Human Kinetics, University of Ottawa Michael J. Carter Department of Kinesiology, McMaster University Diane M. Ste-Marie School of Human Kinetics, University of Ottawa Abstract The self-controlled motor learning literature consists of experiments that compare a group of learners who are provided with a choice over an aspect of their practice environment to a group who are yoked to those choices. A qualitative review of the literature suggests an unambiguous benefit from self-controlled practice. A meta-analysis was conducted on the effects of self-controlled practice on retention test performance measures with a focus on assessing and potentially correcting for selection bias in the literature, such as publication bias and p-hacking. First, a naïve random effects model was fit to the data and a moderate benefit of self-controlled practice, g = .44 (k = 52, N = 2061, 95% CI [.31, .56]), was found. Second, publication status was added to the model as a potential moderator, revealing a significant difference between published and unpublished findings, with only the former reporting a benefit of self-controlled practice. Third, to investigate and adjust for the impact of selectively reporting statistically significant results, a weight-function model was fit to the data with a one-tailed p-value cutpoint of .025. The weight-function model revealed substantial selection bias and estimated the true average effect of self- controlled practice as g = .107 (95% CI [.047, .18]). P-curve analyses were conducted on the statistically significant results published in the literature and the outcome suggested a lack of evidential value. Fourth, a suite of sensitivity analyses were conducted to evaluate the robustness of these results, all of which converged on trivially small effect estimates. Overall, our results suggest the benefit of self-controlled practice on motor learning is small and not currently distinguishable from zero. Keywords: Motor learning, Retention, Choice, OPTIMAL theory, Meta-analysis, p-curve, Publication bias Introduction Asking learners to control any aspect of their practice environment has come to be known as self-controlled practice in the motor learning literature (Sanli et al., 2013; Wulf & Lewthwaite, 2016). The first pub- lished experiments to test self-controlled learning asked learners to control their augmented feedback schedule (Janelle et al., 1997; Janelle et al., 1995). For example, in an experiment by Janelle et al., 1997, participants practiced throwing tennis balls at a target with their https://doi.org/10.15626/MP.2021.2803 https://doi.org/10.17605/OSF.IO/39W6K 2 non-dominant hand. The practice period occurred over two separate days. Participants were assigned to one of four experimental groups (n = 12): self-controlled knowledge of performance, yoked-to-self-control, sum- mary knowledge of performance after every five trials, and a knowledge of results only control group. The self-controlled group could request knowledge of per- formance whenever they wanted it, while each yoked group participant was matched with a self-control group counterpart and received knowledge of performance on the same schedule. The experimenter evaluated the participants’ throws, identified the most critical error in their throwing form, and provided knowledge of per- formance via video feedback, along with directing at- tention to the error and giving prescriptive feedback. During a delayed-retention test, the accuracy, form, and speed of the throw were assessed. The results indicated that the self-control group threw more accurately and with better form than all other groups on the retention test. The self-control and yoked groups did not signif- icantly differ in throwing speed, but the control group threw faster than the self-control group on the second retention block. The results were interpreted as evi- dence that the participants provided with choice were able to process information more efficiently than their counterparts who received a fixed schedule of feedback. Figure 1 shows that the number of experiments com- paring self-controlled groups to yoked groups has been increasing since the original experiments by Janelle and his colleagues (1997, 1995). Researchers have exper- imented with giving learners control over a variety of variables in the practice environment. A qualitative as- sessment of the literature suggests that self-control is generally beneficial regardless of choice-type (Wulf & Lewthwaite, 2016). For example, self-control has been effective when participants have been provided choice over what can be considered instructionally-relevant variables, such as knowledge of results (Patterson & Carter, 2010), knowledge of performance (Lim et al., 2015), concurrent feedback (Huet et al., 2009), use of an assistive device (Wulf et al., 2001), observation of a skilled model (Lemos et al., 2017), practice sched- ule (Wu & Magill, 2011), practice volume (Lessa & Chiviacowsky, 2015), and task difficulty (Leiker et al., 2016). Additionally, self-controlled benefits have also been found for instructionally-irrelevant variables, such as the colour of various objects in the practice envi- ronment (Wulf et al., 2018), other decorative choices (Iwatsuki et al., 2019), and the choice of what to do after the retention test is complete (Lewthwaite et al., 2015). Despite the widespread optimism that self-controlled practice is useful for enhancing motor learning, re- 0 2 4 6 8 10 19 95 19 96 19 97 19 98 19 99 20 00 20 01 20 02 20 03 20 04 20 05 20 06 20 07 20 08 20 09 20 10 20 11 20 12 20 13 20 14 20 15 20 16 20 17 20 18 20 19 Year N u m b e r o f E xp e ri m e n ts Figure 1. Number of self-controlled learning experi- ments meeting the inclusion criteria by year. searchers continue to debate the underlying mecha- nisms responsible for the effect (M. J. Carter & Ste- Marie, 2017b; Wulf et al., 2018). Beginning with Janelle et al. (1995), both motivational and informa- tion processing mechanisms were proposed as possible explanations for self-control benefits. Researchers have since supported these two mechanisms and, from a mo- tivational perspective, have posited that self-control en- hances confidence (Chiviacowsky, Wulf, & Lewthwaite, 2012; Janelle et al., 1995; Wulf & Lewthwaite, 2016) and satisfies the basic psychological need for auton- omy (Sanli et al., 2013; Wulf & Lewthwaite, 2016), motivating motor performance and learning enhance- ment. Most self-controlled learning experiments, how- ever, have involved participants making choices over potentially informative variables, which could act as a confounding variable. Citing this potential motivation- al/informational confound, Lewthwaite et al. (2015) experimented with providing instructionally-irrelevant choices, such as the colour of the golf balls to putt, the painting to hang on the wall, and what to do following the retention test. Lewthwaite and her colleagues rea- soned that information processing explanations could not account for benefits due to these incidental choices, and instead motivational factors were more likely. Con- sistent with the motivational hypothesis, participants exhibited significantly greater motor learning on a golf putting task (Experiment 1) and on a balance task (Ex- periment 2). Subsequently, several experiments have reported benefits with instructionally-irrelevant choices (Abdollahipour et al., 2017; Chua et al., 2018; Halperin et al., 2017; Iwatsuki et al., 2019; Wulf et al., 2014; Wulf et al., 2018), further reinforcing this motivational perspective. A contrasting line of research has been reported by M. J. Carter and his colleagues (2014, 2017a, 2017b) in which informational factors, the second dominant 3 perspective, are given more weight as an explanatory variable. In one experiment by M. J. Carter et al. (2014), self-control participants were provided with choice over receiving knowledge of results, but divided into three experimental groups; those who could make their knowledge of results decision before the trial, af- ter the trial, or both (they would decide before, but could change their mind following the trial). Timing of the choice significantly attenuated the self-control ben- efit. While the self-after and self-both groups exhib- ited learning advantages relative to their yoked coun- terparts, the self-before group displayed no such advan- tage. The argument proffered by the researchers was that there was more informational value to be gained from knowledge of results requested after a trial than when it had to be requested before the outcome of the trial occurred (also see Chiviacowsky & Wulf, 2005). In another experiment (M. J. Carter & Ste-Marie, 2017a), asking learners to complete an interpolated ac- tivity in the interval preceding their choice of whether to receive knowledge of results significantly attenu- ated the self-control benefit (also see Couvillion et al., 2020; Woodard & Fairbrother, 2020). As a final ex- ample, M. J. Carter and Ste-Marie (2017b) compared an instructionally-relevant choice group (i.e., when to receive knowledge of results) to an instructionally- irrelevant choice group (i.e., which video game to play after retention and which colour arm wrap to wear while practicing). Unlike the experiment by Wulf and colleagues (2018), M. J. Carter and Ste-Marie found that instructionally-relevant choices were more effec- tive than task-irrelevant choices. Overall, they have used these different findings to tie self-controlled learn- ing benefits to information-processing activities of the learner and, in particular, those related to the process- ing of intrinsic feedback (e.g., M. J. Carter & Ste-Marie, 2017a; Chiviacowsky & Wulf, 2005) and the provided knowledge of results (e.g., Grand et al., 2015). In the present research, these different viewpoints concerning the mechanisms of self-controlled learn- ing advantages were examined via meta-analysis with choice-type included as a moderator. The logic was that the motivational and informational perspectives would have different predictions. More specifically, from a motivation hypothesis, no moderating effect of choice- type on motor learning would be expected. In contrast, smaller effects for irrelevant-choice type, as compared to relevant-choice types, would be expected from the information-processing perspective. Beyond this interest in the possible theoretical mech- anisms, a more important question addressed was whether there is in fact evidential value for the self- controlled learning benefit. This is of relevance because the current consensus in the field is that self-controlled practice is generally more effective than yoked practice (for reviews see Sanli et al., 2013; Ste-Marie et al., 2019; Wulf & Lewthwaite, 2016). Reflecting this con- fidence in its benefits for motor learning, researchers have recommended adoption of self-control protocols in varied settings, such as medical training (Brydges et al., 2009; Jowett et al., 2007; Wulf et al., 2010), physiotherapy (Hemayattalab et al., 2013; Wulf, 2007), music pedagogy (Wulf & Mornell, 2008), strength and conditioning (Halperin et al., 2018), and sports training (Janelle et al., 1995; Sigrist et al., 2013). Problematic though is that recent, high-powered ex- periments with pre-registered analysis plans have failed to observe motor learning or performance benefits with self-control protocols (Grand et al., 2017; McKay & Ste- Marie, 2022; St. Germain et al., 2022; Yantha et al., 2022). Against the backdrop of the so-called replica- tion crisis in psychology (Open Science Collaboration, 2015), there is reason for pause when evaluating the ostensible benefits of self-controlled learning. Further, Lohse et al. (2016) have raised concerns about publica- tion bias, uncorrected multiple comparisons, p-hacking, and other selection effects in the motor learning liter- ature. Therefore, to address the impact of selection ef- fects on estimates of the self-controlled learning effect, a weight function model (E. C. Carter et al., 2019; Hedges & Vevea, 1996; McShane et al., 2016; Vevea & Hedges, 1995; Vevea & Woods, 2005) with a one-tailed p-value cutpoint of .025 was fit to the dataset of effects to pro- vide a pre-registered adjusted estimate of the overall self-controlled learning effect. Even the adjusted esti- mate is biased if the data generating processes are bi- ased in ways not captured by the assumptions of the model, so further sensitivity analyses were conducted to estimate the average effect of self-control after correct- ing for selection effects (E. C. Carter et al., 2019; Vevea & Woods, 2005). In parallel, in an effort to investigate the presence of evidential value in the literature, signif- icant results were subjected to a p-curve analysis (Si- monsohn et al., 2014b; Simonsohn et al., 2015). The p- curve analysis focuses exclusively on significant results and therefore is not affected by publication bias. In sum, the objectives of this meta-analysis were to estimate the true average effect of self-controlled learning and evaluate the evidential value of the self- controlled learning literature. Bias resulting from se- lective publication was addressed with weight function and p-curve models and effect size estimates were ad- justed accordingly. A key theoretical question related to the underlying mechanisms of putative self-controlled learning advantages (motivational versus informational influences) was also addressed through moderator anal- 4 yses, but, to anticipate, inferences will depend on the reliability of the evidence overall. Finally, sensitivity analyses were conducted in addition to pre-registered analyses in an effort to understand the extent that our conclusions depended on the modeling techniques and assumptions adopted. Methods Pre-registration The procedures followed to conduct this meta- analysis were pre-registered and can be viewed at https://osf.io/qbg69. This meta-analysis was retrospec- tive and earlier samples of the literature had been meta- analyzed prior to this pre-registration, albeit with dif- ferent data collection procedures, scope, and excluding recent experiments. This study adheres to PRISMA re- porting guidelines (Page et al., 2021). Literature Search The literature search and data extraction were con- ducted by three authors (BM, ZY, JH) and one research assistant (HS) independently. The goal of the search was to identify all articles that met the inclusion criteria for the meta-analysis. Specifically, randomized experi- ments were subject to five criteria for inclusion: 1) A self-control group in which participants were asked to make at least one choice during practice, 2) a yoked group that experienced the same practice conditions as the self-controlled group, 3) a delayed 24-hour reten- tion test or test with longer delay interval, 4) an objec- tive measurement of motor performance, and 5) publi- cation in a peer-reviewed journal or acceptance as part of a Master’s or PhD thesis. The literature search was completed on August 2, 2019. The search commenced at PubMed and Google Scholar with the following query: self-control* OR self-regulat* OR self-direct* OR learner-control* OR learner-regulat* OR learner-direct* OR subject-control* OR subject-regulat* OR subject-direct* OR performer- control* OR performer-regulat* OR performer-direct* AND motor learning*. The query retrieved 9014 hits on PubMed and 98,600 hits on Google Scholar. Each researcher excluded hits based on title alone or title and abstract when necessary, and quit searching the databases at self-selected intervals following extended periods of excluding 100% of search results (ranging between 20 and 30 results pages without identifying a relevant record). Following an initial run of searching databases, each researcher employed their own search strategies, including reviewing the reference sections of reviews and included articles, consulting the OPTI- MAL theory website1, and searching the ProQuest The- sis database. This literature search process resulted in 160 articles that could not be excluded without consulting the full- text of the article. All 160 articles were coded for in- clusion or exclusion by two researchers independently. All instances of disagreement between coders were re- viewed by three authors (BM, ZY, and JH), and consen- sus was reached in each case. Disagreements were in- frequent and were often caused by a lack of clarity in the articles (e.g., 100% knowledge of results groups labeled as yoked groups). None of the coding disagreements evolved into conceptual disagreements. Rather, in each case, it was identified that one coder had missed a de- tail in the full text that changed its inclusion eligibility. Subsequent to this process, a total of 73 articles, which included 78 experiments, met the inclusion criteria (see Table 1). Dependent Variable Selection The focus of this meta-analysis was on performance outcomes associated with the goal of the skill. The primary theoretical perspectives offered as an account for self-controlled learning are likewise focused on per- formance outcomes. For example, the OPTIMAL the- ory proposes that a learner’s movements become cou- pled with the goal they are trying to achieve when they experience autonomy-support during practice (Wulf & Lewthwaite, 2016). To reflect this focus, a dependent measure priority list was developed that gave higher priority to absolute error measures and less priority to consistency measures, time/work measures, and form scores. Dependent measure priority was ordered as follows: 1) absolute error (and analogous measures: radial error, points in an accuracy measure), 2) root- mean-square-error (RMSE), 3) absolute constant error, 4) variable error, 5), movement time (and distance travelled), 6) movement form – expert raters, 7) oth- erwise unspecified objective performance measure re- ported first in research report.2 In the event that mul- tiple measures of motor performance were reported for an experiment, effect sizes were calculated for the high- est priority measure reported in the study. In experi- ments with multiple self-control groups and one yoked 1The webpage link that was consulted is no longer available (https://optimalmotorlearning.com/index.php/did- you-know-that/). A new webpage devoted to OPTI- MAL theory can be accessed using the following link: https://gwulf.faculty.unlv.edu/optimal-motor-learning/ 2Radial error, accuracy points, and distance travelled were added to the pre-registered dependent measures as they arose during data-extraction. Decisions were made blind to the data by an author not involved in said extraction (BM or DSM). https://osf.io/qbg69 https://optimalmotorlearning.com/index.php/did-you-know-that/ https://optimalmotorlearning.com/index.php/did-you-know-that/ https://gwulf.faculty.unlv.edu/optimal-motor-learning/ 5 group, the self-control groups were combined (Hig- gins & Green, 2011). If multiple choice-types or sub- populations were included in an experiment, combined and individual effects were calculated for inclusion in moderator analyses. Many of the self-controlled learning experiments an- alyzed in this study included multiple dependent mea- sures. However, including multiple measures from the same experiment introduces bias and inflates Type 1 er- ror (Scammacca et al., 2014). Although there are a vari- ety of methods for dealing with multiple measures from the same studies in meta-analysis, we chose to create a priority list and always selected the highest priority dependent measure that was reported. If the highest priority measure was not described in adequate detail to calculate the effect size, the authors were contacted and the data were requested. If the authors could not provide the data for the highest priority dependent mea- sure reported in their study, the experiment was left out of our analysis. The rationale for selecting the approach we did was based on five considerations. First, our interest was in motor learning as reflected by an enhanced capabil- ity to perform a skill. Motor learning studies often re- port multiple error measures, but they are not equally coupled with performance outcome. Constant error, for example, was not included on the priority list be- cause it is possible to have zero constant error while performing terribly overall. Therefore, we chose to pri- oritize measures that could be considered to be tightly coupled with performance, like absolute error, RMSE, and absolute constant error. If these measures were not used, measures that are only correlated with per- formance, such as variable error, movement time, and movement form, were selected. We reasoned this selec- tion strategy would focus the analysis on measures re- lated to improved skill while de-emphasizing other ef- fects. Second, we reasoned that averaging across de- pendent measures could introduce additional hetero- geneity to the analysis by including potentially disparate dependent measures. The third, fourth, and fifth con- siderations all relate to avoiding bias but differ with re- gard to the source of the bias and the alternate method that would include such bias. Thus, the third consid- eration was that imposing a priority list was thought to better avoid biases that could emerge from select- ing the most focal measure in a given study, because an unknowable percentage of studies may have defined the focal measure based on the strength of the findings. Fourth, we reasoned that some measures may only get reported if they support the predicted benefit of self- control. Scammacca et al. (2014) reported that effect size estimates were inflated when random dependent measures were selected in a meta-analysis case study, perhaps reflecting a selective reporting bias. Averaging across all reported measures–a fair alternative to our approach–could conceivably pick up some of this report- ing bias. Fifth, we ignored lower priority measures with data when higher priority measures lacked data because we reasoned there could be a systematic reason for this pattern: preference for reporting data associated with positive effects. Indeed, there were articles where the only measure reported with sufficient data to calculate an effect size was also the only measure with a signifi- cant result (e.g., Wulf et al., 2005). Data Extraction The four researchers separated into pairs and half of the included experiments were coded independently by one pair. The other half were coded independently by the other pair. The coding included varied moderators, publication year, and sample size. Also Hedges’ g was calculated from reported statistics and sample size us- ing the compute.es package (Re, 2013) in R (R Core Team, 2021). Effect sizes were calculated from means and standard deviations, test statistics like t and F, or from precisely reported p-values. When covariates were included in the analysis, the correlation coefficient for the covariate - dependent measure relationship was re- quired to calculate accurate effect sizes. Since this in- formation is often not reported, authors were contacted and the information was requested. One effect size was calculated for each of three time points for each experi- ment: acquisition, retention, and transfer. The independent data extractions were compared and inconsistent results were highlighted. There was 89% absolute agreement between pairs of coders on 1344 data points. For those with disagreement, one of the researchers from the other coding pair reviewed the relevant experiment to confirm the value to be used in the analysis.3 Several articles failed to report the data necessary to calculate effect sizes at some or all time-points. A total of 39 authors were emailed with requests for missing data and 17 were able to provide data following a min- imum one month period following the request. After requesting missing data, 25 experiments were excluded 3On one occasion, the third researcher was unable to match either effect calculation, so the involved researchers discussed the issue, determined the source of the inconsis- tency, and asked a fourth researcher to recalculate the effect size with clear instructions for avoiding confusion. The source of inconsistency was simply a rounding error when combining multiple groups and the fourth researcher was able to corrob- orate the calculation. 6 from primary analyses for missing retention data. A to- tal of 52 effects from 51 experiments reported in 46 ar- ticles were included in the primary meta-analysis. In addition to extracting effect sizes, inferential statistics were scraped from published experiments that reported a statistically significant effect at retention. Two authors (BM and JH) independently completed a p- curve disclosure form consisting of a direct quote of the stated hypotheses for each experiment, the experimen- tal design, and a direct quote of the results indicating a significant result (see Appendix A). There was 94% ab- solute agreement between the independent forms. Mis- matches were resolved with consensus. Outlier screening The meta-analysis R package metafor (Viechtbauer, 2010) was used to screen the data for potentially influ- ential outliers (see analysis script). In order to identify outlier values and exclude them from further analyses, the following nine influence statistics were calculated: a) externally standardized residuals, b) DFFITS values, c) Cook’s distances, d) covariance ratios, e) DFBETAS values, f) the estimates of t2 when each study is removed in turn, g) the test statistics for (residual) heterogeneity when each study is removed in turn, h) the diagonal ele- ments of the hat matrix, and i) the weights (in %) given to the observed outcomes during the model fitting. Any experiment with effects identified as extremely influen- tial by any three of the influence metrics were removed from subsequent analyses. Risk of Bias All articles were assessed for risk of bias by the lead author using the Cochrane Risk of Bias 1.0 tool (Higgins et al., 2011). Each article was coded as either high risk, unclear (some concerns), or low risk on 7 dimensions: sequence generation, allocation concealment, incom- plete outcome data, selective outcome reporting, blind- ing of outcome assessment, blinding of participants and personnel, and other sources of bias. Pre-specified Analyses Random Effects Model A naïve random effects model was fit to the reten- tion effect sizes to estimate the average reported effect of self-controlled learning and to assess heterogeneity in effect sizes between experiments. Heterogeneity was evaluated with the Q statistic and described with I2. A mixed-effects model was fit to evaluate whether differ- ences in experimental design or sample characteristics moderated the effect of self-controlled learning. Moderator Analyses Moderators were determined based on the authors’ collective knowledge of the self-controlled learning lit- erature. We coded for discrete differences in proto- cols between experiments to investigate whether differ- ing methodologies resulted in different effect size es- timates. Further, based on a meta-analysis reporting that the effect of choice on intrinsic motivation can be moderated by whether participants were compensated for completing the study (Patall et al., 2008), we also coded for compensation type. Finally, we investigated whether publication status was a moderator of the ef- fect of self-control as part of our overall approach to examining the impact of publication bias on the self- controlled learning literature. The following six mod- erators were analyzed separately in mixed-effects mod- els: a) Choice-type: Choices were categorized as either instructionally-irrelevant, knowledge of results, knowl- edge of performance, concurrent feedback, amount of practice, use of assistive device, practice schedule, ob- servational practice, or difficulty of practice; b) Experi- mental setting: Experiments were categorized as either laboratory, applied, or laboratory-applied. We defined a laboratory setting as one where learners are asked to acquire a skill not typically performed in everyday life. We defined an applied setting as one where learners are asked to acquire a skill often performed outside of a laboratory. Finally, we defined a laboratory-applied set- ting as one where learners are asked to acquire a skill resembling skills often performed outside the labora- tory but with researcher-contrived differences; c) Sub- population: The following subgroups were analyzed: adult (18-50 years of age), children/adolescents (un- der 18-years old), older adult (over 50-years-old), and clinical (clinical population defined by the research ar- ticle); d) Publication status: Articles were classified as published or unpublished (e.g., theses); e) Compensa- tion: Whether participants were compensated for par- ticipating in the experiment was categorized as com- pensated, not compensated, or not stated; f) Reten- tion delay-interval: Coded as 24-hour, 48-hours, or >48- hours. 7 Table 1 Experiment characteristics and moderator coding. Authors Year Setting Compensation Choice-type Population Retention N Published Aiken et al. 2012 Applied Not stated Observation Adult 24-hr 28 Yes Alami 2013 Lab Yes Feedback (KR) Adult 24-hr 22 No Ali et al. 2012 Lab Not stated Feedback (KR) Adult 24-hr 48 Yes Andrieux et al. 2016 Lab Not stated Task difficulty Adult 24-hr 48 Yes Andrieux et al. 2012 Lab Not stated Task difficulty Adult 24-hr 38 Yes Arsal 2004, Expt 1 Lab Not stated Feedback (KR) Adult 48-hr 28 No Arsal 2004, Expt 2 Lab Not stated Feedback (KR) Adult 48-hr 28 No Barros 2010, Blocked Lab Not stated Feedback (KR) Adult 24-hr 48 No Barros 2010, Random Lab Not stated Feedback (KR) Adult 24-hr 48 No Barros et al. 2019, Expt 1 Lab-Applied No Feedback (KR) Adult 24-hr 60 Yes Barros et al. 2019, Expt 2 Lab No Feedback (KR) Adult 24-hr 60 Yes Bass 2015 Lab No Feedback (KR) Adult 24-hr 20 No Bass 2018 Applied No Feedback (KR) Adult 24-hr 60 No Brydges et al. 2009 Applied Not stated Observation Adult >48-hr 48 Yes Bund & Weimeyer 2004 Lab-Applied No Observation Adult 24-hr 52 Yes Carter & Patterson 2012 Lab Not stated Feedback (KR) Adult 24-hr 20 Yes Carter & Patterson 2012 Lab Not stated Feedback (KR) Older 24-hr 20 Yes Carter & Patterson 2012 Lab Not stated Feedback (KR) Two 24-hr 40 Yes Chen et al. 2002 Lab Yes Feedback (KR) Adult 48-hr 48 Yes Chiviacowsky 2014 Lab Not stated Feedback (KR) Adult 24-hr 28 Yes Chiviacowsky & Lessa 2017 Lab Not stated Feedback (KR) Oider 48-hr 22 Yes Chiviacowsky & Wulf 2002 Lab Not stated Feedback (KR) Adult 24-hr 30 Yes Chiviacowsky et al. 2012 Lab Not stated Feedback (KR) Clinical 24-hr 30 Yes Chiviacowsky et al. 2008 Lab Not stated Feedback (KR) Children 24-hr 26 Yes Chiviacowsky et al. 2012 Lab Not stated Assistive device Clinical 24-hr 28 Yes Davis 2009 Applied Not stated Model Adult 24-hr 24 No Fagundes et al. 2013 Lab-Applied Not stated Feedback (KR) Adult 48-hr 52 Yes Fairbrother et al. 2012 Lab Not stated Feedback (KR) Adult 24-hr 48 Yes Ferreira et al. 2019 Lab Not stated Feedback (KR) Adult 24-hr 60 Yes Figueiredo et al. 2018 Lab No Feedback (KR) Adult 24-hr 30 Yes Ghorbani 2019, Expt 2 Lab-Applied Not stated Feedback (KR) Adult 24-hr 36 Yes Grand et al. 2015 Lab No Feedback (KR) Adult 24-hr 36 Yes Grand et al. 2017 Lab Yes Incidental Adult >48-hr 68 Yes Hansen et al. 2011 Lab No Feedback (KR) Adult 24-hr 24 Yes 8 Hartman 2007 Lab Not stated Assistive device Adult 24-hr 18 Yes Hemayettalab et al. 2013 Lab Not stated Feedback (KR) Clinical 24-hr 20 Yes Ho 2016 Lab Not stated Amount of practice Adult 24-hr 120 No Holmberg 2013 Lab-Applied No Feedback (KP) Adult 24-hr 24 No Huet et al. 2009 Lab-Applied Not stated Feedback (Concurrent) Adult 24-hr 20 Yes Ikudome et al. 2019, Expt 1 Lab-Applied No Incidental Adult 24-hr 40 Yes Ikudome et al. 2019, Expt 2 Lab-Applied No Observation Adult 24-hr 40 Yes Jalalvan et al. 2019 Lab-Applied Not stated Task difficulty Adult 24-hr 60 Yes Janelle et al. 1997 Lab-Applied Yes Feedback (KP) Adult >48-hr 48 Yes Jones 2010 Lab Yes Repetition schedule Adult 24-hr 40 No Kaefer et al. 2014 Lab No Feedback (KR) Adult 24-hr 56 Yes Keetch & Lee 2007 Lab Yes Repetition schedule Adult 24-hr 96 Yes Kim et al. 2019 Lab Yes Feedback (KR) Adult 24-hr 42 Yes Leiker et al. 2016 Lab-Applied Not stated Task difficulty Adult >48-hr 60 Yes Leiker et al. 2019 Lab Not stated Task difficulty Adult >48-hr 60 Yes Lemos et al. 2017 Applied No Observation Children 24-hr 24 Yes Lessa & Chiviacowsky 2015 Applied Not stated Amount of practice Older 48-hr 36 Yes Lewthwaite et al. 2015, Expt 1 Lab-Applied Not stated Incidental Adult 24-hr 24 Yes Lewthwaite et al. 2015, Expt 2 Lab Not stated Incidental Adult 24-hr 30 Yes Lim et al. 2015 Applied Not stated Feedback (KP) Adult 24-hr 24 Yes Marques & Correa 2016 Applied Not stated Feedback (KP) Adult 48-hr 70 Yes Marques et al. 2017 Applied Not stated Feedback (KP) Adult 24-hr 30 Yes Norouzi et al. 2016 Lab Not stated Feedback (KR) Adult 24-hr 45 Yes Nunes et al. 2019 Lab-Applied No Feedback (KP) Older 24-hr 40 Yes Ostrowski 2015 Lab Not stated Feedback (KR) Adult 24-hr 80 No Patterson & Carter 2010 Lab Yes Feedback (KR) Adult 24-hr 24 Yes Patterson & Lee 2010 Lab-Applied Yes Task difficulty Adult 48-hr 48 Yes Patterson et al. 2013 Lab Yes Feedback (KR) Adult 24-hr 48 Yes Patterson et al. 2011 Lab Yes Feedback (KR) Adult 24-hr 60 Yes Post et al. 2016 Lab-Applied No Feedback (KP) Adult 24-hr 44 Yes Post et al. 2011 Applied No Amount of practice Adult 24-hr 24 Yes Post et al. 2014 Applied Not stated Amount of practice Adult 24-hr 30 Yes Rydberg 2011 Applied Not stated Repetition schedule Adult 24-hr 16 No Sanli & Patterson 2013 Lab No Repetition schedule Adult 24-hr 24 Yes Sanli & Patterson 2013 Lab No Repetition schedule Children 24-hr 24 Yes Ste-Marie et al. 2013 Applied No Feedback (KP) Children 24-hr 60 Yes Tsai & Jwo 2015 Lab Yes Feedback (KR) Adult 24-hr 36 Yes 9 von Lindern 2017 Lab Not stated Feedback (KR) Adult 24-hr 48 No Williams et al. 2017 Lab Yes Feedback (Concurrent) Adult 24-hr 29 Yes Wu & Magill 2011 Lab No Repetition schedule Adult 24-hr 30 Yes Wu 2007, Expt 1 Lab-Applied Yes Repetition schedule Adult 24-hr 30 No Wulf & Adams 2014 Lab No Repetition schedule Adult 24-hr 20 Yes Wulf &Toole 1999 Lab-Applied Yes Assistive device Adult 24-hr 26 Yes Wulf et al. 2015, Expt 1 Lab-Applied No Repetition schedule Adult 24-hr 68 Yes Wulf et al. 2001 Lab-Applied Yes Assistive device Adult 24-hr 26 Yes Wulf et al. 2018, Expt 1 Lab-Applied No Incidental Adult 24-hr 32 Yes Wulf et al. 2018, Expt 2 Lab-Applied No Incidental Adult 48-hr 28 Yes Wulf et al. 2018, Expt 2 Lab-Applied No Observation Adult 48-hr 28 Yes Wulf et al. 2018, Expt 2 Lab-Applied No Two Adult 48-hr 42 Yes Wulf et al. 2005 Applied No Observation Adult >48-hr 26 Yes Note. KR = Knowledge of results; KP = Knowledge of performance. 10 Adjusting for Selection Effects Selection bias in the motor learning literature is likely caused by filtering based on the statistical significance of results (Lohse et al., 2016). To assess and adjust for selection effects, the R package weightr (Coburn & Vevea, 2017) was used to fit a Vevea-Hedges weight function model to the retention data (Vevea & Hedges, 1995). The weight-function model estimates the true average effect, heterogeneity, and the probability that a non-significant result survives censorship and is avail- able for analysis. Selection effects are modelled by a step function that divides the effects into two bins at one-tailed p = .025, coinciding with a two-tailed p- value of .05. The probability of a non-significant ef- fect surviving censorship to appear in the model is es- timated relative to the probability of observing a study with a significant effect. The selection-adjusted model was compared to the naïve random effects model with a likelihood ratio test. Better fit from an adjusted model suggests selection bias in the literature. The adjusted estimate from the weight-function model was pre-registered as the primary estimate of the true average effect in this meta-analysis. Please note that while the weight-function model attempts to esti- mate the true effect of self-controlled learning after cor- recting for selection biases, the estimated effect cannot be considered definitive. Nevertheless, the adjusted es- timate is likely less biased than the naïve random effects estimate (E. C. Carter et al., 2019; Hong & Reed, 2021; Kvarven et al., 2020; Vevea & Hedges, 1995). The dif- ference between the estimates can be informative about the potential impact of selection biases, with larger dis- parities between models suggesting greater selection ef- fects. P-Curve Analysis To investigate the evidential value of the self- controlled learning literature, the significant positive results at retention reported in peer-reviewed journals were submitted to a p-curve analysis (Simonsohn et al., 2015). To be included in the analysis, articles needed to meet the following criteria: a) be a published arti- cle; b) state explicitly that self-controlled learning was expected to be more effective than yoked practice; c) re- port inferential statistics comparing a self-control group and a yoked group directly on a retention test; d) con- clude that the self-control group performed significantly better than the yoked group. If the article included mul- tiple dependent measures showing a significant effect, the dependent measure priority list was used to select the highest priority measure. If only one measure was reported as significant, that effect was included even if the experiment included higher priority measures that were null. This resulted in a slightly different sample of effects from the random effects and weight-function models. The distribution of significant p-values is a function of the power of the experiments included in the anal- ysis. If a p-curve included only Type 1 errors, the ex- pected distribution would be uniform. As the power of included experiments increases, so too does the amount of right skew in the p-curve, with smaller p-values ap- pearing more frequently than large p-values. The p- curve analysis tests the null hypothesis that there is no evidentiary value by analyzing the amount of right skew in the distribution of p-values. Conversely, if researchers peek at their data and stop collecting when they reach statistical significance, a practice known as p-hacking, the distribution of significant p-values under the null would be left skewed, with p-values near .05 occurring more frequently. Varying mixtures of true effect sizes and intensities of p-hacking produce varying shapes of p-curve, therefore the observed p-curve was compared to the distribution of p-values expected if the studies were conducted with 33% power. It is unlikely that re- searchers would continuously conduct experiments that fail >66% of the time whilst studying the self-controlled learning phenomenon. Observing a p-curve significantly “flatter” than what would be expected with 33% power would suggest a lack of evidential value among the sig- nificant results (Simonsohn et al., 2014a, 2014b). Sensitivity Analyses The primary analyses were followed up with sev- eral sensitivity analyses. Sensitivity analyses are used to evaluate the sensitivity of the results to the specific parameters chosen for the original analyses. The self- controlled learning literature, like many areas of be- havioural research, was not produced exclusively by registered experiments with pre-specified analysis plans and 100% reporting frequency. The complexity of se- lection effects at various levels, including editorial deci- sions, author decisions, analysis decisions, and missing data, renders the accuracy of modeled effects impossi- ble to estimate (E. C. Carter et al., 2019). Producing a range of estimates based on varying assumptions is intended to provide the reader with a broader picture of the uncertainty of the point estimates in the primary analyses. Bias correction methods vary in their performance depending on the total amount of heterogeneity, the true average effect size, the amount of publication bias, and the intensity of p-hacking in the data (E. C. Carter et al., 2019). To determine which bias correction mod- els perform well in the various plausible conditions for 11 Other sources of bias Blinding of participants and personnel Blinding of outcome assessment Selective outcome reporting Incomplete outcome data Allocation concealment Sequence generation 0% 25% 50% 75% 100% Low risk of bias Some concerns High risk of bias Figure 2. Proportion of studies with low risk, some concerns, and high risk of bias in each of the seven dimensions of the Cochrane RoB 1.0 tool. data in this meta-analysis, model performance checks were conducted using the Meta-Showdown Explorer shiny app developed by E. C. Carter and colleagues (2019). Simulated conditions were as follows: medium publications bias (significant results published at 100% frequency, non-significant published at 20% frequency, wrong direction effects published at 5% frequency), medium questionable research practice (QRP) environ- ment (for a detailed explanation of QRP environment see E. C. Carter et al., 2019), τ = 0, .2; g = 0, .2, .5; k = 60, good performance defined as a maximum of .1 upward or downward bias, and maximum mean ab- solute error of .1, also tested with maximum bias and error values of .15. With good performance defined by a maximum bias in either direction of .1 and maximum absolute error of .1, the weight function model and, to a lesser extent, p-curve models provided coverage across all plausible conditions except the highest heterogeneity condition (τ = .4). With good performance defined as a maximum bias and error of .15, the precision-effect with standard error (PEESE) method provided good perfor- mance in all conditions. Therefore, sensitivity analy- ses were conducted on effect size data via p-curve and PEESE methods. An additional sensitivity analysis of the estimated power among included studies was con- ducted with the z-curve (Bartoš & Schimmack, 2020). Z-curve, like p-curve, analyzes only statistically signif- icant results and estimates the power of the included studies (called expected replication rate, ERR). How- ever, unlike p-curve, z-curve is robust to heterogeneity because it fits a finite mixture model of seven distribu- tions, allowing the underlying true effects to vary. Fur- ther, z-curve also estimates the power of all studies that have been conducted (called expected discovery rate, EDR) which can be compared to the observed discovery rate in order to test for the presence of publication bias. Primary P-Curve A leave-one-out analysis of p-curve results was con- ducted to assess the extent to which the primary results depended on the inclusion of one or two extreme re- sults. Results that depend on the inclusion of one or two extreme results should not be considered robust. Results Risk of Bias The risk of bias assessment revealed lackluster re- porting standards were pervasive among the included articles (see Figure 2). For example, comparing a self- control group to a yoked group usually involves first col- lecting a self-control participant, then their yoked coun- terpart. Despite this, most articles simply reported that http://www.shinyapps.org/apps/metaExplorer/ 12 the participants were randomly assigned to these con- ditions, with no indication of how this temporal con- straint was addressed. A similar issue was observed with respect to addressing outliers and attrition. Over 75% of the included articles failed to mention outliers and how they were addressed (captured by the incom- plete outcome data dimension). Most studies included in this study were not double-blind, largely due to the inherent difficulties in conducting a double-blind study of self-controlled motor learning. While the risk of bias associated with a lack of double blinding has been de- bated (see Howick, 2008), it is nonetheless notable that double-blinding was rare among the included studies. Outlier removal Two studies were flagged as significantly influential outliers by all nine influence metrics calculated during data screening: Lemos et al. (2017, g = 3.7), and Mar- ques et al. (2017, g = 3.95). No other effect sizes were identified as outliers by any metric. Both outliers were removed from all subsequent analyses. Naïve Random Effects Model The naïve random effects model estimated the av- erage treatment effect of self-controlled practice, g = .44 (k = 52, N = 2061, 95% CI [.31, .56]). However, there was significant variability in the average effect es- timated across experiments, Q(df = 51) = 103.45, p < .0001, τ = .31. It was estimated that 47.9% (I2) of the total variability in effect sizes across experiments was due to true heterogeneity in the underlying effects measured (see Figure 3). Moderator Analyses Six moderators selected for theoretical and/or methodological reasons were tested separately. Five moderators failed to account for a significant amount of heterogeneity: experimental setting (p = .46, R2 = 1%), compensation (p = .99, R2 = 0%), choice-type (p = .71, R2 = 0%), sub-population (p = .74, R2 = 0%), and retention interval (p = .54, R2 = 0%). One moderator, publication status, accounted for a statisti- cally significant amount of heterogeneity, p < .0001, R2 = 48%. Among published experiments, self-controlled practice had a strong benefit, g = .54, 95% CI [.28, .81]. However, among unpublished experiments, self- controlled practice had essentially no effect, g = .003, 95% CI [-.23, 24]. Selection Model The weight-function model combines an effect size model and a selection model (Hedges & Vevea, 1996). The effect size model is equivalent to the naïve random effects model, specifying what the distribution of effect sizes would be in the absence of publication bias or other selection effects. The selection model accounts for the probability a given study survives selection based on its p-value and specifies how the effect size distribution is modified by selection. A weight-function model with a p-value cutpoint of (one-tailed) .025 was fit to the re- tention effect size estimates (see Figure 4). The results of a likelihood ratio test suggest the adjusted model was a significantly better fit to the data than the unadjusted model, χ2(df = 1) = 21.18, p < .0001.4 The adjusted ef- fect size estimate was significantly different from zero, g = .107, p < .001, 95% CI [.05, .17]. According to the adjusted model, non-significant results were 6% as likely to survive selection as significant results. Note that the weightr function failed to estimate the random effects model and the results reported here are based on a fixed-effect estimate. P-Curve The purpose of the p-curve analysis was to investigate the evidential value in the published reports (N = 26) of statistically significant self-controlled learning bene- fits. Visual inspection of Figure 5 reveals a v-shaped distribution with the greatest frequency of p-values in the < .05 bin. The observed p-curve was significantly flatter than would be expected if the experiments had 33% power, p = .0035, indicating an absence of evi- dential value. Conversely, the half p-curve (Simonsohn et al., 2015) was significantly right skewed, suggesting the presence of evidential value. Sensitivity analysis, however, revealed that the half curve does not remain significantly right skewed following removal of the most extreme p-value from the sample. The estimated power of the included studies was 5%, 95% CI [5%, 17%]. Interim Discussion The primary results described above suggest that se- lection effects have caused a seriously distorted record of self-controlled learning. Estimated benefits are less than one third of the naïve estimate, g = .107, 95% CI [.05, .17]. The p-curve analysis failed to detect ro- bust evidence of a self-controlled learning effect. The performance of the weight-function model depends on the specific conditions present in the meta-analysis, al- though these conditions are unknowable (E. C. Carter et al., 2019). It was necessary to conduct sensitiv- ity analyses with additional bias correction methods to 4Be aware that the likelihood ratio test is not robust to mis- specification of the random effects model (Hedges & Vevea, 1996). 13 -2 -1 0 1 2 3 4 Standardized Mean Difference Wulf et al., 2017 (Exp. 2) Wulf et al., 2017 (Exp. 1) Wulf et al., 2015 Wulf & Toole, 1999 Wulf & Adams, 2014 Williams et al., 2017 Tsai & Jwo, 2015 Ste-Marie et al., 2013 Post et al., 2014 Post et al., 2011 Post et al., 2016 Patterson & Carter, 2010 Marques & Correa, 2016 Lim et al., 2015 Lewthwaite et al., 2015 (Exp. 2) Lewthwaite et al., 2015 (Exp. 1) Lessa & Chiviacowsky, 2015 Leiker et al., 2016 Kim et al., 2019 Kaefer et al., 2014 Ikodome et al., 2019 (Exp. 2) Ikodome et al., 2019 (Exp. 1) Hartman, 2007 Hansen et al., 2011 Grand et al., 2017 Grand et al., 2015 Figueiredo et al., 2018 Ferreira et al., 2019 Fairbrother et al., 2012 Chiviacowsky et al., 2012.2 Chiviacowsky et al., 2008 Chiviacowsky et al., 2012.1 Chiviacowsky & Wulf, 2002 Chiviacowsky & Lessa, 2017 Chiviacowsky, 2014 Barros et al., 2018 (Exp. 2) Barros et al., 2018 (Exp. 1) Andrieux et al, 2016 Ali et al., 2012 Aiken et al., 2012 Wu, 2007 (Exp. 1) von Lindern, 2017 Rydberg, 2011 Ostrowski & Porter, 2015 Holmberg, 2013 Ho, 2016 Bass, 2015 Barros, 2010 (random) Barros, 2010 (blocked) Arsal, 2004 (Exp. 2) Arsal, 2004 (Exp. 1) Alami, 2013 0.76 [ 0.11, 1.41] 0.82 [ 0.11, 1.53] 0.63 [ 0.15, 1.11] 0.81 [ 0.03, 1.59] 2.16 [ 1.09, 3.23] 0.27 [-0.44, 0.98] 0.85 [ 0.14, 1.56] 0.82 [ 0.30, 1.34] 0.88 [ 0.15, 1.61] 0.12 [-0.66, 0.90] 0.14 [-0.45, 0.73] 0.89 [ 0.08, 1.70] 0.59 [ 0.04, 1.14] 1.68 [ 0.78, 2.58] 1.00 [ 0.27, 1.73] 1.07 [ 0.24, 1.90] 0.72 [ 0.07, 1.37] 0.45 [ 0.01, 0.89] 0.32 [-0.27, 0.91] 0.53 [ 0.01, 1.05] 0.88 [ 0.26, 1.50] 0.08 [-0.20, 0.36] 1.29 [ 0.31, 2.27] -0.44 [-1.27, 0.39] -0.14 [-0.62, 0.34] 0.30 [-0.35, 0.95] 0.37 [-0.48, 1.22] 0.27 [-0.25, 0.79] 0.41 [-0.14, 0.96] 0.76 [ 0.03, 1.49] 0.80 [ 0.02, 1.58] 0.77 [ 0.04, 1.50] 0.31 [-0.40, 1.02] 0.36 [-0.45, 1.17] 0.77 [ 0.01, 1.53] 0.55 [-0.07, 1.17] 0.31 [-0.31, 0.93] 0.94 [ 0.13, 1.75] 0.49 [-0.06, 1.04] 0.09 [-0.62, 0.80] 0.36 [-0.35, 1.07] 0.18 [-0.50, 0.86] 0.04 [-0.88, 0.96] 0.11 [-0.41, 0.63] -0.69 [-1.50, 0.12] -0.18 [-0.52, 0.16] 0.83 [-0.05, 1.71] 0.14 [-0.57, 0.85] -0.43 [-1.16, 0.30] -0.12 [-0.83, 0.59] 0.83 [ 0.07, 1.59] -0.92 [-1.77, -0.07] Hedges' g [95%CI]Author(s) and Year Favours Self-ControlFavours Yoked Published Experiments Unpublished Experiments 0.54 [0.42, 0.67]RE Model for Published Subgroup 0.01 [-0.25, 0.26]RE Model for Unpublished Subgroup 0.44 [0.31, 0.56]RE Model for Overall Estimate Figure 3. Forest plot of Hedges’ g (95% CI) for self-controlled versus yoked groups on retention tests. Size of squares is proportional to 1/σ2 (precision). Experiments are divided into published and unpublished subgroups and the black polygons represent 95% CI estimates from subgroup analyses. The black polygon at the bottom of the figure represents the 95% CI estimate for all included experiments. 14 Observed Outcome S ta n d a rd E rr o r 0 .5 4 8 0 .4 1 1 0 .2 7 4 0 .1 3 7 0 -1 0 1 2 0.10 < p ≤ 1.00 0.05 < p ≤ 0.10 0.01 < p ≤ 0.05 0.00 < p ≤ 0.01 Studies Figure 4. Funnel plot of self-controlled learning studies at retention. Standard error is plotted on the y-axis and Hedges’ g is plotted on the x-axis. Dark gray contour regions represent two-tailed p-values between .10 and .05 (not quite significant). The light gray contour regions represent two-tailed p-values between .05 and .01. In the absence of bias (and other forms of heterogeneity), the most precise experiments would center on the naïve random effects estimate near the top of the plot and as experiments get progressively less precise they would move down the plot and spread out symmetrically. In the presence of bias, one would expect experiments to cluster in the light gray contour regions. The clustering of experiments in the positive light gray contour region in the above plot suggests substantial bias. assess the reliability of the selection-adjusted weight- function model estimate. Based on performance checks conducted under a range of plausible conditions, it was determined that sensitivity analyses conducted with a PEESE meta-regression and p-curve effect size estima- tion would provide good performance coverage across most plausible conditions. Sensitivity Analyses Precision-Effect with Standard Error (PEESE) model When publication bias is present in a body of ev- idence, sample size and effect size can be negatively correlated (Stanley & Doucouliagos, 2014). The PEESE model fits a quadratic relationship between effect size and standard error to reflect the intuition that publica- tion bias is stronger for low precision studies than high precision studies. The rationale is that low precision studies need to overestimate effects to achieve signifi- cance and get published, while high precision studies can publish without exaggerated effects; thus, creating greater publication bias among lower precision stud- ies (E. C. Carter et al., 2019; Stanley & Doucouliagos, 2014). A weighted-least-squares regression model was fit with effect size regressed on the square of the stan- dard error, weighted by the inverse of the variance: gi = b0 + b1 se 2 i + ei (1) The PEESE method estimated a non-significant benefit of self-controlled learning after controlling for publica- tion bias, g = .054, 95% CI [-.18, .29], p = .659. 15 Figure 5. P-curve analysis of published experiments that were statistically significant at retention. If the included experiments are studying a true null hypothesis the expected distribution of p-values is uniform, represented by the dotted line. If the experiments are studying a true effect, the expected distribution becomes increasingly right skewed as a function of statistical power. The expected right skewed distribution associated with 33% power is plotted by the dashed line. The observed p-curve is plotted by the solid line and was substantially flatter than the 33% power distribution. The half p-curve analysis included p-values below p = .025 and was significantly right skewed. The right skew did not survive deletion of the most extreme value. P-Curve Effect Estimation A p-curve model was fit to the overall retention effect size data, unlike the first primary p-curve which was fit to the reported significant results. The p-curve is a function of sample size and effect size, and because sample size is known, the effect size that provides the best fit to the observed p-curve can be estimated (Si- monsohn et al., 2014a). A p-curve analysis conducted with the R package dmetar (Harrer et al., 2019) was used to estimate the average effect size among the sta- tistically significant effects in the meta-analysis. The model estimated an average effect of g = .035.5 The estimated power of included studies was 7%, 95% CI [5%, 22%]. Unfortunately, p-curve does not perform well in the presence of heterogeneity and these results should be interpreted cautiously. Z-Curve A z-curve was fit to the overall retention data and estimated the power of statistically significant studies (ERR) as 12%, 95% CI [3%, 34%]. The power of all studies conducted (EDR) was estimated as 6%, 95% CI [5%, 13%]. The 95% confidence intervals for both the ERR and EDR failed to include the observed discovery rate of 48%, suggesting significant publication bias in the data. Acquisition and Transfer In light of the evidence that experiments are appar- ently selected for positive self-controlled learning ef- 5The p-curve of effect sizes was significantly flatter than the expected 33% power curve as well, p = .009. 16 fects at retention, pre-planned exploratory estimates of the effect of self-controlled practice on acquisition and transfer performance can no longer be considered reli- able. However, given that some have argued that trans- fer tests are more sensitive measures of motor learn- ing than delayed retention tests (Chiviacowsky & Wulf, 2002; Fairbrother et al., 2012), the transfer test data were analyzed via both naïve random effects and weight function models. The naïve estimate at transfer was g = .52, while the bias corrected estimate was g = .17, p = .24. As with delayed retention, the selection model provided a better fit to the transfer data than the naïve model, p = .008. The primary take away from these analyses is that the reported self-controlled learning ef- fects to date are unreliable. Discussion The primary objective of this meta-analysis was to as- sess the effect of providing choices during the acquisi- tion of a motor skill on delayed retention performance in the general population. A secondary objective was to test between motivation and informational explana- tions for self-controlled learning benefits by investigat- ing whether choice-type moderates the effect of choice. To this aim, an extensive search for experiments that compared self-controlled practice to a yoked compari- son group was conducted. Effect size and moderator data were ascertained from data reported in the re- search articles or, in some cases, received directly from the authors of the studies. Efforts were taken to ensure that each effect size calculation and moderator code could be reproduced by an independent party. In paral- lel, the results of published experiments that achieved a hypothesized statistically significant result in favour of self-control were extracted directly from the articles and outlined in a p-curve disclosure form (see Appendix A). Pre-registered primary analyses were applied to the data and results were followed up with a suite of sensi- tivity analyses. The naïve random effects model estimated a benefit from self-controlled practice of g = .44. However, the naïve model fails to account for selection effects, such as publication bias and p-hacking, and as such overes- timates the true average effect when these selection ef- fects are present (E. C. Carter et al., 2019; Hedges & Ve- vea, 1996; Stanley & Doucouliagos, 2014). Publication status was a significant moderator of the self-controlled practice effect, accounting for 48% of the total hetero- geneity in the model. Published experiments reported an average benefit of g = .54 while unpublished exper- iments reported no benefit at all on average. It is possi- ble that researchers use statistical significance, typically defined as p < .05 on a two-tailed test, to filter their results for publication. To account for potential selec- tion effects driven by statistical significance, a weight- function model was fit to the retention test effect size data with a one-tailed p-value cutpoint of .025 included in the model (Vevea & Hedges, 1995). The adjusted model provided a significantly better fit to the data than the naïve random effects model. The model estimated the selection-adjusted benefit of self-controlled learning as g = .11, a dramatic departure from the naïve estimate of g = .44. Two additional bias correction techniques were conducted to assess the sensitivity of this result to changes in correction methodology. The PEESE method estimated the effect at g = .05, while p-curve estimated g = .04, and neither analysis was able to rule out the null hypothesis. In parallel to the meta-analysis described above, a p- curve was conducted on the reported significant results. The p-curve used somewhat different inclusion criteria focusing only on published, statistically significant re- sults suggesting a self-controlled learning benefit. In ad- dition, the p-curve included results reported for any de- pendent measure in an article, even if the focal measure (of this meta-analysis) was reported as non-significant. Therefore, the p-curve was more inclusive of evidence reported by authors as favouring a self-controlled ben- efit while ignoring experiments with null effects. The results revealed both significant right skew below p = .025 (two-tailed) and a p-curve that was significantly flatter than a distribution with an expected power of 33%. The evidence of right skew, indicating superiority of self-control relative to yoked conditions, was tenuous and did not survive the deletion of the most extreme result–an experiment that reported a benefit from self- control of g = 2.16 (Wulf & Adams, 2014). The overall p-curve produced an estimate that the true power of the included experiments was 5%, leading to a rejection of the hypothesis that the experiments contained eviden- tial value. It appears from these analyses that the substantial self-controlled learning literature is, as of now, insuf- ficient to provide evidence that self-controlled practice is more effective than a yoked practice. The bias cor- rection techniques applied in this analysis are sensitive to unknown conditions, such as the true average effect size and the amount of true heterogeneity; although ef- forts were taken to provide coverage across most plau- sible conditions. The corrected estimates produced by the weight-function model, p-curve, and PEESE meth- ods appeared to converge on trivially small effects. Fur- ther, the p-curve of significant results suggested a lack of evidential value. Based on the model performance parameters we tested (E. C. Carter et al., 2019), which allowed up to .15 units of maximum bias or mean ab- 17 solute error as acceptable performance, our results are consistent with a self-controlled learning benefit rang- ing from g = -.11 to .26, with a plausible upper 95% confidence limit of g = .33. Thus, this analysis does not rule out the possibility that self-controlled practice pro- vides meaningful motor learning benefits on average. The present literature, however, appears insufficient to establish that a self-control benefit indeed exists. Turning to the current theoretical debates surround- ing the motivational and informational underpinnings of self-controlled learning, these debates now seem moot, or at least premature. The effectiveness of self- control was not moderated by choice-type, suggesting that self-controlled practice may be ineffective regard- less of the nature of the choices provided. Indeed, the only factor we tested that moderated the effect of self- controlled practice was publication status. Future Studies Given that the current meta-analysis failed to sup- port the widely touted assertion of a substantial self- controlled learning benefit (Sanli et al., 2013; Ste-Marie et al., 2019; Wulf & Lewthwaite, 2016), considerations need to be given to the design and research practices for future studies. Registered reports provide one pos- sible path forward (Caldwell et al., 2020). A registered report involves submitting a research proposal to a two- phase peer-review. The first phase of the review oc- curs prior to data-collection and is assessed based on the proposed methodology, rationale, and potential con- tribution. If accepted in principle, researchers commit to carrying out the registered experiment and submit- ting the results in a final article for the second phase of peer-review. The final article is peer-reviewed for qual- ity and adherence to the registered plan, but accept- reject decisions at this point are not based on the re- sults. In theory, this practice should eliminate p-hacking and, for literatures composed entirely of registered re- ports, publication bias. A number of motor behaviour and/or kinesiology journals have begun adopting reg- istered reports as an option for authors, including the Human Movement Science, Frontiers in Movement Science and Sport Psychology, Journal of Sport and Exercise Psy- chology, Journal of Sport Sciences, and Reports in Sport and Exercise (formerly Registered Reports in Kinesiology). While registered reports are a potentially fruitful pro- cess to begin the accumulation of evidence regarding self-controlled learning, there are practical issues with investigating self-controlled learning that motor learn- ing researchers may find overly burdensome. For exam- ple, to have 80% power to detect an effect of g = .26 with a two cell experimental design, 506 participants are required. If the weight-function adjusted estimate of g = .11 is accurate, N = 2600 are required. More chal- lenging still would be testing between hypothesized mo- tivational and informational mechanisms. For example, if a 2 (choice) X 2 (choice-relevance) experiment were conducted to test whether the instructional-relevance of choice fully attenuates its effect, four times as many par- ticipants would be required to maintain the same degree of power (Simonsohn, 2015). In contrast, the median sample size among experiments included in this meta- analysis was N = 36, which is typical of motor learning experiments in general (Lohse et al., 2016). In addition to challenges with establishing that an effect exists, additional challenges will emerge if re- searchers are interested in generalizing the benefits of self-controlled practice beyond comparisons to a yoked group, as has been the case thus far (Ste-Marie et al., 2019; Wulf & Lewthwaite, 2016). Yoking may allow for inferences to be made about the act of making cer- tain choices, but it may not provide an adequate control group for evaluating best practices in an applied setting (e.g., J. A. C. Barros et al., 2019; Ste-Marie et al., 2019; Yantha et al., 2022). Indeed, given that our estimate suggests the advantage of self-controlled over yoked practice is small, if it exists at all, it seems unlikely that self-control would be more effective than an instructor- guided practice. An instructor-guided group could eas- ily be argued to have advantages over a yoked group, because of the ability for the instructor to adapt choices to the current practice context and to make use of per- sonal experience and expertise. Following this logic, experiments investigating the benefit of self-controlled over instructor-guided practice could conceivably re- quire substantially larger samples than experiments that use yoked comparison groups. Exploratory Analysis of Pre-Registered Experiments There have been, to our knowledge, four pre- registered experiments that have compared self- controlled and yoked practice (Grand et al., 2017; McKay & Ste-Marie, 2022; St. Germain et al., 2022; Yantha et al., 2022). Three of these experiments failed to meet our inclusion criteria because they were not published or part of an accepted thesis at the time of the analysis (McKay & Ste-Marie, 2022; St. Germain et al., 2022; Yantha et al., 2022). These pre-registered experiments should provide estimates of the self-control effect unbiased by selection effects and are therefore more useful for estimating the real average effect than attempting to correct biased experiments after the fact (E. C. Carter et al., 2019). A random effects model was used to estimate the average effect of self-control in the four experiments and yielded g = .02, 95% CI [-.17, .21]. These results converge with the bias-corrected es- 18 timates around trivially small differences between self- controlled and yoked practice conditions. Conclusions We set out to assess the effect of self-controlled prac- tice on motor learning. The published literature on the subject to date appeared unambiguously supportive of a self-control benefit, yet the results of this meta-analysis suggest this may not be the case. If authors, review- ers, and editors select for statistical significance when deciding if experiments get published, the published lit- erature becomes biased (Ioannidis, 2005). Worse still, filtering based on statistical significance may well in- centivize researchers to leverage researcher degrees of freedom to achieve a significant result, a practice known as p-hacking, further biasing the literature (Wicherts et al., 2016). An instructive example of the potential impact of selection effects comes from research study- ing the so-called ego-depletion effect (Baumeister et al., 2007; Hagger et al., 2010). In a typical study, par- ticipants are asked to engage in activities that suppos- edly drain a limited reservoir of willpower, termed ego- depletion, and are subsequently measured on a depen- dent measure requiring an additional exertion of self- control, such as a Stroop task. The typical finding is that performance suffers on the second task if ego- depletion occurs beforehand. A meta-analysis by Hag- ger et al. (2010) reported the average effect of ego- depleting interventions on willpower dependent mea- sures was d = .62. There was apparent consensus in the field that willpower relied on a limited resource due to the ostensibly unambiguous evidence in sup- port of the theory (Baumeister & Vohs, 2016). Nev- ertheless, when bias correction methods were applied in a meta-analysis of ego-depletion literature, the ad- justed estimates often did not differ significantly from zero (E. C. Carter et al., 2015). Subsequently, a pre- registered, multi-lab replication project tested a sam- ple of N = 2141 and reported that the ego-depletion effect was close to zero (Hagger et al., 2016). Thus, a prominent psychological construct substantiated by a large corpus of peer-reviewed evidence was investi- gated using cutting edge meta-analytic techniques that corrected for selection bias and the result was a triv- ially small estimated effect–an estimate supported by a subsequent large scale pre-registered replication effort. Notably, both the bias corrected meta-analysis and the subsequent multi-lab replication efforts have been crit- icized by ego-depletion theorists (Baumeister & Vohs, 2016; Cunningham & Baumeister, 2016). Others have sharply challenged these critiques (Schimmack, 2020), and while debate continues among social psychologists about the underlying theory at stake (e.g., Dang, 2018), there is consensus that several methods shown to pro- duce positive results in the past are unlikely to replicate in future experiments. In stark parallel to the ego-depletion literature, the findings of the current research suggest the self- controlled motor learning literature may be similarly biased. As motor learning researchers consider the path forward for self-controlled learning, non-bias re- lated limitations of the extant literature should be ad- dressed. For example, yoked groups fail to isolate pu- tative motivational and informational processes when self-controlling learners make choices pertinent to ac- quiring a skill (M. J. Carter et al., 2016; M. J. Carter & Ste-Marie, 2017b; Lewthwaite et al., 2015). Further, exclusive reliance on yoked comparison groups limits the generalizability of self-controlled learning to applied settings where the alternative to self-control is typically coach or instructor control (i.e., those with domain- specific knowledge). As motor learning researchers in this area move forward, they are faced with the ques- tion of whether this effect is worth the resources re- quired to study it. If that answer is yes, then in addition to being pre-registered and an adequately powered de- sign, future self-controlled learning experiments should provide insight about either the underlying processes at work or the real world usefulness of this practice vari- able. Author Contact Corresponding author: Brad McKay, Department of Kinesiology, McMaster University, 1280 Main St W, Hamilton ON Canada, L8S 4K1. Permanent E-mail: bradmckay8@gmail.com Institution E-mail: mckayb9@mcmaster.ca Brad McKay 0000-0002-7408-2323 Zachary D. Yantha 0000-0003-1851-7609 Julia Hussien 0000-0001-7434-228X Michael J. Carter 0000-0002-0675-4271 Diane M. Ste-Marie 0000-0002-4574-9539 Acknowledgements All authors thank Heather Smith for her help with data extraction. Conflict of Interest and Funding The authors declare no conflicts of interest. BM was supported by a Social Sciences and Humanities Re- search Council (SSHRC) of Canada Canada Graduate Scholarship - Doctoral. MJC was supported by a Natural Sciences and Engineering Research Council (NSERC) of Canada Discovery Grant (RGPIN-2018-05589). https://orcid.org/0000-0002-7408-2323 https://orcid.org/0000-0003-1851-7609 https://orcid.org/0000-0001-7434-228X https://orcid.org/0000-0002-0675-4271 https://orcid.org/0000-0002-4574-9539 19 R packages used in this project We used R (Version 4.0.4; R Core Team, 2021) and the R-packages computees (Re, 2013), dmetar (Ver- sion 0.0.9000; Harrer et al., 2019), kableExtra (Ver- sion 91.3.4; Zhu, 2021), meta (Version 4.18.0; Bal- duzzi et al., 2019), metafor (Version 3.0.2; Viecht- bauer, 2010), papaja (Version 0.1.0.9997; Aust & Barth, 2020), rcolorbrewer (Neuwirth, 2014), robvis (Version 0.3.0; McGuinness, 2019), tidyverse (Version 1.3.0; Wickham et al., 2019), and weightr (Version 2.0.2; Coburn & Vevea, 2019). Author Contributions (CRediT Taxonomy) Conceptualization: BM, ZDY, JH, MJC, DSM Data curation: BM, MJC Formal Analysis: BM Funding acquisition: BM, MJC Investigation: BM, ZDY, JH, MJC Methodology: BM Project administration: BM Software: BM, MJC Supervision: BM, DSM Validation: BM, MJC Visualization: BM, MJC Writing – original draft: BM, ZDY, JH, MJC, DSM Writing – review & editing: BM, ZDY, JH, MJC, DSM Open Science Practices This article earned the Preregistration+, Open Data and the Open Materials badge for preregistering the hypothesis and analysis before data collection, and for making the data and materials openly available. It has been verified that the analysis reproduced the results presented in the article. The entire editorial process, including the open reviews, is published in the online supplement. References References marked with an asterisk (*) indicate studies included in the meta-analysis. Abdollahipour, R., Palomo Nieto, M., Psotta, R., & Wulf, G. (2017). External focus of attention and au- tonomy support have additive benefits for mo- tor performance in children. Psychology of Sport and Exercise, 32, 17–24. *Aiken, C. A., Fairbrother, J. T., & Post, P. G. (2012). The effects of self-controlled video feedback on the learning of the basketball set shot. Frontiers in Psychology., 3, 338. *Alami, A. (2013). An examination of feedback request strategies when learning a multi-dimensional motor task under self-controlled and yoked condi- tions (Doctoral dissertation). University of Ten- nessee, Knoxville. *Ali, A., Fawver, B., Kim, J., Fairbrother, J., & Janelle, C. M. (2012). Too much of a good thing: Ran- dom practice scheduling and self-control of feedback lead to unique but not additive learn- ing benefits. Frontiers in Psychology., 3, 503. *Andrieux, M., Boutin, A., & Thon, B. (2016). Self- Control of task difficulty during early practice promotes motor skill learning. Journal of Motor Behavior, 48(1), 57–65. *Andrieux, M., Danna, J., & Thon, B. (2012). Self-control of task difficulty during train- ing enhances motor learning of a complex coincidence-anticipation task. Research Quar- terly for Exercise and Sport, 83(1), 27–35. *Arsal, G. (2004). Effects of external and Self-Controlled feedback schedule on retention of anticipation timing and ball throwing task (Master’s thesis). Middle East Technical University. Aust, F., & Barth, M. (2020). papaja: Create APA manuscripts with R Markdown [R package ver- sion 0.1.0.9997]. https : / / github . com / crsh / papaja Balduzzi, S., Rücker, G., & Schwarzer, G. (2019). How to perform a meta-analysis with R: A practical tu- torial. Evidence-Based Mental Health, (22), 153– 160. https://doi.org/10.1136/ebmental-2019- 300117 *Barros, J. A. C., Yantha, Z. D., Carter, M. J., Hussien, J., & Ste-Marie, D. M. (2019). Examining the impact of error estimation on the effects of self- controlled feedback. Human Movement Science, 63, 182–198. *Barros, J. A. (2010). The effects of practice schedule and Self-Controlled feedback manipulations on the acquisition and retention of motor skills (Doc- toral dissertation). University of Tennessee, Knoxville. Bartoš, F., & Schimmack, U. (2020). Z-curve.2.0: Esti- mating replication rates and discovery rates. *Bass, A. D. (2015). An experiment to chronometrically examine the effects of Self-Controlled feedback on the performance and learning of a sequential timing task (Master’s thesis). University of Ten- nessee, Knoxville. https://github.com/crsh/papaja https://github.com/crsh/papaja https://doi.org/10.1136/ebmental-2019-300117 https://doi.org/10.1136/ebmental-2019-300117 20 *Bass, A. D. (2018). The effect of observation on mo- tor learning in a self-controlled feedback proto- col (Doctoral dissertation). University of Ten- nessee, Knoxville. Baumeister, R. F., & Vohs, K. D. (2016). Strength model of Self-Regulation as limited resource: Assess- ment, controversies, update (chapter 2). In J. M. Olson & M. P. Zanna (Eds.), Advances in experimental social psychology (pp. 67–127). Academic Press. Baumeister, R. F., Vohs, K. D., & Tice, D. M. (2007). The strength model of Self-Control. Current Direc- tions in Psychological Science, 16(6), 351–355. *Brydges, R., Carnahan, H., Safir, O., & Dubrowski, A. (2009). How effective is self-guided learning of clinical technical skills? it’s all about process. Medical Education, 43(6), 507–515. *Bund, A., & Wiemeyer, J. (2004). Self-controlled learn- ing of a complex motor skill: Effects of the learner’s preferences on performance and self- efficacy. Journal of Human Movement Studies, 47, 215–236. Caldwell, A. R., Vigotsky, A. D., Tenan, M. S., Radel, R., Mellor, D. T., Kreutzer, A., Lahart, I. M., Mills, J. P., Boisgontier, M. P., & Consortium for Trans- parency in Exercise Science (COTES) Collabo- rators. (2020). Moving sport and exercise sci- ence forward: A call for the adoption of more transparent research practices. Sports Medicine, 50(3), 449–459. Carter, E. C., Kofler, L. M., Forster, D. E., & McCullough, M. E. (2015). A series of meta-analytic tests of the depletion effect: Self-control does not seem to rely on a limited resource. Journal of Experi- mental Psychology: General, 144(4), 796–815. Carter, E. C., Schönbrodt, F. D., Gervais, W. M., & Hil- gard, J. (2019). Correcting for bias in psychol- ogy: A comparison of Meta-Analytic methods. Advances in Methods and Practices in Psycholog- ical Science, 2(2), 115–144. Carter, M. J., Carlsen, A. N., & Ste-Marie, D. M. (2014). Self-controlled feedback is effective if it is based on the learner’s performance: A replication and extension of chiviacowsky and wulf (2005). Frontiers in Psychology., 5, 1325. *Carter, M. J., & Patterson, J. T. (2012). Self-controlled knowledge of results: Age-related differences in motor learning, strategies, and error detection. Human Movement Science, 31(6), 1459–1472. Carter, M. J., Rathwell, S., & Ste-Marie, D. (2016). Motor skill retention is modulated by strategy choice during self-controlled knowledge of re- sults schedules. Journal of Motor Learning and Development, 4(1), 100–115. Carter, M. J., & Ste-Marie, D. M. (2017a). An interpo- lated activity during the knowledge-of-results delay interval eliminates the learning advan- tages of self-controlled feedback schedules. Psy- chological Research, 81(2), 399–406. Carter, M. J., & Ste-Marie, D. M. (2017b). Not all choices are created equal: Task-relevant choices enhance motor learning compared to task- irrelevant choices. Psychonomic Bulletin & Re- view, 24(6), 1879–1888. *Chen, D. D., Hendrick, J. L., & Lidor, R. (2002). En- hancing self-controlled learning environments: The use of self-regulated feedback information. Journal of Human Movement Studies, 43(1), 69. *Chiviacowsky, S. (2014). Self-controlled practice: Au- tonomy protects perceptions of competence and enhances motor learning. Psychology of Sport and Exercise, 15(5), 505–510. *Chiviacowsky, S., & Lessa, H. T. (2017). Choices over feedback enhance motor learning in older adults. Journal of Motor Learning and Develop- ment, 5(2), 304–318. *Chiviacowsky, S., & Wulf, G. (2002). Self-controlled feedback: Does it enhance learning because performers get feedback when they need it? Re- search Quarterly for Exercise and Sport, 73(4), 408–415. Chiviacowsky, S., & Wulf, G. (2005). Self-controlled feedback is effective if it is based on the learner’s performance. Research Quarterly for Exercise and Sport, 76(1), 42–48. *Chiviacowsky, S., Wulf, G., de Medeiros, F. L., Kae- fer, A., & Tani, G. (2008). Learning benefits of self-controlled knowledge of results in 10-year- old children. Research Quarterly for Exercise and Sport, 79(3), 405–410. Chiviacowsky, S., Wulf, G., & Lewthwaite, R. (2012). Self-controlled learning: The importance of protecting perceptions of competence. Frontiers in Psychology., 3, 458. *Chiviacowsky, S., Wulf, G., Lewthwaite, R., & Cam- pos, T. (2012). Motor learning benefits of self- controlled practice in persons with parkinson’s disease. Gait & Posture, 35(4), 601–605. *Chiviacowsky, S., Wulf, G., Machado, C., & Rydberg, N. (2012). Self-controlled feedback enhances learning in adults with down syndrome. Revista Brasileira de Fisioterapia, 16(3), 191–196. Chua, L.-K., Wulf, G., & Lewthwaite, R. (2018). Onward and upward: Optimizing motor performance. Human Movement Science, 60, 107–114. 21 Coburn, K. M., & Vevea, J. L. (2017). Weightr: Esti- mating weight-function models for publication bias. R package version, 1(2). Coburn, K. M., & Vevea, J. L. (2019). Weightr: Estimat- ing weight-function models for publication bias [R package version 2.0.2]. https : / / CRAN . R - project.org/package=weightr Couvillion, K. F., Bass, A. D., & Fairbrother, J. T. (2020). Increased cognitive load during acquisition of a continuous task eliminates the learning effects of self-controlled knowledge of results. Journal of Sport Sciences, 38(1), 94–99. Cunningham, M. R., & Baumeister, R. F. (2016). How to make nothing out of something: Analyses of the impact of study sampling and statistical inter- pretation in misleading Meta-Analytic conclu- sions. Frontiers in Psychology., 7, 1639. Dang, J. (2018). An updated meta-analysis of the ego depletion effect. Psychological Research, 82(4), 645–651. *Davis, J. (2009). Effects of self-controlled feedback on the squat (Master’s thesis). State University of New York College at Cortland. *Fagundes, J., Chen, D. D., & Laguna, P. (2013). Self- control and frequency of model presentation: Effects on learning a ballet passé relevé. Human Movement Science, 32(4), 847–856. *Fairbrother, J. T., Laughlin, D. D., & Nguyen, T. V. (2012). Self-controlled feedback facilitates mo- tor learning in both high and low activity indi- viduals. Frontiers in Psychology., 3, 323. *Ferreira, B. P., Malloy-Diniz, L. F., Parma, J. O., Nogueira, N. G. H. M., Apolinário-Souza, T., Ugrinowitsch, H., & Lage, G. M. (2019). Self- Controlled feedback and learner impulsivity in sequential motor learning. Perceptual Motor Skills, 126(1), 157–179. *Figueiredo, L. S., Ugrinowitsch, H., Freire, A. B., Shea, J. B., & Benda, R. N. (2018). External control of knowledge of results: Learner involvement enhances motor skill transfer. Perceptual Motor Skills, 125(2), 400–416. *Ghorbani, S. (2019). Motivational effects of enhancing expectancies and autonomy for motor learning: An examination of the OPTIMAL theory. J. Gen. Psychol., 146(1), 79–92. *Grand, K. F., Bruzi, A. T., Dyke, F. B., Godwin, M. M., Leiker, A. M., Thompson, A. G., Buchanan, T. L., & Miller, M. W. (2015). Why self-controlled feedback enhances motor learning: Answers from electroencephalography and indices of motivation. Human Movement Science, 43, 23– 32. *Grand, K. F., Daou, M., Lohse, K. R., & Miller, M. W. (2017). Investigating the mechanisms underly- ing the effects of an incidental choice on motor learning. Journal of Motor Learning and Devel- opment, 5(2), 207–226. Hagger, M. S., Chatzisarantis, N. L. D., Alberts, H., Anggono, C. O., Batailler, C., Birt, A. R., Brand, R., Brandt, M. J., Brewer, G., Bruyneel, S., Calvillo, D. P., Campbell, W. K., Cannon, P. R., Carlucci, M., Carruth, N. P., Cheung, T., Crow- ell, A., De Ridder, D. T. D., Dewitte, S., . . . Zwienenberg, M. (2016). A multilab prereg- istered replication of the Ego-Depletion ef- fect. Perspectives in Psychological Science, 11(4), 546–573. Hagger, M. S., Wood, C., Stiff, C., & Chatzisarantis, N. L. D. (2010). Ego depletion and the strength model of self-control: A meta-analysis. Psycho- logical Bulletin, 136(4), 495–525. Halperin, I., Chapman, D. W., Martin, D. T., Lewthwaite, R., & Wulf, G. (2017). Choices enhance punch- ing performance of competitive kickboxers. Psy- chological Research, 81(5), 1051–1058. Halperin, I., Wulf, G., Vigotsky, A. D., Schoenfeld, B. J., & Behm, D. G. (2018). Autonomy: A missing ingredient of a successful program? Strength & Conditioning Journal, 40(4), 18. Hancock, G. R., Butler, M. S., & Fischman, M. G. (1995). On the problem of Two-Dimensional error scores: Measures and analyses of accu- racy, bias, and consistency. Journal of Motor Be- havior, 27(3), 241–250. *Hansen, S., Pfeiffer, J., & Patterson, J. T. (2011). Self- control of feedback during motor learning: Ac- counting for the absolute amount of feedback using a yoked group with self-control over feed- back. Journal of Motor Behavior, 43(2), 113– 119. Harrer, M., Cuijpers, P., Furukawa, T., & Ebert, D. D. (2019). Dmetar: Companion r package for the guide ’doing meta-analysis in r’ [R package ver- sion 0.0.9000]. http://dmetar.protectlab.org/ *Hartman, J. M. (2007). Self-controlled use of a per- ceived physical assistance device during a bal- ancing task. Perceptual Motor Skills, 104(3 Pt 1), 1005–1016. Hedges, L. V., & Vevea, J. L. (1996). Estimating effect size under publication bias: Small sample prop- erties and robustness of a random effects selec- tion model. Journal of Educational and Behav- ioral Statistics, 21(4), 299–332. *Hemayattalab, R., Arabameri, E., Pourazar, M., Ar- dakani, M. D., & Kashefi, M. (2013). Effects of https://CRAN.R-project.org/package=weightr https://CRAN.R-project.org/package=weightr http://dmetar.protectlab.org/ 22 self-controlled feedback on learning of a throw- ing task in children with spastic hemiplegic cerebral palsy. Research in Developmental Dis- abilities, 34(9), 2884–2889. Higgins, J. P., & Green, S. (Eds.). (2011). Cochrane handbook for systematic reviews of interventions (Vol. 4). John Wiley & Sons. Higgins, J. P., Altman, D. G., Gøtzsche, P. C., Jüni, P., Moher, D., Oxman, A. D., Savović, J., Schulz, K. F., Weeks, L., & Sterne, J. A. (2011). The cochrane collaboration’s tool for assessing risk of bias in randomised trials. BMJ, 343. *Ho, R. L. M. (2016). Self-controlled learning and differ- ential goals: Does “too easy” and “too difficult” af- fect the self-control paradigm? (Doctoral disser- tation). California State University, Long Beach. *Holmberg, B. A. (2013). The “when” and the “what”: Effects of Self-Control of feedback about multiple critical movement features on motor performance and learning (Doctoral dissertation). University of Tennessee, Knoxville. Hong, S., & Reed, W. R. (2021). Using monte carlo experiments to select meta-analytic estimators. Research Synthesis Methods, 12(2), 192–215. Howick, J. (2008). Against a priori judgements of bad methodology: Questioning double-blinding as a universal methodological virtue of clinical tri- als. *Huet, M., Camachon, C., Fernandez, L., Jacobs, D. M., & Montagne, G. (2009). Self-controlled concur- rent feedback and the education of attention to- wards perceptual invariants. Human Movement Science, 28(4), 450–467. *Ikudome, S., Kou, K., Ogasa, K., Mori, S., & Nakamoto, H. (2019). The effect of choice on motor learn- ing for learners with different levels of intrinsic motivation. Journal of Sport and Exercise Psy- chology, 41(3), 159–166. Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2(8), e124. Iwatsuki, T., Navalta, J. W., & Wulf, G. (2019). Au- tonomy enhances running efficiency. Journal of Sport Sciences, 37(6), 685–691. *Jalalvand, M., Bahram, A., Daneshfar, A., & Arsham, S. (2019). The effect of gradual Self-Control of task difficulty and feedback on learning golf putting. Research Quarterly for Exercise and Sport, 90(4), 429–439. *Janelle, C. M., Barba, D. A., Frehlich, S. G., Tennant, L. K., & Cauraugh, J. H. (1997). Maximizing performance feedback effectiveness through videotape replay and a self-controlled learn- ing environment. Research Quarterly for Exercise and Sport, 68(4), 269–279. Janelle, C. M., Kim, J., & Singer, R. N. (1995). Subject- controlled performance feedback and learning of a closed motor skill. Perceptual Motor Skills, 81(2), 627–634. *Jones, A. (2010). Effects of amount and type of self- regulation opportunity during skill acquisition on motor learning (Doctoral dissertation). McMas- ter University. Jowett, N., LeBlanc, V., Xeroulis, G., MacRae, H., & Dubrowski, A. (2007). Surgical skill acquisi- tion with self-directed practice using computer- based video training. American Journal of Surgery., 193(2), 237–242. *Kaefer, A., Chiviacowsky, S., Meira, C. d. M., Jr, & Tani, G. (2014). Self-controlled practice en- hances motor learning in introverts and extro- verts. Research Quarterly for Exercise and Sport, 85(2), 226–233. *Keetch, K. M., & Lee, T. D. (2007). The effect of self- regulated and experimenter-imposed practice schedules on motor learning for tasks of vary- ing difficulty. Research Quarterly for Exercise and Sport, 78(5), 476–486. *Kim, Y., Kim, J., Kim, H., Kwon, M., Lee, M., & Park, S. (2019). Neural mechanism underlying self- controlled feedback on motor skill learning. Hu- man Movement Science, 66, 198–208. Kvarven, A., Strømland, E., & Johannesson, M. (2020). Comparing meta-analyses and preregistered multiple-laboratory replication projects. Nature Human Behaviour, 4(4), 423–434. *Leiker, A. M., Bruzi, A. T., Miller, M. W., Nelson, M., Wegman, R., & Lohse, K. R. (2016). The effects of autonomous difficulty selection on engage- ment, motivation, and learning in a motion- controlled video game task. Human Movement Science, 49, 326–335. *Leiker, A. M., Pathania, A., Miller, M. W., & Lohse, K. R. (2019). Exploring the neurophysiological effects of Self-Controlled practice in motor skill learning. Journal of Motor Learning and Devel- opment, 7(1), 13–34. *Lemos, A., Wulf, G., Lewthwaite, R., & Chiviacowsky, S. (2017). Autonomy support enhances perfor- mance expectancies, positive affect, and motor learning. Psychology of Sport and Exercise, 31, 28–34. *Lessa, H. T., & Chiviacowsky, S. (2015). Self-controlled practice benefits motor learning in older adults. Human Movement Science, 40, 372–380. 23 *Lewthwaite, R., Chiviacowsky, S., Drews, R., & Wulf, G. (2015). Choose to move: The motivational im- pact of autonomy support on motor learning. Psychonomic Bulletin & Review, 22(5), 1383– 1388. *Lim, S., Ali, A., Kim, W., Kim, J., Choi, S., & Radlo, S. J. (2015). Influence of self-controlled feed- back on learning a serial motor skill. Perceptual Motor Skills, 120(2), 462–474. Lohse, K., Buchanan, T., & Miller, M. (2016). Under- powered and overworked: Problems with data analysis in motor learning studies. Journal of Motor Learning and Development, 4(1), 37–58. *Marques, P. G., & Corrêa, U. C. (2016). The effect of learner’s control of self-observation strategies on learning of front crawl. Acta Psychologica, 164, 151–156. *Marques, P. G., Thon, R. A., Espanhol, J., Tani, G., & Corrêa, U. C. (2017). The intermedi- ate learner’s choice of self-as-a-model strategies and the eight-session practice in learning of the front crawl swim. Kinesiology, 49(1). McGuinness, L. A. (2019). Robvis: An r package and web application for visualising risk-of-bias assess- ments. https://github.com/mcguinlu/robvis McKay, B., & Ste-Marie, D. M. (2022). Autonomy sup- port via instructionally irrelevant choice not beneficial for motor performance or learning. Research Quarterly for Exercise and Sport, 93, 64–76. McShane, B. B., Böckenholt, U., & Hansen, K. T. (2016). Adjusting for publication bias in Meta-Analysis: An evaluation of selection methods and some cautionary notes. Perspectives in Psychological Science, 11(5), 730–749. Neuwirth, E. (2014). Rcolorbrewer: Colorbrewer palettes [R package version 1.1-2]. https : / / CRAN . R - project.org/package=RColorBrewer *Norouzi, E., Hossini, F. S., & Aghdasi, M. T. (2016). Effect of self-control feedback on the learning of a throwing task with emphasis on decision- making process. Open Science Journal of Psy- chology, 2(6), 32. *Nunes, M. E. d. S., Correa, U. C., Souza, M. G. T. X. d., Basso, L., Coelho, D. B., & Santos, S. (2019). No improvement on the learning of golf putting by older persons with Self-Controlled knowledge of performance. Journal of Aging and Physical Activity, 27(3), 300–308. Open Science Collaboration. (2015). Estimating the re- producibility of psychological science. Science, 349(6251), aac4716. *Ostrowski, J. (2015). The influence of shame on the frequency of self-controlled feedback and motor learning (Master’s thesis). Southern Illinois Uni- versity Carbondale. Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., Shamseer, L., Tetzlaff, J. M., Akl, E. A., Brennan, S. E., et al. (2021). The prisma 2020 statement: An updated guideline for reporting systematic re- views. BMJ, 372. Patall, E. A., Cooper, H., & Robinson, J. C. (2008). The effects of choice on intrinsic motivation and related outcomes: A meta-analysis of re- search findings. Psychological Bulletin, 134(2), 270–300. *Patterson, J. T., Carter, M., & Sanli, E. (2011). Decreas- ing the proportion of self-control trials during the acquisition period does not compromise the learning advantages in a self-controlled con- text. Research Quarterly for Exercise and Sport, 82(4), 624–633. *Patterson, J. T., & Carter, M. J. (2010). Learner regu- lated knowledge of results during the acquisi- tion of multiple timing goals. Human Movement Science, 29(2), 214–227. *Patterson, J. T., Carter, M. J., & Hansen, S. (2013). Self-controlled KR schedules: Does repetition order matter? Human Movement Science, 32(4), 567–579. *Patterson, J. T., & Lee, T. D. (2010). Self-regulated fre- quency of augmented information in skill learn- ing. Candian Journal of Experimental Psychol- ogy, 64(1), 33–40. *Post, P. G., Aiken, C. A., Laughlin, D. D., & Fairbrother, J. T. (2016). Self-control over combined video feedback and modeling facilitates motor learn- ing. Human Movement Science, 47, 49–59. *Post, P. G., Fairbrother, J. T., & Barros, J. A. C. (2011). Self-controlled amount of practice ben- efits learning of a motor skill. Research Quar- terly for Exercise and Sport, 82(3), 474–481. *Post, P. G., Fairbrother, J. T., Barros, J. A. C., & Kulpa, J. D. (2014). Self-Controlled practice within a fixed time period facilitates the learning of a basketball set shot. Journal of Motor Learning and Development, 2(1), 9–15. R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statisti- cal Computing. Vienna, Austria. https://www. R-project.org/ Re, A. C. D. (2013). Compute.es: Compute effect sizes. https://cran.r- project.org/package=compute. es https://github.com/mcguinlu/robvis https://CRAN.R-project.org/package=RColorBrewer https://CRAN.R-project.org/package=RColorBrewer https://www.R-project.org/ https://www.R-project.org/ https://cran.r-project.org/package=compute.es https://cran.r-project.org/package=compute.es 24 *Rydberg, N. (2011). The effect of self-controlled prac- tice on forearm passing, motivation, and affect in women’s volleyball players (Master’s thesis). University of Nevada, Las Vegas. *Sanli, E. A., & Patterson, J. T. (2013). Learning effects of self-controlled practice scheduling for chil- dren and adults: Are the advantages different? Perceptual Motor Skills, 116(3), 741–749. Sanli, E. A., Patterson, J. T., Bray, S. R., & Lee, T. D. (2013). Understanding self-controlled motor learning protocols through the Self- Determination theory. Frontiers in Psychology., 3, 611. Scammacca, N., Roberts, G., & Stuebing, K. K. (2014). Meta-Analysis with complex research designs: Dealing with dependence from multiple mea- sures and multiple group comparisons. Review of Educational Research, 84(3), 328–364. Schimmack, U. (2020). A meta-psychological perspec- tive on the decade of replication failures in social psychology. Canadian Psychology, 61(4), 364–376. Sigrist, R., Rauter, G., Riener, R., & Wolf, P. (2013). Augmented visual, auditory, haptic, and mul- timodal feedback in motor learning: A review. Psychonomic Bulletin & Review, 20(1), 21–53. Simonsohn, U. (2015). [17] no-way interactions. The Winnower. https : / / doi . org / 10 . 15200 / winn . 142559.90552 Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014a). P-curve and effect size: Correcting for publica- tion bias using only significant results. Perspec- tives in Psychological Science, 9(6), 666–681. Simonsohn, U., Nelson, L. D., & Simmons, J. P. (2014b). P-curve: A key to the file-drawer. Journal of Experimental Psychology: General, 143(2), 534– 547. Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2015). Better p-curves: Making p-curve analy- sis more robust to errors, fraud, and ambitious p-hacking, a reply to Ulrich and Miller (2015). Journal of Experimental Psychology: General, 144(6), 1146–1152. St. Germain, L., Williams, A., Poskus, A., Balbaa, N., Leshchyshen, O., Lohse, K. R., & Carter, M. J. (2022). Increased perceptions of autonomy through choice fail to enhance motor skill re- tention. Journal of Experimental Psychology: Hu- man Perception and Performance, 48(4), 370– 379. Stanley, T. D., & Doucouliagos, H. (2014). Meta- regression approximations to reduce publica- tion selection bias. Research Synthesis Methods, 5(1), 60–78. Stanley, T. D., Jarrell, S. B., & Doucouliagos, H. (2010). Could it be better to discard 90% of the data? a statistical paradox. American Statisti- cian, 64(1), 70–77. Ste-Marie, D. M., Carter, M. J., & Yantha, Z. D. (2019). Self-controlled learning: Current findings, the- oretical perspectives, and future directions. In N. J. Hodges & A. M. Williams (Eds.), Skill ac- quisition in sport: Research, theory and practice (pp. 119–140). Routledge. *Ste-Marie, D. M., Vertes, K. A., Law, B., & Rymal, A. M. (2013). Learner-Controlled Self-Observation is advantageous for motor skill acquisition. Fron- tiers in Psychology., 3, 556. *Tsai, M.-J., & Jwo, H. (2015). Controlling absolute fre- quency of feedback in a self-controlled situa- tion enhances motor learning. Perceptual Motor Skills, 121(3), 746–758. Vevea, J. L., & Hedges, L. V. (1995). A general linear model for estimating effect size in the presence of publication bias. Psychometrika, 60(3), 419– 435. Vevea, J. L., & Woods, C. M. (2005). Publication bias in research synthesis: Sensitivity analysis using a priori weight functions. Psychological Methods, 10(4), 428–443. Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statisti- cal Software, 36(3), 1–48. https://doi.org/10. 18637/jss.v036.i03 *von Lindern, A. D. (2017). Self-control effect during a reduction of feedback availability (Doctoral dis- sertation). University of Tennessee, Knoxville. Wicherts, J. M., Veldkamp, C. L. S., Augusteijn, H. E. M., Bakker, M., van Aert, R. C. M., & van Assen, M. A. L. M. (2016). Degrees of freedom in plan- ning, running, analyzing, and reporting psycho- logical studies: A checklist to avoid p-hacking. Frontiers in Psychology., 7, 1832. Wickham, H., Averick, M., Bryan, J., Chang, W., Mc- Gowan, L. D., François, R., Grolemund, G., Hayes, A., Henry, L., Hester, J., Kuhn, M., Ped- ersen, T. L., Miller, E., Bache, S. M., Müller, K., Ooms, J., Robinson, D., Seidel, D. P., Spinu, V., . . . Yutani, H. (2019). Welcome to the tidy- verse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686 *Williams, C. K., Tseung, V., & Carnahan, H. (2017). Self-Control of haptic assistance for motor learning: Influences of frequency and opinion of utility. Frontiers in Psychology., 8, 2082. https://doi.org/10.15200/winn.142559.90552 https://doi.org/10.15200/winn.142559.90552 https://doi.org/10.18637/jss.v036.i03 https://doi.org/10.18637/jss.v036.i03 https://doi.org/10.21105/joss.01686 25 Woodard, K. F., & Fairbrother, J. T. (2020). Cognitive loading during and after continuous task execu- tion alters the effects of self-controlled knowl- edge of results. Frontiers in Psychology., 11, 1046. *Wu, W. F. W. (2007). Self-control of learning multiple motor skills (Doctoral dissertation). Louisiana State University. *Wu, W. F. W., & Magill, R. A. (2011). Allowing learn- ers to choose: Self-controlled practice schedules for learning multiple movement patterns. Re- search Quarterly for Exercise and Sport, 82(3), 449–457. *Wulf, G., Clauss, A., Shea, C. H., & Whitacre, C. A. (2001). Benefits of self-control in dyad practice. Research Quarterly for Exercise and Sport, 72(3), 299–303. *Wulf, G., & Toole, T. (1999). Physical assistance de- vices in complex motor skill learning: Benefits of a self-controlled practice schedule. Research Quarterly for Exercise and Sport, 70(3), 265– 272. Wulf, G. (2007). Self-controlled practice enhances mo- tor learning: Implications for physiotherapy. Physiotherapy, 93(2), 96–101. Wulf, G., & Adams, N. (2014). Small choices can en- hance balance learning. Human Movement Sci- ence, 38, 235–240. *Wulf, G., Chiviacowsky, S., & Cardozo, P. L. (2014). Additive benefits of autonomy support and en- hanced expectancies for motor learning. Hu- man Movement Science, 37, 12–20. *Wulf, G., Chiviacowsky, S., & Drews, R. (2015). Ex- ternal focus and autonomy support: Two im- portant factors in motor learning have additive benefits. Human Movement Science, 40, 176– 184. *Wulf, G., Iwatsuki, T., Machin, B., Kellogg, J., Copeland, C., & Lewthwaite, R. (2018). Lasso- ing skill through learner choice. Journal of Mo- tor Behavior, 50(3), 285–292. Wulf, G., & Lewthwaite, R. (2016). Optimizing perfor- mance through intrinsic motivation and atten- tion for learning: The OPTIMAL theory of motor learning. Psychonomic Bulletin & Review, 23(5), 1382–1414. Wulf, G., & Mornell, A. (2008). Insights about practice from the perspective of motor learning: A re- view. Music Performance Research, 2, 1–25. *Wulf, G., Raupach, M., & Pfeiffer, F. (2005). Self-controlled observational practice enhances learning. Research Quarterly for Exercise and Sport, 76(1), 107–111. Wulf, G., Shea, C., & Lewthwaite, R. (2010). Motor skill learning and performance: A review of influen- tial factors. Medical Education, 44(1), 75–84. Yantha, Z. D., McKay, B., & Ste-Marie, D. M. (2022). The recommendation for learners to be pro- vided with control over their feedback sched- ule is questioned in a self-controlled learning paradigm. Journal of Sports Sciences, 40(7), 769–782. Zhu, H. (2021). Kableextra: Construct complex table with ’kable’ and pipe syntax [R package version 1.3.4]. https://CRAN.R-project.org/package= kableExtra https://CRAN.R-project.org/package=kableExtra https://CRAN.R-project.org/package=kableExtra 2 6Appendix A: P-Curve Disclosure Form Table 2 Experiment information from papers included in the p-curve analysis. Original paper Quoted text from original paper indicated predicted benefit of self- control relative to yoked practice Design Key statistical result Quoted text from original paper with statistical results Result Andrieux, Danna & Thon (2012) “Thus, we hypothesized that a prac- tice condition in which the learner could set the level of task difficulty would be more beneficial for learn- ing than a condition in which this parameter was imposed.” Two cell Difference in means “A follow up analysis restricted to the first two blocks revealed a sig- nificant difference between groups, F(1, 36) = 4.85, p <.05, partial eta squared = .12. Self-controlled learners were significantly more ac- curate (M AE = 12.73 mm, SE = 1.57) than their yoked counterparts (M AE = 18.1 mm, SE = 1.87) after a 24-hr rest.” F(1, 36) = 4.85 Andrieux, Boutin, & Thon (2016) “Two main reasons led us to expect that self-control of nominal task dif- ficulty would enhance motor skill learning, and especially when intro- duced during early practice rather than during late practice.” Four cell (Full self- control, full yoked, self-control then yoked, yoked then self-control) Difference in means “Planned pairwise comparisons re- vealed that the self-control groups exhibited lower RMSE (SC + SC, SC + YO, and YO + SC groups) than their yoked group counterparts (YO + YO group), F(1, 44) = 14.02, p <.01.” F(1, 44) = 14.02 Brydges, Carnahan, Safir & Dubrowski (2009) “We hypothesised that participants with self-guided access to instruc- tion would learn more than partic- ipants whose access to instruction was externally controlled.” 2 (Control: self, yoked) X 2 (Goals: process, outcome) Difference in means “The self-process group performed better on the retention test than the control-process group (Fig. 1). This effect was significant for time taken, (F[1,23] = 4.33, P <0.05).” F(1,23) = 4.33 Chiviacowsky (2014) “We hypothesized that participants of the self-controlled group would show superior motor learning than yoked participants” Two cell Difference in means “The Self group outperformed the Yoked group. The group main effect was significant, t(26) = 2.08, p = .04, d = .78.” t(26) = 2.08 Chiviacowsky, Wulf, de Medeiros, Kaefer & Tani (2008) “Therefore, the purpose of the present study was to examine whether the learning benefits of self- controlled KR would generalize to children.” Two cell Difference in means “The self-control group had higher accuracy scores than the yoked group. This difference was signifi- cant, F(1, 24) = 4.40, p <.05.” F(1, 24) = 4.40 2 7 Chiviacowsky, Wulf, Lewthwaite, & Cam- pos (2012) “The potential benefits of self- controlled practice have yet to be ex- amined in persons with PD...under the assumption that self-controlled practice would enhance the learning of the task...” Two cell Difference in means “The self-control group was over- all more effective than the yoked group. Time in balance was sig- nificantly longer for the self-control group, F(1, 26) = 4.25, p <.05.” F(1, 26) = 4.25 Chiviacowsky Wulf, Machado & Rydberg (2012) “We predicted that self-controlled practice, in particular the ability to choose when to receive feedback, would result in more effective learn- ing compared to a practice condi- tion without this opportunity (yoked group).” Two cell Difference in means “The day following practice, a re- tention test (without feedback) re- vealed lower AEs for the self-control group than the yoked group (see Figure 2, right). The group differ- ence was significant, with F(1, 28)= 4.72, p <0.05, eta squared =.14.” F(1, 28)= 4.72 Hartman (2007) “The primary aim of this study was to test whether there would ex- ist a learning advantage for a self- controlled group, as opposed to a yoked control group, for learning a dynamic balance task.” Two cell Difference in means “To assess the relatively permanent or learning effects of practice with or without a self-controlled use of a balance pole, both groups per- formed a retention test on Day 3. The group effect was significant, F(1, 17) = 8.29, p <.01, with the Self-control group outperform- ing the yoked group.” F(1, 17) = 8.29 Kaefer, Chiviacowsky, Meira Jr. & Tani (2014) “...both self-controlled groups (in- troverts and extroverts) will achieve a level of activation that facilitates learning through the control of stim- ulation source (feedback) in com- parison with the groups that do not have control over it.” 2 (Control: self, yoked) X 2 (Per- sonality: introvert, extrovert) Difference in means “The groups’ main effects were detected on the factor “feedback type”: Self-controlled groups per- formed better, F(1, 52) = 4.13, p <.05, compared with externally controlled groups” F(1, 52) = 4.13 Leiker, Bruzi, Miller, Nelson, Wegman & Lohse (2016) “We hypothesized that participants in the self-controlled group would show superior learning (i.e., better performance on retention and trans- fer tests) compared to the yoked group.” Two cell Difference in means “Controlling for pre-pest, there was a significant main effect of group, F(1,57) = 4.51, p = 0.04, par- tial eta squared = 0.07, such that participants in the self-controlled group performed better on the post- test than participants in the yoked group.” F(1,57) = 4.51 Lemos, Wulf, Lewth- waite & Chiviacowsky (2017) “Independent of which factor the learner is given control over e or whether or not this factor is directly related to the task to be learned e the learning benefits appear to be very robust.” Two cell Difference in means “On the retention test, choice partic- ipants clearly outperformed the con- trol group. The group main effect was significant, F(1, 22) = 88.16, p <0.01.” F(1, 22) = 88.16 2 8 Lessa & Chiviacowsky (2015) “...it was hypothesized that older adult participants of the self-group would demonstrate superior motor learning results, presenting faster task times on the speed cup-stacking task, when compared with partici- pants in the yoked control group.” Two cell Difference in means “The analysis of the retention test revealed significant differences be- tween groups, F(1,34) = 4.87, p <.05...with participants of the self- control group presenting faster task times compared to yoked partici- pants.” F(1,34) = 4.87 Lewthwaite, Chivia- cowsky, Drews & Wulf (2015; Exp. 1) “In the present experiment, the choice learners were given was not related to task performance per se. Therefore, any learning benefits re- sulting from having, as opposed to not having, a choice would suggest that motivational factors are respon- sible for those effects.” Two cell Difference in means “On the retention test, during which white golf balls were used, the choice group showed significantly higher putting accuracy (36.8) than the yoked group (26.4), F(1, 22) = 7.31, p <.05” F(1, 22) = 7.31 Lewthwaite, Chivia- cowsky, Drews & Wulf (2015; Exp. 2) “Given the potential theoretical im- portance of the finding in Experi- ment 1, we wanted to replicate it with another task and different type of choice.” Two cell Difference in means “On the retention test 1 day later, the choice group demonstrated signifi- cantly longer times in balance than the yoked group, F(1, 27) = 7.93, p <.01.” F(1, 27) = 7.93 Lim, Ali, Kim, Choi & Radlo (2015) “It was expected that a self- controlled feedback schedule would be more effective for the learning and performance of serial skills for both acquisition and retention phases than a yoked schedule.” Two cell Difference in means “In the retention phase, there was a significant main effect for Group (F(1, 22) = 18.27, p <.05). The follow-up test indicated that the Self-controlled feedback group had higher performance (Cohen’s d = 6.4) than the Yoked-feedback group during the retention test in both blocks.” F(1, 22) = 18.27 Patterson, Carter & Sanli (2011: Compar- ison 1) “We expected that the structure of this self-controlled practice context would either add to or compromise the existing benefits attributed to a self-controlled practice context.” 2 (Control: self, yoked) X 3 (Struc- ture: full, all, faded) Difference in means “Specifically, the Self-Self condition demonstrated less |CE| compared to their Yoked-Yoked counterparts. This main effect was significant, F(1, 18) = 8.06, p <.05.” F(1, 18) = 8.06 Patterson, Carter & Sanli (2011: Compar- ison 2) “We expected that the structure of this self-controlled practice context would either add to or compromise the existing benefits attributed to a self-controlled practice context.” 2 (Control: self, yoked) X 3 (Struc- ture: full, all, faded) Difference in means “The All-Self condition demon- strated less |CE| compared to the All-Yoked condition. This main effect was also statistically significant, F(1, 18) = 4.67, p <.05.” F(1, 18) = 4.67 2 9 Patterson, Carter & Sanli (2011: Compar- ison 3) “We expected that the structure of this self-controlled practice context would either add to or compromise the existing benefits attributed to a self-controlled practice context.” 2 (Control: self, yoked) X 3 (Struc- ture: full, all, faded) Difference in means “The Faded-Self condition demon- strated less |CE| compared to the Faded-Yoked condition, supported by a main effect for group, F(1, 18) = 5.78, p <.05.” F(1, 18) = 5.78 Post, Fairbrother, Bar- ros & Kulpa (2014) “It was hypothesized that learners in the SC group would demonstrate superior accuracy and form scores compared with the yoked group dur- ing the retention test.” Two cell Difference in means “The univariate ANOVA for retention revealed a significant group effect, F(1, 29) = 6.08, p = .020. The SC group had higher Accuracy scores the YK group” F(1, 29) = 6.08 Ste-Marie, Vertes, Law & Rymal (2013) “We hypothesized that the Learner Controlled group would show su- perior physical performance of the trampoline skills. . . compared to the Experimenter Controlled group.” Two cell Difference in means “A separate independent samples t-test showed that the Learner Controlled group had significantly higher performance scores com- pared to the Experimenter Con- trolled group at retention, t(58) = 3.21, p <.05, d = .753.” t(58) = 3.21 Wulf & Adams (2014) “We asked whether giving perform- ers an incidental choice would also result in more effective learning of exercise routines.” 2(Group: self- control, yoked) X 3 (Exercise: toe touch, head turn, ball pass) X 2 (Leg: left, right) mixed design with repeated measures on the final two factors Difference in means “On the retention test. . . the choice group showed fewer errors than the control group. The main effects of group, F(1,18) = 25.35, p <.001, was significant.” F(1,18) = 25.35 Wulf & Toole (1999) “If the beneficial effects of self- control found in previous studies are more general in nature (i.e., some general mechanism responsi- ble for these effects), learning ad- vantage would also be expected for self-controlled use of physical assis- tance.” Two cell Difference in means “The main effect of Group, F(1,24) = 4.54, p <.05, was significant. Thus, allowing learners to select their own schedule of physical assis- tance during practice had a clearly beneficial effect on learning.” F(1,24) = 4.54 3 0 Wulf, Clauss, Shea & Whitacre (2001) “Importantly, however, if self-control promotes the development of a more efficient movement technique, one should see greater movement effi- ciency, as indicated by delayed force onsets, in self-control as compared to yoked participants.” Two cell Difference in means “Whereas the self-control group demonstrated relative force onsets that, on average, occurred about half the distance between the cen- ter of the apparatus and the par- ticipant’s maximum amplitude, the yoked group’s average force onset had already occurred after they had travelled less than 20% of the dis- tance to the maximum amplitude. This group difference was signifi- cant, F(1,24) = 4.43, p <.05.” F(1,24) = 4.43 Wulf, Raupach & Pfeiffer (2005) “Thus, if the learning advantages of self-controlled practice generalize to observational practice, allowing learners to decide when they want to view a model presentation should result in enhanced retention perfor- mance, with regard to movement form and, perhaps, movement ac- curacy, compared to that of yoked learners.” Two cell Difference in means “Overall, the self-control group had higher form scores than the yoked group throughout retention. The main effect of group F(1,23) = 5.16, p <.05, was significant.” F(1,23) = 5.16 Wulf, Iwatsuki, Machin, Kellogg, Copeland, & Lewth- waite (2017) Exp 1. “The purpose of the present exper- iments was threefold. First, we deemed it important to provide fur- ther evidence for the impact of inci- dental choices on motor skill learn- ing. Given that self-controlled prac- tice benefits for learning have fre- quently been interpreted from an information-processing perspective (e.g., Carter, Carlson, & Ste-Marie, 2014; Carter & Ste-Marie, 2016), with limited regard for rewarding- motivational explanations, further experimental evidence for learning enhancements through choices not directly related to the task seemed desirable (Experiments 1 and 2).” Two cell Difference in means “On the retention test one day later, the choice group demonstrated higher scores than did the control group. The group effect was signifi- cant, F(1, 29) = 5.72, p <.05.” F(1, 29) = 5.72 3 1 Wulf, Chiviacowsky & Drews (2015) “To summarize, we hypothesized that an external focus and autonomy support would have additive bene- fits for motor learning (i.e., reten- tion and transfer performance), as evidenced by main effects for each factor.” 2 (Autonomy sup- port: self, yoked) X 2 (Focus: external, internal) Difference in means “On the retention test, the main ef- fect of Autonomy Support was sig- nificant, F(1, 64) = 6.98, p <.01.” F(1,64) = 6.98 Ikudome, Kuo, Ogasa, Mori & Nakamoto (2019; Exp. 2) “Previous studies manipulating par- ticipants’ choice of variables relevant to the experimental task have indi- cated that such choices have a pos- itive effect on motor learning due to deeper information processing by the participants. Based on these studies, it is possible that this pos- itive effect would be observed re- gardless of participants’ levels of in- trinsic motivation, because this type of choice would not induce a change in perceived locus of causality from internal to external.” 2(Choice: self, yoked) X 2 (Motiva- tion: high, low) Difference in means “An ANCOVA indicated significant main effects of choice, F(1, 39) = 8.93, p = .005.” F(1,39) = 8.93 Note. KR = Knowledge of results; PD = Parkinson’s Disease; SC = Self-controlled. 32 Appendix B: Missing Data Of the 78 experiments that met the eligibility cri- teria of this meta-analysis, 25 were excluded because of missing data. Those 25 experiments included 13 experiments that reported a statistically significant re- sult, along with 12 that failed to find a significant self- controlled learning effect. Among the 13 experiments with missing data reporting a significant self-control benefit, one reported an inappropriate analysis (Hemay- attalab et al., 2013),6 one reported statistics that do not match the experimental design (Jalalvand et al., 2019),7 one reported significant effects on a partial analysis of their data rather than overall (Brydges et al., 2009), and one was previously identified by Lohse and colleagues 2016 as an outlier study (M. J. Carter & Patterson, 2012). The meta-analysis may have been strengthened by the exclusion of these results (Stanley et al., 2010). Among the remaining nine experiments reporting a significant effect with missing data, two reported effects collapsed across immediate and delayed retention only (Patterson et al., 2013; Wu & Magill, 2011), two re- ported null effects on a higher priority measure and did not include sufficient data to calculate the effect size, while reporting a significant effect on a lower priority measure (Wulf et al., 2001; Wulf et al., 2005, both studies were included in the primary p-curve analysis), and five compared three or more groups in an omnibus ANOVA and reported the group effect as significant but did not include sufficient data to calculate the effect size for the self-control versus yoked comparison (Chen et al., 2002; Ghorbani, 2019; Huet et al., 2009; Janelle et al., 1997; Norouzi et al., 2016). 6Although data were collected in one dimension using con- centric circles, AE and a measure of dispersion were analyzed together in a MANOVA. This measure of dispersion is not an accurate reflection of variability on a two-dimensional task for reasons described by Hancock et al., 1995. 7A subgroup analysis involving two groups n = 15 was re- ported with df = 56. The article reports r2 effect sizes as- sociated with each test that cannot be reproduced with the reported statistics or best guesses.