Meta-Psychology, 2020, vol 4, MP.2018.872 https://doi.org/10.15626/2018.872 Article type: Commentary Published under the CC-BY4.0 license Open data: Yes Open materials: Yes Open and reproducible analysis: Yes Open reviews and editorial process: Yes Preregistration: N/A Edited by: Marcel van Assen Reviewed by: M. van Assen, R. van Aert Analysis reproduced by: André Kalmendal All supplementary files can be accessed at OSF: https://doi.org//10.17605/OSF.IO/J2QGS Coding Errors Lead to Unsupported Conclusions: A critique of Hofmann et al. (2015) Donald R. Williams University of California, Davis, USA Paul-Christian Bürkner Aalto University, Finland Abstract We have detected coding errors in the meta-analysis of Hofmann et al. (2015) who investigated the effect of in- tranasal oxytocin on psychiatric symptoms. We demonstrate that, after correcting these errors and reanalysing the data, the main conclusions of Hofmann et al. (2015) are no longer supported. Keywords: meta-analysis, coding errors, intranasal oxytocin Introduction Due to converging evidence in animals and healthy human populations, oxytocin has been identified as po- tentially having therapeutic properties. As such, nu- merous randomized controlled trails have investigated the efficacy of intranasal oxytocin (IN-OT) on reduc- ing psychiatric symptoms in clinical populations. As results have been mixed, meta-analytic reviews seek- ing to synthesize the extant literature have been pub- lished. One such review was published in Psychiatry Re- search (Hofmann et al., 2015). The authors concluded that IN-OT significantly improved psychiatric symptoms and found significant effects on depression, anxiety, psy- chotic symptoms, and general psychopathology. We found several errors in this paper and, when corrected, resulted in all null results (no significant effect of IN- OT) which suggests that the conclusions of Hofmann et al. (2015) are incorrect. The current letter therefore has three aims: (1) we will outline several errors and raise questions regarding their analysis; and (2) we will perform a meta-analysis using the same primary stud- ies and similar methods; and (3) we will conclude by stating the importance of issuing a correction. Errors and questions Effect size directions While conducting a meta-analysis on a similar topic, we initially noticed Hofmann et al. (2015) misspecified the direction of one outcome. In other words, the pri- mary study reported that the placebo group improved whereas the IN-OT group did not (Lee et al., 2013). However, Table 1 of Hofmann et al. (2015) reports a large effect of IN-OT (Hedges’ g = 1.07). Furthermore, all of the outcomes reported in their Table 1 were pos- itive which indicates IN-OT was superior to placebo in all instances. From the primary studies, however, we 2 extracted the relevant data and found that 6 out of the 16 outcomes used to compute the overall effect should have been negative. As seen in Figure 1 of this letter, all effects to the left of 0 had the wrong direction in Hofmann et al. (2015). Possible Selection Bias Decisions made during the research process can in- fluence the presence of an effect (Gelman and Loken, 2014). This is true in meta-analyses, particularly when extracting only one outcome from several possibilities. Anagnostou et al. (2012) reported three outcomes on repetitive behavior. While two of the effects were ei- ther minimal (d = 0.13) or in the opposite direction (d = -0.22), Hofmann et al. (2015) selected the largest effect in support of IN-OTs efficacy (d = 0.64). While the Yale-Brown Obsessive Compulsive Scale produced the negative effect in Anagnostou et al. (2012) , the same scale produced a positive effect in Epperson et al. (1996) and was selected for inclusion by Hofmann et al. (2015). Dadds et al. (2014) reported two measures of repetitive behavior. For these outcomes, the placebo group showed improvement between pre- to post-test scores, whereas symptoms actually increased in the IN- OT group. From this study, Hofmann et al. (2015) again selected the outcome that was most favorable to IN-OT. However, this outcome (Child Autism Rating Scale) was not labeled as repetitive behavior in Dadds et al. (2014) while the outcomes that favored the placebo group were considered repetitive behavior. Finally, since multiple outcomes were extracted from some studies, the overall effect of IN-OT on psychiatric symptoms was computed on a subset of effects. Based on the effects reported in Table 1 of Hofmann et al. (2015), the average effect size was larger for the included outcomes (d = 0.83) than the excluded outcomes (d = 0.49). Misspecified outcomes We also believe several outcomes were not coded ac- curately in Hofmann et al. (2015). The majority of out- comes in the psychotic symptoms category were total scores from the Positive and Negative Symptoms Scale (PANSS). Total scores of the PANSS are a combina- tion of negative symptoms, positive symptoms, and gen- eral psychopathology Kay et al. (1987). However, Hof- mann et al. (2015) included two outcomes as psychotic symptoms that exclusively measured aspects of negative symptoms in schizophrenia. They also coded two Brief Psychiatry Reporting Scale (BPRS) outcomes as general psychopathology. Based on the contents of the scale and other meta-analyses on this topic (Oya et al., 2016), this should have either been coded as psychotic symptoms or they should have provided rationale for divergent coding. All four of these outcomes were reported as positive which, in addition to the aforementioned er- rors, likely inflated their meta-analytic estimates. Meta-analysis Based on the methods provided in Hofmann et al. (2015), we attempted to replicate their procedures as closely as possible, including outcomes included, effect size calculation, and assessment of publication bias. We then analyzed the data in a manner that was consistent with the extant literature and previous meta-analyses on this topic. All computations were done in R and with the metafor package (Viechtbauer, 2010). We used the default settings in metafor, including REML for estimating the between-study variance and the Q- profile method for the corresponding confidence inter- vals. Two-tailed p-values are reported. Our fully re- producible analysis is available on OSF (https://osf.io/ kd3en/). Methods The exact method used for effect size calculation is not entirely clear in Hofmann et al. (2015). Accordingly, we computed both Hedges’ g (SMD) exclusively from the post-treatment scores and the standardized mean change with raw score standardization (SMCR), which is a measure of pre to post-treatment change (r = 0.7) compared between groups (IN-OT vs. Placebo). From their methods section, we think that an effect size sim- ilar to the SMCR was most likely used. In the present analysis, when a 95 % confidence interval (CI) excluded zero there was evidence for a significant effect at p- value < 0.05. Replication Attempt. As seen in Figure 1, the over- all estimates for psychiatric symptoms were not signifi- cant (SMD = 0.22, z = 1.67, p-value = 0.0953, 95 %-CI [-0.04, 0.47]; SMCR = 0.17, z = 1.23, p-value = 0.217, 95 % = CI [-0.10, 0.43]). Trim and Fill procedures in- dicated bias in SMD outcomes and, when corrected, the effect was reduced (SMD = 0.07, 95 % CI = [-0.18, 0.32]). There was significant between-study variance for the SMCRs (τ2 = 0.15, p-value = 0.003), but not for the SMDs (τ2 = 0.09, p-value = 0.1149). We then obtained estimates for specific symptoms (Table 1).The meta-analytic estimates for depression, anxiety, repet- itive behaviors, and general psychopathology were all non-significant (CI’s included zero). While using the outcomes reported in Hofmann et al. (2015) produced a significant SMD estimate for psychotic symptoms (Table 1), restricting the outcomes to total psychotic symptoms resulted in a loss of statistical significance. https://osf.io/kd3en/ https://osf.io/kd3en/ 3 (A) (B) Figure 1. (A) SMCR estimates. (B) SMD estimates. The effect from Averbeck et al. (2012) was computed from a t-statistic on post-treatment scores. Outcomes from MacDonald et al. (2013) were obtained from a figure using web plot digitizer Rohatgi, 2017. Pedersen et al. (2013) did not report pre- scores. Through email, Dr. Pederson confirmed that the authors of Hofmann et al. (2015) did not contact then in regards to pre-scores. As such, we used change scores (SMCR) from day 2 to day 3, while SMD was calculated from day 3. We used the same outcome for Dadds et al. (2014) as Hofmann et al. (2015) It should be noted, however, this was pre-treatment and follow-up scores (3 months later). Dr. Dadds confirmed that they did not collect post- data. Standard deviations (SD) for Modabbernia et al. (2013) 4 Table 1 Meta-analytic estimates for specific symptoms SMCR ES SE z p-value 95 % CI Anxiety 0.09 0.17 0.5147 0.6067 [-0.24, 0.41] Depression 0.29 0.27 1.0843 0.2782 [-0.23, 0.81] Psychopathology 0.10 0.18 0.5664 0.5711 [-0.25, 0.46] Psychotic 0.31 0.18 1.6814 0.0927 [-0.05, 0.66] Repetitive -0.06 0.21 -0.2889 0.7726 [-0.46, 0.35] τ2 0.09 0.06 0.0140 [0.01, 0.29] SMD Anxiety 0.08 0.18 0.4695 0.6387 [-0.26, 0.43] Depression 0.28 0.27 1.0300 0.3030 [-0.25, 0.81] Psychopathology 0.14 0.19 0.7301 0.4653 [-0.24, 0.52] Psychotic 0.41 0.19 2.1278 0.0334 [0.03, 0.80] Repetitive 0.15 0.22 0.6876 0.4917 [-0.28, 0.58] τ2 0.069 0.069 0.0561 [0.00, 0.40] Note: SMCR: Standardized mean change with raw score standardization. SMD: Standardized mean difference (Hedges’ g). ES: Effect size. τ2: Residual between-study variance after accounting for symptom type. Conclusion Although Hofmann et al. (2015) is not a new arti- cle, and was recently retracted (Hofmann et al., 2016), there are several reasons this letter deserves attention. First, while they concluded that IN-OT had robust ef- fects on several psychiatric symptoms, our analysis sug- gests that all effects were non-significant. Second, IN- OT research has become a very active field and ensuring correctness in the publish literature is a mental health priority. Third, there is overwhelming evidence from animal studies supporting the role of oxytocin in psy- chiatric disorders, especially those comprised of social dysfunction (Lim et al., 2005). By ensuring null results are represented in the literature, researchers might be compelled to improve current methods of delivery or dedicate more resources into developing pharmaceuti- cal drugs that directly activate oxytocin receptors. Ac- cordingly, we hope this letter not only results in a cor- rection but also moves the field forward which is espe- cially important because of the lack of effective treat- ments for certain aspects of these disorders. Author Contact Corresponding author: Donald R. Williams (email: drwwilliams@ucdavis.edu). Paul-Christian Bürkner (email: paul.buerkner@gmail.com). Conflict of Interest and Funding The authors have no conflict of interest to declare. There was no specific funding for this study. Author Contributions DRW conducted the analysis and wrote the initial draft of the paper. PCB originally spotted the incon- sistencies in Hofmann et al. (2015) and provided proof- reading for the analysis and the paper. Author names are ranked in order of contribution. Open Science Practices This article earned the Open Data and the Open Ma- terials badge for making the data and materials openly available. It has been verified that the analysis repro- duced the results presented in the article. The entire editorial process, including the open reviews, are pub- lished in the online supplement. References Anagnostou, E., Soorya, L., Chaplin, W., Bartz, J., Halpern, D., Wasserman, S., Wang, A. T., Pepa, L., Tanel, N., Kushki, A., & Hollander, E. (2012). Intranasal oxytocin versus placebo in the treat- ment of adults with autism spectrum disor- ders: a randomized controlled trial. Molecular autism, 3(1), 16. https : / / doi . org / 10 . 1186 / 2040-2392-3-16 https://doi.org/10.1186/2040-2392-3-16 https://doi.org/10.1186/2040-2392-3-16 5 Averbeck, B. B., Bobin, T., Evans, S., & Shergill, S. S. (2012). Emotion recognition and oxy- tocin in patients with schizophrenia. Psycholog- ical Medicine, 42, 259–266. https://doi.org/10. 1017/S0033291711001413 Dadds, M. R., MacDonald, E., Cauchi, A., Williams, K., Levy, F., & Brennan, J. (2014). Nasal oxy- tocin for social deficits in childhood autism: A randomized controlled trial. Journal of Autism and Developmental Disorders, 44(3), 521–531. https://doi.org/10.1007/s10803-013-1899-3 Epperson, C. N., McDougle, C. J., & Price, L. H. (1996). Intranasal oxytocin in obsessive-compulsive disorder. Biological Psychiatry, 40(6), 547–549. https : / / doi . org / 10 . 1016 / 0006 - 3223(96 ) 00120-5 Gelman, A., & Loken, E. (2014). The garden of fork- ing paths: Why multiple comparisons can be a problem, even when there is no “fishing expedi- tion” or “p-hacking” and the research hypothe- sis was posited ahead of time. Psychological bul- letin, 140(5), 1272–1280. https://doi.org/dx. doi.org/10.1037/a0037714 Hofmann, S. G., Fang, A., & Brager, D. N. (2015). Effect of intranasal oxytocin administration on psy- chiatric symptoms: A meta-analysis of placebo- controlled studies. Psychiatry Research, 228(3), 708–714. https://doi.org/10.1016/j.psychres. 2015.05.039 Hofmann, S. G., Fang, A., & Brager, D. N. (2016). No- tice of Retraction and Replacement: Hofmann et al. Effect of intranasal oxytocin administra- tion on psychiatric symptoms: A meta-analysis of placebo-controlled studies. Psychiatry Re- search. 2015;228:708-714. Psychiatry Research. https://doi.org/10.1016/j.psychres.2016.10. 055 Kay, S. R., Fiszbein, A., & Opler, L. a. (1987). The posi- tive and negative syndrome scale (PANSS) for schizophrenia. Schizophrenia bulletin, 13(2), 261–76. Lee, M. R., Wehring, H. J., McMahon, R. P., Linthicum, J., Cascella, N., Liu, F., Bellack, A., Buchanan, R. W., Strauss, G. P., Contoreggi, C., & Kelly, D. L. (2013). Effects of adjunctive intranasal oxytocin on olfactory identification and clinical symptoms in schizophrenia: Results from a ran- domized double blind placebo controlled pilot study. Schizophrenia Research, 145(1-3), 110– 115. https://doi.org/10.1016/j.schres.2013. 01.001 Lim, M. M., Bielsky, I. F., & Young, L. J. (2005). Neu- ropeptides and the social brain: Potential ro- dent models of autism. International Journal of Developmental Neuroscience, 23(2-3 SPEC. ISS.), 235–243. https : / / doi . org / 10 . 1016 / j . ijdevneu.2004.05.006 MacDonald, K., MacDonald, T. M., Brüne, M., Lamb, K., Wilson, M. P., Golshan, S., & Feifel, D. (2013). Oxytocin and psychotherapy: A pilot study of its physiological, behavioral and subjective ef- fects in males with depression. Psychoneuroen- docrinology, 38(12), 2831–2843. https : / / doi . org/10.1016/j.psyneuen.2013.05.014 Modabbernia, A., Rezaei, F., Salehi, B., Jafarinia, M., Ashrafi, M., Tabrizi, M., Hosseini, S. M. R., Tajdini, M., Ghaleiha, A., & Akhondzadeh, S. (2013). Intranasal oxytocin as an adjunct to risperidone in patients with schizophrenia: An 8-week, randomized, double-blind, placebo- controlled study. CNS Drugs, 27(1), 57–65. https://doi.org/10.1007/s40263-012-0022-1 Oya, K., Matsuda, Y., Matsunaga, S., Kishi, T., & Iwata, N. (2016). Efficacy and safety of oxytocin augmentation therapy for schizophrenia: an updated systematic review and meta-analysis of randomized, placebo-controlled trials. Euro- pean Archives of Psychiatry and Clinical Neuro- science, 266(5), 439–450. https://doi.org/10. 1007/s00406-015-0634-9 Pedersen, C. A., Smedley, K. L., Leserman, J., Jarskog, L. F., Rau, S. W., Kampov-Polevoi, A., Casey, R. L., Fender, T., & Garbutt, J. C. (2013). In- tranasal Oxytocin Blocks Alcohol Withdrawal in Human Subjects. Alcoholism: Clinical and Exper- imental Research, 37(3), 484–489. https://doi. org/10.1111/j.1530-0277.2012.01958.x Rohatgi, A. (2017). WebPlotDigitizer. http://arohatgi. info/WebPlotDigitizer Viechtbauer, W. (2010). Conducting Meta-Analyses in R with the metafor Package. Journal of Statisti- cal Software, 36(3), 1–48. https://doi.org/10. 18637/jss.v036.i03 https://doi.org/10.1017/S0033291711001413 https://doi.org/10.1017/S0033291711001413 https://doi.org/10.1007/s10803-013-1899-3 https://doi.org/10.1016/0006-3223(96)00120-5 https://doi.org/10.1016/0006-3223(96)00120-5 https://doi.org/dx.doi.org/10.1037/a0037714 https://doi.org/dx.doi.org/10.1037/a0037714 https://doi.org/10.1016/j.psychres.2015.05.039 https://doi.org/10.1016/j.psychres.2015.05.039 https://doi.org/10.1016/j.psychres.2016.10.055 https://doi.org/10.1016/j.psychres.2016.10.055 https://doi.org/10.1016/j.schres.2013.01.001 https://doi.org/10.1016/j.schres.2013.01.001 https://doi.org/10.1016/j.ijdevneu.2004.05.006 https://doi.org/10.1016/j.ijdevneu.2004.05.006 https://doi.org/10.1016/j.psyneuen.2013.05.014 https://doi.org/10.1016/j.psyneuen.2013.05.014 https://doi.org/10.1007/s40263-012-0022-1 https://doi.org/10.1007/s00406-015-0634-9 https://doi.org/10.1007/s00406-015-0634-9 https://doi.org/10.1111/j.1530-0277.2012.01958.x https://doi.org/10.1111/j.1530-0277.2012.01958.x http://arohatgi.info/WebPlotDigitizer http://arohatgi.info/WebPlotDigitizer https://doi.org/10.18637/jss.v036.i03 https://doi.org/10.18637/jss.v036.i03 Introduction Errors and questions Effect size directions Possible Selection Bias Misspecified outcomes Meta-analysis Methods Replication Attempt Conclusion Author Contact Conflict of Interest and Funding Author Contributions Open Science Practices