Meta-Psychology, 2023, vol 7, MP.2022.3270 https://doi.org/10.15626/MP.2022.3270 Article type: Original Article Published under the CC-BY4.0 license Open data: Not Applicable Open materials: Yes Open and reproducible analysis: Yes Open reviews and editorial process: Yes Preregistration: No Edited by: Rickard Carlsson Reviewed by: Peder Isager, Matt Williams Analysis reproduced by: Lucija Batinović Associated OSF project: https://doi.org/10.17605/OSF.IO/9X7D4 Means to valuable exploration II: How to explore data to modify existing claims and create new ones Michael Höfler1,2, Brennan McDonald1,2, Philipp Kanske1,2, and Robert Miller 1 1Faculty of Psychology, Technische Universität Dresden, Dresden, Germany 2Clinical Psychology and Behavioural Neuroscience, Institute of Clinical Psychology and Psychotherapy, Technische Universität Dresden, Germany Transparent exploration in science invites novel discoveries by stimulating new or mod- ified claims about hypotheses, models, and theories. In this second article of two con- secutive parts, we outline how to explore data patterns that inform such claims. Trans- parent exploration should be guided by two contrasting goals: comprehensiveness and efficiency. Comprehensiveness calls for a thorough search across all variables and possi- ble analyses not to miss anything that might be hidden in the data. Efficiency adds that new and modified claims should withstand severe testing with new data and give rise to relevant new knowledge. Efficiency aims to reduce false positive claims, which is better achieved if a bunch of results is reduced into a few claims. Means for increasing efficiency are methods for filtering local data patterns (e.g., only interpreting associa- tions that pass statistical tests or using cross-validation) and for smoothing global data patterns (e.g., reducing associations to relations between a few latent variables). We suggest that researchers should condense their results with filtering and smoothing before publication. Coming up with just a few most promising claims saves resources for confirmation trials and keeps scientific communication lean. This should foster the acceptance of transparent exploration. We end with recommendations derived from the considerations in both parts: an exploratory research agenda and suggestions for stakeholders such as journal editors on how to implement more valuable exploration. These include special journal sections or entire journals dedicated to explorative re- search and a mandatory separate listing of the confirmed and new claims in a paper’s abstract. Keywords: Exploration, Transparency, Smoothing, Filtering, Preregistration, Open Data, Open Analysis, Severe Testing, Replication Introduction It has long been recognised that confirmatory and exploratory research are beneficial for each other. Ex- ploratory findings can provide insights for new or im- proved scientific claims to be tested (Lakatos, 1977; Popper, 1959; Stebbins, 1992), and the failure of a confirmatory trial might suggest exploring for a better claim and a more promising next trial. However, for ex- ploration to inform confirmation well, researchers need to be equipped with an understanding of the aims and means of exploratory analysis in advance. In the first of two consecutive articles (Höfler et al., 2022), we called for a sharp boundary between con- firmation and exploration to separate established from new scientific claims about hypotheses, models and the- ories. A claim is confirmed if an evidential norm is met, such as p-value (p) < α. Strict adherence to an eviden- tial norm ensures severe testing (Mayo, 2018): A confir- matory test of a claim must be likely to fail if the claim is wrong. Such a risky probe ensures that a claim is supported by meaningful evidence. Unfortunately, ad- herence is often violated through the use of question- able research practices, by cherry-picking a p < α from numerous different analyses (p-hacking) or a hypothe- sis that happens to yield such a p (HARKing; Hollen- beck and Wright, 2017). Practices like that constitute intransparent exploration, misused to produce seeming confirmation of a hypothesis by pretending to meet the norm (Höfler et al., 2022). Behind non-transparency in analysis and generation of hypotheses, non-adherence may be hidden. Therefore, adherence requires control to accept an analysis as confirmatory, for example by pre-registration (Höfler et al., 2022). In contrast, transparent or “open exploration” (Thompson et al., 2020) enjoys the freedom to exten- sively analyse data (Manuti & Giancaspro, 2019) and embraces all “researchers’ degrees of freedom” (Dirnagl, https://doi.org/10.15626/MP.2022.3270 https://doi.org/10.17605/OSF.IO/9X7D4 2 2020; Simonsohn et al., 2020) to modify existing or cre- ate new claims about the world. However, by trying different analyses, for instance by using multiple sta- tistical tests, the evidential norm may not be adhered to because α accumulates over several tests (Bender & Lange, 2001). In consequence, a confirmatory trial with new data is required to adhere to the norm. This idea extends to concatenated exploration, an iterative pro- cess, in which exploration and confirmation repeatedly feed each other, modifying and testing claims, to iden- tify the best possible claims that can be confirmed (Steb- bins, 1992). Likewise, empirical science has been de- scribed as a process of mapping of knowledge back and forth from a claim via study design to data analysis and modification of the claim, with modification guided, for example, by exploratory results (Bogen & Woodward, 1988; Box, 1980; Lakatos, 1977; Mayo, 2018; Popper, 1959; Suppes, 1969). For transparent exploration to evolve, however, researchers need to be equipped with a conceptual understanding and practical skills of ex- ploratory analysis. This shall foster researchers’ self- efficacy and make them more willing to freely conduct and openly report exploration (Stebbins, 2001). How- ever, what exploration actually is has rarely been asked in psychology, with a few exceptions (Behrens, 1997; Dirnagl, 2020). Likewise, exploration, recognisable as such, appears hard to find outside the current data min- ing/big data movement (Adjerid & Kelley, 2018), quali- tative investigations (Kassis & Papps, 2020), planned re- views (Moghaddam, 2004) and theses (Sohmer, 2020). In this second article we will outline what we be- lieve are important foundations for conceptualising and conducting transparent exploration. We begin with dis- cussing the goals of comprehensive and efficient explo- ration. We then describe basic ideas on how to refine existing hypotheses, models and theories and how to create new hypotheses. Based on these foundations, we summarize analytical means to address effective- ness through filtering and smoothing explorative results. The paper ends with a small research agenda frame- work and recommendations for stakeholders who have the means to establish more transparent exploratory re- search. Goals of exploration Exploration as a quantitative quest for novelty As in part I (Höfler et al., 2021), we refer to ex- ploration in the specific sense of “a toolbox of analyt- ical methods to generate and modify hypotheses, mod- els, and theories”. Creating and refining such claims about the world allows for scientific novelty and may be achieved by quantitative analysis. Note that we do not address qualitative analysis here which may serve the same purpose (I. & R., 1998). We regard quanti- tative exploration as a quest for data patterns that may give rise to novelty. We exemplify data patterns with as- sociations between variables, but data patterns may also be higher order relations such as interactions, clusters of individuals or variables that appear similar in a substan- tive respect, trajectories over time and or other “data regularities” that may point to new insights (Adjerid & Kelley, 2018; Hand, 2007; Nguyen, 2000). A quest for such patterns may be theoretically well informed and thus planned, or may be primarily data-driven, start- ing with inspection or quantitative analysis of the data resulting in unusual, unexpected or striking patterns. These may be of direct interest or suggest where and how to further explore. Comprehensive exploration and the explorative search-space Perhaps the most straightforward idea of exploration is comprehensiveness. Comprehensiveness embraces the potential to discover any and all patterns in a dataset that would give rise to a hidden truth about nature or challenge prior beliefs (Stebbins, 2001; Swedberg, 2018). Due to feasibility, time, financial and other practical constraints, the resources to explore data will, however, always be limited by the inherent difficulties associated with collecting new data or even analysing given data. Nevertheless, we suggest that comprehen- siveness should initially guide the planning of explo- ration. For example, if one’s goal is to identify unknown risk factors for mental health problems, all possible vari- ables, analyses, and observational levels ranging from the biochemical to the level of the society (Williams, 2021) should be taken into consideration in the first place. Theoretical arguments and prior empirical re- sults may then suggest where the most important pat- terns are hidden. For instance, one may collect a data set with hundreds of potential risk factors from differ- ent domains (parental mental health, childhood risk factors, nutrition, stressors . . . ) and dozens of men- tal health outcomes (disorders, disability measures . . . ) and for any factor-outcome combination an association may be found. Alternatively, researchers may decide to focus on exploring a specific domain and a small range of outcomes, for example, diet factors and their relation with affective disorders (Martins et al., 2021). Such considerations ask for boundaries within which to ex- plore. We conceptualize this with the exploratory search- space. The exploratory search-space comprises all data patterns (e.g., associations between variables) that are actually explored among all patterns that could be ex- plored. Choosing an exploratory search-space is akin to 3 Figure 1 Schematic illustration of two explorative search-spaces that could be chosen to find data patterns of interest (green dots). A pattern of interest can only be found within the boundaries of a search-space. placing the lasso of one’s practical resources around the area where background knowledge suggests the most novelty. Figure 1 illustrates this with a very simple re- search quest, where 200 data patterns (potential asso- ciations) could be explored, out of which 6 are patterns of interest (e.g., true associations) that could be later identified by explorative analysis. The figure shows two possible choices of explorative search-spaces, S1 and S2. The narrow S1 contains 4 out of 6 patterns of interest and 22 patterns of no interest. S2 is a more comprehensive extension of S1. Explor- ing S2 requires mores resources. Besides, only 1 more pattern of interest could be identified in a later analy- sis, but 23 more patterns of no interest could be falsely identified (e.g., by randomly yielding p < α ). Efficient exploration The danger of false-positive results already intro- duces the second goal of transparent exploration: ef- ficiency. Efficient exploration aims to advance science with new insights while not polluting the literature with a multitude of claims. Their subsequent non- confirmation would waste the resources of other re- searchers in subsequent attempts at confirmation or oth- erwise fail to advance science. Such findings are the cost of enjoying comprehensiveness in data exploration. If one turns over every stone, one will find every hid- den coin, but also every piece of junk underneath. With “patterns of interest” (the 6 green dots in Figure 1) we suggest that exploration should aim at identifying pat- terns that are both 1. true and 2. relevant By “true” we mean that a data pattern is not caused by chance and gives rise to a previously unknown claim, requiring substantive explanation. In the Popperian tra- dition, a true claim must improve predictions about the world that could turn out to be wrong (Box, 1976; Pop- per, 1959). Thus (1) aims at finding claims that are likely to pass severe testing with new data (Mayo, 2018). Note that a claim derived from a pattern might be close to the pattern; for example, hypothesising an associa- tion if a statistically significant association is found in the data. It may, however, require additional substan- tive input to form a meaningful statement (Rubin & Donkin, 2022). This is strongly the case when a causal hypothesis is derived from the finding of an association (Elhai & Montag, 2020; Glymour et al., 2019). With respect to (2), exploring for relevant patterns aims to exclude proposals of weak or modest scientific value for the benefit of stronger new or modified claims. This serves science per se but also gives rise to more severe testing. A simple example is the claim that an ef- fect is particularly large, rather than just that the effect is greater than zero. Not only is this more scientifically informative, but it can also more easily turn out to be wrong. Generally, with relevance we mean any substan- tive argument that might render a claim scientifically in- teresting. For instance, causal claims have been argued to be much more relevant than associational claims to inform theories and to assess the potential of interven- tions (Hernán, 2018; Höfler et al., 2021). Beyond ob- jective dimensions like effect magnitude, practical and clinical significance (Kirk, 1996) or the generalisability of a claim (from a narrow to a more general popula- tion), “relevance” is a qualitative term that, we believe, should not be defined in general terms across scientific domains. Perhaps the best general answer to the mean- ing of relevance is that it must always be renegotiated by the scientific community even within a domain be- cause what appears relevant might itself be subject to change. Note that a wrong claim might nevertheless trigger true insights and thus be in that sense relevant (Nosek et al., 2018; Stebbins, 1992, 2006). For exam- ple, claims on ego depletion have not been replicated (Lurquin & Miyake, 2017), but have spawned the idea and finding that willpower is not a limited resource (Job et al., 2010). 4 Exploring around existing claims With this understanding of comprehensiveness and efficiency, we are now equipped to derive some ba- sic ideas on how to actually explore data. These are not intended to be complete, but rather to sketch out some promising directions that might one day form part of a thorough and mathematically formalized elabora- tion. We begin with explorative quests around existing knowledge before discussing searches for the entirely new. Thus, we proceed from narrow to wider search- spaces, just as science has been hierarchically classified into single hypotheses, models based on multiple hy- potheses, and theories for a full explanation of a phe- nomenon (Gelman et al., 2019). Exploring along an existing hypothesis With specific claims (hypotheses) it is easier to in- fer what is wrong, while falsifying global claims (mod- els, theories) leaves open which components actually require modification. Additionally, a hypothesis might be wrong but might become true (at least make bet- ter predictions) if modified. A hypothesis might also be true but not make a strong proposition. Consider again the magnitude of an effect. Commonly researchers hy- pothesise that an effect is greater than 0 in which case a confirmative result supports any magnitude greater than zero including an effect magnitude arbitrarily close to zero. Thus, an effect could be below any threshold of practical (e.g., clinical, or public health) significance (“nullism”, Greenland, 2017). For a stronger proposi- tion, exploration may aim to identify the highest δ, so that the claim “effect > δ” remains true. (Mayo, 2018) gives instances on how to estimate δ based on severe testing calculations. In general, we believe that “turning all the knobs” (Hofstadter & Dennett, 1981) is a useful metaphor to think about the components of a hypothesis and changing which of them may give rise to a better statement about the world. For example, a hypothesis might state that a particular diet has a positive effect on quality of life. This hypothesis might be modified to say that the effect only occurs in a certain domain of life, or that the diet is only effective if its ingredients are changed. Box 1 describes how trying different analytical methods might lead to a better proposition on an effect or an association. Box 1: Exploring around a hypothesis with “mul- tiverse analyses” "Specification curves” (Masur & Scharkow, 2020; Simonsohn et al., 2020) and “multi- verse analyses” (Del Giudice & Gangestad, 2021; Stee- gen et al., 2016) try different analytical methods and options and show how a result (p-value, confidence interval) varies across them, how robust it is against the assumptions that a particular analysis makes. Then competences on what a method is robust against helps to understand the nature of a relationship under inspec- tion. For instance, there might be clear evidence (p = .001) in ordinary least squares regression for more qual- ity in life on average if a certain diet is followed versus not followed. The evidence might however vanish (p = .450) if “robust linear regression” is used instead, a method that is robust against extreme values and out- liers in the residuals (Erceg-Hurn & Mirosevich, 2008; Field & Wilcox, 2017; Huber, 1981; Wilcox, 2012). This may indicate that extreme values dominate the result in ordinary regression if not accounted for. If further data inspection is consistent with that explanation, the initial hypothesis may be refined from a difference in the mean outcome to just a higher probability of extreme values if the diet is followed. That is, from an overall associa- tion to an association only in some individuals. Further exploration, for example with “finite mixture models” (Skrondal & Rabe-Hesketh, 2004), might identify who these are. Exploring within a theory’s or model’s degrees of freedom When modifying a model or theory, “turning all the knobs” calls for questioning all the single propositions from which the model or theory is built. A theory could be broken down into its component parts, changed, if necessary, as described above, and put back together again to form a modified theory. However, it has been criticised that some theories leave knobs unset in the first place, leaving open how they could turn out to be wrong (Bringmann et al., 2022; Scheel, 2021). Under- specification renders them inaccessible to severe test- ing when tested as a whole, because turning knobs ac- cording to the data improves the theory’s overall fit to the data (Eronen & Bringmann, 2021; Fiedler, 2017; Gigerenzer, 2010; Lakatos, 1977; Lakens, 2019; Szol- losi & Donkin, 2021). Because they are poorly fal- sifiable, some theories “are not even wrong” (Scheel, 2021). However, with transparency in exploration, fill- ing the gaps becomes an explicit and desirable purpose (Woo et al., 2017). This is rewarded with publications, and a completed model or theory which makes specific predictions and thus becomes subject to severe confir- mative testing both as a whole and in its completed 5 parts. Knobs to be particularly turned are causal claims in theories, which have only been tested as if they were associative (Höfler et al., 2022; Höfler et al., 2021). Another source for hidden need for modification is poor measurement with established, but questionable instru- ments (e.g., Schimmack, 2021). Exploring to create new claims Local versus global data patterns Large-scale studies collect data on many factors and outcomes, such as in the epidemiology of mental disor- ders (Kessler & Merikangas, 2004), let alone the huge data sets from genetic or imaging studies (Pennycook, 2018; Thompson et al., 2020). With such studies one may find countless associations, and the question arises whether to explore them individually or summarize them in advance (Hand, 2007). Imagine one assesses 20 nutrition factors in relation with 10 mental health outcomes. Here, local patterns are associations between specific nutritional factors and specific outcomes. If indicative of causal effects, they might have different implications for science or practice: A theory might suppose that different nutritional fac- tors have very different impacts on various aspects of mental health. Accordingly, interventional effects may depend on which factor is changed to affect which out- come. For example, the absence of alcohol consump- tion might have a different impact on social well-being than a vegan diet has on personal growth. On the other hand, the 20 factors and 10 outcomes could be man- ifestations of just a few latent variables, which might explain why a certain set of associations can be found. In this case, one may focus on the global pattern of as- sociations, for instance the relation between healthy nu- trition and overall mental well-being. Such a focus has been used, for example, to hypothesise about the re- lationships between psychopathology and neural mea- sures using canonical correlations (Linke et al., 2021). In neuroscience, exploring for global claims has been argued to be more important for insight and predic- tion (Bzdok & Ioannidis, 2019). Box 2 illustrates how globally focusing on any association versus locally fo- cusing on specific associations relates to severe testing when using statistical tests in an explorative manner, and whether one should adjust the α for each test to the number of associations tested. Box 2: Severe testing of any association versus a particular association if several associations could be found With 20 factors and 10 outcomes, 200 associations may be tested, each with a level α significance test to sep- arate randomly from non-randomly occurring patterns. Here, α * 200 tests would be expected to yield p < α in the absence of associations (e.g., Colquhoun, 2014). With α = .05, this equals 10. If one happens to find at least one p < α , the result “any association found” has not been severely probed and hence provides only little evidence for anything being truly there, because there were 200 chances for identifying a pattern. Referring to “any association” puts these tests into the global con- text of all the 200 investigated associations, and with this global perspective, α is inflated (Bender & Lange, 2001). The other possible result, “no association found”, would be supported with considerable initial evidence because it could have been 200 times refuted, especially if a sample is large and thus the beta errors of the indi- vidual tests are small. The evidential norm, however, may be adjusted for the number of tests, α be replaced with α /200 in each test (Bonferroni correction). Not doing so has been criticised to undermine trust in some fields of science through spurious results, for example in genome-wide association studies (Jorgensen et al., 2009; Marigorta et al., 2018). The adjustment turns the matter around: Now the result of “any association” is much more severely probed, but the result of “no association” a great deal less severely than before. If background knowledge suggests that local associations are of interest, each association should be tested with a level α test irrespective of the other associations (Ben- der & Lange, 2001). A statistically significant associa- tion then has been probed with a severity of 1 – α, and a statistically non-significant association with a severity of 1 - beta (Mayo, 2018). Filtering local data patterns When searching for the new, it may be desirable to choose a large search-space, but a comprehensive explo- ration carries the risk of many false positive results. This danger is countered by more rigorous filtering, which results in a smaller number of identified patterns. In Figure 2 we illustrate exploration as a process of firstly choosing an explorative search space as before (Figure 1 with search space S1), then filtering the data patterns and finally creating new claims out of the remaining patterns. After choosing search space S1 with a total of 26 patterns, 4 patterns of interest could be identi- fied. 3 of those are actually identified by filtering, 2 of which being patterns of interest. Then the efficiency of filtering within S1 can be described by the proportion 6 Figure 2 The specification of an exploratory search space in step 1 is efficient in that it covers 4 out of 6 patterns of in- terest, while the vast majority of patterns of no interest are omitted from the outset (Figure 1). In step 2, the 22 patterns (grey dots) + the 4 interesting patterns (green dots) within the search space are subjected to filtering, af- ter which 1 pattern of no interest and 2 patterns of interest remain. Finally, claims are derived from these. of identified patterns of interest among all patterns of interest (2 out of 4) and the proportion of patterns of no interest among all identified patterns (1 out of 3). Finally, the 3 identified patterns need to be translated into substantive claims. Most simply and close to the data one may create 3 separate associational hypothe- ses. Or, 2 associations may appear substantively simi- lar, like: 1. more chronic stress when crash diets are used, and 2. more chronic stress when diet adherence is exceptionally high. This may give rise to more global hypothesizing: "Extreme attention on healthy nutrition is related to more chronic stress". What are specific methods to filter, here to move from 26 patterns to perhaps 3? So far in this article, we have solely mentioned the dominant method of statisti- cal tests. Statistical tests are useful to eliminate random patterns, but, with their profound origin as an approach to confirmation and without explicit explorative lan- guage, they contribute to the blending of confirmation and exploration. This applies as long as the difference is not made very clear by a statement such as “explorative testing was conducted” (Höfler et al., 2022). Alterna- tives include confidence intervals, descriptive statistics, data mining and machine learning techniques (Adjerid & Kelley, 2018; Alonso et al., 2018; Romero & Ventura, 2020), Bayesian approaches and any other method that may happen to be effective. Individual versus community-driven filtering The yet more fundamental question when filtering re- sults is, who should do it. With individual-driven filter- ing, as so far assumed, scientists themselves filter their results before coming up with new claims in a publi- cation. The most universal individual filtering method is internal cross-validation (De Rooij & Weeda, 2020; Fleming et al., 2021; Xiong et al., 2020). It can, in prin- ciple, be combined with any analytical method. Its key idea is splitting a large data set randomly into n subsets, and repeatedly running an exploratory analysis in n-k “training data” subsets while probing its results with the remaining k “test data” subsets (Parvandeh et al., 2020; Xiong et al., 2020). This, however, requires a large total sample size. Most importantly, internal cross-validation endows researchers the freedom to explore beyond a po- tentially existing plan or even without a plan, because each pattern found, no matter with which method and with how many analytical options tried, must pass the test data. (This works as long as the entire procedure is not repeated with new randomly created subsets until a striking pattern happens to be seemingly confirmed. However, this danger is easy to address by transparency in the seed value of the random process that divides the sample into subsamples.) By contrast, community-driven filtering relies on the scientific public and is usually implemented through per-publication peer reviews. Another instance is exter- nal cross-validation, where different data sets are used to generate claims and filter them (“independent repli- cation”; König, 2011). We propose that individually- driven filtering should often precede community-driven filtering, because otherwise rigorously filtered results may receive little attention amidst many published poorly filtered results. As an important exception to the previous discus- sions, each result, for example on diet – mental health associations, might be potentially informative for other researchers. In such cases, particularly in modestly large search-spaces, no filtering at all seems warranted. All associations may be published on a public repository (Pennycook, 2018; Thompson et al., 2020) so that oth- ers are enabled to probe the associations predicted from their causal models (Greenland et al., 2004; Ryan et al., 2019). Smoothing global data patterns Background knowledge, however, might suggest that one should focus on global data patterns beforehand in- stead of analysing local patterns and maybe aggregat- ing these into global claims later (Hand, 2007). With such knowledge one may decide to summarize observed 7 variables into latent variables before running an analy- sis, for example by fitting a structural equation model. Or, some entities are known to be more similar than others along a dimension, for example genomic loci along the DNA strand and brain activation or body cells along their two-dimensional spatial distance. Then it is possible to arrange the observations accordingly and to smooth the data with statistical methods (Farcomeni & Greco, 2016). Smoothing aims to reduce the varia- tion along the dimension, because otherwise every sin- gle point along the dimension is subject to individually occurring random error, potentially hiding the overall pattern of interest or “latent structure”. Such smoothing serves to “clean up” the data in the first place (Green- land, 2006). Consider the example of epigenetic responses to stress exposition across genomic loci. One may explore the variation of the response locally, locus by locus, and thus allow it to freely vary. This preserves all the pat- terns in the data, but many of those will just be noise, the result of random error. The background knowledge that two gene loci are more associated with an out- come the closer they are spatially is ignored (Jaffe et al., 2012). Figure 3 shows a fictive example, in which the outcome Y, stress response, varies along the genomic locis’ relative spatial location X (for illustration one- dimensional and scaled from 0 to 100). The red line displays how Y truly varies across location X according to the function Y = sin(sqrt(X)) * 10*X. We assume that other factors contribute to Y through a normally dis- tributed error with expectation = 0 and standard devia- tion = 500. For smoothing, we use polynomial splines, a technique of non-parametric regression (Takezawa, 2005) that controls the extent of smoothing through the degree of a polynomial (command twoway lpoly in Stata 15.2, StataCorp, 2017). Figure 3a shows the random pattern that emerges if no smoothing is done and only local patterns are investigated. Here, the patterns are the spikes that represent outcome values. The height of each spike is a separately estimated parameter. These many estimates (here 50) may poorly carry over to new data, that is, overfitting is likely. The spikes might be used to generate a set of individual hypotheses while neglecting the spatial dependency. With luck, such a dependency might emerge with spikes fairly close to the true structure. We suspect, however, that such luck will rarely occur. With moderate smoothing (Figure 3b) we are able to identify a rough course and might hypoth- esise that the outcome is highest if X ranges between 40 and 70. Stronger smoothing (Figure 3c) results in a fairly good fit to the true function and allows further hypothesising a local minimum around X = 25. How- ever, if too much smoothing is applied (Figure 3d, linear Figure 3 The figure illustrates fictive data where an outcome Y varies across genomic loci (X) along a spatial position. The true Y-X relation (red line) equals Y = sin(sqrt(X)) *10*X, where deviations from it arise from random error in a sample of n = 50 (normally distributed with expecta- tion = 0 and standard deviation = 500). Plot (a) shows the results (blue peaks) if no smoothing is done, plots (b) through (d) apply different levels of smoothing, from in- sufficient smoothing (b) and adequate smoothing (c) to over-smoothing (d). approximation), underfitting occurs, the latent structure cannot be described with only two parameters, core fea- tures like the peak in the range of 40-70 are overlooked. Smoothing may reveal a new hypothesis like “stress response has its genetic basis in the range 40-70” or only be an intermediate step (Greenland, 2006), and the smoothed structure (blue curve in the example) be further analysed, e.g., in relation to factors that might influence stress response across location. For exam- ple, particularly high peaks in the 40-70 range in indi- viduals with negative childhood events might indicate that genes in this range are activated more strongly in these individuals. Several methods to smooth psycho- logical data are common, albeit not under the idea of smoothing. Table 1 summarises some of them and lists their “smoothing parameters” that regulate the degree of smoothing. Much elaboration, however, is required for sound guidance on how to apply such methods for exploration: whether they do the right smoothing to the appropriate extent to efficiently stimulate new claims in a specific research domain. As a general advice, the more a field is already understood, the more data may be smoothed. 8 Table 1 Table 1: Some methods for statistical smoothing, their search spaces and parameters via which they smooth Method Search space Smoothing parameter(s) Non-parametric regression Functions that describe an X-Y as- sociation or the associations of sev- eral X with Y e.g., the degree of a polynomial (local polynomial smoothing) Regularisation methods in regres- sion with many predictors (Lasso, elastic net regression, etc.) Estimates of regression parameters e.g., the sum of the regression coefficients, besides the intercept (Lasso) Exploratory factor analysis Latent dimensions and their load- ings on observed items Number of latent dimensions and choice of rotation method Cluster analysis, latent mixture models Possible clusters of individuals that are homogenous within but het- erogenous between Number of clusters Canonical correlation analysis Linear combinations of factors and outcomes Number of latent dimensions be- hind a set of factors and number of latent dimensions behind a set of outcomes Planning exploration and transparency on how one has explored After outlining the goals of comprehensiveness and efficiency and some basic ideas on how to explore data, we are equipped to discuss the possibility of planning an explorative quest. We suggest that, if the sample size does not allow for internal cross-validation, a well- underpinned plan may render data exploration more ef- ficient in identifying patterns of interest. The argument is that a planned exploration may be more focused and therefore require less analysis. Identified patterns might in turn be supported by more initial evidence (Höfler et al., 2022; Simmons et al., 2011). Note that this is a heuristic argument, because severity depends on what exact analyses are conducted. However, the following strict statement can be made: The severity with which a pattern has been “pre-tested” becomes smaller if addi- tional statistical tests are conducted (α becomes greater with each additional test) or any additional filtering has been done. If there is a plan, it should be transparent, that is, made public, to enable researchers to “take credit” of it (Wagenmakers & Dutilh, 2016) when publishing the results that it generates. Also, without a plan, we sug- gest that transparency beyond the obligatory distinc- tion between exploration and confirmation is crucial for scientific communication. Otherwise, intransparency about how data has been explored could hide some exploratory steps. Readers may then be misled about how promising confirmation attempts are. To give an extreme example, an association might appear present or absent, small or large, positive or negative, just by picking a narrowly defined subsample in which a rela- tion might be claimed (e.g., Vul et al., 2009). If many subsamples have been tried, it may be unlikely that the association will be found again with new data. Box 3 summarizes how some measures inform about what has actually been done and how much initial evidence there is. 9 Box 3: Transparency measures for what and how much exploration has been conducted” • Preregistration Preregistering the pure intention that exploration is to be carried out counteracts later false asser- tions on confirmation on results that have been actually obtained by exploration (heard from Eric- Jan Wagenmakers in a 2020 talk). If there is a plan on how to explore, it should also be pre- registered to be later able to show that one had this plan. Changes of a plan might be necessary for various reasons when enjoying the dynamics of digging into data. These can be transparently recorded by an audit trail (version management) system such as “Git” (Chacon & Straub, 2014). • Open data Access to data (Isbell, 2021), preferably to the raw data (Arribas-Bel et al., 2021; Nikiforova, 2020; Wilkinson et al., 2016), allows researchers to reproduce found patterns or problems in data that have made changes in a plan necessary. Re- searchers can also try their own analyses to see if they come up with the same result (Shahin et al., 2020). If such analysis (preferably done by inde- pendent re-analysers, e.g., Silberzahn et al., 2018) identifies the same pattern, the initial evidence for the claim is larger, because the alternative anal- ysis might have failed to identify it (e.g., an as- sociation that is also found when using a statisti- cal method that is more robust against irregular- ities in the data such as non-normally distributed residuals; Field and Wilcox, 2017).To ease open access to data, several publishers have recently started to offer purpose-designed, peer-reviewed, and citable journal contribution templates that allow for the publication of data sets (a collec- tion is provided by “Data Journals – Forschungs- daten.org,” 2022). • Open analysis Open analysis generates transparency in what analyses have been actually done through access to the complete syntax used and the results it has generated (van Dijk et al., 2021). Together with open data it serves reproducing a whole explo- rative quest. Automatic documentation ensures that no analyses with maybe unfavourable results are concealed. Powerful software packages for this have been developed that store an entire analytical workflow (Peikert & Brandmaier, 2021; Van Lissa et al., 2020; Wratten et al., 2021), as well as notebooks customised for this (Beg et al., 2021). Further research agenda for exploration: where to explore, what and how to explore The following proposals summarize the ideas from the first (Höfler et al., 2022) and this second article on exploration. They are likely to give highly context- dependent answers and are therefore intended for sep- arate consideration across the many fields and research quests of psychology. On top of them we invite re- searchers to probe the conceptions of this paper with their own explorative quests. This opens the probably most promising avenue for refinement. 1. Reconsider established evidential norms for confir- mation. More severe tests may be required to be passed for a claim to move from a new claim to an established claim. In particular, specify what al- ternative explanations (e.g., bias) must be probed against. 2. Identify hypotheses, models and theories that might benefit from exploring around them. 3. Identify little understood domains. Specify where key features might be found and with which method of measuring and analysing. 4. What new data should be collected or what exist- ing data should be explored? What are promis- ing search-spaces? Are global or local claims of greater interest? 5. Identify gaps in theories that should be filled by exploration to be complete and severely testable. Also identify poorly probed components of theo- ries that might benefit from explorative quests for modifications. 6. Methodically elaborate on the efficiency of methods of filtering and smoothing. Consider the applica- bility of exploratory methods used outside psy- chology, for example those recommended for ex- ploring the huge amount of medical data in the UK Biobank (2022). Formal elaboration of these concepts should be helpful. Mathematical rigour probes for stringency and may indicate need for change and gaps that have to be filled. 7. Use open science measures for transparency on how exploration was carried out and how much initial evidence may exist for identified patterns. Recommendations to stakeholders We end with a list of recommendations for stakehold- ers including journal editors, peer-reviewers and fund- ing agencies. These three groups have the largest means 10 for change if they cooperate in addressing the follow- ing points. We suggest in general that funding agen- cies should provide financial incentives for explorative quests, public repositories and methodical elaboration. Editors should offer space and define rules that pro- mote transparent exploration of high quality. Reviewers should control these issues. Open review seems prefer- able, because it creates transparency in the control pro- cess. Specific recommendations are: 1. Mandatory separation between tested versus new hypotheses (Gigerenzer, 2018) already listed in the abstract of an article. 2. Create new journal sections for exploration pa- pers and reserve space for this (McIntosh, 2017; Thompson et al., 2020). Maybe fund entire ex- ploration journals like the publisher Open Explo- ration did with its four medical journals (Publish- ing, 2021). 3. Use editorials to mention gaps in theories (Lakens, 2019)(Lakens, 2019) that could be filled by explo- ration (Woo et al., 2017). 4. “Place exploratory analyses (regardless of the out- come) on citable public repositories” (Pennycook, 2018). Funding agencies are requested to create more space and fund according studies to inform other researchers (Thompson et al., 2020) with results suitable to test or feed theories (Greenland et al., 2004). 5. The common sense that every publication must have an introduction and a discussion part may be ques- tioned. A pure exploratory publication, for exam- ple, on a range of somehow plausible potential risk factors for a disease, does not necessarily re- quire an introduction (it would merely list weak justifications and have little space to describe the theoretical background for analysing each of the many investigated factors). The same applies to the discussion part, a deeper discussion may be better placed in a paper format that discusses the results from several studies and their impact on theory building, interventions and public health (Greenland et al., 2004). Publications on only grossly justified observational data with associa- tion results (e.g., short-term planned Covid-19 re- search) appear most useful if they just describe the methods and report the results (Greenland et al., 2004). Conclusion Science has been argued to have made its biggest discoveries through chance (Gaughan, 2010; Roberts, 1989), but maybe chance can be prompted by provid- ing scientists with means to valuable exploration. Psy- chology seems to have a particularly large potential here. Also, scientific communication could highly ben- efit from considering exploratory findings not as estab- lished knowledge, but as pure suggestions on the rocky path from data to truth that invite one to walk on with- out knowing where one arrives. Yet teaching some basic insights like how valuable exploration and true confir- mation benefit from one another might help, at least in the long run when those who are now taught are ready to conceptualise their own studies. Probably al- most every reader has been taught statistics and meth- ods with a nearly exclusive focus on confirmation. Once a new generation of two-trail scientists will emerge, this generation might come up with powerful ways of co- operative exploration that our generation is incapable of imagining because of our confirmatory priming. We wish to conclude with the admittedly emotional re- mark that the necessity of writing these two articles on the value of exploration in science has felt some- what strange. The self-evidence of this should be reason enough to engage in strict confirmation and transparent exploration and, in turn, to look forward to a science, we believe, thus enriched. Author Contact Michael Höfler, Chemnitzer Straße 46, Clinical Psychology and Behavioural Neuroscience, Institute of Clinical Psychology and Psychotherapy, Technis- che Universität Dresden, 01187 Dresden, Germany. michael.hoefler@tu-dresden.de, +49 351 463 36921 ORCID: https://orcid.org/0000-0001-7646-8265 Acknowledgements: We thank Annekathrin Rätsch for aid with the references. Conflict of Interest and Funding Robert Miller is an employee of Pfizer Pharma GmbH. The authors declare that there were no conflicts of in- terest with respect to the authorship or the publication of this article. Philipp Kanske is supported by the German Research Foundation (KA4412/2-1, KA4412/4-1, KA4412/5-1, KA4412/9-1, CRC940/C07). Author Contributions Michael Höfler had the lead in developing the con- ceptions and the writing. Brennan McDonald has con- tributed epistemic details and was involved in the writ- ing and wording of the entire manuscript. Philipp Kanske commented on and edited the manuscript. https://orcid.org/0000-0001-7646-8265 11 Robert Miller has contributed methodological aspects and reviewed and edited the manuscript. Open Science Practices This article earned the Open Materials badge for making the materials openly available. It has been ver- ified that the analysis reproduced the results presented in the article. The entire editorial process, including the open reviews, is published in the online supplement. References Adjerid, I., & Kelley, K. (2018). Big data in psychology: A framework for research advancement. Amer- ican Psychologist, 73(7), 899–917. https://doi. org/10.1037/amp0000190 Alonso, S. G., de la Torre-Díez, I., Hamrioui, S., López- Coronado, M., Calvo Barreno, D., Morón Noza- leda, L., & Franco, M. (2018). Data mining al- gorithms and techniques in mental health: A planned review. Journal of Medical Systems, 42, 161. https : / / doi . org / 10 . 1007 / s10916 - 018 - 1018-2 Arribas-Bel, D., Green, M., Rowe, F., & Singleton, A. (2021). Open data products-a framework for creating valuable analysis ready data. Journal of Geographical Systems, 23, 497–514. https:// doi.org/10.1007/s10109-021-00363-5 Beg, M., Taka, J., Kluyver, T., Konovalov, A., Ragan- Kelley, M., Thiery, N. M., & Fangohr, H. (2021). Using jupyter for reproducible scientific work- flows. Computing in Science Engineering, 23(2), 36–46. https://doi.org/10.1109/MCSE.2021. 3052101 Behrens, J. T. (1997). Principles and procedures of ex- ploratory data analysis. Psychological Methods, 2(2), 131–160. https://doi.org/10.1037/1082- 989X.2.2.131 Bender, R., & Lange, S. (2001). Adjusting for multiple testing — when and how? Journal of Clinical Epidemiology, 54(4), 343–349. https://doi.org/ 10.1016/s0895-4356(00)00314-0 Bogen, J., & Woodward, J. (1988). Saving the phenom- ena. Philosophical Review, 97, 303–352. Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71(356), 791–799. https://doi.org/10.1080/01621459. 1976.10480949 Box, G. E. P. (1980). Sampling and bayes inference in scientific modelling and robustness (with dis- cussion and rejoinder). Journal of the Royal Sta- tistical Society, Series A, 143, 383–430. Bringmann, L. F., Elmer, T., & Eronen, M. I. (2022). Back to basics: The importance of concep- tual clarification in psychological science. Cur- rent Directions in Psychological Science, 31(4), 340–346. https : / / doi . org / 10 . 1177 / 09637214221096485 Bzdok, D., & Ioannidis, J. P. A. (2019). Exploration, inference, and prediction in neuroscience and biomedicine. Trends in Neurosciences, 42(4), 251–262. https://doi.org/10.1016/j.tins.2019. 02.001 Chacon, S., & Straub, B. (2014). Pro git. Apress. Colquhoun, D. (2014). An investigation of the false discovery rate and the misinterpretation of p- values. Journal of the Royal Society of Open Sci- ence, 1, 140216. https://doi.org/10.1098/rsos. 140216 Data journals – forschungsdaten.org. (2022). De Rooij, M., & Weeda, W. (2020). Cross-validation: A method every psychologist should know. Ad- vances in Methods and Practices in Psychological Science, 3(2), 248–263. https : / / doi . org / 10 . 1177/2515245919898466 Del Giudice, M., & Gangestad, S. W. (2021). A traveler’s guide to the multiverse: Promises, pitfalls, and a framework for the evaluation of analytic deci- sions. Advances in Methods and Practices in Psy- chological Science, 4(1). https : / / doi . org / 10 . 1177/2515245920954925 Dirnagl, U. (2020). Preregistration of exploratory re- search: Learning from the golden age of discov- ery. PLOS Biol, 18(3), e3000690. https : / / doi . org/10.1371/journal.pbio.3000690 Elhai, J. D., & Montag, C. (2020). The compatibility of theoretical frameworks with machine learn- ing analyses in psychological research. Current Opinion in Psychology, 36, 83–88. https://doi. org/10.1016/j.copsyc.2020.05.002 Erceg-Hurn, D. M., & Mirosevich, V. M. (2008). Mod- ern robust statistical methods: An easy way to maximize the accuracy and power of your re- search. American Psychologist, 63(7), 591–601. https://doi.org/10.1037/0003-066X.63.7.591 Eronen, M. I., & Bringmann, L. F. (2021). The theory crisis in psychology: How to move forward. Per- spectives on Psychological Science, 16(4), 779– 788. https://doi.org/10.1037/amp0000190 https://doi.org/10.1037/amp0000190 https://doi.org/10.1007/s10916-018-1018-2 https://doi.org/10.1007/s10916-018-1018-2 https://doi.org/10.1007/s10109-021-00363-5 https://doi.org/10.1007/s10109-021-00363-5 https://doi.org/10.1109/MCSE.2021.3052101 https://doi.org/10.1109/MCSE.2021.3052101 https://doi.org/10.1037/1082-989X.2.2.131 https://doi.org/10.1037/1082-989X.2.2.131 https://doi.org/10.1016/s0895-4356(00)00314-0 https://doi.org/10.1016/s0895-4356(00)00314-0 https://doi.org/10.1080/01621459.1976.10480949 https://doi.org/10.1080/01621459.1976.10480949 https://doi.org/10.1177/09637214221096485 https://doi.org/10.1177/09637214221096485 https://doi.org/10.1016/j.tins.2019.02.001 https://doi.org/10.1016/j.tins.2019.02.001 https://doi.org/10.1098/rsos.140216 https://doi.org/10.1098/rsos.140216 https://doi.org/10.1177/2515245919898466 https://doi.org/10.1177/2515245919898466 https://doi.org/10.1177/2515245920954925 https://doi.org/10.1177/2515245920954925 https://doi.org/10.1371/journal.pbio.3000690 https://doi.org/10.1371/journal.pbio.3000690 https://doi.org/10.1016/j.copsyc.2020.05.002 https://doi.org/10.1016/j.copsyc.2020.05.002 https://doi.org/10.1037/0003-066X.63.7.591 12 Farcomeni, A., & Greco, L. (2016). Robust methods for data reduction. Chapman; Hall/CRC. https : / / doi.org/10.1201/b18358 Fiedler, K. (2017). What constitutes strong psychologi- cal science? the (neglected) role of diagnostic- ity and a priori theorizing. Perspectives on Psy- chological Science, 12(1), 46–61. https : / / doi . org/10.1177/1745691616654458 Field, A. P., & Wilcox, R. R. (2017). Robust statistical methods: A primer for clinical psychology and experimental psychopathology researchers. Be- haviour Research and Therapy, 98, 19–38. https: //doi.org/10.1016/j.brat.2017.05.013 Fleming, J. I., Wilson, S. E., Hart, S. A., Therrien, W. J., & Cook, B. G. (2021). Open accessibility in ed- ucation research: Enhancing the credibility, eq- uity, impact, and efficiency of research. Educa- tional Psychologist, 56(2), 110–121. https : / / doi.org/10.1080/00461520.2021.1897593 Gaughan, R. (2010). Accidental genius: The world’s greatest by-chance discoveries. Metro Books. Gelman, A., Haig, B., Hennig, C., Owen, A., Cousins, R., Young, S., Robert, C., Yanofsky, C., Wagenmak- ers, E. J., Kenett, R., & Lakeland, D. (2019). Many perspectives on deborah mayo’s “statisti- cal inference as severe testing: How to get be- yond the statistics wars”. Retrieved November 2, 2021, from http : / / www . stat . columbia . edu / ~gelman / research / unpublished / mayo _ reviews_2.pdf Gigerenzer, G. (2010). Personal reflections on the- ory and psychology. Theory Psychology, 20(6), 733–743. https : / / doi . org / 10 . 1177 / 0959354310378184 Gigerenzer, G. (2018). Statistical rituals: The replica- tion delusion and how we got there. Advances in Methods and Practices in Psychological Science, 1(2), 198–218. https : / / doi . org / 10 . 1177 / 2515245918771329 Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of causal discovery methods based on graphi- cal models. Frontiers in Genetics, 10, 524. https: //doi.org/10.3389/fgene.2019.00524 Greenland, S. (2006). Smoothing observational data: A philosophy and implementation for the health sciences. International Statistical Review, 74, 31–46. https : / / doi . org / 10 . 1111 / j . 1751 - 5823.2006.tb00159.x Greenland, S. (2017). Invited commentary: The need for cognitive science in methodology. Ameri- can Journal of Epidemiology, 186(6), 639–645. https://doi.org/10.1093/aje/kwx259 Greenland, S., Gago-Dominguez, M., & Castelao, J. E. (2004). The value of risk-factor ("black-box") epidemiology. Epidemiology, 15(5), 529–35. https://doi.org/10.1097/01.ede.0000134867. 12896.23 Hand, D. J. (2007). Principles of data mining. Drug- Safety, 30(7), 621–622. https : / / doi . org / 10 . 2165/00002018-200730070-00010 Hernán, M. A. (2018). The c-word: Scientific eu- phemisms do not improve causal inference from observational data. American Journal of Public Health, 108(5), 616–619. https : / / doi . org/10.2105/AJPH.2018.304337 Höfler, M., Scherbaum, S., Kanske, P., McDonald, B., & Miller, R. (2022). Means to valuable exploration i. the blending of confirmation and exploration and how to resolve it. Meta- Psychology, 2(6). https : / / doi . org / 10 . 15626 / MP.2021.2837 Höfler, M., Trautmann, S., & Kanske, P. (2021). Qualitative approximations to causality: Non- randomizable factors in clinical psychology. Clinical Psychology in Europe, 3(2), e3873. https://doi.org/10.32872/cpe.3873 Hofstadter, D. R., & Dennett, D. C. (1981). The mind’s i: Fantasies and reflections on self and soul. Basic Books. Hollenbeck, J. R., & Wright, P. M. (2017). Harking, sharking, and tharking: Making the case for post hoc analysis of scientific data. Journal of Management, 43(1), 5–18. https://doi.org/10. 1177/0149206316679487 Huber, P. J. (1981). Robust statistics. John Wiley & Sons, Inc. I., N., & R., B. C. (1998). Qualitative-quantitative re- search methodology: Exploring the interactive continuum. Southern Illinois University Press. Isbell, D. R. (2021). Open science, data analysis, and data sharing. Open Science Framework Preprint. https://doi.org/10.31219/osf.io/pdj9y Jaffe, A. E., Murakami, P., Lee, H., Leek, J. T., Fallin, M. D., Feinberg, A. P., & Irizarry, R. A. (2012). Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. International Journal of Epidemiology, 41(1), 200–209. https://doi.org/10.1093/ije/ dyr238 Job, V., Dweck, C. S., & Walton, G. M. (2010). Ego de- pletion — is it all in your head? implicit theo- ries about willpower affect self-regulation. Psy- chological Science, 21(11), 1686–1693. https:// doi.org/10.1177/0956797610384745 https://doi.org/10.1201/b18358 https://doi.org/10.1201/b18358 https://doi.org/10.1177/1745691616654458 https://doi.org/10.1177/1745691616654458 https://doi.org/10.1016/j.brat.2017.05.013 https://doi.org/10.1016/j.brat.2017.05.013 https://doi.org/10.1080/00461520.2021.1897593 https://doi.org/10.1080/00461520.2021.1897593 http://www.stat.columbia.edu/~gelman/research/unpublished/mayo_reviews_2.pdf http://www.stat.columbia.edu/~gelman/research/unpublished/mayo_reviews_2.pdf http://www.stat.columbia.edu/~gelman/research/unpublished/mayo_reviews_2.pdf https://doi.org/10.1177/0959354310378184 https://doi.org/10.1177/0959354310378184 https://doi.org/10.1177/2515245918771329 https://doi.org/10.1177/2515245918771329 https://doi.org/10.3389/fgene.2019.00524 https://doi.org/10.3389/fgene.2019.00524 https://doi.org/10.1111/j.1751-5823.2006.tb00159.x https://doi.org/10.1111/j.1751-5823.2006.tb00159.x https://doi.org/10.1093/aje/kwx259 https://doi.org/10.1097/01.ede.0000134867.12896.23 https://doi.org/10.1097/01.ede.0000134867.12896.23 https://doi.org/10.2165/00002018-200730070-00010 https://doi.org/10.2165/00002018-200730070-00010 https://doi.org/10.2105/AJPH.2018.304337 https://doi.org/10.2105/AJPH.2018.304337 https://doi.org/10.15626/MP.2021.2837 https://doi.org/10.15626/MP.2021.2837 https://doi.org/10.32872/cpe.3873 https://doi.org/10.1177/0149206316679487 https://doi.org/10.1177/0149206316679487 https://doi.org/10.31219/osf.io/pdj9y https://doi.org/10.1093/ije/dyr238 https://doi.org/10.1093/ije/dyr238 https://doi.org/10.1177/0956797610384745 https://doi.org/10.1177/0956797610384745 13 Jorgensen, T. J., Ruczinski, I., Kessing, B., Smith, M. W., Shugart, Y. Y., & Alberg, A. J. (2009). Hypothesis-driven candidate gene association studies: Practical design and analytical con- siderations. American Journal of Epidemiology, 170(8), 986–993. https : / / doi . org / 10 . 1093 / aje/kwp242 Kassis, A., & Papps, F. A. (2020). Integrating comple- mentary and alternative therapies into profes- sional psychological practice: An exploration of practitioners’ perceptions of benefits and barri- ers. Complementary therapies in clinical practice, 41, 101238. https://doi.org/10.1016/j.ctcp. 2020.101238 Kessler, R. C., & Merikangas, K. R. (2004). The na- tional comorbidity survey replication (ncs-r): Background and aims. International Journal of Methods in Psychiatric Research, 13(2), 60–68. https://doi.org/10.1002/mpr.166 Kirk, R. E. (1996). Practical significance: A concept whose time has come. Educational and Psycho- logical Measurement, 56(5), 746–759. https:// doi.org/10.1177/0013164496056005002 König, I. R. (2011). Validation in genetic association studies. Briefings in Bioinformatics, 12(3), 253– 258. https://doi.org/10.1093/bib/bbq074 Lakatos, I. (1977). The methodology of scientific re- search programmes: Philosophical papers volume 1. Cambridge University Press. Lakens, D. (2019). The value of preregistration for psychological science: A conceptual analysis. PsyArXiv Preprint. https://doi.org/10.31234/ osf.io/jbh4w Linke, J. O., Abend, R., Kircanski, K., Clayton, M., Stavish, C., et al. (2021). Shared and anxiety- specific pediatric psychopathology dimensions manifest distributed neural correlates. Biolog- ical Psychiatry, 89(6), 579–587. https : / / doi . org/10.1016/j.biopsych.2020.10.018 Lurquin, J. H., & Miyake, A. (2017). Challenges to ego- depletion research go beyond the replication crisis: A need for tackling the conceptual crisis. Frontiers in Psychology, 8, 568. https://doi.org/ 10.3389/fpsyg.2017.00568 Manuti, A., & Giancaspro, M. L. (2019). People make the difference: An explorative study on the rela- tionship between organizational practices, em- ployees’ resources, and organizational behav- ior enhancing the psychology of sustainabil- ity and sustainable development. Sustainabil- ity, 11, 1499. https : / / doi . org / 10 . 3390 / su11051499 Marigorta, U. M., Rodríguez, J. A., Gibson, G., & Navarro, A. (2018). Replicability and predic- tion: Lessons and challenges from gwas. Trends in Genetics: TIG, 34(7), 504–517. https://doi. org/10.1016/j.tig.2018.03.005 Martins, L. B., Braga Tibães, J. R., Sanches, M., Jacka, F., Berk, M., & Teixeira, A. L. (2021). Nutrition- based interventions for mood disorders. Expert Review of Neurotherapeutics, 21(3), 303–315. https : / / doi . org / 10 . 1080 / 14737175 . 2021 . 1881482 Masur, P. K., & Scharkow, M. (2020). Specr: Conducting and visualizing specification curve analyses. Mayo, D. G. (2018). Statistical inference as severe testing: How to get beyond the statistics wars. Cambridge University Press. https : / / doi . org / 10 . 1017 / 9781107286184 McIntosh, R. D. (2017). Exploratory reports: A new ar- ticle type for cortex. Cortex, 96, A1–A4. https: //doi.org/10.1016/j.cortex.2017.07.014 Moghaddam, F. M. (2004). From ‘psychology in liter- ature’ to ‘psychology is literature’: An explo- ration of boundaries and relationships. Theory Psychology, 14(4), 505–525. https : / / doi . org / 10.1177/0959354304044922 Nguyen, S. H. (2000). Regularity analysis and its appli- cations in data mining. In S. T. L. Polkowski & T. Y. Lin (Eds.), Rough set methods and applica- tions (pp. 289–378). Physica-Verlag HD. https: //doi.org/10.1007/978-3-7908-1840-6_7 Nikiforova, A. (2020). Comparative analysis of national open data portals or whether your portal is ready to bring benefits from open data. IADIS International Conference on ICT, Society and Hu- man Beings. Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mel- lor, D. T. (2018). The preregistration revolu- tion. PNAS Proceedings of the National Academy of Sciences of the United States of America, 115(11), 2600–2606. https : / / doi . org / 10 . 1073/pnas.1708274114 Parvandeh, S., Yeh, H. W., Paulus, M. P., & McKinney, B. A. (2020). Consensus features nested cross- validation. Bioinformatics, 36(10), 3093–3098. https : / / doi . org / 10 . 1093 / bioinformatics / btaa046 Peikert, A., & Brandmaier, A. M. (2021). A reproducible data analysis workflow with r markdown, git, make, and docker. Quantitative and Computa- tional Methods in Behavioral Sciences, 1, e3763. https://doi.org/10.5964/qcmb.3763 https://doi.org/10.1093/aje/kwp242 https://doi.org/10.1093/aje/kwp242 https://doi.org/10.1016/j.ctcp.2020.101238 https://doi.org/10.1016/j.ctcp.2020.101238 https://doi.org/10.1002/mpr.166 https://doi.org/10.1177/0013164496056005002 https://doi.org/10.1177/0013164496056005002 https://doi.org/10.1093/bib/bbq074 https://doi.org/10.31234/osf.io/jbh4w https://doi.org/10.31234/osf.io/jbh4w https://doi.org/10.1016/j.biopsych.2020.10.018 https://doi.org/10.1016/j.biopsych.2020.10.018 https://doi.org/10.3389/fpsyg.2017.00568 https://doi.org/10.3389/fpsyg.2017.00568 https://doi.org/10.3390/su11051499 https://doi.org/10.3390/su11051499 https://doi.org/10.1016/j.tig.2018.03.005 https://doi.org/10.1016/j.tig.2018.03.005 https://doi.org/10.1080/14737175.2021.1881482 https://doi.org/10.1080/14737175.2021.1881482 https://doi.org/10.1017/9781107286184 https://doi.org/10.1017/9781107286184 https://doi.org/10.1016/j.cortex.2017.07.014 https://doi.org/10.1016/j.cortex.2017.07.014 https://doi.org/10.1177/0959354304044922 https://doi.org/10.1177/0959354304044922 https://doi.org/10.1007/978-3-7908-1840-6_7 https://doi.org/10.1007/978-3-7908-1840-6_7 https://doi.org/10.1073/pnas.1708274114 https://doi.org/10.1073/pnas.1708274114 https://doi.org/10.1093/bioinformatics/btaa046 https://doi.org/10.1093/bioinformatics/btaa046 https://doi.org/10.5964/qcmb.3763 14 Pennycook, G. (2018). You are not your data. Behav- ioral and Brain Sciences, 41. https : / / doi . org / 10.1017/S0140525X1800081X Popper, K. (1959). The logic of scientific discovery. Basic Books. Publishing, O. E. (2021). Https://www.explorationpub.com [Accessed: 2021-01-13]. https : / / www . explorationpub.com Roberts, R. M. (1989). Serendipity: Accidental discoveries in science. John Wiley & Sons, Inc. Romero, C., & Ventura, S. (2020). Educational data mining and learning analytics: An updated sur- vey. WIREs Data Mining and Knowledge Discov- ery, 10(3). https : / / doi . org / 10 . 1002 / widm . 1355 Rubin, M., & Donkin, C. (2022). Exploratory hypothe- sis tests can be more compelling than confirma- tory hypothesis tests. Philosophical Psychology. https : / / doi . org / 10 . 1080 / 09515089 . 2022 . 2113771 Ryan, O., Bringmann, L. F., & Schuurman, N. K. (2019). The challenge of generating causal hypothe- ses using network models [preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/ryg69 Scheel, A. M. (2021). Why most psychological re- search findings are not even wrong [preprint]. PsyArXiv. https : / / doi . org / 10 . 31234 / osf. io / 8w2sd Schimmack, U. (2021). The implicit association test: A method in search of a construct. Perspectives on Psychological Science, 16(2), 396–414. https:// doi.org/10.1177/1745691619863798 Shahin, M. H., Bhattacharya, S., Silva, D., Kim, S., Bur- ton, J., Podichetty, J., Romero, K., & Conrado, D. J. (2020). Open data revolution in clinical research: Opportunities and challenges. Clini- cal and Translational Science, 13(4), 665–674. https://doi.org/10.1111/cts.12756 Silberzahn, R., Uhlmann, E. L., Martin, D. P., Anselmi, P., Aust, F., Awtrey, E., & et al. (2018). Many analysts, one data set: Making transparent how variations in analytic choices affect results. Ad- vances in Methods and Practices in Psychological Science, 1(3), 337–356. https : / / doi . org / 10 . 1177/2515245917747646 Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibil- ity in data collection and analysis allows pre- senting anything as significant. Psychological Science, 22(11), 1359–1366. https://doi.org/ 10.1177/0956797611417632 Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification curve analysis. Nature Human Be- havior. https://doi.org/10.1038/s41562- 020- 0912-z Skrondal, A., & Rabe-Hesketh, S. (2004). Generalized latent variable modeling: Multilevel, longitudi- nal, and structural equation models. Chapman & Hall/CRC. Sohmer, O. R. (2020). An exploration of the value of cooperative inquiry for transpersonal psychology, education, and research: A theoretical and qual- itative inquiry (Doctoral dissertation). Califor- nia Institute of Integral Studies. https://search. proquest.com/docview/2464456670 Stebbins, R. A. (1992). Concatenated exploration: Notes on a neglected type of longitudinal re- search. Quality & Quantity, 26, 435–442. https: //doi.org/10.1007/BF00170454 Stebbins, R. A. (2001). Exploratory research in the so- cial sciences. Sage Publications, Inc. https://doi. org/10.4135/9781412984249 Stebbins, R. A. (2006). Concatenated exploration: Aid- ing theoretic memory by planning well for the future. Journal of Contemporary Ethnography, 35(5), 483–494. https : / / doi . org / 10 . 1177 / 0891241606286989 Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing transparency through a multiverse analysis. Perspectives on Psychologi- cal Science, 11(5), 702–712. https : / / doi . org / 10.1177/1745691616658637 Suppes, P. (1969). Models of data. In E. Nagel, P. Sup- pes, & A. Tarski (Eds.), Logic, methodology, and philosophy of science: Proceedings of the 1960 international congress (pp. 252–261). Stanford University Press. Swedberg, R. (2018). On the uses of exploratory re- search and exploratory [Retrieved October 14, 2020]. Szollosi, A., & Donkin, C. (2021). Arrested theory de- velopment: The misguided distinction between exploratory and confirmatory research. Perspec- tives on Psychological Science, 16, 717–724. https://doi.org/10.1177/1745691620966796 Takezawa, K. (2005). Introduction to nonparametric re- gression. John Wiley & Sons. https://doi.org/ 10.1002/0471771457 Thompson, W. H., Wright, J., & Bissett, P. G. (2020). Point of view: Open exploration. eLife, 9. https: //doi.org/10.7554/eLife.52157 Van Lissa, C. J., Brandmaier, A. M., Brinkman, L., Lamprecht, A.-L., Peikert, A., Struiksma, M. E., & Vreede, B. (2020). Worcs: A workflow for open reproducible code in science. Data Sci- https://doi.org/10.1017/S0140525X1800081X https://doi.org/10.1017/S0140525X1800081X https://www.explorationpub.com https://www.explorationpub.com https://doi.org/10.1002/widm.1355 https://doi.org/10.1002/widm.1355 https://doi.org/10.1080/09515089.2022.2113771 https://doi.org/10.1080/09515089.2022.2113771 https://doi.org/10.31234/osf.io/ryg69 https://doi.org/10.31234/osf.io/8w2sd https://doi.org/10.31234/osf.io/8w2sd https://doi.org/10.1177/1745691619863798 https://doi.org/10.1177/1745691619863798 https://doi.org/10.1111/cts.12756 https://doi.org/10.1177/2515245917747646 https://doi.org/10.1177/2515245917747646 https://doi.org/10.1177/0956797611417632 https://doi.org/10.1177/0956797611417632 https://doi.org/10.1038/s41562-020-0912-z https://doi.org/10.1038/s41562-020-0912-z https://search.proquest.com/docview/2464456670 https://search.proquest.com/docview/2464456670 https://doi.org/10.1007/BF00170454 https://doi.org/10.1007/BF00170454 https://doi.org/10.4135/9781412984249 https://doi.org/10.4135/9781412984249 https://doi.org/10.1177/0891241606286989 https://doi.org/10.1177/0891241606286989 https://doi.org/10.1177/1745691616658637 https://doi.org/10.1177/1745691616658637 https://doi.org/10.1177/1745691620966796 https://doi.org/10.1002/0471771457 https://doi.org/10.1002/0471771457 https://doi.org/10.7554/eLife.52157 https://doi.org/10.7554/eLife.52157 15 ence, 4(1), 29–49. https : / / doi . org / 10 . 3233 / DS-210031 van Dijk, W., Schatschneider, C., & Hart, S. A. (2021). Open science in education sciences. Journal of Learning Disabilities, 54(2), 139–152. https:// doi.org/10.1177/0022219420945267 Vul, E., Harris, C., Winkielman, P., & Pashler, H. (2009). Puzzlingly high correlations in fmri studies of emotion, personality, and social cognition. Per- spectives on Psychological Science, 4(3), 274– 290. https : / / doi . org / 10 . 1111 / j . 1745 - 6924 . 2009.01125.x Wagenmakers, E.-J., & Dutilh, G. (2016). Seven self- ish reasons for preregistration. APS Observer, 29(9). https : / / www . psychologicalscience . org / observer / seven - selfish - reasons - for - preregistration Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing. Academic Press. Wilkinson, M. D., Dumontier, M., Aalbersberg, I. J., Ap- pleton, G., Axton, M., Baak, A., & et al. (2016). The fair guiding principles for scientific data management and stewardship. Scientific Data, 3(1), 160018. https://doi.org/10.1038/sdata. 2016.18 Williams, M. N. (2021). Levels of measurement and sta- tistical analyses. Meta-Psychology, 5. https : / / doi.org/10.15626/MP.2019.1916 Woo, S. E., O’Boyle, E. H., & Spector, P. E. (2017). Best practices in developing, conducting, and eval- uating inductive research [editorial]. Human Resource Management Review, 27(2), 255–264. https://doi.org/10.1016/j.hrmr.2016.08.004 Wratten, L., Wilm, A., & Göke, J. (2021). Re- producible, scalable, and shareable analysis pipelines with bioinformatics workflow man- agers. Nature Methods, 18, 1161–1168. https : //doi.org/10.1038/s41592-021-01254-9 Xiong, Z., Chen, Y., Li, Z., & Zhao, Y. (2020). Evalu- ating explorative prediction power of machine learning algorithms for materials discover using k-fold forward cross-validation. Computational Materials Science, 171, 109203. https : / / doi . org/10.1016/j.commatsci.2019.109203 https://doi.org/10.3233/DS-210031 https://doi.org/10.3233/DS-210031 https://doi.org/10.1177/0022219420945267 https://doi.org/10.1177/0022219420945267 https://doi.org/10.1111/j.1745-6924.2009.01125.x https://doi.org/10.1111/j.1745-6924.2009.01125.x https://www.psychologicalscience.org/observer/seven-selfish-reasons-for-preregistration https://www.psychologicalscience.org/observer/seven-selfish-reasons-for-preregistration https://www.psychologicalscience.org/observer/seven-selfish-reasons-for-preregistration https://doi.org/10.1038/sdata.2016.18 https://doi.org/10.1038/sdata.2016.18 https://doi.org/10.15626/MP.2019.1916 https://doi.org/10.15626/MP.2019.1916 https://doi.org/10.1016/j.hrmr.2016.08.004 https://doi.org/10.1038/s41592-021-01254-9 https://doi.org/10.1038/s41592-021-01254-9 https://doi.org/10.1016/j.commatsci.2019.109203 https://doi.org/10.1016/j.commatsci.2019.109203