International Journal of Interactive Mobile Technologies (iJIM) – eISSN: 1865-7923 – Vol. 16, No. 07, 2022 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children https://doi.org/10.3991/ijim.v16i07.27893 Christothea Herodotou1(*), Chrysoula Mangafa2, Pinsuda Srisontisuk1 1The Open University, Milton Keynes, UK 2Metropolitan College, Rhodes campus, Greece christothea.herodotou@open.ac.uk Abstract—The choice of mobile applications (apps) for learning has been heavily relied on customer and teacher reviews, designers’ descriptions, and alignment with existing learning and human-computer interaction theories. There is limited empirical evidence to advise on the educational value of mobile apps as these are used by children. Understanding the impact of mobile apps on young children’s learning is timely given the lack of evidence-based recommendations that could guide parents and teachers in selecting apps for their children. In this paper, we present the results of a series of Randomised Control Trials (RCTs) with 376 children aged 5 to 6 years old who interacted with two maths apps in three schools in the UK. Pre/post-test comparisons revealed learning gains in both the control and intervention groups, suggesting that the selected applications are equally good to standard maths practice. Implications for the selection and use of mobile apps are discussed. Keywords—mobile apps, maths, learning, children, RCTs 1 Introduction A plethora of mobile applications (apps), often labelled as ‘educational’, is target- ing young children. The mobile and tactile nature of apps facilitates a great degree of independence enabling young children, or even toddlers and infants, to easily interact with them [1], [2]. Yet, the listing of an app in the education category of an app store does not necessarily mean that the app has an educational value [3] or that the app has been tested with children and has been shown to promote learning. Technical con- strains such as a lack of resources (e.g., time, money) may inhibit app evaluation [4] or, in other cases, educational technology experts such as instructional designers are not involved in the process of design to ensure that effective pedagogical practices and relevant game mechanics are considered. Review ratings by customers, teachers or designers may not be particularly helpful either, as they often overstate information or are assessing aspects not directly related to the educational quality of an app [5]. A lack of transparency and information that can help assess the quality of apps is found to be missing from the app stores [6]. 116 http://www.i-jim.org https://doi.org/10.3991/ijim.v16i07.27893 mailto:christothea.herodotou@open.ac.uk Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children In addition to the use of review ratings, a top-down approach structured on exist- ing learning and human-computer interaction theories, such as cooperative learning, human motivation, and usability design [7] has been applied to assess the quality of mobile apps. Amongst the design parameters used are: visual design and how this affects visual attention and perception, adaptability and accessibility, usability and simplicity of interaction, sounds effects, verbal communication, organisational design, navigation and screen consistency (e.g., [8]). Evidence suggests that good quality apps support specific learning goals and promote active, engaged, meaningful, and socially interactive learning. Such apps present features such as explicit instruction, repetitive and cumulative training in learning concepts, immediate feedback, challenge and early reward, and individualized, self-paced learning (self-regulation and control) (e.g., [3], [4]). Although such frameworks can provide valuable insights as to which apps or specific design features facilitate learning and development, they do not engage children in the process of research and evaluation in a participatory manner. Such engagement would help designers improve the design of apps by identifying issues such as whether and what certain children learn from interacting with apps, whether and how educational outcomes align with designers’ intentions, and which design aspects are challenging and potentially hindering learning processes and engagement. A few studies have examined the impact or effectiveness of mobile apps on children’s learning and development using robust methodological approaches (e.g., [9], [10], [11]). A systematic review of the effects of mobile apps on young children’s learning and development identified 14 studies reporting positive effects on children between 2 and 5 years old, four studies reporting mixed findings and one study with negative findings [9]. The majority of studies focused on language literacy and considerably fewer studies examined topics such as maths and science. Conditions facilitating learning included: (a) interactivity, narration and highlight functions, variety of representations, and varied levels of dif- ficulty, (b) adult support while using an app, (c) age of children, with greater benefits reported for older children, (d) similarity between an app and assessment activities (near transfer), and (e) the use of one device per child especially for struggling students and girls. A meta-analysis quantified the impact of maths apps on learning by identi- fying medium-size effects (ES = 0.29), yet noting that this overall effect size masks the true variability in mean effects due to observed heterogeneity amongst examined apps. Factors inflating the overall effect size were the use of researcher-developed as opposed to standardized instruments measuring learning outcomes, measuring small and fixed learning outcomes, such as number recognition, as opposed to e.g., math problem solving skills, and participants’ age with greater effects for pre-schoolers as opposed to kindergarten to third-grade children [12]. Overall, the number of studies examining the impact of educational apps on early years learning is rather limited and non-conclusive. These studies raise the need for further and systematic research in the field that can enhance our understanding of how certain apps facilitate learning and who of the children can benefit the most when inter- acting with them. Further research should take the form of large-scale randomized con- trol trials [10] in order to produce robust insights about the impact of apps on specific groups of children. In terms of the latter, the educational value of an app should not be seen as an “one size fits all”, for children with different demographic characteristics, iJIM ‒ Vol. 16, No. 07, 2022 117 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children skills and knowledge may benefit differently from using a single app. The abilities of children can influence how an app is used and the degree to which this will be helping or hindering learning [13]. Such knowledge can help teachers and parents to make evidence-based decisions when selecting apps for children, and inform app designers resulting in better quality educational apps. The examination of maths apps has attracted limited interest compared to other domains, such as language literacy [9]. Underachievement of children in maths is a global phenomenon [14], necessitating the development and testing of interventions that can help children reach (and exceed) minimum maths proficiency levels. The use of maths apps can be promising in this respect, especially if future research activities focus on testing specific app characteristics, align theory, design and outcome mea- sures, and assess varied cognitive and skill-based outcomes [15]. This study aims to contribute to this line of work by examining the impact of two mobile apps on maths skills and knowledge and determining whether these apps can bring benefits to certain groups of children as defined by their age, gender and prior maths knowledge. Align- ing with existing recommendations [10], it has deployed a Randomised Control Trial (RCT) methodological design. The two apps under study – Moose Math and Monster Numbers – could be described as “instructive” or “drill-and-practice” apps; they require limited cognitive effort in the form of remembering or recalling previously acquired knowledge [9], [10]. They resemble pen-and-paper activities with the advantage of providing immediate feedback. They are training children in automating tasks such as addition and subtraction through ongoing practice and repetition. Such apps are amongst the most popular and top rated in the app market [3], [16]. It is thus worthwhile to examine whether these widely used and positively perceived apps are beneficial for children and their learning. 2 Reviewing existing studies An increasing number of studies, including systematic reviews, are found to exam- ine or summarize the learning impact of selected mobile applications on young children [9], [10], [17], [11]. In particular for maths apps, positive effects have been observed on early maths learning in typically developing children in the areas of number rec- ognition and naming, and simple addition and subtraction [10]. Prior knowledge and performance are found to be significant moderators of proposed effectiveness [11]. Although these studies are underpinned by a common goal – to determine the effects of mobile apps on early years maths development-, they present a great variation in terms of who the children under examination are and what the apps under study look like. This suggests that a closer examination of reported studies is needed to shed some light on who can benefit the most from interacting with apps and which design features or implementations are those that can support or promote these benefits. The skills or expertise of learners may interact with the cognitive load of tasks. For example, novice or less skilled learners may need considerable guidance and breaking down in steps complex instructions, whereas more skillful learners may find this as impairing their progress [18]. 118 http://www.i-jim.org Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children The few studies available measuring effects of mobile apps on early maths learning present rather mixed-findings in relation to who of the children can benefit the most from interacting with apps. In particular, Schacter and Jo [19] examined low-income children (Mean age = 4,6) who used the tablet-based curriculum app Math Shelf, in a classroom setting and identified that the intervention group outperformed the control group. While gender and race had no effects on outcomes, prior maths knowledge had a moderating effect; children with lower pre-test scores on number sense (<50%) benefited nearly twice from the intervention that those with higher pre-test scores, showing the value of the app especially for low performing children. Yet, another study testing the same app with the same age children showed contradicting outcomes. While the intervention group performed better than the control group aligning with previous findings, pre-test scores and gender were found to moderate effects. This time the higher performers (>50%) and female children had better post-test scores in number sense [20]. Enhanced learning outcomes in numbers, shapes, space, and measure were also reported in a number of studies with children 4 to 7 years old, who interacted with a set of apps from OneBillion. In one of the reported studies, low achievers 4–5 years old were found to benefit more from the apps than a similar age of high-achievers. No impact of socio-economic status and child’s first language was found [21]. Age was shown to moderate effects on STEM learning (including quantity of different sets) in a study where a group of children played a game (Mesozoic Math Adventures) and another group watched the experimenter playing the same game [22]. Younger children (Mean age = 3,6) were found to learn more from watching rather than playing the game, while older children (Mean age = 4,7) learnt equally well from play- ing or watching the game. These differences were explained by cognitive load which is likely to increase when playing a game and make it difficult for the younger children to manage it. Yet, in another study comparing video versus tablet-based interactions, observed differences held true even after controlling for age. Children (3,7–5,6 years old) who played a tablet-based game about approximate measuring or viewed a video recorded version of the game demonstrated greater transfer of knowledge than a control group playing a zoo keeping game. Children in the interactive condition (tablet-based) had better outcomes in a near transfer test, whereas children in the video-recording condition were better in the far transfer test. Other co-variates including gender, verbal ability, parent’s education, and household income were not significant [23]. Yet, in other studies, age as well as gender were not associated with post-test perfor- mance. A tablet-based maths implementation consisting of 32 different digital games was superior to a respective computer-based one in terms of developing numbering skills, numeral literacy, mastery of number facts, calculation skills and understanding of concepts with 4 and 5 years old (Mean age = 5,2) [24]. Also, the game-based app Measure Up! was found to result in enhanced learning gains in understanding mea- surement concepts such as height and length, weight, and capacity in the intervention condition (Mean age = 5) than the control condition [25]. Similarly, a numeracy app was found to improve numerical magnitude knowledge in 6 years old, yet a working memory game app did not result in any improvements compared to the control group. A combination of the two game apps was found to improve working memory for at least a month later. No differences in age, gender, ethnicity, race, and home languages iJIM ‒ Vol. 16, No. 07, 2022 119 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children were observed between the intervention and control groups, hence these variables were excluded from the analysis [26]. A study that examined a game app, closely similar to the design of the two apps we examined in this paper, showed improvements in the arithmetic fluency of 7 years old; it helped children become fluent in adding and subtracting simple sums up to 20. The app gave an arithmetic addition or subtraction problem (e.g., 6 + 8 = ?) to the children and a number of possible answers. The speed and correctness of each problem were associated to game performance. Post-test comparisons showed significant gains for the intervention group in subtraction using non-symbolic (dots; ::) number representations than the control group. Improvements in non-symbolic problems required students to make a calculation in order to find the answer and therefore the authors concluded that the game improved calculation efficiency rather than retrieval efficiency, as originally expected [27]. Also, to the best of authors’ knowledge, a single study was found to report on equally good pre-post test outcomes between the intervention and control groups (5 and 6 years old) in mathematical abilities, spatial awareness and working memory. This concerned a comparison between a programming app (Bee-bot app), programming with pen-and-paper, and a control group [28]. The lack of significant differences was explained by standard teaching practice in addition and subtraction that may have helped all groups perform equally well in the proposed tasks, a lack of statistical power and a possibility of Type II error. In terms of the apps used in the aforementioned studies, these feature certain design characteristics. The apps of OneBillion used an in-app virtual teacher to guide chil- dren’s learning with instructions and demonstrations [21]. A racing game in which children competed a virtual enemy helped children calculate correctly certain maths problems suggesting that non-symbolic arithmetic skills can be improved through simple multiple-choice tasks [27]. The Math Shelf app, structured around games that support short-term maths goals, can be tailored to students’ needs. It assigns content based on an assessment children take which determines where in the curriculum they are [19]. The Mesozoic Math Adventures presented two games in which a character was indicating what the children should do, either by asking a question that could be answered by selecting from a number of options or asking children to test a hypothesis by, for example, arranging objects on the screen [22]. The Measure that Animal app introduces a zookeeper who needs to measure some animals, yet he has forgotten his measuring tape. Children can select an item from a box and place it on a line to mea- sure the animal. This interactive approach has been designed to scaffold the process of measuring [29]. With the exception of one study, the outcomes of existing studies point to enhanced post-test performance after children aged 4 to 7 years old interacted with certain maths apps. Yet, the effects of prior knowledge, age, and gender on post-test performance are rather blurred. There are mixed-findings in respect of whether maths apps can help in particular the low or high achievers or whether older children (than younger ones) and female are those who can benefit the most from interacting with apps. None of the reported studies evidenced significant effects of socio-economic status, ethnicity, child’s first language, verbal ability, parent’s education, and household income. These insights raised the need to explore further the effects of moderating factors in order to determine who of the children benefit the most from interacting with selected maths apps. 120 http://www.i-jim.org Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children 2.1 Learning through “drill-and-practice” Ealy years maths curricula are mainly focused on improving skills such as counting, using numbers, and calculating addition and subtraction problems [30], [31]. Mathe- matic skills such as number combination (e.g., 6 + 4 = ?, 10–4 = ?) can be solved by counting, decomposing, or by automatic retrieval of the answer from memory. Children make use of specific strategies that can help them solve number combination problems, often starting with “counting all”, then moving to strategies such as counting starting from the biggest number and decomposing a whole into different combinations of parts. Over time, associations of problems with correct answers become established in mem- ory and children retrieve answers rather than practising strategies to find the correct answer [32]. There are three stages to skills acquisition: cognitive – performance of calculations to produce the correct answer; associative – retrieving the answer from declarative knowledge; and autonomous – no strategy is used and retrieving the answer becomes a reflex [33]. Drill-and-practice is a significant part of learning about number combinations that can lead to the “autonomous” stage of skills acquisition or arithmetic fluency. It is a behaviourist-oriented approach to learning that can result in conducting lower level processes (such as addition or subtraction of small sums) with limited effort. This is a significant skill as it enables greater cognitive capacity for solving complex tasks [34]. Teaching the strategies for solving a task coupled with deliberate practice were shown to result in better learning outcomes than teaching without practice [32]. Developmental differences were found in terms of practising number combinations. A computer-based task showing children the strategy to use to solve an addition problem was found to be more beneficial for 3rd graders, while a process-based training (no strategy or scaffold- ing provided) was more beneficial for 5th graders. It is more likely the older children could develop their own strategies for solving the tasks by possessing relevant cogni- tive skills, and this resulted in becoming faster in finding the correct answer. On the contrary, younger children who were given a strategy to solve the task became more accurate after practice [18]. 2.2 The maths apps under study In this paper, we examined two commercially available mobile apps that have not been researched before, Moose Math by Duck Duck Goose and Monster Numbers by Didactoons. To select apps, we first reviewed the design features of available maths apps for early years and grouped existing apps in three main categories: (i) apps linked to physical artefacts, (ii) “drill-and-practice” apps with external rewards, and (iii) apps that combined gaming and learning elements, for example a racing maths game. In this study, we chose to study an exemplary app from the second and third categories. The criteria we used to select the specific apps were as follows: (a) free maths apps available in both the Apple and Google stores, (b) apps not used in previous published work, and (c) apps rated with at least 3.5/5. The apps under study were “instructive”, that is, supporting learning through “drill-and-practice” [3] and targeting recalling of sim- ple addition and subtraction tasks and counting. At the point of writing, Moose Maths was rated with 4.5/5 (iOS) and 4.4/5 (100,000 + installs) (Google store) and Monster iJIM ‒ Vol. 16, No. 07, 2022 121 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children Numbers 4.5/5 (iOS) and 4.0/5 (10,000 + installs). In Moose Math, participating chil- dren were asked to interact with specific in-app learning activities that were deemed suitable to their age, in particular the Juice Mixer, Pets, and Pets Bingo. 2.3 The present study This paper presents evidence from the project mEvaluate: Devising an evaluation framework for the design and use of mobile learning applications in early years’ educa- tion funded by the British Academy Mid-Career Fellowship scheme, the aim of which was to devise an evidence-based evaluation framework for the design and use of mobile apps for math literacy (see [35]). Project data were collected from a series of RCTs in primary schools in the UK. The following Research Questions (RQs) were addressed in the study: RQ1: What is the learning impact of the mobile apps Moose Math and Monster Numbers on 5–6 years old? We hypothesised that the performance of participating children would improve after interacting with the two apps under study. The apps would be an opportunity for practising maths concepts and processes taught in the classroom [32]. It would enable performance of lower level processes with limited effort and help to establish the correct number associations in memory [32], [34]. The added value of using mobile apps, rather than a pen-and-paper equivalent, is that children receive immediate feedback from the apps that can help quick recovery from mistakes and facilitate progress. RQ2: How do children’s characteristics in particular age, gender, and previous maths performance relate to learning using these apps? Existing studies present rather mixed findings about the impact of age and gender on maths learning with apps (e.g., [19], [22], [29]). Therefore, we hypothesised that these characteristics would not influence post-test performance. On the contrary, given the evidence to support that previous performance is related to post-test outcomes in a positive or negative way [21], [19], we hypothesized that previous maths performance, as measured by pre-tests and the assessment of teachers for each individual child, would moderate effects on post-tests. 3 Methodology 3.1 Context and process of data collection We ran four RCTs in four self-selected primary schools in the UK, identified through announcements we shared with different teachers associations in England. Using a SPSS function, we numbered and randomly allocated students within each class into a control and an intervention groups. Ethical approvals were gained from the ethics committee of the Open University UK. Parental consent was obtained by the guardians of all children who took part in the study. Teachers were offered Amazon vouchers as a thank you gift for their participation in the study. No incentives were offered to par- ticipating children. We treated the first school (20, Year 1 children) as piloting of the process of implementation and data collection. In particular, we piloted and refined the pre/post-tests designed to measure impact on learning after interacting with the apps, 122 http://www.i-jim.org Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children and the instruction documents we shared with teachers that detailed how to implement the study. Also, we monitored and gave feedback to teachers as to how to respond to students’ queries when using the mobile app and when completing the pre/post-tests, to ensure that limited guidance is provided to children that could bias the results of this study. In terms of the socio-economic status of participating schools, all four schools were public and presented a larger than the national average concentration of disadvantaged students (i.e., minority ethnic groups, English as an additional language, free school meals, children in care or adopted, and pupil premium – that is the governmental grant offered to school and families of disadvantaged children to minimise the attainment gap). The Index of Multiple Deprivation, that is, the official measure of relative depri- vation of small areas in England classifies the area around School 1 at the 7th decile (the 10th decile is the least deprived small area nationally), and Schools 2, 3 and 4 at the 1st decile, an indication that the areas where those schools are located are amongst the most deprived 10% of small areas nationally. In terms of technological equipment, School 1 had 15 iPads shared across the entire school and School 2, 3, and 4 had no mobile devices. In these schools, mobile devices were provided to each child by the authors. In this paper, we excluded the pilot school and are reporting on the outcomes from three schools (coded thereafter as School 1, School 2, School 3) with a total of N = 376 children as follows: School 1 – one Year 1 and one Reception classes (n = 46); School 2 – two Year 1 and two Reception classes (n = 100); School 3 – four Year 1 and four Reception classes (n = 230). The duration of the intervention ranged slightly across schools to accommodate the needs and availability of participating teachers: children in School 1 had 8 sessions with the mobile app of 15–20 minutes each, and Schools 2 and 3, 5–7 sessions of an average 15 minute duration (two sessions per week). Prior to the start of the intervention, we shared written instructions with participating teachers and debriefed them orally as to how they should use the devices. In School 1, the inter- vention took place during maths teaching. While the intervention group was interacting with the app, the control group was doing standard maths practice. In School 2 and 3, teachers organised the study around the school needs, therefore the control group in some sessions was doing standard maths practice while in others was practicing other subjects. The role of the teachers was to moderate or supervise the study and provide limited technical support if needed. This design aligned with existing studies examin- ing the use of technology in classroom settings [21]. The first and last sessions at each school were coordinated by the research team, as a means to showcase to the teachers how to implement the study and also allocate and collect pre/post-tests. Children in the intervention group were sitting together and worked individually (one-device-per child) with the mobile devices. No guidance or help was provided by the teacher or the researchers, unless technical difficulties inhib- ited a child from interacting with the app. The research team contacted the teachers once a week as a means to monitor the progress of the study, resolve any issues they were facing, and enhance the fidelity of the implementation. Pre and post tests were designed based on the learning objectives of the apps under study, piloted and revised during the piloting phase. The piloting indicated that the tests were relatively lengthy and therefore they were substantially shortened. The tests iJIM ‒ Vol. 16, No. 07, 2022 123 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children followed the type of activities children were asked to complete in the app: (a) number recognition, (b) counting, (c) adding, and (d) subtracting numbers. Instructions on how to complete each activity in the tests were written in a separate document and shared with teachers across all classes. When children could not understand the instructions given, a non-related to the app example was given and explained by the teacher. 3.2 Sample Table 1 summarizes the sample characteristics across schools. Overall, 376 self-selected children took part in the study. Participating schools had on average sim- ilar numbers of male and female children, aged between 5 and 6 years old. In terms of children’s existing performance in maths, the teachers’ assessment showed that in School 1 and 3 the majority of children had an average performance, whereas in School 2 a slight majority was above average. Another measure of students’ previous maths knowledge and understanding is their scores in pretests indicating that School 1 had a lower maths average that the other two schools. Table 1. Sample characteristics School Name No. of Students Gender (%) Age (months) Class Maths Performance* Pre-test Performance (out of 100) School 1 46 M = 41 F = 59 M = 68 SD = 6.4 1 X Y1, 1 X Reception 22% below average, 48% as expected, 30% above average. M = 46 SD = 24 School 2 100 M = 55 F = 45 M = 70 SD = 7.1 2 X Y1, 2 X Reception 16% below average, 36% as expected, 39% above average. (9% missing) M = 68 SD = 24 School 3 230 M = 120 F = 110 M = 69 SD = 6.8 4 x Y1, 4 x Reception 18% below average, 51% as expected, 9% above average. (22% missing) M = 65 SD = 34 Note: *As assessed by the class teacher. 3.3 Mobile apps under study To visualize the interaction pathways and design features of each app, we used the Activity Theory framework for analyzing serious games [36]. Moose Maths pres- ents a cyclical interaction pathway that starts with: (a) selecting a learning activity, (b) selecting a reward, (c) completing correctly a learning task, and (d) receiving a reward (See Figure 1 and Table 2). It allows for maximum three wrong answers to a given task before proceeding to the next one. Instructions are provided in the form of oral help before a learning task starts. Help (oral and visual) is available on demand (See purple bird in Table 3). 124 http://www.i-jim.org Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children Fig. 1. The design of the mobile application Moose Math Table 2. The game and learning components of Moose Math Choose Activity Choose an Item for Your City Instructions/ Help Complete Task Rewards Gaming elements Actions Customisation Choose Obtain help Matching Performance evaluation Tools Activity Object Oral or visual instructions Fruits, bird, drink, oral and visual messages Rewards Goals Choose activity Decorate Learn interface Solve task Maximise performance Learning elements Actions Observe Repetition; recover from errors Tools Tips/help Challenge Goals Provide guidance Counting or addition or subtraction iJIM ‒ Vol. 16, No. 07, 2022 125 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children Table 3. Instructions and help provided by Moose Math Automatic, Oral Instructions Optional Instructions (Help-on-Demand) Oral Help Visual Help Drag ingredients from the fridge to the blender. Put 3 oranges into the blender. 1st wrong attempt: Let’s try again. Put two more in the blender. I am always here if you get stuck. 2nd wrong attempt: Let’s try again. Put 2 more Or Remove one 3rd wrong attempt: The app moves to the next task. Correct answer: Looks delicious, choose a cup. Press the cup or tab the recipe book to make another drink. Similarly, Monster Numbers presents a cyclical interaction pathway with sepa- rate learning and gaming tasks. The successful completion of a learning task follows a gaming session (racing game). There is no limitation as to the number of wrong attempts made neither in the learning nor in the gaming task (unlimited repetition of activity) (Figure 2). Instructions are both visual and written (but not oral) and presented before the start of a gaming or learning task. In learning tasks only, these can be skipped by pressing the start button (Table 4). 126 http://www.i-jim.org Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children Fig. 2. The game flow of the “Monster Numbers” app 3.4 Process of data analysis Aligning with [37], we used a multiple linear regression analysis; independent vari- ables or predictors were the pre-test scores, the condition (intervention versus control), gender, age, and previous maths performance. We transformed all pre/post-test scores to percentages to allow for easier interpretation (see [37]). The analysis considered for only complete cases of children (listwise selection), that is cases where both pre and post-test values were available. Three pre-test and six post-test cases were missing and excluded from the analysis. In all three datasets (three schools), no values over .70 were observed in the correlation matrix, P-P plot and scatterplots showed linear relation- ship of standardized residuals, and Cook’s distance was not greater than 1, meeting the assumptions for running a regression. Moose Math app: This app was tested in Schools 1 and 2. We ran a separate regression analysis for each participating school. We first inspected the distribution of the dependent variable (post test scores) within each group (control, intervention) and within each school dataset. In Schools 1, the skewness and kurtosis measures and standard errors, and their histograms, normal Q-Q plots and box-plots and the Kolmogorov-Smirnov test of normality (School 1: control p = .200; intervention p = .200) showed that the data were approximately normally distributed. Levene’s test verified the equality of variances between the control and intervention groups (School 1: p = .65). In School 2, the data were found to be non-normally distributed. The Kolmogorov-Smirnov test of normality was significant for both the control (p = .009) and intervention groups (p = .001). Yet, we performed a regression analysis given that the sample size was ‘sufficiently large’ and over 80 participants which is considered appropriate for running a parametric test [38]. Monster Numbers app: This apps was tested in School 3. We first inspected the distribution of the dependent variable (post test scores) within each group (control, intervention). The skewness and kurtosis measures and standard errors, and their histograms, normal Q-Q plots and box-plots and the Kolmogorov-Smirnov test of iJIM ‒ Vol. 16, No. 07, 2022 127 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children normality (control p < .001, intervention p < .001) showed that the data were not normally distributed. Levene’s test verified the equality of variances between the control and intervention groups (p = .793). Despite the non-normal distribution, we performed a linear regression analysis given that the sample size was ‘sufficiently large’ as above [38]. Table 4. Instructions and forms of feedback in the “Monster Numbers” app Instructions Forms of Feedback Rewards Gaming Levels In-game instructions given visually and as text. When instructions are given the running speed of the main character decreases. Children have no option of opting out of instruction. Instructions do not repeat when level is failed. Ongoing feedback is shown through score, coins collected, and potion collected. End of level evaluation based on performance. Rewards given as parts of spacecraft. Rating out of 3 is given. Math Levels Instructions given at the beginning of level as text and visual representation. Children can skip the instructions by pressing play. Instruction repeat when level is failed. Ongoing feedback shown through progression bar, lives remaining, and timer. End of level evaluation based on performance. Rewards given as collectable coins. Rating out of 3 is given. 4 Results School 1 (Moose Math): The results of the regression indicated that only one pre- dictor explained 26% of the variance in the dependent variable (post-tests) (R2 = .26, F (2,40) = 6.99, p < .01). Pre-test scores significantly predicted post-test scores (β = .45, p = .003) (See Table 5), while the condition (control versus intervention group) was not statistically significant (β = –.12, p = .401, NS). After entering demographic variables, the model remained significant (R2 = .29, F (5,37) = 3.1, p < .01). The only variable predicting post-test scores was pre-test performance (β = .39, p = .025) indicating that 128 http://www.i-jim.org Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children the greater the pre-test performance the better the post-test scores were. In particu- lar, one point increase on the pre-tests corresponds to 0.39 increase in post-tests per- formance. Participating children had significantly better scores in post-tests, over and above the condition they were in, gender, age and previous maths performance. School 2 (Moose Math): As with previous school, the results of this regression indi- cated that only one predictor explained 51% of the variance in post-tests (R2 = .51, F (2, 92) = 48, p = .001). Pre-test scores significantly predicted post-test scores (β = .68, p = .001) (See Table 5), while the condition (control versus intervention group) was not statistically significant (β = .13, p = .09, NS). After entering demographic variables, the model remained significant (R2 = .53, F (5, 81) = 18.45, p = .001). The only variable predicting post-test scores was pre-test performance (β = .66, p = .001). In particular, one point increase on the pre-tests corresponds to 0.66 increase in post-tests perfor- mance. No other significant differences were found. School 3 (Monster numbers app): The results of the regression indicated that only one predictor explained 60% of the variance in post-tests (R2 = .596, F (2, 104) = 28.63, p < .001). Pre-test scores significantly predicted post-test scores (β = .59, p < .001) (See Table 5), while the condition (control versus intervention group) was not statistically significant (β = .002, p = .98, NS). After entering demographic variables, the model remained significant explaining 54% of the variance in the dependent variable (R2 = .54, F (5, 76) = 6.21, p < .001). The only variable predicting post-test scores was pre-test performance (β =.46, p = .001) suggesting that children had significantly better scores in post-tests, over and above the condition they were in (β =.01, p = .91, NS), gender (β = .005, p = .96, NS), age (β =.05, p = .64, NS) and previous maths performance (β =.14, p = .21, NS). In particular, one point increase on the pre-tests corresponds to 0.46 increase in post-tests performance. Table 5. Means and standard deviations in the control and intervention groups Mobile Application Control Intervention Pre-Test Post-Test Pre-Test Post-Test School 1 Moose Maths M = 52 SD = 23 M = 68 SD = 24 M = 39 SD = 23 M = 56 SD = 27 School 2 Moose Maths M = 64 SD = 26 M = 69 SD = 24 M = 71 SD = 22 M = 78 SD = 17 School 3 Monster Numbers M = 62 SD = 34 M = 62 SD = 34 M = 67 SD = 34 M = 63 SD = 35 5 Discussion In this paper, we conducted three RCTs with 376 children aged 5 and 6 years old to capture the impact of two popular and highly rated, “drill-and-practice” mobile maths apps at three primary schools located in relatively deprived areas of the UK. In contrast to the majority of existing studies reporting positive learning gains from using maths apps (e.g., [39], [10]), this study identified no significant differences between the app and non-app conditions. Participating children were found to have better learning out- comes in post-tests by the end of the intervention over and above the condition they iJIM ‒ Vol. 16, No. 07, 2022 129 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children were in, suggesting that both conditions – interacting with a maths app and standard teaching practice – were equally beneficial to helping children complete basic maths tasks such as counting, addition and subtraction of small numbers. In contrast to our initial hypothesis for RQ1, the intervention group did not present improved learning outcomes compared to the control group. This finding aligns with a few studies that had reported non-significant gains post intervention for the app condition [40], [41] as well as insights suggesting inflated effect sizes in studies examining constrained maths skills as such skills have a ceiling effect, are mastered by most children and are influenced more by direct teaching [42]. The increased performance of children in post-tests in both conditions could also be explained by an overall progress in understanding early maths concepts over the period of the proposed intervention. Counting, addition and subtraction are core topics in early years maths instruction, hence systematic classroom practice may have had a positive impact on the performance of students as a whole. Aligning with our hypothesis for RQ2, the only factor explaining post-test perfor- mance in both the control and intervention groups was pre-test scores. The greater the children’s performance in pre-tests, the better their post-tests outcomes were. There was no effect of age, gender and previous maths performance (as assessed by teachers) suggesting that these factors are unrelated to post-test changes. These findings confirm studies showing that the effectiveness of mobile apps for learning is often related to prior knowledge and performance [11]. Children who were more skillful or knowledge- able might have performed better in the tests than other children either because they developed the strategies needed to solve the tasks in hand or they were at the “autono- mous” stage of calculations in which they could recall answers from memory with no effort [33]. In contrast, the low performing children might have performed less well due to a lack of additional guidance or explanations (either from a teacher or the app) that could help them manage the cognitive load and cope with the tasks successfully [18]. Reflecting on the delivery of the intervention, there was a variation in the activities children in the control group were engaged with across sessions and schools. For exam- ple, practicing addition or subtraction using pen and paper or receiving instruction as to how to solve such problems may have benefited the control group and helped them perform equally well to the intervention group. Also, for the intervention group, the medium used to deliver the pre/post-tests was different to the medium used to practice maths concepts. Generalisation may not happen when children are instructed using a mobile device, whereas the assessment is completed using a different format such as pen-and-paper. Studies on computer-based maths instruction showed that students who practised only on a computer performed better in a computer-based assessment than a pen-and-paper one, whereas those who practised on a paper-and-pencil had similar outcomes in the computer-based and the pen-and-paper assessments [43]. These fac- tors may have disadvantaged the intervention group and benefited the control group that was used to practising using pen-and-paper. On the other hand, the design of the pre/post tests followed the structure and content of the activities presented in the two mobile apps. In other words, they were closely aligned with the content of the in-app learning experience to facilitate near transfer [29]. Researcher-developed as opposed to standardised instruments were shown to inflate effect sizes for app conditions [12], suggesting that the tests may have favoured the intervention group. The combination of the above factors could explain why both groups improved after the intervention. 130 http://www.i-jim.org Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children Other factors that may explain the lack of superior post-test performance in the intervention group is the design of the two apps under study and their focus on “drill- and-practice” of already acquired knowledge. The selected apps had no elements of explicit or direct teaching or structured instruction, an app feature that has shown to relate to enhanced learning outcomes [44]. Such features could showcase to children the strategies to use to solve tasks such as how to add up quantities, and help children understand and recover from mistakes in a constructive way. Given the young age of participating children (4 and 5 years old), these are more likely to be at the cogni- tive stage of calculations [33], that is practising strategies to find the correct answer rather than drawing from memory established number associations [32]. Aligning with existing studies [18], “drill-and-practice” apps may have been more beneficial for older children that are transitioning to the “autonomous stage” of skills acquisition or reach- ing arithmetic fluency. Therefore, a “drill-and-practice” app could help them become faster in finding the correct answers, a skill needed for solving more complex problems. In addition, the delivery of feedback in the apps under study may have inhibited learning and recovery from mistakes. Studies have showed that specific (and not all) types of feedback can result in enhanced learning outcomes ([19], [45]). In this study, the Moose Math app provided verbal and emotional feedback (e.g., Let’s try again or Looks delicious) (see Table 3) or at a ‘self-level’ referring to personal evaluations and affect in the form of reinforcement [46]. In contrast, help-on-demand provided feedback at a ‘task-level’, that is instructions about how to proceed. These instruc- tions were verbal, written, and graphical. Yet, the most beneficial form of feedback has shown to be elaborative feedback, that is providing explanations as to why an answer is correct or wrong, as well as cues and suggestions as to how to modify a response [47]. In Moose Math, elaborative feedback is provided in the help-on-demand button (see Table 3, purple bird), yet not in the task feedback, suggesting that the latter could be enhanced by explaining why an answer is correct or wrong, or by providing per- sonalised feedback that responds to specific actions on the screen. Examining the role of feedback in Moose Math using screen recordings, Herodotou [35] has shown that feedback is perceived differently by children of the same age, with some children being unable to recover from errors after accessing oral and visual help. 6 Limitations In an effort to increase the ecological validity of the study and improve the fidelity of the implementation, we produced and piloted protocols of implementation with instruc- tions as to how teachers should interact with children and children with the apps. Also, we had weekly email communication with teachers discussing progress and any issues related to the implementation. This is a rather common approach of conducting research with technology in educational contexts (e.g., [44]), where teachers receive training as to how to facilitate the study while the researchers are not present in all implementation sessions. Yet, we cannot rule out slight variations in the implementation by individual teachers that may have had an impact on outcomes. For example, participating chil- dren used the apps inside the classroom context. In some of the sessions (as reported by some teachers), children in the control group may have been in the same physical iJIM ‒ Vol. 16, No. 07, 2022 131 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children environment implementing other maths-related activities. This may have posed a threat to internal validity due to contamination. Also there was a variation in the length of the intervention across participating classes and schools, and the activities the control group was engaged with. In particu- lar, we originally planned the study to span for four consecutive weeks with three ses- sions of 20 min in each week (total of 12 sessions). Yet, due to teachers’ workload and last-minute school priorities, eight sessions ran in School 1 of 15–20 min duration each and 5–7 sessions in Schools 2 and 3 of 15 min duration each. The smaller duration of the intervention may have had an impact on the performance of the intervention group that could explain the lack of superior outcomes, often cited in the literature, compared to the control group. In addition to that, in School 2 and 3, children in the control group were not always engaged with standard maths practice. Despite the instructions given to teachers, there were sessions when children were studying other non-math related topics. This suggests that, in some cases, the exposure of the control group to maths content and teaching may have been less compared to the intervention group. 7 Conclusions Although “drill-and-practice” maths apps are quite popular in the app market, highly rated and frequently downloaded, few studies have attempted to examine their impact on learning. In this paper, we conducted three experimental studies with a total of 376 children aged 5–6 years old from deprived areas in the UK, in an effort to assess their impact on early maths learning. Insights showed that the app condition was equiv- alent to standard teaching practice suggesting that popular apps, such as Moose Maths and Monster numbers, could help children practice basic maths tasks such as count- ing, addition and subtraction, yet they were not superior, or showed to have an added value compared to standard maths practice. Considering the development of early maths skills and in particular, the transition of children through different stages prior to calculating with no effort, it is suggested that teachers and app designers consider for the skills and knowledge children have developed prior to using or recommending the use of “drill-and-practice” apps. Children who have developed an understanding of the strategies needed to calculate and are starting to become more autonomous in performing such tasks are those more likely to benefit from these apps. Such apps can help them develop calculation efficacy (do tasks quickly) or establish number associ- ations in memory, a skill needed for reducing cognitive load and enabling the solution of complex maths problems. App designers should be cautious with the age recommendations they make for such apps (e.g., suitable for children 3–7 years old) as children up to 6 years old may not benefit from interacting with them. Apps with instructive or teaching features including more elaborated feedback and scaffolding might be more beneficial for these ages as they can help children develop the strategies needed to calculate effectively. To this respect, the role of instructional designers or experts in educational technology should be heavily considered in the process of design; they could provide valuable insights as to how children develop [35], which features or mechanisms are more appropriate for their age, and how to embed these to the app design to enable active and personalised 132 http://www.i-jim.org Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children learning experiences. Partnerships between app designers and educational experts should be promoted to ensure that educational apps consider for pedagogical princi- ples and have been tested with children prior to their release to the market [48]. Such evaluations could contribute to the development of an evidence base that could guide parents and teachers when choosing and using apps with children. Also, the design of apps should move from an “one size fits all” approach to more tailored and per- sonalised approaches, using for example machine learning techniques, that consider for children’s individual learning needs including prior experiences with maths and how these may relate to app use and understanding, thus presenting each child with a dynamic and tailored learning experience. 8 Acknowledgment This work was funded by the British Academy under Grant MD170009. The described research has received approval from the Open University ethics committee. 9 References [1] D. Holloway, L. Green, and S. Livingstone, “Zero to eight: young children and their internet use,” London, 2013. [2] L. Plowman, O. Stevenson, C. Stephen, and J. McPake, “Preschool children’s learning with technology at home,” Computers & Education, vol. 59, no. 1, pp. 30–37, 2012. https://doi. org/10.1016/j.compedu.2011.11.014 [3] K. Highfield and K. Goodwin, “Apps for mathematics learning: a review of ‘educational’ apps from the iTunes app store,” in 36th annual conference of the Mathematics Educationa Research Group of Australia, 2013, vol. 0, pp. 378–385. [4] K. Hirsh-pasek, J. M. Zosh, R. Michnick, J. H. Gray, and M. B. Robb, Putting edu- cation in “educational” app: lessons From the science of learning. 2015. https://doi. org/10.1177/1529100615569721 [5] S. Papadakis, M. Kalogiannakis, and N. Zaranis, “Designing and creating an educational app rubric for preschool teachers,” Education and Information Technologies, vol. 22, no. 6, pp. 3147–3165, 2017. https://doi.org/10.1007/s10639-017-9579-0 [6] A. K. Dubé, G. Kacmaz, R. Wen, S. S. Alam, and C. Xu, “Identifying quality educational apps: Lessons from ‘top’ mathematics apps in the Apple App store,” pp. 5389–5404, 2020. https://doi.org/10.1007/s10639-020-10234-z [7] C.-Y. Lee and T. S. Cherner, “A comprehensive evaluation rubric for assessing instructional apps,” Journal of Information Technology Education: Research, vol. 14, no. 1, pp. 21–53, 2015. https://doi.org/10.1007/s12215-009-0007-1 [8] L. C. Lanna and M. G. Oró, “An analysis of the interaction design of the best educational apps for children aged zero to eight,” Comunicar. Media Education Research Journal, vol. 24, no. 1, 2016. https://doi.org/10.3916/C46-2016-08 [9] C. Herodotou, “Young children and tablets: a systematic review of effects on learning and development,” Journal of Computer Assisted Learning, vol. 34, no. 1, 2018. https://doi. org/10.1111/jcal.12220 [10] S. F. Griffith, M. B. Hagan, P. Heymann, B. H. He, and D. M. Bagner, “Apps as learning tools: a systematic review,” vol. 145, no. 1, 2020. https://doi.org/10.1542/peds.2019-1579 iJIM ‒ Vol. 16, No. 07, 2022 133 https://doi.org/10.1016/j.compedu.2011.11.014 https://doi.org/10.1016/j.compedu.2011.11.014 https://doi.org/10.1177/1529100615569721 https://doi.org/10.1177/1529100615569721 https://doi.org/10.1007/s10639-017-9579-0 https://doi.org/10.1007/s10639-020-10234-z https://doi.org/10.1007/s12215-009-0007-1 https://doi.org/10.3916/C46-2016-08 https://doi.org/10.1111/jcal.12220 https://doi.org/10.1111/jcal.12220 https://doi.org/10.1542/peds.2019-1579 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children [11] S. Verbruggen, F. Depaepe, and J. Torbeyns, “International journal of child-computer inter- action effectiveness of educational technology in early mathematics education: a systematic literature review,” International Journal of Child-Computer Interaction, vol. 27, p. 100220, 2021. https://doi.org/10.1016/j.ijcci.2020.100220 [12] J. Kim, J. Gilbert, Q. Yu, and C. Gale, “Measures matter: a meta-analysis of the effects of educational apps on preschool to grade 3 children’s literacy and math skills,” AERA Open, vol. 7, no. 1, pp. 1–19, 2021. https://doi.org/10.1177/23328584211004183 [13] S. I. Tucker, P. S. Moyer-Packenham, A. Westenskow, and K. E. Jordan, “The complex- ity of the affordance–ability relationship when second-grade children interact with math- ematics virtual manipulative apps,” Technology, Knowledge and Learning, vol. 21, no. 3, pp. 341–360, 2016. https://doi.org/10.1007/s10758-016-9276-x [14] UNESCO, “More than one-half of children and adolescents are not learning world wide,” 2017. [15] J. M. Zydney and Z. Warner, “Mobile apps for science learning: review of research,” Comput- ers and Education, vol. 94, pp. 1–17, 2016. https://doi.org/10.1016/j.compedu.2015.11.001 [16] L. Kolâs, H. Nordseth, and R. Munkvold, “Learning with educational apps: a qualitative study of the most popular free apps in Norway,” in 2016 15th International Conference on Information Technology Based Higher Education and Training (ITHET), 2016, pp. 1–8. https://doi.org/10.1109/ITHET.2016.7760701 [17] B. Haßler, L. Major, and S. Hennessy, “Tablet use in schools: a critical review of the evi- dence for learning outcomes,” pp. 139–156, 2016. https://doi.org/10.1111/jcal.12123 [18] S. Caviola, G. Gerotto, and I. C. Mammarella, “Computer-based training for improving mental calculation in third- and fifth-graders,” ACTPSY, vol. 171, pp. 118–127, 2016. https://doi.org/10.1016/j.actpsy.2016.10.005 [19] J. Schacter and B. Jo, “Improving low-income preschoolers mathematics achievement with Math Shelf, a preschool tablet computer curriculum,” Computers in Human Behavior, vol. 55, pp. 223–229, 2016. https://doi.org/10.1016/j.chb.2015.09.013 [20] J. Schacter and B. Jo, “Improving preschoolers’ mathematics achievement with tablets: a randomized controlled trial,” Mathematics Education Research Journal, vol. 29, no. 3, 2017. https://doi.org/10.1007/s13394-017-0203-9 [21] L. A. Outhwaite, A. Gulliford, and N. J. Pitchford, “Closing the gap: efficacy of a tablet intervention to support the development of early mathematical skills in UK primary school children,” Computers and Education, vol. 108, pp. 43–58, 2017. https://doi.org/10.1016/j. compedu.2017.01.011 [22] E. L. Schroeder and H. L. Kirkorian, “When Seeing is better than doing: preschoolers’ trans- fer of stem skills using touchscreen games,” vol. 7, no. September, pp. 1–12, 2016. https:// doi.org/10.3389/fpsyg.2016.01377 [23] F. Aladé, A. R. Lauricella, L. Beaudoin-Ryan, and E. Wartella, “Measuring with Murray: Touchscreen technology and preschoolers’ STEM learning,” Computers in Human Behavior, vol. 62, pp. 433–441, 2016. https://doi.org/10.1016/j.chb.2016.03.080 [24] S. Papadakis, M. Kalogiannakis, and N. Zaranis, “Comparing tablets and PCs in teach- ing mathematics: an attempt to improve mathematics competence in early childhood edu- cation,” Preschool and Primary Education, vol. 4, no. 2, pp. 241–253, 2016. https://doi. org/10.12681/ppej.8779 [25] K. Schenke et al., “Does ‘Measure Up!’ measure up? Evaluation of an iPad app to teach preschoolers measurement concepts,” Computers & Education, vol. 146, no. October 2019, p. 103749, 2020. https://doi.org/10.1016/j.compedu.2019.103749 134 http://www.i-jim.org https://doi.org/10.1016/j.ijcci.2020.100220 https://doi.org/10.1177/23328584211004183 https://doi.org/10.1007/s10758-016-9276-x https://doi.org/10.1016/j.compedu.2015.11.001 https://doi.org/10.1109/ITHET.2016.7760701 https://doi.org/10.1111/jcal.12123 https://doi.org/10.1016/j.actpsy.2016.10.005 https://doi.org/10.1016/j.chb.2015.09.013 https://doi.org/10.1007/s13394-017-0203-9 https://doi.org/10.1016/j.compedu.2017.01.011 https://doi.org/10.1016/j.compedu.2017.01.011 https://doi.org/10.3389/fpsyg.2016.01377 https://doi.org/10.3389/fpsyg.2016.01377 https://doi.org/10.1016/j.chb.2016.03.080 https://doi.org/10.12681/ppej.8779 https://doi.org/10.12681/ppej.8779 https://doi.org/10.1016/j.compedu.2019.103749 Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children [26] G. B. Ramani, E. N. Daubert, G. C. Lin, S. Kamarsu, A. Wodzinski, and S. M. Jaeggi, “Racing dragons and remembering aliens: benefits of playing number and working mem- ory games on kindergartners’ numerical knowledge,” no. September 2019, pp. 1–17, 2020. https://doi.org/10.1111/desc.12908 [27] F. van der Ven, E. Segers, A. Takashima, and L. Verhoeven, “Effects of a tablet game inter- vention on simple addition and subtraction fluency in first graders,” Computers in Human Behavior, vol. 72, pp. 200–207, 2017. https://doi.org/10.1016/j.chb.2017.02.031 [28] D. Messer, L. Thomas, A. Holliman, and N. Kucirkova, “Evaluating the effectiveness of an educational programming intervention on children’s mathematics skills, spatial awareness and working memory,” Education and Information Technologies, pp. 1–10, 2018. https:// doi.org/10.1007/s10639-018-9747-x [29] F. Alade, A. R. Lauricella, L. Beaudoin-ryan, and E. Wartella, “Measuring with Murray: Touchscreen technology and preschoolers’ STEM learning,” Computers in Human Behavior, vol. 62, pp. 433–441, 2016. https://doi.org/10.1016/j.chb.2016.03.080 [30] California Department of Education, “Mathematics Framework for California Public Schools: Kindergarten Through Grade Twelve,” 2015. [31] EYFS (Early Years Foundation Stage), “Statutory framework for the early years foundation stage,” 2017. [32] L. S. Fuchs et al., “The effects of strategic counting instruction, with and without delib- erate practice, on number combination skill among students with mathematics difficul- ties,” Learning and Individual Differences, vol. 20, no. 2, pp. 89–100, 2010. https://doi. org/10.1016/j.lindif.2009.09.003 [33] C. Tenison, J. R. Anderson, C. Tenison, and J. R. Anderson, “Learning, memory, and cog- nition modeling the distinct phases of skill acquisition modeling the distinct phases of skill acquisition,” 2015. https://doi.org/10.1037/xlm0000204 [34] E. Lehtinen, M. Hannula, S. Jake, and M. Hans, “Cultivating mathematical skills: from drill-and-practice to deliberate practice,” ZDM, vol. 49, no. 4, pp. 625–636, 2017. https:// doi.org/10.1007/s11858-017-0856-6 [35] C. Herodotou, “MAD learn: an evidence-based affordance framework to assessing learn- ing apps,” Proceedings of 2021 7th International Conference of the Immersive Learning Research Network, iLRN 2021, 2021. https://doi.org/10.23919/iLRN52045.2021.9459325 [36] M. B. Carvalho et al., “An activity theory-based model for serious games analysis and conceptual design,” Computers & Education, vol. 87, pp. 166–181, 2015. https://doi. org/10.1016/j.compedu.2015.03.023 [37] P. Connolly, A. Biggart, S. Miller, L. O’Hare, and A. Thurston, “Using randomised controlled trials in education,” London: SAGE Publications, 2017. https://doi. org/10.4135/9781473920385 [38] K. L. Sainani, “Dealing with non-normal data,” PM and R, vol. 4, no. 12, pp. 1001–1005, 2012. https://doi.org/10.1016/j.pmrj.2012.10.013 [39] C. Herodotou, “Mobile games and science learning: a comparative study of 4 and 5 years old playing the game angry birds,” British Journal of Educational Technology, 2017. https://doi. org/10.1111/bjet.12546 [40] C. Mattoon, A. Bates, R. Shifflet, N. Latham, and S. Ennis, “Examining computational skills in prekindergarteners: The effects of traditional and digital manipulatives in a prekindergar- ten classroom,” Early Childhood Research and Practice, vol. 17, no. 1, 2015. [41] D. Messer, L. Thomas, A. Holliman, and N. Kucirkova, “Evaluating the effectiveness of an educational programming intervention on children’s mathematics skills, spatial awareness and working memory,” pp. 2879–2888, 2018. https://doi.org/10.1007/s10639-018-9747-x iJIM ‒ Vol. 16, No. 07, 2022 135 https://doi.org/10.1111/desc.12908 https://doi.org/10.1016/j.chb.2017.02.031 https://doi.org/10.1007/s10639-018-9747-x https://doi.org/10.1007/s10639-018-9747-x https://doi.org/10.1016/j.chb.2016.03.080 https://doi.org/10.1016/j.lindif.2009.09.003 https://doi.org/10.1016/j.lindif.2009.09.003 https://doi.org/10.1037/xlm0000204 https://doi.org/10.1007/s11858-017-0856-6 https://doi.org/10.1007/s11858-017-0856-6 https://doi.org/10.23919/iLRN52045.2021.9459325 https://doi.org/10.1016/j.compedu.2015.03.023 https://doi.org/10.1016/j.compedu.2015.03.023 https://doi.org/10.4135/9781473920385 https://doi.org/10.4135/9781473920385 https://doi.org/10.1016/j.pmrj.2012.10.013 https://doi.org/10.1111/bjet.12546 https://doi.org/10.1111/bjet.12546 https://doi.org/10.1007/s10639-018-9747-x Paper—An Experimental Investigation of ‘Drill-and-Practice’ Mobile Apps and Young Children [42] J. Kim, J. Gilbert, Q. Yu, and C. Gale, “Measures matter: a meta-analysis of the effects of educational apps on preschool to grade 3 children’s literacy and math skills,” AERA Open, vol. 7, 2021. https://doi.org/10.1177/23328584211004183 [43] G. J. Duhon, S. H. House, and T. A. Stinnett, “Evaluating the generalization of math fact fluency gains across paper and computer performance modalities,” Journal of School Psychology, vol. 50, no. 3, pp. 335–345, 2012. https://doi.org/10.1016/j.jsp.2012.01.003 [44] L. A. Outhwaite, M. Faulder, A. Gulliford, and N. J. Pitchford, “Raising early achievement in math with interactive apps: a randomized control trial,” Journal of Educational Psychology, 2018. https://doi.org/10.1037/edu0000286 [45] M. Zhang, R. Trussell, B. Gallegos, and R. Asam, “Using math apps for improving student learning: an exploratory study in an ...: EBSCOhost,” TechTrends, vol. 59, no. 2, 2015. https://doi.org/10.1007/s11528-015-0837-y [46] J. Hattie and H. Timperley, “The power of feedback,” Review of Educational Research, vol. 44, no. 1, pp. 16–17, 2007. https://doi.org/10.1111/j.1365-2923.2009.03542.x [47] V. J. Shute, “Focus on formative feedback,” Review of Educational Research, vol. 78, no. 1, pp. 153–189, 2008. https://doi.org/10.3102/0034654307313795 [48] C. Mangafa, L. Moody, A. Woodcock, and A. Woolner, “The Design of Guidelines for Teachers and Parents in the Use of iPads to Support Children with Autism in the Develop- ment of Joint Attention Skills,’’ in Marcus A. (eds), Design, User Experience, and Usabil- ity: Novel User Experiences. DUXU 2016. Lecture Notes in Computer Science, vol 9747. Springer, Cham, 2016. https://doi.org/10.1007/978-3-319-40355-7_17 10 Authors Dr Christothea Herodotou is an Associate Professor at the Institute of Educational Technology (IET), The Open University, Walton Hall, Milton Keynes, MK7 6AA. She was the Principal Investigator of the mEvaluate Project. Dr Chrysoula Mangafa is Deputy Director of Academic Affairs at Metropolitan College, Rhodes campus, Greece. She was the postdoctoral research associate of the mEvaluate Project. Dr Pinsuda Srisontisuk supported the process of data collection at schools, while she was studying for a PhD at the Institute of Educational Technology (IET), The Open University, Walton Hall, Milton Keynes, MK7 6AA. Article submitted 2021-10-27. Resubmitted 2022-01-03. Final acceptance 2022-01-22. Final version published as submitted by the authors. 136 http://www.i-jim.org https://doi.org/10.1177/23328584211004183 https://doi.org/10.1016/j.jsp.2012.01.003 https://doi.org/10.1037/edu0000286 https://doi.org/10.1007/s11528-015-0837-y https://doi.org/10.1111/j.1365-2923.2009.03542.x https://doi.org/10.3102/0034654307313795 https://doi.org/10.1007/978-3-319-40355-7_17