Microsoft Word - BRAIN_vol_9_issue_special1_2018_v1l.doc 45 Intention Reconsideration in Wumpus World And Intentional Inference in Adolescents Carlos Pelta Complutense University of Madrid, Madrid, Spain Campus de Somosaguas, Ctra. de Húmera, s/n, 28223, Pozuelo de Alarcón cpelta@psi-ucm.es Abstract Adolescence is usually described as a period of search for sensations, great impulsivity, and risky behavior. How can this circumstance affect the intentional inference of adolescents who have to reconsider their intentions after performing a task? In this article we show that adolescents who had seen a very cautious agent acting in a Classic Wumpus World (CWW) environment and who had quite reliably predicted their movements in a later round, adopted a much more risky prediction of the agent's movements when they had to reconsider some of their original intentions in a slightly modified Wumpus World environment (or MCWW-Modified Classic Wumpus World-version). This result could support the vision of adolescents as optimistic improvisers in their decision- making processes. Keywords: Adolescence, Wumpus World, Intention Reconsideration, Intentional Inference, 1. Introduction Adolescence is a period of great impulsiveness and risk-taking (for a general cognitive perspective and not only circumscribed to the period of adolescence, see Kahneman & Lovallo, 1993). This behavior can be adaptive insofar as it encourages them to explore new situations and to acquire key skills for adulthood. But it can also have serious consequences such as accidents, suicidal behavior or unwanted relationships. Among other pieces of neurological evidence (Wadman, 2018), it seems that adolescence is a period of increased sensitivity to rewards (Romer et al. 2016). Several studies have shown an increased activation of the ventral striatum in adolescence compared to adulthood (see Galván, 2010; Galván, 2016). In Geier et al. (2010); adolescents between 13 and 17 years old with adults aged 18 to 30 years were compared in a series of trials with a reward against neutral trials. The adolescents showed a percentage of correct answers comparable to that of the adults, the imaging tests indicating an increased activation in the ventral striatal zone during the period of preparation of the responses and anticipation of the rewards. The participation of the ventral striatum in adolescents immediately preceding the response and with access to processing the reward signal a little faster than adults, could explain their vulnerability to impulsivity in decision-making, in the sense that they can devote less time than adults to evaluate and appropriately interpret rewards with goal-oriented behavior (see Luna et al., 2013). In the same direction, Galván et al. (2007) emphasize that adolescents show an exaggerated activity of the nucleus accumbens for the rewards in comparison with children and adults. Specifically, for adolescents a less risky behavior was associated with the anticipation of negative consequences, also giving the inverse correlation (Galván, 2017). In turn, Small et al. (1993) already showed that adolescents who do not engage in risky behavior anticipate significantly more costs for behaviors than their more risky peers (see Chick and Reyna, 2012 and Wilhelms et al., 2015, in the context of the fuzzy-trace theory). The authors show that when the magnitude of the reward is high, adolescents who rest on verbatim thinking (a mental representation of the literal stimulus) and who, therefore, literally analyze the expected value of a reward, are more likely to take health risks. The dual systems model (see Shulman et al., 2015) characterizes adolescence as a time of high socio-emotional reactivity (relative to previous and subsequent periods) and a cognitive control that is still maturing. The model posits that is the confluence of the developmental patterns of the socioemotional and cognitive control systems-relatively high responsiveness to reward combined BRAIN – Broad Research in Artificial Intelligence and Neuroscience, Volume 9, Special Issue on Educational Psychology (April, 2018), ISSN 2067-8957 46 with relatively weak self-regulation that renders adolescents particularly vulnerable to risk-taking. Shulman & Cauffman (2014), using a scale that surveyed involvement in a wide range of risk- taking behaviors, found that weak impulse control and sensation-seeking involved self-reported engagement in risky behaviors. In recent decades, there has been a rich synergy between Psychology and Artificial Intelligence that has led to the proliferation of artificial simulation environments that have tried to validate some interesting hypotheses of Cognitive Psychology. Kinny & Georgeff (1991) conducted a comparative experimental study of two behavioral strategies of artificial agents in their interaction with their environment. These two authors used a simulation environment called Tileworld. It is a two-dimensional environment in which there are agents, tiles, obstacles, and holes. An obstacle is an immovable object. The holes have to be covered with tiles by the agent. An agent gets points by covering holes with tiles, in order to reach as many points as possible. It is an inherently dynamic domain. Beginning with a state of the environment generated at random, the environment changes in time in discrete steps - holes appearing and disappearing. The holes appear randomly and have a certain life expectancy, unless they disappear due to actions of the agent. The interval between the appearance of successive holes is called hole gestation time. In the experiment of Kinny and Georgeff, the tiles were omitted. An agent reached points simply by occupying the holes. The agent had perfect knowledge of the state of the world and only generated plans to visit a single hole at a time. The dynamism in the domain was represented by the rapidity of change in the environment and was manipulated by adjusting the clock speeds of the environment and the agent. The effectiveness of the agent was represented by the sum of the values of the holes occupied, divided by the sum of the scores of all the holes that appear in the environment during a trial. The results of the simulations showed that a cautious agent surpassed an agent of bold behavior in highly dynamic environments. Castelfranchi et al. (2006) have designed three types of agents that interact in different types of worlds. First of all, cautious agents avoid accidents in an unsafe environment unlike the rash agents that get the worst results in damage avoidance. The third type of agents or adaptive agents obtain results comparatively similar to those of the cautious agents. The cautious agents are agents that anticipate risks and choose, among the actions, the safest. They choose those actions and plans with the lowest expected risk value, while the rash agents prefer to choose between the quickest actions or plans and try to reach their objectives sooner. To find out the human psychological ability to infer the intentions of a programmed artificial system, we have chosen the environment Wumpus World (see Russell & Norvig, 1994) and we have modified it slightly in such a way that, in a first round, a group of adolescent students observed the behavior of a very cautious agent, that is, an agent who is continually checking if there is any danger signal to be avoided. The agent prefers a safer mode of action although this may be slower. It is a Classic Wumpus World environment with an agent programmed in a very cautious way. Subjects observed step by step the behavior of the cautious agent during a round of the game. Next, the subjects had to predict in the most reliable way possible what movements the agent would make by moving it along the board. Later, a version of Wumpus World introduced slight modifications and forced the subjects to reconsider the initial intentions and the actions of the artificial agent programmed with a very cautious behavior. After witnessing a round made by the agent itself, the human agents had to move the artificial agent on the board step by step using the cursor, appearing on the game screen its position and trying to predict what the movements of the artificial agent would be, taking into account they had seen it before. The participants always tried to infer the intentions, predicting how the artificial agent would behave as seen in the first round of the game or exhibition round. As the artificial agent moved around the board, the subjects pressed the step button and both the position of the agent inferred by the subjects themselves and the real position it occupied according to their very cautious programming appeared on the screen. Thus it was possible to compare the programmed movements of the artificial agent with the movements inferred by the human agents, in this case, a group of adolescents, and it was verified if the adolescents had C. Pelta - Intention Reconsideration in Wumpus World And Intentional Inference in Adolescents 47 been more or less cautious than the programmed artificial agent, drawing the appropriate conclusions about its process of inference of the agent's intentions in the Wumpus World environment. 2. Classic Wumpus World and our modified version We now go on to describe the environment of the Classic Wumpus World or CWW version and the modification (MCWW) that we have proposed for the experiment carried out and that implies a reconsideration of intentions. Reconsidering intentions is a case of metacognition, understood as metareasoning (see Russell & Wefald, 1991). The reconsideration of intentions assumes that "(...) an agent has devised a (partial) plan of action for a particular environment, as it appeared to the agent at some time t. But then, at some later time t´> t-, perhaps in the course of executing the plan, the agent's view on the world changes"(see van Zee & Icard, 2015), perhaps also because the context of the environment varies. The Classic Wumpus World environment is deterministic, non-accessible (the agent only has a partial perception of its environment), static (except the agent, other objects do not move) and discrete. The Classic Wumpus World environment (from here, CWW) consists of a grid of 64 squares, 6 pits, a bar of gold and a Wumpus. The intelligent agent's goal is to traverse the grid and grab the gold. If the agent lands in a square where there is a pit or a Wumpus, the agent perishes. In squares adjacent to pits the agent will perceive a "breeze", squares adjacent to the Wumpus have a "stench" and the square with gold has a "glitter". If the agent runs into a wall it perceives a "bump". The agent has the following 5 percepts introduced by Russell and Norvig: stench implies that Wumpus is nearby; breeze implies that pit is nearby; glitter implies that gold is here; bump implies running into a wall; finally, scream implies that the agent shoots Wumpus with an arrow. Our agent is programmed to be very cautious and to always start in the upper left corner of the grid, on the square that will be labeled (0,0). Its actions will be: move north; move south; move west; move east; move back; grab; shoot and climb (assumed after the agent grabs the gold). The fundamental goal of the agent is preceded by Intention 1 and consists of accessing the gold, for which it must be placed frontally in the square adjacent to it. To access the gold, the agent may first attempt to kill the Wumpus (Intention 2), by shooting an arrow (which is not represented in the environment). The agent can only hit the Wumpus when she is in front of it (never diagonally). The agent only has an arrow and can fail but, if successful, he kills the Wumpus. When dying, the Wumpus screams. The agent dies if it enters the grid occupied by the Wumpus or if it falls into a grid in which there is a pit. If the gold is found, the agent returns to its cave. It may happen that she returns to its cave without fulfilling any or only fulfilling one of the two intentions. In our experiment, our agent will be very cautious because it has the following beliefs (see To, 2010): (a) if it can not leave its cave, it will not even try; (b) it always avoids danger (it always moves back if it lands on a square where there is a breeze or stench); (c) it does not perish in the hands of the Wumpus (if there is no path free of breeze or stench, the agent prefers to perish in a pit than face death with the Wumpus, and will choose the square with a breeze); (d) sometimes, the agent will retreat without grabbing the gold and (e) sometimes it will give up a fight against Wumpus without randomly shooting the arrow. In our modified version, the following different characteristics are given: 1. The five initial perceptions are simplified to only three: breeze, glitter, and scream. The agent perceives the death of the Wumpus even when it does not scream for such death. 2. The Wumpus does not scream because it is killed by the agent. It screams randomly because it is hungry, once it has woken up. Our agent may be forced to reconsider her Intention 2, showing even more cautious behavior. 3. The agent can kill the Wumpus (and fulfill Intention 2) only if it has counted exactly two Wumpus appearances awake. In any case, the agent always avoids the presence of the Wumpus. It BRAIN – Broad Research in Artificial Intelligence and Neuroscience, Volume 9, Special Issue on Educational Psychology (April, 2018), ISSN 2067-8957 48 is a priority for the agent to save its life before trying to fulfill its intentions to get the gold (Intention 1) or to kill the Wumpus (Intention 2). 4. The novelty of a map, that is, an 8x8 board added to the world, is introduced in the MCWW design, as opposed to the CWW version. This map is responsible for showing on the screen the agent's knowledge base or the inferences she makes as she moves and receives the different perceptions on the board. In this way, it is easier for the experimental subjects to predict the possible movements of the agent. As an example, observe a sequence of screens of the board of our version MCWW in which it is seen that the agent kills the Wumpus (the goal of Intention 2). See the figures 1-3: Figure 1. Screen showing the agent in the cave. Figure 2. Screen showing the agent hunting the Wumpus. C. Pelta - Intention Reconsideration in Wumpus World And Intentional Inference in Adolescents 49 Figure 3. Screen showing the agent finding the gold. 3. Psychology of intentions The concept of intention is very difficult to define (see González Marqués & Pelta, 2010). In any case, it is a mental state that can be associated with a series of particular mental states. According to Pacherie & Haggard (2010), intentions possess two distinctive characteristics: they are accessible to the conscience and maintain some connection with a subsequent action. A series of neuroscientific studies have sought to decode the brain processes that predict the specific content of a subsequent action (see Haggard & Eimer, 1999 and Soon et al., 2008). Immediately before the initiation of the action, a motor content would be generated once the specific situation and the context of action are established. We call this kind of motor execution, "immediate intention" (see Pacherie & Haggard, 2010). From a neural point of view, immediate intentions are conscious experiences tending to the action, generated by the motor systems of the middle frontal cortex. Psychologically, immediate intentions are predictive in the sense that they precede actions. But they also have an episodic, rather than abstract and semantic quality. However, human beings also have the capacity to be guided by plans and to apply temporally very distant intentions. These intentions are called prospective intentions. Bratman (1987) speaks of intentions directed towards the future and highlights the commitment to action that is characteristic of all intentions. In this commitment to action, there is a volitional dimension and a dimension centered on reasoning. In the volitional dimension, intentions are characterized by controlling behavior against ordinary desires, for being pro-attitudes. In the rational dimension, the intentions have a planning value. But between their initial formation and their eventual execution, the intentions possess a characteristic stability or inertia. In the absence of new relevant information, the intentions will resist reconsideration. But although non-reconsideration is the default option, as Bratman underlines, future-oriented intentions can not be irreversible since such irreversibility would be irrational once things can change. We are epistemically limited creatures, with limited temporal and cognitive resources to handle information, to deliberate about options, to extract consequences, etc. Thanks to the fact that we are capable of forming prospective intentions, we can withstand the pressure of time and the environment that forces us to a careless deliberation. And even if the intentions resist the reconsideration, they can be revoked. In this article, we propose an experiment in which a series of adolescent students have to predict the movements of an artificial agent in a Classic Wumpus World (CWW) environment and the movements of this agent in a slightly Modified Classic Wumpus World environment (MCWW). The agent has been programmed in a very cautious way in her movements. We compare if the movements predicted by the subjects for the agent adjust to the real movements given by the agent and if the failures produced in predicting such movements are mostly cautious or bold, analyzing the consequences. BRAIN – Broad Research in Artificial Intelligence and Neuroscience, Volume 9, Special Issue on Educational Psychology (April, 2018), ISSN 2067-8957 50 4. Methodology The participants were 34 healthy adolescent students (16 males, 18 females, mean age = 17.17, S.D. = 0.25) of Psychology as an optional choice in the second year of A-level of one high school in Madrid. Previously, the students received a class of 55 minutes at the computer, explaining the Classic version of Wumpus World (CWW) and also an instruction sheet. Then they had to use a Modified version of the Classic version of Wumpus World (MCWW) with the characteristics described above. Each student used a Dell desktop computer with ADM Athlon 64 microprocessor. The experiment studied the fidelity in the inference of the intentions of the artificial agent by the subjects both in a CWW environment and in a MCWW environment. Firstly, participants observed step by step how the cautious programmed agent was moving along the board of the CWW environment until the end of the round. The positions that the agent was occupying until the final movement were displayed on the screen. Next, each student started with a random board and participated in a round of the game, trying to predict the movements that the artificial agent would make. Below we can see, in table 1, an abbreviated list of the movements predicted by an experimental subject in the CWW condition (in bold, the movements predicted by the subject versus the real movements made by the programmed agent are marked): Table 1. Movements predicted by an experimental subject in the CWW condition. Step 1: Position 1,0 Position 1,0 Step 2: Position 2,0 Position 2,0 Step 3: Position 3,0 Position 3,0 Step 4: Position 4,0 Position 4,0 Step 5: Position 5,0 Position 5,0 Step 6: Position 6,0 Position 6,0 Step 7: Position 6,1 Position 7,0 Step 8: Position 7,1 Position 7,1 Step 9: Position 7,2 Position 6,1 Step 10: Position 6,2 Position 6,2 Step 11: Position 5,2 Position 5,2 Step 12: Position 4,2 Position 4,2 Step 13: Position 3,2 Position 3,2 Step 14: Position 2,2 Position 2,2 Step 15: Position 1,2 Position 1,2 Step 16: Position 1,1 Position 1,1 Step 17: Position 1,0 Position 1,0 Step 18: Position 0,0 Cave Position 0,0 Afterwards, each student observed a round of the game in the MCWW condition and participated in a new round trying to infer the movements that the programmed agent would make. Apart from statistically comparing the correct predictions and non-correct predictions of the subjects in the inference of the intentions of the artificial agent in both versions of the environment, a distinction was made between the failures caused by cautious movements and bold movements. Those movements farthest from the safe squares were considered as bold movements, that is, furthest from the agent's cave but closer to the cave of the Wumpus and to the squares occupied by pits. 5. Results We have used the SPSS 18.0 statistical program and we have obtained the percentage of correct predictions and non-correct predictions regarding the movements made by the artificial agent, as well as the percentage difference between cautious movements and bold movements predicted by the subjects. We have introduced other statistical descriptions and included the corresponding confidence intervals. To determine the significance of the differences between the CWW condition and the CMWW condition, we have applied various ANOVAs to correct and non-correct predictions and, within the non-correct predictions, to the difference between cautious predictions and bold predictions. Firstly, we look at the difference between correct and non-correct predictions. With respect to the correct predictions, in the CMWW condition, the total sample was 34 [N = 34]. There were 186 hits (20%) on the total movements. The sample had a mean 23.40 (SD = 9.33), 95% CI [4.80, 6.14]. In the CWW condition, the total sample was 34 [N = 34] and there were 517 hits (63%) on the total movements. The sample had a mean 15.50 (SD = 8.57), 95% CI [12.51, 18.49]. In the ANOVA of the comparison in correct predictions between CMWW and CWW, we found a significant difference in favor of CWW, as shown in the following Table 1. C. Pelta - Intention Reconsideration in Wumpus World And Intentional Inference in Adolescents 51 Table 2. Correct predictions ANOVA in conditions CMWW and CWW. SS df MS F Between Conditions 11.05 1 .50 3.80 Within Conditions 5.94 66 .13 Total 17.00 67 **p <0 .05 As for the non-correct predictions and with respect to the total movements, for [N = 34], in the CMWW condition there were 738 failures over a total of 924 movements (80%). In contrast, in the CWW condition, 304 failures were made on the total of 821 movements made (37%). In the CMWW condition, the mean was 21.71 (SD = 11.19), 95% CI [17.80, 25.61]. In the CWW condition, the mean was 8.94 (SD = 4.29), 95% CI [7.44, 10.44]. In the ANOVA of the comparison between CMWW and CWW with respect to the failures, we had a significant difference in favor of CMWW (see Table 2). Table 3. Non-correct predictions ANOVA in conditions CMWW and CWW. SS df MS F Between Conditions 2769.94 1 2769.94 38.54 Within Conditions 4742.94 66 71.86 Total 7512.88 67 **p <0 .05 In the total of failures we have distinguished between cautious and not cautious (bold) failures (the difference between the first and second types of failures was already indicated above). In the case of cautious or conservative failures, we have that for [N = 34], the mean was 7.70 (SD = .88), 95% CI [7.38, 8.01]. With regard to bold failures, the mean was 14.36 (SD = 10.01), 95% CI [10.81, 17.91]. Now let us analyze what kind of failures have occurred in each of the versions of Wumpus World. In the CMWW condition, the cautious failures were 258 over a total of 738 wrong movements (35%) and we have that, for [N = 34], the mean was 7.59 (SD = 2.59), 95% CI [6.68, 8.49]. The bold failures were 480 out of a total of 738 wrong movements (65%) and the mean was 13.82 (SD = 9.33), 95% CI [10.57, 17.08]. See the figure 4: Figure 4. Failures percentage in the CMWW condition. In the ANOVA that compares the two types of failures in the CMWW condition, we have found a significant difference in favor of the group of bold failures versus the group of cautious failures, as can be seen in Table 4. Table 4. Cautious failures versus bold failures ANOVA in the CMWW condition. SS df MS F Between Conditions 660.94 1 660.94 14.07 Within Conditions 3099.17 66 46.95 Total 3760.11 67 **p <0 .05 BRAIN – Broad Research in Artificial Intelligence and Neuroscience, Volume 9, Special Issue on Educational Psychology (April, 2018), ISSN 2067-8957 52 In the CWW condition, the cautious failures, for [N = 34], there were 139 out of a total of 304 wrong movements (46%) and the mean was 4.09 (SD = 2.22), 95% CI [3.31, 4.86] . On the other hand, the bold failures were 165 out of a total of 304 wrong movements (54%) and the mean was 4.85 (SD = 1.48), 95% CI [4.34, 5.37]. See the Figure 5: Figure 5. Failures percentage in the CWW condition In the ANOVA comparing the two types of failures in the CWW condition, there was not a significant difference in favor of the group of bold failures versus the cautious failures group. To conclude, we give an account of the comparative results between the CMWW and CMW conditions with regard to bold type failures. In the CMWW, 480 failures were given bold and in the CWW condition there were 165 bold failures. In the first case and for [N = 34], the mean was 14.12 (SD = 9.84), 95% CI [10.68, 17.55]. In the second case, for [N = 34], the mean was 4.91 (SD = 3.43), 95% CI [3.71, 6.11]. In the ANOVA comparing the bold failures in both conditions, a significant difference was found in favor of the CMWW condition (see table 5). Table 5. Cautious failures versus bold failures ANOVA in CMWW and CWW conditions. SS df MS F Between Conditions 1440.72 1 1440.72 104.69 Within Conditions 908.26 66 13.76 Total 2348.98 67 **p <0 .05 6. Discussion It is very indicative that, in the Classic version of Wumpus World or condition CWW, the number of total movements has always been lower than those produced in the modified condition or condition CMWW. And so, for example, when we recount the movements generated by the subjects in both conditions, and always for [N = 34], the subjects executed 821 total movements in the CWW condition compared to the 924 movements made in the CMWW condition (something else 10%). The difference in the number of failures produced in both conditions is still more significant. And so, in the CWW condition, there were only 304 failures (37% of total movements) compared to the 738 failures produced by the subjects in the CMWW condition (80% of total movements). In parallel, and as regards the correct predictions in the two conditions, the number of hits by the subjects in the CWW condition was very high-517 (63% of the total number of movements) - compared to only 186 hits by the participants in the CMWW condition (20% of the total movements). This result is very logical, given that in the Classic version or CWW condition the subjects did not hesitate, as they did not have to reconsider intentions and had more certainty to stick to. In short, we have verified that there was a notable statistical difference between the two conditions between the percentages of correct predictions and non-correct predictions, as well as, in failures, between those arising from bold versus cautious movements. In the CMWW condition, we C. Pelta - Intention Reconsideration in Wumpus World And Intentional Inference in Adolescents 53 have found a higher percentage of non-correct predictions than correct predictions and a higher percentage of failures produced by bold movements versus cautious movements. Even with the presence of a map incorporated into the agent's inference board, subjects have been much less cautious in anticipating the movements to be made by the artificial agent. In the ANOVAs we have found that such differences have been significant. According to Pacherie and Haggard (2010) it seems as if our group of adolescents had behaved like optimistic improvisers reconsidering intentions. Improvisers because they do not seem to think much about how the cautious artificial agent will act and optimistic because, significantly, their failures involve taking risks rather than being the product of cautious decisions. This should not surprise us based on what we discussed in the introduction about the impulsive decision-making by adolescents but, in any case, it shows how their behavior seems to move away from what Pacherie and Haggard called “neurotic planners”, that is, subjects who make extensive use of mental time travel to imaginatively combine and recombine possible situations and strategies for enacting their intentions. 7. Conclusion In this article, we have tried to demonstrate how a sample of adolescents was unable to predict the behavior of a cautious artificial agent in the Wumpus World environment of Artificial Intelligence when a series of initially given intentions had to be reconsidered. Numerous studies have shown how adolescents usually opt for risky decisions in multiple daily tasks (Duell et al. 2016; Blakemore 2017; Do et al. 2017) but until now this had not been verified in a task that put in contact the Psychology with the Artificial Intelligence in the field of the inference of intentions. The conclusion is that adolescents have committed more failures, and among these failures have committed more failures of type bold than of cautious type in their predictions of the movements of the artificial agent, in the Modified version or MCWW version that in the Classic version or CWW condition. These results would support the vision of adolescents as optimistic improvisers (see Pacherie & Haggard, 2010) in their decision-making processes. We have analyzed the ability of adolescents to predict the behavior of an artificially programmed agent in a cautious manner by concluding their preference for bold choices when they have had to reconsider intentions. But there is a wide field in the study of adolescent behavior using AI tools. For example, it seems proven that adolescent behavior is more impulsive and emotional under "hot" or arousing conditions (Figner et al., 2009; Somerville et al, 2011). A next step would be the design of computer games from which the decision-making behavior of adolescents was analyzed, introducing emotions and extreme conditions in those games. References Blakemore, S-J. (2017).Avoiding social risk in adolescence. In press in Current Directions in Psychological Science. Retrieved preprint from https://sites.google.com/site/blake- morelab/recent_publications. Bratman, M. (1987). Intention, plans and practical reasoning. Stanford: CSLI. Castelfranchi, C., Falcone, R.,& Piunti, M. (2006). Agents with anticipatory behaviors: to be cautious in a risky environment, Proceedings ECAI (Frontiers in AI and Applications), 693- 694. Chick, C.F., & Reyna, V.F.(2012).A fuzzy-trace theory of adolescent risk-taking: Beyond self- control and sensation seeking. In V.F. Reyna, S.Chapman, M. Dougherty,&J.Confrey (Eds.), The adolescent brain: Learning, reasoning, and decision making. Washington DC: APA, 379- 428. Do, K.T., Moreira, J.F., & Telzer, E.H. (2017). But is helping you worth the risk? Defining prosocial risk taking in adolescence, Developmental Cognitive Neuroscience, 25, 260-271.Retrieved from https://www.sciencedirect.com/science/article/pii/S1878929 316300561. Duell, N., Steinberg, I., Chein, J., Al-Hassam, S.M., Bacchini, D., Lei, C. et al. (2016). Interaction of reward seeking and self-regulation in the prediction of risk taking: a cross-national test of the BRAIN – Broad Research in Artificial Intelligence and Neuroscience, Volume 9, Special Issue on Educational Psychology (April, 2018), ISSN 2067-8957 54 dual systems model, Developmental Psychology, 52(10), 1593-1605. Retrieved from https://dx.doi.org/10.1037/dev 00 00152. Figner, B., Mackinlay, R.J., Wilkening, F., & Weber, E.U. (2009). Affective and deliberative processes in risky choice:age differences in risk taking in the Columbia card task, Journal of Experimental Psychology: Learning, Memory, and Cognition, 35(3), 709-730. Galván, A.(2010). Adolescent development of the reward system,Frontiers in Human Science 4(6), 1-9. Galván, A.(2016).Insights about adolescent behavior, plasticity,and policy from neuroscience research, Neuron, 83(2), 262-265. Retrieved from https://www.cell.com/neuron/pdf/S 0896-6273(14)00549-2.pdf. Galván, A. (2017). The neuroscience of adolescence. Cambridge: C.U.P. Galván, A., Hare, T., Voss, H., Glover, G. & Casey, B.J. (2007). Risk-taking and the adoles- cent brain: who is at risk?, Developmental Science, 10(2), F8-F14. Geier, C.F., Terwilliger, R., Teslovich, T, Velanova, K., & Luna, B.(2010).Immaturities in re- ward processing and its influence on inhibitory control in adolescence,Cerebral Cortex, 20, 1613-1629. González Marqués, J., & Pelta, C. (2010).Implementation intentions and artificial agents, Int. Journal of Psychology and Psychological Therapy, 10(1), 41-53. Haggard, P., & Eimer, M. (1999). On the relation between brain potentials and the awareness of voluntary movements, Experimental Brain Research, 126(1), 128-133. Kahneman, D.,& Lovallo, D.(1993).Timid choices and bold forecasts:a cognitive perspective on risk taking, Management Science, 39(1), 17-31. Kinny, D., & Georgeff, M. (1991). Commitment and effectiveness of situated agents, Proc.of the Twelfth International Joint Conference on Artificial Intelligence(IJCAI91),Sidney Australia. Luna, B., Paulsen, D.J, Padmanabhan, A.,&Geier, C.(2013).The teenage brain:cognitive control and motivation, Current Directions in Psychological Science, 22(2), 94-100. Pacherie, E.,& Haggard, P. (2010).What are intentions? In L.Nadel, & W.Sinnott-Armstrong. Conscious will and responsibility. A tribute to Benjamin Libet. Oxford: OUP, 70-84. Romer, D., Reyna, V., & Satterthwaite, T.D. (2016).Beyond stereotypes of adolescent risk taking: placing the adolescent brain in developmental context, Developmental Cogniti Neuroscience, 27(2017), 19-34. Retrieved from http://dx.doi.org./10.1016/j.den., 2017.07.007. Russell, S., & Norvig, P. (2006).Artificial intelligence: a modern approach.NJ:Prentice-Hall. Russell, S., & Wefald, E. (1991). Principles of metareasoning, Artificial Intelligence,49(1-3), 361-395. Small, S.A., Silverberg, S.B.,& Kerns, D. (1993). Adolescents´perceptions of the costs and benefits of engaging in health compromising behaviors,Journal of Youth and Adolescence, 22(1), 73- 87. Shulman, E.P., & Cauffman, E. (2014). Deciding in the dark: age differences in intuitive risk judgment, Developmental Psychology 50(1), 167. Shulman, E.P., Smith, A.R., Silva, K., Icenogle, G., Duell, N., Chein, J., & Steinberg,L.(2016). The dual systems model: review, reappraisal and reaffirmation, Develop. Cognitive Neuroscience, 17, 103-117. Somerville, L.H., Hare, T., & Casey, B. J.(2011).Frontostriatal maduration predicts cognitive control failure to appetitive cues in adolescents, Journal of Cognitive Neuroscience, 23(9), 2123- 2134. Retrieved from https://dx.doi.org./10.1162/jocn.2010.21572. Soon, C.S., Brass, M., Heinze, H.J., & Haynes, J.D. (2008). Unconscious determinants of free decisions in the human brain, Nature Neuroscience, 11(5), 543-545. To, E. (2010). Final Project Course CSC457. DePaul University.Retrieved from http://shrike. Depaul.edu/~yto/csc457/newProFinalProject_ericto.htm. Van Zee, M., & Icard, T. (2015). Intention reconsideration as metareasoning, BORM NIPS Workshop, 1-7. C. Pelta - Intention Reconsideration in Wumpus World And Intentional Inference in Adolescents 55 Wadman, M. (2018).Watching the teen brain grow, Science, 359(6373).Retrieved from https://Science.sciencemag.org/content/359/6371/13. Wilhelms, E., Corbin, J.C., & Reyna, V.F.(2015).Gist memory in reasoning and decision ma- king: age, experience and expertise. In A. Feeney, & V.F. Reyna (Eds.), Reasoning as memory. N. York: Psychology Press. Carlos PELTA received his PhD in Psychology (2015) from Complutense University of Madrid. His current research interests include different aspects of Artificial Intelligence applied in Psychology like the design of a computer system for teaching Psychology (PSICO-A). He has (co-) authored more than 30 papers and conferences presentations, being member in many International Societies and participating in several research projects. See code: MCWW (MODIFIED CLASSIC WUMPUS WORLD) - author: Carlos Pelta, on the website of the journal, attached to this article.