Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 1 Sentiment and Sentence Similarity as Predictors of Integrated and Independent L2 Writing Performance Kutay Uzun,1 Trakya University, Department of English Language Teaching, Turkey kutayuzun@trakya.edu.tr Ömer Gökhan Ulum2 Mersin University, Department of English Language Teaching, Turkey DOI: 10.35974/acuity.v7i2.2529 Abstract This study aimed to utilize sentiment and sentence similarity analyses, two Natural Language Processing techniques, to see if and how well they could predict L2 Writing Performance in integrated and independent task conditions. The data sources were an integrated L2 writing corpus of 185 literary analysis essays and an independent L2 writing corpus of 500 argumentative essays, both of which were compiled in higher education contexts. Both essay groups were scored between 0 and 100. Two Python libraries, TextBlob and SpaCy, were used to generate sentiment and sentence similarity data. Using sentiment (polarity and subjectivity) and sentence similarity variables, regression models were built and 95% prediction intervals were compared for integrated and independent corpora. The results showed that integrated L2 writing performance could be predicted by subjectivity and sentence similarity. However, only subjectivity predicted independent L2 writing performance. The prediction interval of subjectivity for independent writing model was found to be narrower than the same interval for integrated writing. The results show that the sentiment and sentence similarity analysis algorithms can be used to generate complementary data to improve more complex multivariate L2 writing performance prediction models. Keywords: EFL Writing Performance, Independent Writing, Integrated Writing, Sentiment Analysis, Sentence Similarity, Task Type INTRODUCTION Natural language processing (NLP), which deals with the computational analysis of human languages for both comprehension and production (Crystal, 2008), has been an ever-growing field of research since 1940’s. Since then, it has been used for purposes such as machine translation, speech recognition, part-of-speech tagging, sentiment analysis, language production (e.g. chat bots), topic modelling or automated question-answer systems from computer science to political science. Despite their wide use in various fields, including educational science (e.g. Crossley, Paquette, Dascalu, McNamara & Baker, 2016), foreign language writing research make limited use of state-of-the-art NLP applications in that most studies which utilize NLP seem to benefit from automated feedback/essay evaluation (e.g. Parra & Calero, 2019) and the computation of cohesion (e.g. Jung, Crossley & McNamara, 2019) or complexity indices (e.g. Casal & Lee, 2019) with a few exceptions such as DeCoursey and Hamad (2019), Hall and Sheyholislami (2013) and Wang (2020) who investigate sentiments in learner reflections, written feedback and syntactic complexity. Corresponding Author: Kutay Uzun, Trakya Universitesi, Kosova Yerleskesi, Eğitim Fakultesi, Oda No:G-06. email: kutayuzun@trakya.edu.tr https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 2 Emotions have been shown to influence second language acquisition (MacIntyre & Gregersen, 2012), vocabulary acquisition (Miller, Fox, Moser & Godfroid, 2018) and performance in foreign language tests and lexical decisions tasks (Dewaele & Alfawzan, 2018). Nonetheless, L2 writing seem to have fallen behind other aspects of language learning in terms of emotion research despite extensive studies on anxiety, a negative emotion, or related constructs such as motivation or attitude. Although these constructs have been studied for decades and fruitful discussions have emerged consequently, it is seen that most of those studies are limited to psychometric scales for the measurement of emotions (e.g. Cheng, 2004; Han & Hiver, 2018); therefore, they are not able to account for the instantaneous variations of those emotions. Moreover, the reflection of emotion or a related construct within the learner text is yet to be discovered except for Wang’s (2020) study. Another problematic area within L2 writing research is cohesion, or the general connectedness of the parts of a text. Traditionally, cohesion is investigated through explicit cues such as conjunctions or personal/demonstrative pronouns. However, cohesion can also be achieved implicitly and this cannot be tracked by traditional means of cohesion assessment. For this reason, certain computationally-available constructs such as type-token ratios, synonym overlap, connective frequency and semantic similarity within (and across, if necessary) texts should be used to assess cohesion (Crossley, Kyle & Dascalu, 2018). However, due to the limited amount of studies regarding each of these constructs, further research is still needed to see how they interact with other constructs regarding L2 writing. In addition to the necessity to study emotion and cohesion in computational terms, an important distinction in L2 writing lies within the difference between integrated and independent writing tasks, which are inherently different from one another. Integrated writing requires learner- writers to utilize primary and/or secondary sources of information for the completion of the task (Weigle & Parker, 2012). On the contrary, independent writing is exclusively based on the learner-writers personal experiences and available linguistic resources without necessitating any use of sources. As such, it differs from integrated writing in lexical, syntactic and lexicogrammatical terms (Kyle, 2020). The coverage of academic skills in integrated writing unlike its independent counterpart is among the major differences between two task types (Kyle, 2020). Related to this, integrated writing pieces include more specific lexis, longer words and a lower level of clausal complexity (Cumming et al., 2006; Kyle & Crossley, 2016). Biber, Gray and Staples (2016) also confirm more extensive use of clauses in independent writing and conclude that integrated writing is better marked by nouns, nominals, noun phrases and phrasal complexity. Guo, Crossley and McNamara (2013) also confirm the differences between integrated and independent writing by identifying content word familiarity, content word frequency, third-person singular verbs, base verbs and sentence similarity as predictors of integrated writing scores. On the other hand, independent writing score has been predicted by noun hypernymy, conditional connectives and average syllables per word in their study. Considering the limited use of NLP technology in foreign language research and the role of emotions in language performance, the amount and scope of the studies dealing with these concepts can be expanded. However, such an expansion should also consider the differences between integrated and independent writing tasks since they bear substantial differences. Therefore, this study aims to contribute to this expansion by searching for the potential connections among L2 writing performance (L2WP), sentiment and sentence similarity as manifested within English as a Foreign Language (EFL) learners’ texts, while comparing how these constructs interact with integrated and independent task performance. Sentiment Analysis and L2 Writing Performance https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 3 Sentiment is defined as an individual’s emotions, opinions, evaluations or beliefs manifested as language (Wiebe, Wilson, Bruce, Bell & Martin, 2004). Therefore, sentiment analysis (SA) is the systematic analysis of those constructs using NLP methods (Liu, 2010). The analysis of sentiments gives information about the polarity of emotions or opinions as positive, negative or neutral in the form of an index (Munezero, Montero, Sutinen & Pajunen, 2014). Sentiment analysis typically involves pre-processing and matching or classification stages to produce results. The pre-processing stage involves the removal of stop words (e.g. function words) and symbols and checking the subjectivity of the text. Then, polarity is computed based on a pre-labelled lexicon or machine learning classification algorithms which classify texts using polarity models (Kumar & Teeja, 2012). However, the removal of stop words in the pre- processing stage may not make a significant change in the accuracy of sentiment computation (Jianqiang & Xiaolin, 2017) or even reduce its accuracy (Ghosal, Das & Bhattacharjee, 2015). Numerous pre-labelled lexicons for sentiment analysis are available in the literature (Liu, 2010). For instance, Linguistic Inquiry and Word Count, The General Inquirer, Hu and Liu’s lexicon, The Affective Norms for English Words, SentiWordNet or SenticNet which can also utilize machine learning algorithms such as Naive-Bayes to automate labelling are the widely- used lexicons for sentiment analysis. These lexicons keep large lists of words and their sentiment orientations as classes (e.g. sad: negative, happy: positive) or indices (e.g. great: 3.1, tragedy: -3.4) and sentiment analysis algorithms compare texts to those lists to compute sentiment scores (Hutto & Gilbert, 2014). Although it is possible to run sentiment analysis with many programming languages, Python- based TextBlob and Vader libraries are the simplest ones to use (Kulkarni & Shivananda, 2019). Both libraries are based on Natural Language Tool Kit (NLTK), which is a high- powered Python package for language processing that is widely-used in research and industry (Bird, Loper & Klein, 2009). TextBlob produces polarity and subjectivity scores for sentiment analysis. The polarity score is between -1 and 1, -1 indicating total negativity and 1 indicating total positivity. A subjectivity score of 1 indicates total subjectivity while 0 indicates total objectivity (Loria, 2020). A library specifically developed for social media analysis, VADER produces separate positivity, neutrality and negativity scores between 0 and 1. Also, it normalizes these scores into a compound score between -1 and 1, -1 indicating total negativity and 1 indicating total positivity. For analysis, VADER can also use capitalization, punctuation and emoticons (e.g. “This is GOOD!!!” gives a higher positivity score than “This is good.”) (Hutto & Gilbert, 2014). Both libraries are widely used in computer science with limited use in other fields such as finance (e.g. Ranjan & Sood, 2019) or education (e.g. Peñafiel, Vásquez, Vásquez, Zaldumbide & Luján-Mora, 2018). Being related to motivation and self-regulation, emotion is considered as an individual difference in L2 writing (Kormos, 2012). In line with this, most emotion-related L2 writing research focus on anxiety (e.g. Cheng, 2004), attitude (e.g. Yoon & Hirvela, 2004) or motivation (e.g. Lo & Hyland, 2007). Indeed, many studies such as Graham, Berninger and Abbott (2012), Guo (2018) and Graham, Harris, Kiuhara and Fishman (2017) confirm that anxiety, attitude and motivation predict writing performance. Nonetheless, most studies on L2 writing rely solely on psychometric scales to measure emotional constructs; therefore, they cannot track or explain the momentary fluctuations in those emotions, which may affect written production partially or completely. Furthermore, if and how emotions are reflected in the written production itself are mostly left unclear. Given its potential for computer science, education and even clinical psychology (Provoost, Ruwaard, van Breda, Riper & Bosse, 2019), sentiment analysis can provide information for L2 writing researchers and practitioners regarding how emotions, stances or evaluations are reflected in texts. One such study utilizing SA in L2WP research is that of Wang (2020), which https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 4 analyses 2620 college-level essays written by Chinese learners of English and reaches the following conclusions: - Emotions as manifested in texts are influenced by the emotionality of writing topics. - Textual polarity and syntactic complexity are related. - Positive and negative emotions cause higher cognitive load and hinder L2WP. - Optimal performance is achieved through textual neutrality. To our knowledge, Wang’s (2020) study is the only one in the current literature which uses SA in relation to L2WP and it is limited to the syntactic complexity of texts written by Chinese learners of EFL. Findings parallel to Wang’s findings in L2 writing can be found in studies which test different skills using non-NLP methods. For instance, the effect of emotions on cognitive load and language performance has been confirmed for L2 listening (Chen & Chang, 2009), reading (Azamnouri, Pishghadam & Meidani, 2020) and vocabulary (Guo, Zou & Peng, 2018). Moreover, lack of objectivity, which is a standard in academic writing (Fulwiler, 2002; Richards & Miller, 2005) and also has cultural roots (Hinkel, 1999; Hwang & Lee, 2008), has been shown to result in lower essay scores among non-native writers of English since it results in an infrequency of proper evidence or justification for claims (Carlson, 1988 as cited in Hinkel, 1999). However, sentiment as measured via sentiment analysis is not a component in these studies and there seems to be no research in the literature regarding the construct and L2WP except for Wang’s study, which does not provide comparative results for integrated and independent writing. Semantic Sentence Similarity and L2 Writing Performance Semantic similarity is a comparative measure of semantic relatedness which evaluates semantic interactions among language units. In the process, taxonomic relationships and commonality are also considered on a hierarchical basis with corpus-based or knowledge-based methods (Harispe, Ranwez, Janaqi & Montmain, 2015; Turney & Pantel, 2010). Corpus-based methods extract contextual information from different corpora and use this information to measure semantic relatedness. Knowledge-based methods rely on WordNets, large lexical databases that also keep associations among words, to compute sentence similarity through the hierarchical relations among words. Corpus-based methods are considered more suitable to account for all semantic relations while knowledge-based methods serve better the purpose of encoding hierarchical relations. (Araque, Zhu & Iglesias, 2019). Both methods can be used separately or in combination in word, sentence, paragraph or document levels. Python libraries such as TextBlob (Loria, 2020), NLTK (Bird et al., 2009) or Spacy (Honnibal & Montani, 2017) can be used for similarity computations with only a few lines of code. These libraries produce scores between 0 and 1 where 0 indicates no similarity and 1 indicates sameness. For instance, the sentences “We should put an end to wars.” and “Let’s finish wars.” produce a similarity score of .87 using Spacy, indicating high similarity. Among NLP libraries, Spacy has been shown to be among the most accurate ones and the fastest one (Honnibal & Johnson, 2015). Crossley et al. (2018) suggest sentence similarity as an indicator of discourse cohesion. Cohesion refers to the connectedness of texts through surface elements, such as connectives or reference words, which make their meaning more accessible to readers (Bailey, 2011). It is considered to be an integral part of understanding how readers are guided by discourse features towards text comprehension (Baştürkmen & von Randow, 2014). Cohesion can be achieved grammatically through conjunctions, references, substitutions or ellipses, or lexically through collocations and reiterations (Grabe & Kaplan, 2014; Halliday & Matthiessen, 2014). Numerous studies indicate a relationship between cohesion and L2WP (e.g. Crossley et al., 2018; Crossley, Kyle & McNamara, 2016; McArthur, Jennings & Philippakos, 2019; Yang & Sun, 2012). https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 5 Despite the established relationship between cohesion and L2WP, Crossley et al. (2018) warn that the traditional measures of cohesion through overt elements (e.g. use of conjunctions) may be insufficient since it can be achieved explicitly or implicitly (Sanders & Maat, 2006) and in the latter case, the evaluation of cohesion becomes more difficult. For this reason, they propose an NLTK-based tool, TAACO, which assesses local (i.e. sentence-level) and global (i.e. paragraph-level) cohesion through connectives, type-token ratios, lexical overlap and sentence similarity to reveal underlying semantic relations among textual elements which constitute discourse cohesion. A part of cohesion, sentence similarity has been shown to be related to L2WP. For instance, Crossley and McNamara (2012) reveal a negative correlation between sentence similarity and essay score. In another study, they also find that sentence similarity predicts textual coherence (Crossley & McNamara, 2011). Guo, Crossley and McNamara (2013), Kyle (2020) and Plakans and Gebril (2017) conclude that sentence similarity can predict essay score in integrated tasks. In the light of these findings, sentence similarity is used in automated essay scoring (Roscoe, Crossley, Snow, Varner & McNamara, 2014) and feedback systems (e.g. Lee, Wong, Cheung & Lee, 2009). Nonetheless, Gu et al. (2013) seems to be the only study in the literature that provides comparative results for the predictive strength of sentence similarity in integrated and independent writing. Therefore, more research is thought to be beneficial to understand how sentence similarity interacts with integrated or independent essay quality in different contexts or genres. Purpose and Research Questions Considering the absence of a study searching for a link between sentiment and L2WP and the scarcity of those which link sentence similarity and L2WP, this study aims to contribute to the literature by showing if and how sentiment and sentence similarity can predict L2WP while comparing their predictive strengths in integrated and independent writing. The research questions are as follows: RQ1. Do EFL writers’ sentiments as manifested in their essays predict their L2WP? RQ2. Do the prediction intervals of the sentiment model differ in integrated and independent writing? RQ3. Do sentence similarity scores of EFL writers predict their L2WP? RQ4. Do the prediction intervals of the sentence similarity model differ in integrated and independent writing? METHODS Due to the computational nature of NLP operations (Crystal, 2008), a quantitative design was preferred. Sentiments, semantic sentence similarities and L2WP were treated numerically. The Corpora The corpus of integrated writing samples included 185 literary analysis essays (LAE) previously collected and scored in Author (2019) (n = 125) and Author (in review) (n = 60). It had 61871 words, giving an average of 334.44 words per essay. The essays typically included four to seven paragraphs, responding to an essay question directed towards how a particular theme is handled in a given literary work. As such, the LAE’s required writers to make use of primary and secondary sources for completion. The LAE’s were scored using the Genre-based Literary Analysis Essay Scoring Rubric (GLAESR). GLAESR is an analytical rubric that is used to score each rhetorical move in a LAE (stating the background, stating the thesis, https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 6 presenting arguments, supporting arguments, concluding arguments, consolidating the thesis, stating personal opinion) and produce a total score between 0 and 100 (Author, 2019). In both Author (2019) and Author (in review), scoring demonstrated interrater reliability as confirmed by Spearman’s Correlation Coefficients. For independent writing samples, 500 EFL essays from the International Corpus Network of Asian Learners of English were used (Ishikawa, 2018). The corpus as used in the study consisted of 114996 words with an average of 229.99 words per essay. The essays were reliably scored between 0 and 100 using the ESL Composition Profile which is an analytical rubric that is used to score writing samples according to content, organization, vocabulary, language use and mechanics (Jacobs, Zinkgraf, Wormouth, Hartfiel & Hughey, 1981). Post-hoc power analysis with G*Power (Faul, Erdfelder, Lang & Buchner, 2007) indicated that the sizes of the corpora were sufficient to achieve 100% statistical power for medium effects in all models. Both corpora were compiled in higher education contexts. Data Collection The data set for the study included the sentiment, sentence similarity and essay scores as provided in the corpora. To avoid computing errors, the authors initially ensured that there was a space after each punctuation mark in the corpus manually and each essay was stored as a .txt file with UTF-8 encoding. TextBlob was used for sentiment analysis (Loria, 2020); therefore, sentiments were obtained by having an algorithm (APPENDIX A) iterate through all files in the corpus directories and compute the polarity and subjectivity scores for each essay. For sentence similarity, Spacy was used with its largest model of the English language (en_core_web_lg) (Honnibal & Montani, 2017). To compute a mean sentence similarity value for each essay, an algorithm (APPENDIX B) was written by the authors. The algorithm worked as follows: 1. An essay was read. 2. The sentences in the essay were separated and stored in a list (i.e. tokenization). 3. Each sentence in the essay was compared to all the others in the same essay. 4. The result of each comparison (0.00-1.00) was stored in a list using the following criteria to avoid duplicate comparisons: a. Sentence similarity score should have been less than 1.00. b. Sentence similarity score (15 digits after decimal point) should not have already been in the list. 5. The mean sentence similarity score was produced for the essay from the sentence similarity scores in the list using NumPy (Oliphant, 2006). 6. The mean sentence similarity score for the essay was stored in a dictionary as “Filename: Sentence similarity Score”. 7. The process was repeated for the next essay in the corpus directory. Data Analysis https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 7 The algorithms for the computation of sentiments and sentence similarity were run on Jupyter Notebook (Kluyver et al., 2016). Linear regression analyses were run using JASP v0.12.2 (JASP Team, 2020) to see if sentiment and sentence similarity predicted essay scores since residual distribution in both models were normal (see Table 1), collinearity statistics were not problematic (see Table 2) and there was no heteroscedasticity (Larson-Hall, 2010). For sentiment, a multivariate model which included both polarity and subjectivity as predictor variables were tested. Sentence similarity was tested in a univariate model. Table 1. Skewness and Kurtosis Values for Model Residuals Corpus Model Skewness SE Kurtosis SE Integrated Sentiment -0.408 0.179 -0.354 0.355 Sentence similarity -0.360 0.179 -0.260 0.355 Independent Sentiment -0.489 0.109 1.049 0.218 Sentence similarity -0.404 0.109 0.844 0.218 Table 2. Tolerance and Variance Inflation Factor (VIF) Values Corpus Model Variable Tolerance VIF Integrated Sentiment Polarity 0.967 1.034 Subjectivity 0.967 1.034 Sentence similarity Sentence similarity N/A N/A Independent Sentiment Polarity 0.993 1.008 Subjectivity 0.993 1.008 Sentence similarity Sentence similarity N/A N/A Prediction strengths of the models were investigated through their 95% prediction intervals, which provide estimated ranges of actual essay scores with 95% confidence. The difference between the lower and upper bounds in each interval was calculated as the width of the interval, smaller numbers indicating narrower and more precise ranges. RESULTS Research Question 1 The first research question aimed to see if sentiment could predict integrated and independent essay scores. The descriptive results are given below in Table 3. Table 3. Polarity, Subjectivity and Essay Scores Corpus Variable M SD Min Max Integrated Polarity 0.08 0.15 -0.29 0.43 Subjectivity 0.51 0.09 0.26 0.74 Essay Score 55.01 17.98 8.00 97.00 Independent Polarity 0.12 0.13 -0.31 0.54 Subjectivity 0.52 0.08 0.29 0.89 Essay Score 62.77 14.21 7.90 95.00 https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 8 As seen in Table 3, neither integrated nor independent writing samples were visibly polarized with scores around 0 in both corpora. The subjectivity values in both corpora were also around the midpoint of 0.50. Regression results for the integrated writing sentiment model are tabulated in Table 4. Table 4. Regression Results for the Sentiment Model (Integrated) SS df MS F p Regression 2613.257 2 1306.629 4.180 .017 Residual 56890.721 182 312.586 Total 59503.978 184 R = .210, R² = .044, Adjusted R² = .033, RMSE = 17.68 As shown in the table, the multivariate sentiment model which included polarity and subjectivity scores as the predictors of integrated writing essay score was significant, explaining 4.4% of the variance (R² = .04, F(2, 182) = 4.18, p < .05). The coefficients for the sentiment model are given below in Table 5. Table 5. Coefficients for the Sentiment Model (Integrated) Variable B SE B β t p Constant 74.816 8.141 9.190 < .001 Polarity 6.941 8.699 0.059 0.798 .426 Subjectivity -39.713 15.338 -0.191 -2.589 .010 Analyses of the coefficients showed that polarity was not a significant predictor of essay score in the model (t = 0.80, p > .05). However, subjectivity was seen to be a significant negative predictor of integrated essay score (t = -2.59, p = .01). Regression results for the independent writing sentiment model are tabulated in Table 6. Table 6. Regression Results for the Sentiment Model (Independent) SS df MS F p Regression 1541.240 2 770.620 3.863 0.22 Residual 99153.164 497 199.503 Total 100694.404 499 R = .124, R² = .015, Adjusted R² = .011, RMSE = 14.12 The regression model showed that the sentiment model could significantly predict independent essay score, explaining 1.5% of the variance (R² = .02, F(2, 497) = 3.86, p < .05). The coefficients related to the model are presented below in Table 7. Table 7. Coefficients for the Sentiment Model (Independent) Variable B SE B β t p Constant 40.486 4.035 17.467 < .001 Polarity 9.397 5.061 0.083 1.857 .064 Subjectivity -17.164 7.729 -0.099 -2.221 .027 https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 9 Coefficient analysis showed that polarity was not a significant predictor of independent essay score (t = 1.86, p > .05). On the other hand, Subjectivity was found to be a significant negative predictor of independent essay score (t = -2.22, p < .05). Research Question 2 The second research question aimed to compare the 95% prediction intervals of the sentiment models for integrated and independent writing. The comparison is tabulated below in Table 8. Table 8. 95% Prediction Intervals for the Sentiment Model Essay Score M SD Min Max U Z p r Integrated 70.33 0.38 69.96 71.94 125250.00 20.113 < .001 0.77 Independent 55.67 0.14 55.56 57.18 As shown in the table, the mean 95% prediction interval for the independent essay scores was 14.66 points narrower than that of the integrated essay scores. The difference was statistically significant with a very large effect (Z = 20.11, p < .001). Research Question 3 The third research question aimed to see if sentence similarity could predict essay score in integrated and independent writing. The descriptive results are presented below in Table 9. Table 9. Sentence Similarities and Essay Scores Corpus Variable M SD Min Max Integrated Sentence Similarity 0.82 0.02 0.79 0.87 Essay Score 55.01 17.98 8.00 97.00 Independent Sentence Similarity 0.88 0.01 0.81 0.89 Essay Score 62.77 14.21 7.90 95.00 Considering that the maximum sentence similarity score could be 1.00, it was seen that the mean sentence similarity score was quite high in the data set for both groups, with a difference of 0.06. Regression results for the integrated writing sentence similarity model are given below in Table 10. Table 10. Regression Results for the Sentence Similarity Model (Integrated) SS df MS F p Regression 1482.015 1 1482.015 4.674 .032 Residual 58021.964 183 317.060 Total 59503.978 184 R = .158, R² = .025, Adjusted R² = .020, RMSE = 17.81 As seen in the table, sentence similarity could significantly predict essay score in integrated writing, explaining 2.5% of the variance (R² = .03, F(2, 182) = 4.67, p < .05). The coefficients for the model are presented in Table 11. https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 10 Table 11. Coefficients for the Sentence Similarity Model (Integrated) Variable B SE B β t p Constant -54.531 50.684 -1.076 < .001 Sentence Similarity 133.072 61.550 0.158 2.162 .032 In the coefficient analysis, it was seen that sentence similarity could predict integrated essay score with a constant of -54.53 and a Beta value of 133.07 (t = 2.16, p < .05). The regression results for the independent sentence similarity model are given below in Table 12. Table 12. Regression Results for the Sentence Similarity Model (Independent) SS df MS F p Regression 756.355 1 756.355 3.769 .053 Residual 99938.049 498 200.679 Total 100694.404 499 R = .087, R² = .008, Adjusted R² = .006, RMSE = 14.17 Analysis revealed that sentence similarity could not significantly predict independent essay score (F(1, 498) = 3.77, p > .05). Research Question 4 The fourth research question aimed to compare the 95% prediction intervals related to the sentence similarity models of integrated and independent writing. However, no comparison could be made since the variable could not significantly predict independent essay score. The 95% prediction intervals for the integrated essay scores in the data set were found to have a mean of 70.64 (SD = 0.14) with a minimum of 70.45 and a maximum of 71.18 points. DISCUSSION The study aimed to find out if sentiment and sentence similarity, computed via NLP methods, could predict integrated and independent L2WP. The results showed that the polarity component of sentiment could not predict L2WP in either task type; however, subjectivity was a significant negative predictor of both integrated and independent L2WP with a very small effect. The comparison of 95% prediction intervals showed that subjectivity as a negative predictor could predict L2WP in a much narrower range in independent writing. The second major finding obtained in the study was that mean sentence similarity could predict integrated L2WP significantly with a very small effect. The variable could not predict independent L2WP. The differences in integrated and independent writing as observed in the analyses confirmed Biber et al. (2016), Cumming et al. (2006), Kyle (2020) and Kyle and Crossley (2016) who also indicated varying features of the two task/L2WP types. Apparently, learner-writers undergo different thinking and written production processes during integrated and independent writing and this results in visible differences in terms of language use manifested as certain constructs such as word familiarity, verb use, subjectivity and sentence similarity. Regarding sentiment, it is known that emotions, stances or personal evaluations are among the individual differences in L2 writing (Kormos, 2012) and these constructs seem to be reflected in texts written by learners, making a difference in their L2WP. In the present study, https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 11 subjectivity was found to be a negative predictor of both integrated and independent writing, signalling that more subjective essays received lower scores. This finding can be considered parallel to that of Wang (2020) although it is limited to syntactic complexity. In both studies, and regardless of task type in the present study, textual objectivity seemed to result in increased performance. The reason why higher objectivity results in better performance in both integrated and independent writing can be related to the objectivity standard in essay writing (Fulwiler, 2002; Richards & Miller, 2005) as well as an increased cognitive load due to the emotionality as observed in learner texts. As suggested by Carlson (1988) and Hinkel (1999), a lack of objectivity in writing may indicate weaknesses in crucial concepts such as evidence or justification in texts. Considering that both integrated and independent corpora consisted of expository/argumentative writing tasks, evidence and justification was a required component in all essays. Successful justification of claims with or without source texts naturally requires an objective outlook which would allow learner-writers to present their arguments from multiple perspectives. In that respect, a high level of subjectivity may be signalling a lack of these justifications, resulting in lower essay scores in both integrated and independent writing. Moreover, positive and negative emotions increase cognitive load as concluded by Wang (2020). Defined in relation to working memory (Cooper, 1998), cognitive load is a crucial factor in L2WP because L2 writing, by itself, can overload working memory due to the intensity of the mental processes involved, resulting in poor performance and frequent errors (Nawal, 2018). In addition to the natural cognitive load of L2 writing, the added load due to the emotionality manifested as subjectivity in texts may have further impeded the working memory, resulting in lower scores in both corpora. Subjectivity as a negative predictor demonstrated higher prediction precision in independent writing than integrated writing. Although the data set used in this study is not sufficient to explore the reasons behind this difference, a plausible explanation may be that the source-based requirements of the literary analysis essay more readily push learners towards a certain level of objectivity while independent writing may be more flexible in that regard, allowing the learner-writer approach the objectivity issue more liberally while writing an essay based on life experiences and opinions. This may, therefore, result in a larger negative effect of subjectivity on essay scores since its excess has been documented to result in lower scores in early studies as well (e.g. Carlson, 1988). However, I believe a cross-comparison of integrated and independent writing samples in terms of objectivity and lexicogrammatical features is necessary for a more assertive conclusion. The results revealed sentence similarity as a positive predictor of integrated L2WP. However, the construct was not a significant predictor of independent L2WP. This finding corroborated those of Guo et al. (2013) which indicated the same result. In their study, Guo et al. explain the differences through the life experience and personal opinion-based nature of independent writing and the source-based nature of integrated writing which allows learner-writers to use the sources as models. Moreover, sentence similarity is a component within textual cohesion. Considering this, the necessity to integrate sources to produce a whole in integrated writing may be pushing writers to write more cohesive essays, which is also the case in expository writing (Crossley, 2013; Guo et al, 2013). Considering that the integrated writing corpora used in this study consisted exclusively of expository literary analysis essays, the same reason may have applied for the finding that revealed sentence similarity as a significant predictor of integrated writing performance but not of independent writing performance. As such this finding was also in line with Kyle (2020), Plakans and Gebril (2017) and Crossley and McNamara (2012), the last one of which indicating no relationship between sentence similarity and independent writing performance. https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 12 CONCLUSION The results of the study show that the subjectivity component of sentiment analysis can predict both integrated and independent L2 writing performance. In both task conditions, subjectivity serves as a negative predictor, indicating that more objective texts receive higher scores. The results also indicate the sentence similarity predicts only integrated L2 writing performance while it does not seem to be related to independent writing. As such, the findings bear importance as to the use of sentiment analysis in L2 writing performance research and confirm the previously proposed use of sentence similarity analysis within the same domain. Bearing the findings in mind, consciousness-raising interventions can be developed and applied by teachers and researchers to improve objectivity and integratedness in learner writing. Although effect sizes of the prediction equations were quite small in this study, the results revealed the contribution of these constructs to L2WP. The small effect sizes of the regression models should be treated with caution since prediction intervals in all models were rather wide in both integrated and independent corpora. In this regard, it is not recommended to attempt score predictions based solely on these variables. Instead, the variables should be seen complementary to more complex multivariate prediction models. Apart from sentiment and sentence similarity in particular, the results also confirm NLP in general as a beneficial tool for researchers of language learning/teaching as well as practitioners. Using NLP tools for the analysis of learner language seems to provide insights that may not be accessible through more traditional forms of data collection. Both automated and manual forms of written corrective feedback or assessment can benefit from the indices produced thanks to these tools. As shown in the literature and this study, task type influences how different variables interact with L2WP. In that respect, different genres should be tested using the same methodology for comparison purposes. Moreover, the data set used in this study cannot explain why objectivity can produce a narrower prediction interval for independent writing than integrated writing. For a thorough explanation, the lexicogrammatical features of highly objective and highly subjective texts should be compared in integrated and independent task conditions. REFERENCES Araque, O., Zhu, G., & Iglesias, C. A. (2019). A semantic similarity-based perspective of affect lexicons for sentiment analysis. Knowledge-Based Systems, 165, 346-359. Azamnouri, N., Pishghadam, R., & Meidani, E. N. (2020). The role of emotioncy in cognitive load and sentence comprehension of language learners. Issues in Language Teaching, 9(1), 29-55. https://doi.org/10.22054/ilt.2020.51543.485 Bailey, S. (2011). Academic writing: A handbook for international students (3rd ed.). Abingdon/New York, NY: Routledge. Baştürkmen, H., & von Randow, J. (2014). Guiding the reader (or not) to re-create coherence: Observations on postgraduate student writing in an academic argumentative writing task. Journal of English for Academic Purposes, 16, 14-22. Biber, D., Gray, B., & Staples, S. (2016). Predicting patterns of grammatical complexity across language exam task types and proficiency levels. Applied Linguistics, 37(5), 639-668. Bird, S., Loper, E., & Klein, E. (2009). Natural language processing with Python. Sebastopol: O'Reilly Media Inc. Carlson, S. (1988). Cultural differences in writing and reasoning skills. In A. C. Purver (Ed), Writing across languages and cultures: Issues in contrastive rhetoric (pp. 109-137). Newbury Park, CA: Sage. https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 13 Casal, J. E., & Lee, J. J. (2019). Syntactic complexity and writing quality in assessed first-year L2 writing. Journal of Second Language Writing, 44, 51-62. Chen, I., Chang, C. (2009). Cognitive Load Theory: An Empirical Study of Anxiety and Task Performance in Language Learning. Electronic Journal of Research in Educational Psychology, 7(18), 729-746. http://dx.doi.org/10.25115/ejrep.v7i18.1369 Cheng, Y. S. (2004). A measure of second language writing anxiety: Scale development and preliminary validation. Journal of Second Language Writing, 13(4), 313-335. https://doi.org/10.1016/j.jslw.2004.07.001 Connor, U. (1996). Contrastive rhetoric: Cross-cultural aspects of second language writing. Cambridge, England: CUP. Cooper, G. (1998, December). Research into cognitive load theory and instructional design at UNSW. Sydney, Australia: University of New South Wales. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.470.3428&rep=rep1&type= pdf Crossley, S. & McNamara, D. (2011). Text coherence and judgments of essay quality: models of quality and coherence. In L. Carlson, C. Hoelscher, & T. F. Shipley (Eds.), Proceedings of the 29th annual conference of the cognitive science society (pp. 1236- 1241). Austin, TX: Cognitive Science Society. Crossley, S. A. (2013). Advancing research in second language writing through computational tools and machine learning techniques: A research agenda. Language Teaching, 46(2), 256-271. Crossley, S. A., & McNamara, D. S. (2012). Predicting second language writing proficiency: The roles of cohesion and linguistic sophistication. Journal of Research in Reading, 35(2), 115-135. Crossley, S. A., Kyle, K., & McNamara, D. S. (2016). The development and use of cohesive devices in L2 writing and their relations to judgments of essay quality. Journal of Second Language Writing, 32, 1–16. https://doi.org/10.1016/j.jslw.2016.01.003 Crossley, S., Paquette, L., Dascalu, M., McNamara, D. S., & Baker, R. S. (2016). Combining click-stream data with NLP tools to better understand MOOC completion. In Proceedings of the sixth international conference on learning analytics & knowledge (pp. 6-14). ACM. https://doi.org/10.1145/2883851.2883931 Crystal, D. (2008) Dictionary of linguistics and phonetics (6th ed.). Oxford: Blackwell. Cumming, A., Kantor, R., Baba, K., Eouanzoui, K., Erdosy, U., & Jamse, M. (2005). Analysis of discourse features and verification of scoring levels for independent and integrated prototype written tasks for the new TOEFL®. ETS Research Report Series, 2005(1), i- 77. DeCoursey, C. A., & Hamad, A. N. (2019). Emotions across the essay: What second-language writers feel across four weeks’ writing a research essay. English Studies at NBU, 5(1), 114-134. https://doi.org/10.33919/esnbu.19.1.6 Dewaele, J. M., & Alfawzan, M. (2018). Does the effect of enjoyment outweigh that of anxiety in foreign language performance? Studies in Second Language Learning and Teaching, 8(1). https://doi.org/10.14746/ssllt.2018.8.1.2 Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39, 175-191. Fulwiler, T. (2002). College writing: A personal approach to academic writing (3rd ed.). Portsmouth, NH: Heinemann Boynton/Cook Ghosal, T., Das, S. K., & Bhattacharjee, S. (2015). Sentiment analysis on (Bengali horoscope) corpus. In 2015 Annual IEEE India Conference (INDICON) (pp. 1-6), New Delhi, India. https://doi.org/10.1109/INDICON.2015.7443551. https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 14 Grabe, W., & Kaplan, R. B. (2014). Theory and Practice of Writing. Abingdon/New York, NY: Routledge. Graham, S., Berninger, V., & Abbott, R. (2012). Are attitudes toward writing and reading separable constructs? A study with primary grade children. Reading & Writing Quarterly, 28(1), 51–69. Graham, S., Harris, K. R., Kiuhara, S. A., & Fishman, E. J. (2017). The relationship among strategic writing behavior, writing motivation, and writing performance with young, developing writers. The Elementary School Journal, 118(1), 82-104. Guo, J. D. (2018). Effect of EFL writing self-concept and self-efficacy on writing performance: Mediating role of writing anxiety. Foreign Language Research, 2, 69-74. Guo, J., Zou, T., & Peng, D. (2018). Dynamic influence of emotional states on novel word learning. Frontiers in Psychology, 9, 1-12. https://10.3389/fpsyg.2018.00537 Guo, L., Crossley, S. A., & McNamara, D. S. (2013). Predicting human judgments of essay quality in both integrated and independent second language writing samples: A comparison study. Assessing Writing, 18(3), 218–238. https://10.1016/j.asw.2013.05.002 Hall, C., & Sheyholislami, J. (2013). Using appraisal theory to understand rater values: An examination of rater comments on ESL test essays. Journal of Writing Assessment, 6(1), 1-17. Halliday, M. A., & Matthiessen, C. M. (2014). Halliday's introduction to functional grammar. Oxford: Routledge. Han, J., & Hiver, P. (2018). Genre-based L2 writing instruction and writing-specific psychological factors: The dynamics of change. Journal of Second Language Writing, 40(1), 44-59. https://doi.org/10.1016/j.jslw.2018.03.001 Harispe S., Ranwez S., Janaqi S., & Montmain J. (2015). Semantic similarity from natural language and ontology analysis. Synthesis Lectures on Human Language Technologies, 8, 1–254. https://doi.org/10.2200/S00639ED1V01Y201504HLT027 Hinkel, E. (1999). Objectivity and credibility in L1 and L2 academic writing. Culture in second language teaching and learning. Cambridge: Cambridge University Press. Honnibal, M., & Johnson, M. (2015). An improved non-monotonic transition system for dependency parsing. In L. Màrquez, C. Callison-Burch, & J. Su (eds.), Proceedings of the 2015 conference on empirical methods in natural language processing (pp. 1373- 1378). Lisbon, Portugal: Association for Computational Linguistics. Honnibal, M., & Montani, I. (2017). spaCy 2: Natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. https://spacy.io/ Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI Hwang, S. & Lee, M. (2008). Syntactic and referential markers ensuring objectivity in EFL essay writing. English Teaching, 63(4), 29-47. Ishikawa, S. (2018). The ICNALE edited essays; A dataset for analysis of L2 English learner essays based on a new integrative viewpoint. English Corpus Studies, 25, 117-130. Jacobs, H.L., Zinkgraf, S.A., Wormouth, D.R., Hartfiel, V.F., & Hughey, J.B. (1981). Testing ESL composition: A practical approach. Rowely, MA: Newbury House. JASP Team (2020). JASP (Version 0.12.2)[Computer software]. Retrieved from https://jasp- stats.org/ Jianqiang, Z., & Xiaolin, G. (2017). Comparison Research on Text Pre-processing Methods on Twitter Sentiment Analysis. IEEE Access, 5, 2870-2879. https://doi.org/10.1109/ACCESS.2017.2672677. https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 15 Jung, Y., Crossley, S., & McNamara, D. (2019). Predicting Second Language Writing Proficiency in Learner Texts Using Computational Tools. The Journal of Asia TEFL, 16(1), 37-52. https://dx.doi.org/10.18823/asiatefl.2019.16.1.3.37 Kluyver, T., Ragan-Kelley, B., Pérez, F., Granger, B. E., Bussonnier, M., Frederic, J., ... & Ivanov, P. (2016). Jupyter Notebooks-a publishing format for reproducible computational workflows. In F. Loizides & B. Schmidt, Positioning and power in academic publishing: Players, agents and agendas (pp. 87-90). Amsterdam: IOS Press. Kormos, J. (2012). The role of individual differences in L2 writing. Journal of Second Language Writing, 21, 390-403. https://doi.org/10.1016/j.jslw.2012.09.003. Kulkarni, A., & Shivananda, A. (2019). Natural Language Processing Recipes. Unlocking Text Data with Machine Learning and Deep Learning using Python. Berkeley, CA: Apress. https://doi.org/10.1007/978-1-4842-4267-4. Kumar, A., & Sebastian, T. M. (2012). Sentiment analysis: A perspective on its past, present and future. International Journal of Intelligent Systems and Applications, 4(10), 1-14. Kyle, K. (2020). The relationship between features of source text use and integrated writing quality. Assessing Writing, 45, 100467. https://doi.org/10.1016/j.asw.2020.100467 Kyle, K., & Crossley, S. (2016). The relationship between lexical sophistication and independent and source-based writing. Journal of Second Language Writing, 34, 12- 24. Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. New York: Routledge Lee, C., Wong, K. C., Cheung, W. K., & Lee, F. S. (2009). Web-based essay critiquing system and EFL students' writing: A quantitative and qualitative investigation. Computer Assisted Language Learning, 22(1), 57-72. Liu, B. (2010). Sentiment analysis and subjectivity. In N. Indurkhya & F. J. Damerau (eds.), Handbook of Natural Language Processing (2nd ed.) (pp. 627–666). Boca Raton, FL: Chapman & Hall/CRC. Lo, J., & Hyland, F. (2007). Enhancing students’ engagement and motivation in writing: The case of primary students in Hong Kong. Journal of Second Language Writing, 16, 219- 237. https://doi.org/10.1016/j.jslw.2007.06.002 Loria, S. (2020). TextBlob Documentation (Release 0.16.0). Retrieved from https://buildmedia.readthedocs.org/media/pdf/textblob/latest/textblob.pdf MacIntyre, P. D., & Gregersen, T. (2012a). Emotions that facilitate language learning: The positive-broadening power of the imagination. Studies in Second Language Learning and Teaching, 2, 193-213. doi: 10.14746/ssllt.2012.2.2.4 McArthur, C.A., Jennings, A. & Philippakos, Z.A. (2019) Which linguistic features predict quality of argumentative writing for college basic writers, and how do those features change with instruction? Reading and Writing, 32, 1553–1574. https://doi.org/10.1007/s11145-018-9853-6 Miller, Z. F., Fox, J., Moser, J. S., & Godfroid, A. (2018). Playing with fire: Effects of negative mood induction and working memory on vocabulary acquisition. Cognition and Emotion, 32, 1105–1113. https://doi.org/10.1080/02699931.2017.1362374. Munezero, M., Montero, C. S., Sutinen, E., & Pajunen, J. (2014). Are They Different? Affect, Feeling, Emotion, Sentiment, and Opinion Detection in Text. IEEE Transactions of Affective Computing, 5(2), 101-111. https://doi.org/10.1109/TAFFC.2014.2317187. Nawal, A. F. (2018). Cognitive load theory in the context of second language academic writing. Higher Education Pedagogies, 3(1), 385-402. https://doi.org/10.1080/23752696.2018.1513812 Oliphant, T. E. (2006). A guide to NumPy. USA: Trelgol Publishing. https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 16 Parra G., L., & Calero S., X. (2019). Automated writing evaluation tools in the improvement of the writing skill. International Journal of Instruction, 12(2), 209-226. https://doi.org/10.29333/iji.2019.12214a Peñafiel, M., Vásquez, S., Vásquez, D., Zaldumbide J., & Luján-Mora, S. (2018). Data Mining and Opinion Mining: A Tool in Educational Context. In Proceedings of the 2018 International Conference on Mathematics and Statistics (ICoMS 2018) (pp. 74-78). New York, NY: Association for Computing Machinery https://doi.org/10.1145/3274250.3274263 Plakans, L., & Gebril, A. (2017). Exploring the relationship of organization and connection with scores in integrated writing assessment. Assessing Writing, 31, 98–112. doi:10.1016/j.asw.2016.08.005 Provoost, S., Ruwaard, J., van Breda, W., Riper, H., & Bosse, T. (2019). Validating automated sentiment analysis of online cognitive behavioral therapy patient texts: An exploratory study [Provisional PDF]. Frontiers in Psychology, 10. https://doi.org/10.3389/fpsyg.2019.01065 Ranjan, S., & Sood, S. (2019). Investor community sentiment analysis for predicting stock price trends. International Journal of Management, Technology and Engineering, 9(5), 6012-6020. Richards, J. C., & Miller, S. K. (2005). Doing academic writing in education: Connecting the personal and the professional. Mahwah, NJ: Erlbaum Roscoe, R. D., Crossley, S. A., Snow, E. L., Varner, L. K., & McNamara, D. S. (2014). Writing quality, knowledge, and comprehension correlates of human and automated essay scoring. In W. Eberle & C. Boonthum-Denecke (eds.), Proceedings of the 27th International Florida Artificial Intelligence Research Society (FLAIRS) Conference (pp. 393-398). Palo Alto, CA: AAAI Press. Sanders, T., & Maat, H. P. (2006). Cohesion and coherence: Linguistic approaches. Reading, 99, 440–466. Turney, P. D., & Pantel, P. (2010). From frequency to meaning: Vector space models of semantics. Journal of Artificial Intelligence Research, 37, 141-188. Wang, Y. (2020). Emotion and syntactic complexity in L2 writing: A corpus-based study on Chinese college-level students’ English writing. The Asian Journal of Applied Linguistics, 7(1), 1-17. Weigle, C. S., & Parker, K. (2012). Source text borrowing in an integrated reading/writing assessment. Journal of Second Language Writing, 21(2), 118-133. https://doi.org/10.1016/j.jslw.2012.03.004. Wiebe, J., Wilson, T., Bruce, R., Bell, M., & Martin, M. (2004). Learning subjective language. Computational linguistics, 30(3), 277-308. Yang, W., & Sun, Y. (2012). The use of cohesive devices in argumentative writing by Chinese EFL learners at different proficiency levels. Linguistics and education, 23(1), 31-48. Yoon, H., & Hirvela, A. (2004). ESL student attitudes toward corpus use in L2 writing. Journal of Second Language Writing, 13, 257–283. https://doi.org/10.1016/j.jslw.2004.06.002 https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 17 APPENDIX A. Sentiment Analysis Algorithm import os import glob from textblob import TextBlob #Researchers can use the same algorithm by simply changing the file path below. os.chdir(r'C:\Corpus_Directory') corpus = glob.glob('*.txt') for essay in range(len(corpus)): f = open(corpus[essay], encoding='utf-8') content = f.read() text = TextBlob(content) sentiment_score = text.sentiment f.close() print(corpus[essay], sentiment_score) https://jurnal.unai.edu/index.php/acuity Acuity: Journal of English Language Pedagogy, Literature, and Culture. Vol.7 No. 1 2022 https://jurnal.unai.edu/index.php/acuity 18 APPENDIX B. Sentence Similarity Algorithm import os import glob import spacy import numpy as np nlp = spacy.load("en_core_web_lg") #Researchers can use the same algorithm by simply changing the file path below. os.chdir(r'C:\Corpus_Directory') corpus = glob.glob('*.txt') similarity_list = [] similarity_results = {} for essay in range(len(corpus)): f = open(corpus[essay], encoding='utf-8') content = f.read() doc = nlp(content) sentences = list(doc.sents) for sentence1 in sentences: for sentence2 in sentences: similarity = sentence1.similarity(sentence2) if similarity < 1.0 and similarity not in similarity_list: similarity_list.append(similarity) similarity_results[f] = np.mean(similarity_list, dtype=np.float64) f.close() https://jurnal.unai.edu/index.php/acuity APPENDIX A. Sentiment Analysis Algorithm APPENDIX B. Sentence Similarity Algorithm