International Journal on Advances in ICT for Emerging Regions 2019 12 (1): September 2019 International Journal on Advances in ICT for Emerging Regions Classifying Sentences in Court Case Transcripts using Discourse and Argumentative Properties G. Rathnayake#1, T. Rupasinghe #2, N. De Silva #3 , M. Warushavithana #4, V. Gamage #5, M. Perera #6, A.S. Perera #7 Abstract—Information that is available in court case transcripts which describes the proceedings of previous legal cases are of significant importance to legal officials. Therefore, automatic information extraction from court case transcripts can be considered as a task of huge importance when it comes to facilitating the processes related to the legal domain. A sentence can be considered as a fundamental textual unit of any document which is made up of text. Therefore, analyzing the properties of sentences can be of immense value when it comes to information extraction from machine-readable text. This paper demonstrates how the properties of sentences can be used to extract valuable information from court case transcripts. As the first task, the sentence pairs were classified based on the relationship type which can be observed between the two sentences. There, we defined relationship types that can be observed between sentences in court case transcripts. A system combining a machine learning model and a rule-based approach was used to classify pairs of sentences according to the relationship type. The next classification task was performed based on whether a given sentence provides a legal argument or not. The results obtained through the proposed methodologies were evaluated using human judges. To the best of our knowledge, this is the first study where discourse relationships between sentences have been used to determine relationships among sentences in legal court case transcripts. Similarly, this study provides novel and effective approaches to identify argumentative sentences in court case transcripts. Keywords— discourse relations, natural language processing, machine learning, support vector machine I. INTRODUCTION Case Law can be described as a part of common law, consisting of judgments given by higher (appellate) courts in interpreting the statutes (or the provisions of a constitution) applicable in cases brought before them [1]. In order to make use of the case law, lawyers and other legal officials have to manually go through related court cases to find relevant information. This task requires a significant amount of effort and time. Therefore, automatic extraction of Information from legal court case transcripts would generate numerous benefits to the people working in the legal domain. From this point onwards, we are referring to the court case In the process of extracting information from legal court cases, it is important to identify how arguments and facts are related to one another. The objective of this study is to automatically determine the relationships between sentences which can be found in documents related to previous court cases of the United States Supreme Court. Transcripts of U.S. court cases were obtained from FindLaw following a method similar to numerous other artificial intelligence applications in the legal domain [2]–[6]. When a sentence in a court case is considered, it may provide details on arguments or facts related to a particular legal situation. Some sentences may elaborate on the details provided in the previous sentence. It is also possible that the following sentence may not have any relationship with the details in the previous sentence and may provide details about a completely new topic. Another type of relationship is observed when a sentence provides contradictory details to the details provided in the previous sentence. Determining these relationships among sentences is vital to identifying the information flow within a court case. To that end, it is important to consider the way in which clauses, phrases, and text are related to each other. It can be argued that identifying relationships between sentences would make the process of Information Extraction from court cases more systematic given that it will provide a better picture of the information flow of a particular court case. To achieve this objective, we used discourse relations-based approach to determine the relationships between sentences in legal documents. Several theories related to discourse structures have been proposed in recent years. Cross-document Structure Theory (CST) [7], Penn Discourse Tree Bank (PDTB) [8], Rhetorical Structure Theory (RST) [9], [10] and Discourse Graph Bank [11] can be considered as prominent discourse structures. The main difference that can be observed between each of these discourse structures is they have defined the relation types in a different manner. This is mainly due to the fact that different discourse structures are intended for different purposes. In this study, we have based the discourse structure on the discourse structure proposed by CST. A sentence in a court case transcript can contain different types of details such as descriptions of a scenario, legal arguments, legal facts or legal conditions. The main objective of identifying relationships between sentences is to determine which sentences are connected together within a single flow. If there is a weak or no relation between two sentences, it would probably infer that those two sentences provide details on different topics. Consider the following sentence pair taken from Lee v. United States [12] shown in Example 1. It can be seen that sentence 1.2 elaborates further on the details provided by sentence 1.1 to give a more comprehensive idea on the topic which is discussed in sentence 1.1. These two sentences are connected to each other within the same flow of information. This can be considered as Elaboration relationship, which is a relationtype described in CST. Now, consider the following Manuscript received on 25 Feb. 2019. Recommended by Prof. G.K.A. Dias on 13 June 2019. This paper is an extended version of the paper “Identifying Relationships Among Sentences in Court Case Transcripts Using Discourse Relations” presented at the ICTer 2018. G. Rathnayake, T. Rupasinghe, N. De Silva, M. Warushavithana, V. Gamage and A.S. Perera are from the Department of Computer Science & Engineering, University of Moratuwa Sri Lanka. (gathika.14@cse.mrt.ac.lk, thejanrupasinghe.14@cse.mrt.ac.lk, nisansaDdS@cse.mrt.ac.lk, menuka.14@cse.mrt.ac.lk, viraj.14@cse.mrt.ac.lk, shehan@cse.mrt.ac.lk). M. Perera is from University of London International Programmes University of London . (madhaviperera58@gmail.com). Classifying Sentences in Court Case Transcripts using Discourse and Argumentative Properties 2 International Journal on Advances in ICT for Emerging Regions September 2019 sentence pair which was also taken from Lee v. United States [12]: In this example, it is evident that the two sentences have the Follow Up relationship as defined in CST. But still, these two sentences are connected together within the same information flow in a court case. Also, there are situations where we can see sentences are showing characteristics which are common to multiple discourse relations. Therefore, several discourse relations can be grouped together based on their properties to make the process of determining relationships between sentences in court case transcripts more systematic. The two sentences for Example 3 below were also taken from Lee v. United States [12]: The sentence 3.2 follows sentence 3.1. A significant connection between these two sentences cannot be observed. It can also be observed that sentence 3.2 starts a new flow by deviating from the topic discussed in sentence 3.1. These observations which were provided by analyzing court cases emphasize the importance of identifying relationships between sentences. In order to identify the relationships among sentences, we defined the relationship types which are important to be considered when it comes to information extraction from court cases. Next, for each of the relationship type defined, we identified the relevant CST relations [7]. Finally, we developed a system to predict the relationship between given two sentences of a court case transcript by combining a machine learning model and a rule-based component. Identifying sentences which provide legal arguments can be considered as another vital task when it comes to legal information extraction based on properties related to sentences. Identifying such arguments from previous court cases can hugely benefit legal officials when handling a new legal scenario. In order to have a clear picture of argumentative sentences, it is vital to understand the structure of a US court case transcript. To that end, the following major sections can be observed. 1) Summary of the case 2) Opinion of the Court 3) Concurring Opinions 4) Dissenting Opinions At the beginning of a court case transcript, a Summary of the Case which presents an overview of the case, main argument, and the decision of the court is presented. Then the Opinion of the Court section brings out the decision of the majority of the judges with the facts and arguments supporting the particular decision. If there are Concurring and Dissenting Opinions, they are presented after the Opinion of the Court. Concurring Opinion section is present in cases where there exist one or more judges who agree with the decision of the court but states different or additional reasons for the decision. Dissenting Opinion section is present in cases where there exist one or more judges who disagree with the Opinion of the Court and brings out reasons for the disagreement. The description in the court case transcript contains valuable statements presented in the court by the major parties involved in the legal scenario. Some of these statements are in the form of legal arguments. Other statements provide background information which can be considered as mere facts, which are mainly intended to support a legal argument. Such facts can be considered as non-arguments. Furthermore, the decisions of the court can also be considered as non-arguments. The following example contains statements in taken from Lee v. United States [12] can be used to properly understand the difference between argumentative sentences and non-argumentative sentences. Example 4: ● Argument: Lee contends that he can make this showing because he never would have accepted a guilty plea had he known the result would be deportation. ● Fact: Petitioner Jae Lee moved to the United States from South Korea with his parents when he was 13. ● Court’s Decision: The District Court, however, denied relief, and the Sixth Circuit affirmed Therefore, identifying argumentative sentences from nonargumentative sentences can be considered as a task of significant importance. In this study, a rule-based approach based on linguistic features was used to determine whether a given sentence provides a legal argument or not. Section II provides an overview of the related work done on identifying relationships among sentences and legal information extraction. Section III describes the methodology which was followed when implementing the proposed systems. Section IV describes the approaches we took to evaluate the proposed methodologies. The results obtained by evaluating the system are analyzed in Section IV. Finally, we conclude our discussion in Section V. Example 1: ● Sentence 1.1: The Government makes two errors in urging the adoption of a per se rule that a defendant with no viable defense cannot show prejudice from the denial of his right to trial. ● Sentence 1.2: First, it forgets that categorical rules are ill suited to an inquiry that demands a “case-by-case examination” of the “totality of the evidence”. Example 2: ● Sentence 2.1: Courts should not upset a plea solely because of post hoc assertions from a defendant about how he would have pleaded but for his attorney’s deficiencies. ● Sentence 2.2: Rather, they should look to contemporaneous evidence to substantiate a defendant’s expressed preferences. Example 3: ● Sentence 3.1: The question is whether Lee can show he was prejudiced by that erroneous advice. ● Sentence 3.2: A claim of ineffective assistance of counsel will often involve a claim of attorney error “during the course of a legal proceeding”–for example, that counsel failed to raise an objection at trial or to present an argument on appeal. 3 G. Rathnayake#1, T. Rupasinghe#2, N. De Silva#3 , M. Warushavithana #4, V. Gamage #5, M. Perera #6, A.S. Perera #7 September 2019 International Journal on Advances in ICT for Emerging Regions II. BACKGROUND Information Extraction from machine-readable text can be considered as an integral aspect when it comes to applying artificial intelligence and computer science to various domains. The processes related to Information Extraction creates new challenges each time they are being applied to a new domain, due to the domain-specific nature of the text and documents. The legal domain can be considered as such a challenging domain when it comes to Natural Language Processing, mainly due to the nature of legal documents, which employ a vocabulary of mixed origin ranging from Latin to English [5]. This challenging nature has stimulated the emergence of legal domain specific works related to different areas such as information extraction [3], information organization [2], [4] and sentiment analysis [13]. As a major task in this study, we attempt to identify the relationships among sentences in court case transcripts. Understanding how information is related to each other in machine-readable texts has always been a challenge when it comes to Natural Language Processing. Determining the way in which two textual units are connected to each other is helpful in different applications such as text classification, text summarization, understanding the context, evaluating answers provided for a question. Analyzing discourse relationships or rhetorical relationships between sentences can be considered as an effective approach to understanding the way how two textual units are connected with each other. Discourse relations have been applied in different application domains related to NLP. [14] describes CST [7] based text summarization approach which involves mechanisms such as identifying and removing redundancy in a text by analyzing discourse relations among sentences. [15] compares and evaluates different methods of text summarizations which are based on RST [10]. In another study [16], text summarization has been carried out by ranking sentences based on the number of discourse relations existing between sentences. [17]–[19] are some other studies where discourse analysis has been used for text summarization. These studies related to text summarization suggest that discourse relationships are useful when it comes identifying information that discusses on same topic or entity and also to capture information redundancy. Analysis of discourse relations has also been used for question answering systems [20], [21] and for natural language generation [22]. In the study [23], discourse relations existing between sentences are used to generate clusters of similar sentences from document sets. This study shows that a pair of sentences can show properties of multiple relation types which are defined in CST [7]. In order to facilitate text clustering process, discourse relations have been redefined in this study by categorizing overlapping or closely related CST relations together. In [24], the discourse relationships which are defined in [23] have been used for text summarization based on text clustering. The studies [23], [24] emphasize how discourse relationships can be defined according to the purpose and objective of the study in order to enhance effectiveness. When it comes to applying discourse relations into the legal domain, [25] discusses the potential of discourse analysis for extracting information from legal texts. [26] describes a classifier which determines the rhetorical status of a sentence from a corpus of legal judgments. In this study, the rhetorical annotation scheme is defined for legal judgments. The study [27] provides details on the summarization of legal texts using rhetorical annotation schemes. The studies [26], [27] focus mainly on the rhetorical status in a sentence, but not on the relationships between sentences. An approach which can be used to detect the arguments in legal text using lexical, syntactic, semantic and discourse properties of the text is described in [28]. In contrast to these studies, our study is intended to identify relationships among sentences in court case transcripts by analyzing discourse relationships between sentences. Identifying relationships among sentences will be useful in the task of determining the flow of information within a court case. Extracting argumentative sentences from court case transcripts is another significant task when it comes to information extraction in the legal documents. Various researches have been carried out on automatic extraction of arguments from legal texts. The study [29] by Wyner, et al. brings out extensive background research on the literature of argumentation and argument extraction with an analysis of various argument corpora. Araucaria [30], [31] is a database of arguments from various sources and a tool for diagramming and representing arguments. In another study [30], Reed and Rowe, introducing Araucaria tool, point out that arguments can be graphically represented in a tree, where premises are being branched off of conclusions. Arguments in AraucariaDB are manually annotated and marked up in an XML-based format, AML (Argument Markup Language). The study by Wyner et al [29] also presents out how legal arguments can be extracted, using a Context-Free Grammar. It describes legal argument construction patterns, to identify premises and conclusions, which they came up with, by analyzing legal cases from ECHR (European Court of Human Rights). Studies [28], [32] on legal argument automatic detection is also done on ECHR cases. Moens, et al. describe argument detection as a sentence classification problem between arguments and non-arguments [28]. There a classifier is trained on a set of manually annotated arguments, considering sentences in isolation. They have evaluated different feature sets involving lexical, syntactic, semantic and discourse properties of the texts. In the study [32] Mochales and Moens, points out that arguments are always formed by premises and conclusions. So they have determined argument extraction as a sentence classification problem among premises, conclusions, and non-arguments. Furthermore, they have improved the feature set used in [28] by including features that refer to content in previous sentences. All these researches, done on argument extraction, have used ECHR cases as their corpus. To the best of our knowledge, there has been no research carried out about argument extraction from US court case transcripts. Argument patterns identified in the study [29] are very rigid and they have specifically been identified for ECHR cases. The reporting structures in US Court Case transcripts are significantly different from ECHR case reports. Therefore, the rules that are described in the study [29] are not directly applicable for extracting argumentative sentences from US Court Cases. Also, to the best of our knowledge, there is no existing annotated corpus which contains argumentative sentences extracted from US court cases. The consequence is machine learning approaches described in previous studies on argument identification in ECHR cases [28], [32] cannot be used. Therefore, it is needed to come up with novel ways to Classifying Sentences in Court Case Transcripts using Discourse and Argumentative Properties 4 International Journal on Advances in ICT for Emerging Regions September 2019 identify arguments and non-arguments in US court case transcripts. III. METHODOLOGY A. Defining Discourse Relationships observed in Court Cases Five major relationship types were defined by examining the nature of relationships that can be observed between sentences in court case transcripts. ● Elaboration - One sentence adds more details to the information provided in the preceding sentence or one sentence develops further on the topic discussed in the previous sentence. ● Redundancy - Two sentences provide the same information without any difference or additional information. ● Citation - A sentence provides references relevant to the details provided in the previous sentence. ● Shift in View - Two sentences are providing conflicting information or different opinions on the same topic or entity. ● No Relation - No relationship can be observed between the two sentences. One sentence discusses a topic which is different from the topic discussed in the other sentence. After defining these relationships, we adopted the rhetorical relations provided by CST [7] to align with our definitions as shown in Table I. TABLE I ADOPTING CST RELATIONS Definition CST Relationships Elaboration Paraphrase, Modality, Subsumption, Elaboration, Indirect Speech, Follow-up, Overlap, Fulfillment, Description, Historical Background, Reader Profile, Attribution Redundancy Identity Citation Citation Shift in View Change of Perspective, Contradiction No Relation - It is very difficult to observe the same sentence appearing more than once within nearby sentences in court case transcripts. However, we have included Redundancy as a relationship type in order to identify redundant information in a case where the two sentences in a sentence pair are the same. B. Expanding the Dataset A Machine Learning model was developed in order to determine the relationship between two sentences in court cases. We used the publicly available dataset of CST bank [33] to learn the Model. The dataset obtained from CST bank contains sentence pairs which are annotated according to the CST relation types. Since we have a labeled dataset [33], we performed supervised learning to develop the machine learning model. Support Vector Machine (SVM) was used as SVMs have shown promising results in previous studies where discourse relations have been used to identify relationships between sentences [23], [24]. Table II provides details on the number of sentence pairs in the data set for each relationship type. TABLE II NUMBER OF SENTENCE PAIRS FOR EACH RELATIONSHIP TYPE CST Relationship Number of Sentence Pairs Identity 99 Equivalence 101 Subsumption 590 Contradiction 48 Historical Background 245 Modality 17 Attribution 134 Summary 11 Follow-up 159 Indirect Speech 4 Elaboration 305 Fulfillment 10 Description 244 Overlap (Partial Equivalence) 429 By examining the CST relationship types available in the dataset as shown in Table II, it can be observed that a relationship type which suggests that there is no relationship between sentences cannot be found. But No Relation is a fundamental relation type that can be observed between two sentences in court case transcripts. Therefore, we expanded the data set by manually annotating 50 pairs of sentences where a relationship between two sentences cannot be found. This new class was named as No Relation. The 50 sentence pairs which were annotated were obtained from previous court case transcripts. A sentence pair is made up of a source sentence and a target sentence. The source sentence is compared with the target sentence when determining the relationship that is present in the sentence pair. For example, if the source sentence contains all the information in target sentence with some additional information, the sentence pair is said to have the Subsumption relationship. Similarly, if the source sentence elaborates the target sentence, the sentence pair is said to have the Elaboration relationship. C. Determining the relationship between sentences using SVM Model In order to train the SVM model with annotated data, features based on the properties that can be observed in a pair of sentences were defined. Before calculating the features related to words, we removed stop words in sentences to eliminate the effect of less significant words. Also, coreferencing was performed on a given pair of sentences using Stanford CoreNLP CorefAnnotator (“coref”) [34] in order to make feature calculations more effective. The two sentences for Example 4 are also taken from Lee v. United States [12], Example 4: ● Sentence 4.1 (Target): Petitioner Jae Lee moved to the United States from South Korea with his parents when he was 13. ● Sentence 4.2 (Source): In the 35 years he has spent in this country, he has never returned to South Korea, nor has he become a U. S. citizen, living instead as a lawful permanent resident. 5 G. Rathnayake#1, T. Rupasinghe#2, N. De Silva#3 , M. Warushavithana #4, V. Gamage #5, M. Perera #6, A.S. Perera #7 September 2019 International Journal on Advances in ICT for Emerging Regions Here the “Petitioner Jae Lee” in the target sentence, is referred using the pronouns “he” and “his” in both sentences. As all these words are referring to the same person, the system replaces “he” and “his” with their representative mention “Petitioner Jae Lee”. Then the sentences in Example 4 are changed as shown below. Example 4 (updated): ● Sentence 4.1 (Target): Petitioner Jae Lee moved to the United States from South Korea with Petitioner Jae Lee parents when Petitioner Jae Lee was 13. ● Sentence 4.2 (Source): In the 35 years Petitioner Jae Lee has spent in this country, Petitioner Jae Lee has never returned to South Korea, nor has Petitioner Jae Lee become a U. S. citizen, living instead as a lawful permanent resident. By resolving co-references calculating Noun Similarity, Verb Similarity, Adjective Similarity, Subject Overlap Ratio, Object Overlap Ratio, Subject Noun Overlap Ratio and Semantic Similarity features between two sentences are made more effective. All the features are calculated and normalized such that their values fall into [0,1] range. We have defined 9 feature categories based on the properties that can be observed in a pair of sentences. Following 5 feature categories were adopted mainly from [23] though we have done changes in implementation such as use of co-referencing. 1) Cosine Similarities Following cosine similarity values are calculated for a given sentence pair, ● Word Similarity ● Noun Similarity ● Verb Similarity ● Adjective Similarity Following equation is used to calculate the above- mentioned cosine similarities. 𝐶𝑜𝑠𝑖𝑛𝑒 𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡 = ∑𝑛𝑖=1 𝐹𝑉𝑆,𝑖 ∗ 𝐹𝑉𝑇,𝑖 √∑𝑛 𝑖=1 (𝐹𝑉𝑆,𝑖 ) 2 ∗ √∑𝑛𝑖=1 (𝐹𝑉𝑇,𝑖 ) 2 Here FVS,i and FVT,i represents frequency vectors of source sentence and target sentence respectively. Stanford CoreNLP POS Tagger (“pos”) [30] is used to identify nouns, verbs and adjectives in sentences. In calculating the Noun Similarity feature, singular and plural nouns, proper nouns, personal pronouns and possessive pronouns are considered. Both superlative and comparative adjectives are considered when calculating the Adjective Similarity. The system ignores verbs that are lemmatized into “be”, “do”, “has” verbs when calculating Verb Similarity feature as the priority should be given to effective verbs in sentences. 2) Word Overlap Ratios Two ratios are considered based on the word overlapping property. One ratio is measured in relation to the target sentence. Other ratio is measured in relation to the source sentence. These ratios provide an indication of the equivalence of two sentences. For example, when it comes to a relationship like Subsumption, source sentence usually contains all the words in the target sentence. This property will be also useful in determining relations such as Identity, Overlap (Partial Equivalence) which are based on the equivalence of two sentences. 𝑊𝑂𝑅(𝑇) = 𝐶𝑜𝑚𝑚(𝑇, 𝑆) 𝐷𝑖𝑠𝑡𝑖𝑛𝑐𝑡(𝑇) 𝑊𝑂𝑅(𝑆) = 𝐶𝑜𝑚𝑚(𝑇, 𝑆) 𝐷𝑖𝑠𝑡𝑖𝑛𝑐𝑡(𝑆) WOR(S), WOR(T) represents the word overlap ratios measured in relation to source and target sentences respectively. Distinct(S), Distinct(T) represents the number of distinct words in source sentence and target sentence respectively. The number of distinct common words between two sentences are shown by Comm(T, S). 3) Grammatical Relationship Overlap Ratios Three ratios which represent the grammatical relationship between target and source sentences are considered. ● Subject Overlap Ratio 𝑆𝑢𝑏𝑗𝑂𝑣𝑒𝑟𝑙𝑎𝑝 = 𝐶𝑜𝑚𝑚(𝑆𝑢𝑏𝑗(𝑆), 𝑆𝑢𝑏𝑗(𝑇)) 𝑆𝑢𝑏𝑗(𝑆) ● Object Overlap Ratio 𝑂𝑏𝑗𝑂𝑣𝑒𝑟𝑙𝑎𝑝 = 𝐶𝑜𝑚𝑚(𝑂𝑏𝑗(𝑆), 𝑂𝑏𝑗(𝑇)) 𝑂𝑏𝑗(𝑆) ● Subject Noun Overlap Ratio 𝑆𝑢𝑏𝑗𝑁𝑜𝑢𝑛𝑂𝑣𝑒𝑟𝑙𝑎𝑝 = 𝐶𝑜𝑚𝑚(𝑆𝑢𝑏𝑗(𝑆), 𝑁𝑜𝑢𝑛(𝑇)) 𝑆𝑢𝑏𝑗(𝑆) All these features are calculated with respect to the source sentence. Subj, Obj, Noun represents the number of subjects, objects and nouns respectively. Comm gives the number of common elements. Stanford CoreNLP DependencyParseAnnotator (“depparse”) [36] is used here to identify subjects and objects. All the subject types including nominal subject, clausal subject, their passive forms and controlling subjects are taken into account in calculating the number of subjects. Direct and indirect objects are considered when calculating the number of objects. All subject and object types are referred from Stanford typed dependencies manual [37]. 4) Longest Common Substring Ratio Longest Common Substring is the maximum length word sequence which is common to both sentences. When the number of characters in longest common substring is taken as n(LCS) and the number of characters in source sentence is taken as n(S), Longest Common Substring Ratio (LCSR) can be calculated as, 𝐿𝐶𝑆𝑅 = 𝑛(𝐿𝐶𝑆) 𝑛(𝑆) This value indicates the part of the target sentence which is present in the source sentence as a fraction. Thus, this will be useful especially in determining discourse relations such as Overlap, Attribution and Paraphrase. Classifying Sentences in Court Case Transcripts using Discourse and Argumentative Properties 6 International Journal on Advances in ICT for Emerging Regions September 2019 5) Number of Entities The ratio between the number of named entities can be used as a measurement of the relationship between two sentences. 𝑁𝐸𝑅𝑎𝑡𝑖𝑜 = 𝑁𝐸(𝑆) 𝑀𝑎𝑥(𝑁𝐸(𝑆), 𝑁𝐸(𝑇)) NE represents the number of named entities in a given sentence. Stanford CoreNLP Named Entity Recognizer (NER) [33] was used to identify named entities which belong to 7 types; PERSON, ORGANIZATION, LOCATION, MONEY, PERCENT, DATE and TIME. In addition to the features mentioned above, following features have been introduced to the system. 1) Semantic Similarity between Sentences This feature is useful in determining the closeness between two sentences. This feature will provide the semantic closeness between two sentences. A method described in [39] is adopted when calculating the semantic similarity between two sentences. Semantic similarity score for a pair of words is calculated using WordNet::Similarity [40]. 𝑆𝑐𝑜𝑟𝑒 = 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 (∑ 𝑛 𝑖=1 𝑁𝑜𝑢𝑛𝑆𝑐𝑜𝑟𝑒 + ∑ 𝑛 𝑖=1 𝑉𝑒𝑟𝑏𝑆𝑐𝑜𝑟𝑒) 2) Transition Words and Phrases Availability of a transition word or a transition phrase at the start of a sentence indicates that there is a high probability of having a strong relationship with the previous sentence. For example, sentences beginning with transition words like “And”, “Thus” usually elaborates the previous sentence. Phrases like “To make that”, “In addition” at the beginning of a sentence also implies that the sentence is elaborating on the details provided in the previous sentence. Considering these linguistic properties two Boolean features were defined. i. Elaboration Transition If the first word of the source sentence is a transition word which implies elaboration such as “and”, “thus”, “therefore” or if a transition phrase is found within first six words of the source sentence, this feature will output 1. If both of the above two conditions are false, the feature will return 0. Two lists containing 59 transition words and 91 transition phrases which imply elaboration were maintained. Though it is difficult to include all transition phrases in the English language which implies elaboration relationship, we can clearly say that if these phrases are present at the beginning of a sentence, the sentence is more than likely to elaborate the previous sentence. ii. Follow-up Transition If the source sentence begins with words like “however”, “although” or phrases like “in contrast”, “on the contrary” which implies that the source sentence is following up the target sentence, this feature will output 1. Otherwise, the feature will output 0. 3) Length Difference Ratio This feature considers the difference of lengths between the source sentence and the target sentence. When length(S), length(T) represents the number of words in source sentence and target sentence respectively, Length Difference Ratio (LDR) is calculated as shown below. 𝐿𝐷𝑅 = 0.5 + 𝑙𝑒𝑛𝑔𝑡ℎ(𝑆) − 𝑙𝑒𝑛𝑔𝑡ℎ(𝑇) 2 ∗ 𝑀𝑎𝑥(𝑙𝑒𝑛𝑔𝑡ℎ(𝑆), 𝑙𝑒𝑛𝑔𝑡ℎ(𝑇)) In a relationship like Subsumption, the length of the source sentence has to be more than the length of the target sentence. In Identity relationship, both sentences are usually of the same length. These properties can be identified using this feature. 4) Attribution This feature checks whether one sentence describes a detail in another sentence in a more descriptive manner. Within this feature, we check whether a word or phrase in one sentence is cited in the other sentence. This is also a Boolean feature. The source sentence and target sentence for Example 5 were obtained from Turner v. United States [41]: Example 5: ● Sentence 5.1 (Target): Such evidence is ’material’ . . . when there is a reasonable probability that, had the evidence been disclosed, the result of the proceeding would have been different. ● Sentence 5.2 (Source): A ’reasonable probability’ of a different result is one in which the suppressed evidence ’undermines confidence in the outcome of the trial. It can be seen that source sentence defines or provides more details on what is meant by “reasonable probability” in the target sentence. Such properties can be identified using this feature. D. Determining Explicit Citation Relationships in Court Case Transcripts In legal court case documents, several standard ways are used to point out whence a particular fact or condition was obtained. The target sentence and source sentence in Example 6 are obtained from Lee v. United States [12]. Example 6: ● Sentence 6.1 (Target): The decision whether to plead guilty also involves assessing the respective consequences of a conviction after trial and by plea. ● Sentence 6.2 (Source): See INS v. St. Cyr, 533 U. S. 289, 322-323 (2001). The two sentences given in Example 6 are adjacent to each other. It can be clearly seen that the source sentence provides a citation for the target sentence. This is only one of the many ways of providing citations in court case transcripts. After observing different ways of providing citations in court case transcripts, a rule-based mechanism to detect such citations was developed. If this rule-based system detects that there is a citation relationship, the pair of sentences will be assigned with the citation relationship. Such a pair of sentences will not be inputted to the SVM model for further processing. From this point onward, this system, which is 7 G. Rathnayake#1, T. Rupasinghe#2, N. De Silva#3 , M. Warushavithana #4, V. Gamage #5, M. Perera #6, A.S. Perera #7 September 2019 International Journal on Advances in ICT for Emerging Regions intended to identify relationships among sentences will be referred as Sentence Relationship Identifier. E. Extracting argumentative sentences from Court Case Transcripts Two major approaches are followed to extract arguments with the consultation of a legal expert. 1) Linguistically identifying arguments using verbs: At first, words such as argue, agree, conclude, rejected, contest, contend, consider, testify, concede, claim, affirm were considered to identify legal arguments. As examples consider the following sentences taken from Lee v. United States [12]: (i) A claim of ineffective assistance of counsel will often involve a claim of attorney error “during the course of a legal proceeding”–for example, that counsel failed to raise an objection at trial or to present an argument on appeal. (ii) Lee contends that he can make this showing because he never would have accepted a guilty plea had he known the result would be deportation. (iii) In post conviction proceedings, they argued that seven specific pieces of withheld evidence were both favorable to the defense and material to their guilt under Brady v. Maryland, 373 U. S. 83. (iv) The D. C. Superior Court rejected petitioners’ Brady claims, finding that the withheld evidence was not material. The D. C. Court of Appeals affirmed. Here, sentence 1 and 4 cannot be considered as arguments. Sentence 1 represents an opinion and sentence 4 brings out the decision of the court. By observing the selected sentences, we refined the word list and decided to only consider verbs to identify arguments. Lemmatized form of verbs in a sentence is extracted using Stanford PoS Tagger [35] and then compared with the predefined list of verbs to check whether the sentence brings out an argument. 2) Citation-based argument extraction: In a legal case, there are statements with citations which link to previous cases, in which the judgments have already been finalized. Those statements come under the category of case law. If a statement is having a citation, it means that the statement is taken from a previous case and that the same statement applies to current legal case as well. Therefore, the lawyers can present the same argument in other legal cases to prove their point. Consider the following examples yet again taken from Lee v. United States [12], • The decision whether to plead guilty also involves assessing the respective consequences of a conviction after trial and by plea. See INS v. St. Cyr, 533 U. S. 289, 322-323. • But in this case counsel’s “deficient performance arguably led not to a judicial proceeding of disputed reliability, but rather to the forfeiture of a proceeding itself.” FloresOrtega, 528 U. S., at 483. The first statement links to INS v. St. Cyr, 533 U. S. which means that it is taken from the cited case. And the same statement can be presented as an argument in any other case if the statement is appropriate under the conditions of the legal case. We have taken a rule-based approach to identify these kinds of arguments. IV. RESULTS A. Results obtained from Sentence Relationship Identifier In order to determine the effectiveness of our system, it is important to carry out evaluations using legal court case transcripts, as it is the domain this system is intended to be used. Court case transcripts related to United States Supreme Court were obtained from Findlaw. Then the transcripts were preprocessed in order to remove unnecessary data and text. Court case title, section titles are some examples of details which were removed in the preprocessing process. Those details are irrelevant when it comes to determining relationships between sentences. The relationship types of sentence pairs were assigned using the system. First, the pairs were checked for citation relationship using the rule-based approach. The relationship types of the sentence pairs where citation relationship couldn’t be detected using the rule-based approach were determined using the Support Vector Machine model. The results obtained using the system for the sentence pairs extracted from the court case transcripts were then stored in a database. From those sentence pairs, 200 sentence pairs were selected to be annotated by human judges. Before selecting 200 sentence pairs, the sentence pairs were shuffled to eliminate the potential bias that could have been existent due to a particular court case. Shuffling was helpful in making sure that the sentence pairs to be annotated by human judges were related to different court case transcripts. Then the selected 200 pairs of sentences to be annotated were grouped together as clusters of five sentence pairs. Each cluster was annotated by two human judges who were trained to identify the relationships between sentence pairs as defined in this study. As expected, the redundancy relationships between sentences could not be observed within the sentence pairs which were annotated using human judges. From the 200 sentence pairs that were observed, our system did not predict Redundancy relationship for any sentence pair. Similarly, human judges did not assign the Redundancy relationship to any sentence pair. The confusion matrix which was generated according to the results obtained is given in Table III. The details provided in the matrix are based only on the sentence pairs that were agreed by two human judges to have the same relationship type. The reasoning behind this approach is to eliminate sentence pairs where there are ambiguities of the relationship type between them. The same approach was used to obtain the results which are presented in Table IV. In contrast, Table V contains results obtained by considering sentence pairs where at least one of the two judges who annotated the pair agrees upon a particular relationship type. The Recall results given in Table IV has a significant importance as all the sentence pairs contained in that results set are annotated with a relationship type which was agreed by two human judges. The Precision results provided in Table V indicate the probability of at least one human judge agreeing with the system’s predictions in relation to each relationship type. Evaluation results from Table IV, Table V shows that the system works well when identifying Elaboration, No Relation and Citation relationship types where F-measure values are above 75% in all cases. Shift in View relationship type was not assigned by the system to any Classifying Sentences in Court Case Transcripts using Discourse and Argumentative Properties 8 International Journal on Advances in ICT for Emerging Regions September 2019 of the 200 sentence pairs which were considered in the evaluation. Human vs Human correlation and Human vs System correlation when it comes to identifying these relationship types were also analyzed. First, we calculated these correlations without considering the relationship type using the following approach. For a given sentence pair P, m(P) is the value assigned to the pair. n is the number of sentence pairs. 1) Human vs Human Correlation ( Cor(H,H) ) When both human judges are agreeing on a single relationship type for the pair P, we assign m(P) = 1. Otherwise, we assign m(P) = 0. 𝐶𝑜𝑟(𝐻, 𝐻) = ∑𝑛𝑃=1 𝑚(𝑃) 𝑛 2) Human vs System Correlation ( Cor(H,S) ) When both human judges are agreeing with the relationship type predicted by the system for the sentences pair P, we assign m(P) = 1. If only one human judge is agreeing with the relationship type predicted by the system for P, we assign m(P) = 0.5. If both human judges disagree with the relationship type predicted by the system for P, we assign m(P) = 0. 𝐶𝑜𝑟(𝐻, 𝑆) = ∑𝑛𝑃=1 𝑚(𝑃) 𝑛 TABLE III CONFUSION MATRIX Predicted Actual E la b o r a ti o n N o R e la ti o n C it a ti o n S h if t- in -V ie w ∑ Elaboration 93.9 6.1 0.0 0.0 99 No Relation 11.9 88.1 0.0 0.0 42 Citation 0.0 4.8 95.2 0.0 21 Shift in View 100 0.0 0.0 0.0 3 ∑ 101 44 20 0 165 TABLE IV RESULTS COMPARISON OF PAIRS WHERE BOTH JUDGES AGREE Discourse Class Precision Recall F-Measure Elaboration 0.921 0.939 0.930 No Relation 0.841 0.881 0.861 Citation 1.000 0.952 0.975 Shift in View - 0 - The following results could be observed after calculating the correlations, ● The correlation between a human judge and another human judge = 0.805 ● The correlation between a human judge and the system = 0.813 TABLE V RESULTS COMPARISON OF PAIRS WHERE AT LEAST ONE JUDGE AGREES Discourse Class Precision Recall F-Measure Elaboration 0.930 0.902 0.916 No Relation 0.846 0.677 0.752 Citation 1.000 0.910 0.953 Shift in View - 0 - When analyzing these two correlations, it can be seen that our system performs with a capability which is near to the human capability. The results obtained by calculating Human vs. Human and Human vs. System correlations in relation to each relationship type are given in Table VI. The following approach was used to calculate these two correlations for each relationship type. Consider the relationship type R, Let, ● S = The set containing all the sentences pairs which are predicted by the system as having the relationship type R ● U = The set containing all the sentences pairs which were annotated by at least one human judge as having the relationship type R. ● V = The set containing all the sentences pairs which were annotated by two human judges as having the relationship type R. Corr(H,H) represents Human vs Human correlation and Corr(H,S) represents Human vs System correlation. For a given set A, n(A) indicates the number of elements in set A. 𝐶𝑜𝑟𝑟(𝐻, 𝐻) = 𝑛(𝑉) 𝑛(𝑈) 𝐶𝑜𝑟𝑟(𝐻, 𝐻) = 𝑛(𝑆 ∩ 𝑈) 𝑛(𝑆 ∪ 𝑈) The results obtained using this approach is provided in Table VI. TABLE VI CORRELATIONS BY TYPE Discourse Class Human- Human Human- System 𝐇𝐮𝐦𝐚𝐧 − 𝐒𝐲𝐬𝐭𝐞𝐦 𝐇𝐮𝐦𝐚𝐧 − 𝐇𝐮𝐦𝐚𝐧 Elaboration 0.750 0.843 1.124 No Relation 0.646 0.603 0.933 Citation 1.000 0.955 0.955 Shift in View 0.188 0.000 0.000 The results which are in Table VI suggest that the system performs with a capability which is near to the human capability when it comes to identifying relationships such as Elaboration, No Relation and Citation. Enhancing system’s ability to identify Shift in View relationship is one of the major future challenges. At the same time, Human vs Human correlation when it comes to identifying Shift in View relationship type is 0.188. This indicates that humans are also having ambiguities when identifying Shift in View relationships between sentences. Either Elaboration or Shift in View relationship occurs when two sentences are discussing the same topic or entity. Shift in View relationship occurs over Elaboration when two sentences are providing different views or conflicting facts on 9 G. Rathnayake#1, T. Rupasinghe#2, N. De Silva#3 , M. Warushavithana #4, V. Gamage #5, M. Perera #6, A.S. Perera #7 September 2019 International Journal on Advances in ICT for Emerging Regions the same topic or entity. The No Relation relationship can be observed between two sentences when two sentences are no longer discussing the same topic or entity. In other words, No Relation relationship suggests that there is a shift in the information flow. As shown in Table III, the sentence pairs with Shift in View relationship are always predicted as having Elaboration relationship by the system. By observing these results, it can be seen that in most of the cases the system is able to identify whether the sentences are discussing the same topic or not. B. Results obtained from Argumentative Sentence Detection Approaches The individual sentences which were extracted from court cases to evaluate the proposed approaches to detect argumentative sentences. The sentences detected as argumentative sentences from each of the two approaches were considered to be annotated by the human judges. The precisions of the argumentative sentence detection approaches were then calculated by comparing with the human annotations. Table VII demonstrate the obtained results. TABLE VI RESULTS COMPARISON OF APPROACHES USED TO DETECT ARGUMENTATIVE SENTENCES Approach No. of Detected Sentences Precision Argumentative verb based 77 64.93% Citation based 93 90.32% As shown in Table VII, the citation-based approach has shown higher precision than that of argumentative verbs- based approach. However, the both approaches work at a satisfactory level with over 60% precision suggesting the effectiveness of the proposed approaches when it comes to detecting legal arguments. V. CONCLUSIONS The methods and experiments presented in this journal paper on legal information extraction based on sentence classification are extensions of our conference paper [42] on identifying relationships among sentences in court case transcripts using discourse relations. Linguistic rule-based approaches that can be used to identify sentences which provide legal arguments are presented exclusively in this journal paper. Demonstrating how sentence classification can facilitate the process of information extraction from legal court case transcripts can be considered as the primary research contribution of this study. This study presents the way in which discourse relationships between sentences can be used to identify the relationships among sentences in United States court case transcripts. Five discourse relationship types were defined in this study in order to automatically identify the flow of information within a court case transcript. This study describes how a machine learning model and a rule-based system can be combined together to enhance the accuracy of identifying relationships between sentences in court case transcripts. Features based on the properties that can be observed between sentences have been introduced to enhance the accuracy of the machine learning model. The study also proposes novel approaches that can be used to extract argumentative sentences from court case transcripts. The approaches to classify argumentative sentences and non- argumentative sentences have the potential to support automatic extraction of legal arguments from United States Court Case transcripts. The proposed system to identify relationships among sentences can be successfully applied to identify the sentences which develop on the same discussion topic or entity. In addition, it is capable of identifying situations in court cases where the discussion topic changes. The system is highly successful in the identification of legal citations. The empirical results also demonstrate the effectiveness and success of the approaches which were used to identify sentences which provide legal arguments. These outcomes demonstrate that the information extraction mechanisms proposed in this study has a promising potential to be applied in tasks related to systematic information extraction from court case transcripts. One such task is the identification of supporting facts, citations which are related to a particular legal argument. Another is the identification of changes in discussion topics within a court case. Despite the usefulness and applicability of the proposed approaches, the outcomes of this study have also demonstrated that the proposed mechanisms are not sufficient when it comes to detecting the situations where two sentences are providing different opinions on the same discussion topic. Enhancing this capability in the system can be considered as the major future work. REFERENCES [1] WebFinance Inc, “What is case law? definition and meaning - businessdictionary.com,” http://www.businessdictionary.com/definition/case-law.html, (Accessed on 05/17/2018) [2] V. Jayawardana, D. Lakmal, N. de Silva, A. S. Perera, K. Sugathadasa, B. Ayesha, and M. Perera, “Word vector embeddings and domain specific semantic based semi-supervised ontology instance population,” International Journal on Advances in ICT for Emerging Regions, vol. 10, no. 1, p. 1, 2017. [3] K. Sugathadasa, B. Ayesha, N. de Silva, A. S. Perera, V. Jayawardana, D. Lakmal, and M. Perera, “Synergistic union of word2vec and lexicon for domain specific semantic similarity,” in Industrial and Information Systems (ICIIS), 2017 IEEE International Conference on. IEEE, 2017, pp. 1–6 [4] V. Jayawardana, D. Lakmal, N. de Silva, A. S. Perera, K. Sugathadasa, and B. Ayesha, “Deriving a representative vector for ontology classes with instance word vector embeddings,” in Innovative Computing Technology (INTECH), 2017 Seventh International Conference on. IEEE, 2017, pp. 79–84. [5] K. Sugathadasa, B. Ayesha, N. de Silva, A. S. Perera, V. Jayawardana, D. Lakmal, and M. Perera, “Legal document retrieval using document vector embeddings and deep learning,” arXiv preprint arXiv:1805.10685, 2018. [6] V. Jayawardana, D. Lakmal, N. de Silva, A. S. Perera, K. Sugathadasa, B. Ayesha, and M. Perera, “Semi-supervised instance population of an ontology using word vector embedding,” in Advances in ICT for Emerging Regions (ICTer), 2017 Seventeenth International Conference on. IEEE, 2017, pp. 1–7 [7] D. R. Radev, “A common theory of information fusion from multiple text sources step one: cross-document structure,” in Proceedings of the 1st SIGdial workshop on Discourse and dialogue-Volume 10. Association for Computational Linguistics, 2000, pp. 74–83. [8] P. Rashmi, D. Nihkil, L. Alan, M. Eleni, R. Livio, J. Aravind, W. Bonnie et al., “The penn discourse treebank 2.0,” in Proceedings of Classifying Sentences in Court Case Transcripts using Discourse and Argumentative Properties 10 International Journal on Advances in ICT for Emerging Regions September 2019 the Sixth International Conference on Language Resources and Evaluation (LREC08), Marrakech, Morocco, may. European Language Resources Association (ELRA)., 2008 [9] W. C. Mann and S. A. Thompson, Rhetorical structure theory: A theory of text organization. University of Southern California, Information Sciences Institute, 1987. [10] L. Carlson, M. E. Okurowski, and D. Marcu, RST discourse treebank. Linguistic Data Consortium, University of Pennsylvania, 2002. [11] F. Wolf, E. Gibson, A. Fisher, and M. Knight, “Discourse graphbank,” Linguistic Data Consortium, Philadelphia, 2004. [12] “Lee v. United States,” in US, vol. 432, no. No. 76 -5187. Supreme Court, 1977, p. 23. [13] V. Gamage, M. Warushavithana, N. de Silva, A. S. Perera, G. Ratnayaka, and T. Rupasinghe, “Fast approach to build an automatic sentiment annotator for legal domain using transfer learning,” arXiv preprint arXiv:1810.01912, 2018. [14] Z. Zhang, S. Blair-Goldensohn, and D. R. Radev, “Towards cst- enhanced summarization,” in AAAI/IAAI, 2002, pp. 439–446. [15] V. Uzeda, T. Pardo, and M. Nunes, “A comprehensive summary informa-ˆ tiveness evaluation for rst-based summarization methods,” International Journal of Computer Information Systems and Industrial Management Applications (IJCISIM) ISSN, pp. 2150–7988, 2009. [16] M. L. d. R. Castro Jorge and T. A. S. Pardo, “Experiments with cst- based multidocument summarization,” in Proceedings of the 2010 Workshop on Graph-based Methods for Natural Language Processing. Association for Computational Linguistics, 2010, pp. 74–82. [17] D. Marcu, “From discourse structures to text summaries,” Intelligent Scalable Text Summarization, 1997. [18] D. R. Radev, H. Jing, M. Stys, and D. Tam, “Centroid-based summa-´ rization of multiple documents,” Information Processing & Management, vol. 40, no. 6, pp. 919–938, 2004. [19] A. Louis, A. Joshi, and A. Nenkova, “Discourse indicators for content selection in summarization,” in Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 2010, pp. 147–156. [20] K. C. Litkowski, “Cl research experiments in trec-10 question answering,” no. 250. National Institute of Standards & Technology, 2002, pp. 122–131. [21] S. Verberne, L. W. J. Boves, N. H. J. Oostdijk, and P. A. J. M. Coppen, “Discourse-based answering of why-questions,” Traitement Automatique des Langues, vol. 47, pp. 21–41, 2007. [22] P. Piwek and S. Stoyanchev, “Generating expository dialogue from monologue: motivation, corpus and preliminary rules,” in Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2010, pp. 333–336. [23] N. A. H. Zahri, F. Fukumoto, and S. Matsuyoshi, “Exploiting discourse relations between sentences for text clustering,” in 24th International Conference on Computational Linguistics, 2012, p. 17. [24] N. A. H. Zahri, F. Fukumoto, M. Suguru, and O. B. Lynn, “Exploiting rhetorical relations to multiple documents text summarization,” [25] B. Hachey and C. Grover, “A rhetorical status classifier for legal text summarisation,” Text Summarization Branches Out, 2004. [26] B. Hachey and C. Grover, “Extractive summarisation of legal texts,” Artificial Intelligence and Law, vol. 14, no. 4, pp. 305–345, 2006. [27] C. Reed and G. Rowe, “Araucaria: Software for argument analysis, diagramming and representation,” International Journal on Artificial Intelligence Tools, vol. 13, no. 04, pp. 961–979, 2004. [28] G. R. Chris Reed, Raquel Mochales Palau and M.-F. Moens, “Language resources for studying argument,” in Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), B. M. J. M. J. O. S. P. D. T. Nicoletta Calzolari (Conference Chair), Khalid Choukri, Ed. Marrakech, Morocco: European Language Resources Association (ELRA), may 2008, http://www.lrecconf.org/proceedings/lrec2008/. [29] International Journal of Network Security & Its Applications, vol. 7, no. 2, p. 1, 2015. [30] M.-F. Moens, C. Uyttendaele, and J. Dumortier, “Information extraction from legal texts: the potential of discourse analysis,” International Journal of Human-Computer Studies, vol. 51, no. 6, pp. 1155–1171, 1999. [31] [32] R. Mochales-Palau and M. Moens, “Study on sentence relations in the automatic detection of argumentation in legal cases,” Frontiers in Artificial Intelligence and Applications, vol. 165, p. 89, 2007. [33] D. Radev, J. Otterbacher, and Z. Zhang, “CSTBank: Cross-document Structure Theory Bank,” http://tangra.si.umich.edu/clair/CSTBank, 2003. [34] K. Clark and C. D. Manning, “Entity-centric coreference resolution with model stacking,” in Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), vol. 1, 2015, pp. 1405–1415. [35] K. Toutanova, D. Klein, C. D. Manning, and Y. Singer, “Featurerich part-of-speech tagging with a cyclic dependency network,” in Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1. Association for Computational Linguistics, 2003, pp. 173–180. [36] D. Chen and C. Manning, “A fast and accurate dependency parser using neural networks,” in Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 740–750. [37] M.-C. De Marneffe and C. D. Manning, “Stanford typed dependencies manual,” Technical report, Stanford University, Tech. Rep., 2008. [38] J. R. Finkel, T. Grenager, and C. Manning, “Incorporating non-local information into information extraction systems by gibbs sampling,” in Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 2005, pp. 363–370. [39] M. A. Tayal, M. Raghuwanshi, and L. Malik, “Word net based method for determining semantic sentence similarity through various word senses,” in Proceedings of the 11th International Conference on Natural Language Processing, 2014, pp. 139–145. [40] T. Pedersen, S. Patwardhan, and J. Michelizzi, “Wordnet:: Similarity: measuring the relatedness of concepts,” in Demonstration papers at HLTNAACL 2004. Association for Computational Linguistics, 2004, pp. 38–41. [41] “Turner v. United States,” in US, vol. 396, no. No. 190. Supreme Court, 1970, p. 398. [42] G. Ratnayaka, T. Rupasinghe, N. de Silva, M. Warushavithana, V. Gamage, and A. S. Perera, “Identifying relationships among sentences in court case transcripts using discourse relations,” in Advances in ICT for Emerging Regions (ICTer), 2018 Eighteenth International Conference on. IEEE, 2018, pp. 13–20. http://www.lrecconf.org/proceedings/lrec2008/