Multimodality in discussion sessions: corpus compilation and pedagogical use


Language Value 
http://www.e-revistes.uji.es/languagevalue 

December 2010, Volume 2, Number 1 pp. 1-26 
ISSN 1989-7103 

 Articles are copyrighted by their respective authors 1 

Multimodality in discussion sessions: corpus compilation and 
pedagogical use1 

Mercedes Querol-Julián 

Universitat Jaume I, Spain 
querolm@ang.uji.es 

ABSTRACT

Discussion sessions of conference paper presentations are spontaneous and unpredictable, in contrast to 
the prepared lecture that precedes them. These can be challenging, especially for novice presenters whose 
worst fear is to fail to understand the second meaning of a question or comment, and who know it is not 
only the quality of the research that is judged but also their prestige and worth. Additionally, spoken 
academic genres have traditionally been explored by focusing on the transcription of speech and 
disregarding the multimodal nature of spoken discourse. This study offers a comprehensive account of the 
design of a multimodal corpus of discussion sessions, where audio, video, transcriptions and annotations 
are time-synchronised. This multilayer analysis provides examples (not only of linguistic utterances of 
rhetorical moves and multimodal evaluation, but also of how they are actually expressed 
paralinguistically and kinetically), which can be used in the classroom and to design learning-teaching 
materials. 

Keywords: English for Academic Purposes, discussion sessions, multimodal corpora, multilayer 
annotation, research-based pedagogical materials 

I. INTRODUCTION 

The study of academic spoken research genres has received the attention of scholars in 

the last decade. They have focused primarily on conference paper presentations 

(Ventola et al. 2002) and particularly on lectures, where the outcomes of the research 

are presented. To date, however, discussion sessions (hereafter DSs) that follow 

lectures, and that round off conference paper presentations (CPs), have not received 

much attention. However, it is in this face-to-face forum that the scientific community 

can question, criticize and praise, or share knowledge and experience with presenters, 

who have to know how to respond and react to discussants’ comments and questions in 

a clear and effective way. Therefore, DSs are inherently evaluative as proven by Wulff 

et al. (2009). These scholars identify considerable differences between the language 

used in the lecture and in the discussion session, which is characterised by patterns of 

evaluative language. 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 2 

Discourse analysis of academic spoken research genres has in general adopted the 

traditional exploration of written genres, paying attention almost exclusively (Hood and 

Forey’s (2005) work is one exception) to the transcription of speech. However, the 

complex multimodal nature of spoken discourse cannot be captured in a verbatim 

transcription of audio recordings; sometimes analysts also make prosodic or phonetic 

transcriptions and take notes of contextual aspects. Spoken discourse can roughly be 

described as the co-expression of verbal modes and non-verbal modes; hence, verbatim 

transcriptions and even transcriptions of paralanguage (prosodic or phonetic) are only a 

partial representation of the original event (Thompson 2005). The process to register 

spoken data can be more problematic when we want to capture non-verbal features, 

such as the visual. Video recording of the events allows the analyst to explore verbal-

visual (visible bodily motion, kinesics) or multimodal functions of linguistic patterns. 

Therefore, the analysis of speech events cannot be performed on the same basis as 

written discourse since they use different modes of expression. The difficulty arises 

because oral communication is multimodal, it is embodied and combines both verbal 

and non-verbal elements (Adolph and Carter 2007). In addition, most of the work on 

kinesics, and on paralanguage, is done on conversation analysis, an area of interpersonal 

interaction widely explored by scholars who generally belong to multidisciplinary 

backgrounds such as anthropology, psychology, psychiatry, and sociolinguistics. 

Gesture is one of the kinesic features that has received most attention. The most 

influential approaches to the study of gesture are those by Efron (1941), Ekman and 

Friesen (1969), Kendon (2004) and McNeill (1992). These works see gesture as an 

activity of major importance to the understanding of the speaker’s speech, which has a 

significant social meaning. 

This paper is part of a study that aimed at making a cross-disciplinary analysis of the 

presenter’s expression of evaluation in the DSs of two CPs in Linguistics and 

Chemistry. I set out to investigate evaluation in spoken academic discourse beyond the 

traditional linguistic approach. Thus, a multimodal approach, drawn mainly from 

conversation analysis studies, was followed to foreground KINESICS and

PARALANGUAGE that CO-OCCUR with the LINGUISTIC EXPRESSION OF EVALUATION. 

The theoretical framework of the study, in which the design of the corpus was 

underpinned, was embedded in techniques of genre analysis (Swales 1990) and 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 3 

discourse analysis, including the theoretical orientations of systemic functional 

linguistics (Halliday 1985), conversation analysis (Schegloff and Sack 1973), 

pragmatics (Brown and Levinson 1987), and multimodal discourse analysis (Kress and 

van Leeuwen 2001). Conversely, corpus linguistic techniques enabled me to make the 

application of the multimodal approach feasible. I used computer techniques for 

automated analytical procedures and qualitative techniques for the interpretation of the 

corpora. More precisely, I collected a video corpus, took part in the process of 

transcription, and annotated it. I used the multilayer annotation tool to time synchronise 

transcriptions (verbatim or orthographic, paralinguistic, and kinesic) and annotations 

(semantic evaluation and generic moves). Without this tool, it would not have been 

feasible to analyse evaluation on the comprehensive multimodal level as was done in 

the study. Nonetheless, a qualitative interpretation of the data was necessary to 

foreground the salient features that define evaluation in DSs. 

The interpretation of findings and the multilayer annotation enabled me to see the 

potential of this material for pedagogical purposes. The multimodal annotated corpus 

that I introduce in this paper can provide real examples of the rhetorical moves in which 

the interaction is organised to express specific communicative purposes, and the 

linguistic and multimodal expression of evaluation that articulates the rhetoric of the 

interaction. These multimodal instances can be retrieved to be used in the classroom and 

in the design of learning-teaching materials. Students will be provided not only isolated 

linguistic utterances but also how these are expressed during the interaction enabling 

them to identify changes in paralinguistic features and kinesic features (gesture, head 

movement, facial expression, and gaze). This would be a significant contribution to the 

virtually non-existent pedagogical materials based on multimodal corpora research to 

learn-teach academic spoken genres. Currently, there is only one work (Ruiz-Madrid 

and Querol-Julián 2008) that devotes a few activities to discussion sessions, which 

design was based on the study of natural language from a multimodal approach. 

The paper is structured in three sections. First, the design of the corpus is presented. I 

describe the data and give a detailed account of the steps followed to get the corpus 

ready for the analysis. Then, I suggest some pedagogical applications of the multimodal 

corpus in the design of activities and the use of the corpus in the classroom. 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 4 

II. CORPUS DESIGN

The corpus was designed and compiled within the framework of a major project, the 

compilation of the Multimodal Academic and Spoken language Corpus (MASC) 

(Fortanet-Gómez and Querol-Julián 2010). MASC is a multidisciplinary collection of 

Spanish and English spoken academic events at university (i.e. lectures, seminars, guest 

lectures, students’ presentations, dissertation defences, plenary lectures, and conference 

paper presentations), collected by the research group GRAPE (Group for Research on 

Academic and Professional English) at the Universitat Jaume I. The multimodal nature 

of MASC is given by the five different types of data, gathered during the video 

recording of the events: slides, transcripts, handouts, and video and/or audio recordings 

There are several aspects that need to be considered when designing a spoken corpus, 

such as the size, variety of language, level of proficiency, text types, and genre among 

others (Campoy and Luzón 2007). Prioritizing one aspect over another depends on the 

purpose of the research that is going to be conducted on the corpus. Hence, the aim of 

the analysis determines the compilation of the corpus, how the corpus is collected, 

transcribed, and annotated. The criteria followed in the design of the corpus used in the 

study were based on the main objective of MASC, the multimodal discourse analysis of 

academic spoken genres (the criteria will be described below). Additionally, a cross-

disciplinary approach was adopted in the study which has also determined the design of 

the corpus. 

In this respect, a contrastive study should compare items that are comparable; to put it 

in other words, the two corpora of Linguistics and Chemistry should have similarities to 

make the comparison possible. A close look to the factors that may influence the 

rhetoric and the performance (linguistically and non-linguistically) of the DSs of CPs 

might help to shed light on the tertium comparationis of the two corpora. I have 

identified six different aspects that may affect INTERPERSONAL MEANING in discussion 

and therefore might influence in the expression of evaluation: the purpose of the 

conference, the relationship among the participants, cultural and personal features, 

environmental factors, others’ turns, and the discipline. These factors, however, do not 

operate individually but function as a whole. First, the PURPOSE OF THE CONFERENCES 

was to create a site for bringing together specialists in a field of research to share 

http://www.e-revistes.uji.es/languagevalue�
http://www.grape.uji.es/�
http://www.grape.uji.es/�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 5 

investigation results and to open a forum for discussion. In the discussion sessions, as 

well as in the lectures, the major concerns of the speakers in both conferences were to 

present their views and to persuade the audience of the relevance and value of their 

research. Concerning the RELATIONSHIP AMONG THE PARTICIPANTS, both were small 

focused conferences, with no parallel sessions; thus, the audience size was similar in all 

the presentations, around 50 people. Small conferences may help presenters to establish 

a good rapport with the audience. Some participants in the conference in Linguistics, as 

well as the organisers of the conference in Chemistry were interviewed to find out the 

relationship between the participants and its possible influence on the discussion 

sessions. They maintained that most of the participants already knew each other before 

the conference, as they were international communities of experts with specific and 

common research interests. The use of first names to address them can linguistically 

confirm this affirmation. They also note that the DS in CPs could be considered the 

most stressful stage. The main reason they gave was that after presenting their research, 

presenters are fully exposed to an audience of experts (in these conferences most of 

them were senior researchers), who during approximately 20 minutes have been 

evaluating the presentation and comparing it with their previous knowledge and 

experience. Presenters should be ready to respond tricky questions and challenging 

comments, obviously easy questions and nice comments do not pose major problems; 

but the difficulty lies in the uncertainty of the audience reaction. In view of this, the 

relationship among the participants can play a crucial role to create a relaxed 

atmosphere for discussion. The main characters of the discussion are the presenter and 

the discussant; consequently, the relationship between them would be the most 

influential one to formulate their questions, comments, and responses. However, the 

discussion opened between them is not an isolated exchange. The relationship that the 

presenter and the discussant have with the rest of the participants may also constrain 

their performance. Of major interest to the contrastive study, however, is that the 

informants argued that the rhetoric and performance of the discussion did not differ 

from those adopted in other conferences on the same academic discipline. 

So far, I have shown that the purpose of the meetings and the relationship among the 

participants of these specialised conferences seem to be the same. However, there are 

other factors that may influence these comparable corpora of DSs which are variables 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 6 

rather than constants. In this respect, CULTURAL AND PERSONAL FEATURES may affect 

discussants’ questions and comments, and presenters’ responses. However, I am neither 

a biographer nor interested in adopting an ethnographic approach to go into what could 

be a fascinating analysis. My final objective in the study was to find out a new 

methodology of analysis from a multimodal perspective; that is the reason why I 

primarily focused on the linguistic and non-linguistic features of the speech, not putting 

much emphasis on the cultural and personal backgrounds of the speakers. On the other 

hand, DSs are organised around a dialogic exchange structure where discussant’s and 

presenter’s turns follow each other or overlap. Certainly, the OTHERS’ TURN, its meaning 

and how it is performed, will constrain the response to the questions and comments. 

This is the way the discussion is constructed. Turns are central in the exchange 

structure, since it is by turn taking that participants take part in the discussion. 

Nonetheless, as stated above, the factors that may affect discussion do not do it 

individually but their spheres of influence overlap. How others’ turns are performed 

depends on the rest of the factors already noted: the purpose of the conference, the 

relationship among the participants, cultural and personal features, ENVIRONMENTAL

FACTORS (such as problems with microphones), and the discipline. Regarding the 

DISCIPLINE, cross-disciplinary differences have been a common topic of analysis from 

different perspectives in the studies of evaluation in academic written genres (Hyland 

2000, 2004). As regards spoken academic genres, whereas a considerable number of 

studies have focused on the description and interpretation of a genre in a particular 

discipline (Flowerdew 1992, Olsen and Huckin 1991), not much work has been done to 

bring to the fore neither differences between two or more disciplines nor disciplinary 

differences concerning evaluation. An exception is the work of Poos and Simpson 

(2002) who explore the use of hedging in a corpus of academic spoken English. These 

scholars found disciplinary differences; however, neither attention has been paid yet to 

evaluation in discussion sessions of conference paper presentations, nor a multimodal 

approach has been adopted to the study of this interpersonal meaning in academic 

spoken genres. 

The tertium comparationis of the two corpora is essential to conduct a scientific 

contrastive study. Nonetheless, although the factors discussed above might influence in 

the expression of evaluation, they are beyond the corpus designer’s control, since they 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 7 

are inherent to the event and the people that take part in it. There are other aspects, 

however, that can be controlled in the design of comparable corpora such as corpus size. 

The size of the present corpus has been determined by the approach adopted in the 

analysis, the multilayered exploration of evaluation. This type of analysis requires small 

corpora that enable to carry out a qualitative examination. The purpose of the study was 

to describe evaluation in both disciplines, rather than to make generalisations of 

linguistic and non-linguistic patterns, where a larger corpus would be required. 

II.1. Corpus description 

As noted above, two corpora of CPs, lectures and discussion sessions, of two different 

academic disciplines were collected for the study. The Chemistry conference brought 

together leading scientists from all over the world, where a total of 36 papers were 

presented across a range of areas on the science of isotopes. Conversely, all 

contributions to the Linguistics conference, 24 in total, dealt with the topics of genre 

analysis and discourse analysis. Participants were international experts in the field of 

applied linguistics. For the investigation, however, only the discussion sessions were of 

interest, thus a subcorpus of ten DSs from each conference was selected. Two criteria 

were considered in the selection of these DSs. The first criterion was the number of 

presenters. Only one speaker should have presented the paper, and thus he or she should 

be the only one responsible for responding the audience’s questions and comments. A 

preliminary analysis showed that when there is more than one presenter, speakers share 

responsibilities; in the sense that, presenters can give and seek for their colleague’s 

support and even negotiate who is going to respond, using verbal and non-verbal 

language. Thus, turn-taking organization and rhetoric would be more complex. It is not 

only the interpersonal meaning between presenter and discussant/s that would come into 

play, but also the interpersonal meaning between presenters. The second criterion 

adopted in the selection was the number of turns. A turn is counted when a participant 

in the discussion (chair, presenter, or discussant) takes the floor. This criterion can give 

a tentative idea of the level of interaction in the discussion, which should be as similar 

as possible in both disciplines. Eventually, the Linguistics DSs corpus consists of nearly 

12,000 words, 71 minutes, and 39 dialogic exchanges. Whereas, the Chemistry DSs 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 8 

corpus amounts to nearly a total of 8,500 words, 59 minutes, and 34 dialogic exchanges. 

The analysis of the corpus of DSs was done at the macrostructure level. This analysis 

revealed the identification of patterns of dialogic exchanges in the two disciplines. 

Accordingly, two sub-corpora of dialogic exchanges were selected for the study of 

evaluation and the generic structure (moves). Sinclair et al. (1972) define exchange as 

the basic unit of the interaction, because it consists of the contribution of at least two 

participants. In the study, I have followed this definition and categorised what I have 

called DIALOGIC EXCHANGES. These types of exchanges refer to the dialogue held 

between discussant and presenter to make comments and questions, and to respond to 

them. The definition of this type of exchanges is necessary to distinguish them from 

other types of interaction where participants aim at organising the discussion rather than 

at engaging in a dialogue. Additionally, the concept of DIALOGIC PATTERN is used to go 

beyond the concept of adjacency pair postulated by Schegloff and Sacks (1973), where 

a question is followed by an answer, to embrace more complex structures; for example, 

discussant’s comment is followed by a question which is responded by presenter, rather 

than the adjacency pair question – response. 

The criterion followed for the selection of the dialogic exchanges that form the sub-

corpora was to share similar dialogic patterns. Results show that only 4 and 3 dialogic 

exchange patterns were recurrent in Linguistics and Chemistry respectively, and only 

those performed in two turns were common in both disciplines: Comment – Comment, 

Question – Response, and Comment + Question – Response. On the other hand, it is 

worth noting that these three patterns are the most frequent “openers” of longer 

exchange patterns in the corpora with more than two turns. These data prove that 

participants in the discussion sessions in the small corpora analysed commonly follow 

these three dialogic exchange patterns (63% of the exchanges in Linguistics and 71% in 

Chemistry) to open discussion. The sub-corpora of dialogic exchanges were constituted 

by four exchanges of each pattern from each discipline. The sub- corpora of Linguistics 

was formed by a total of about 2300 words and 15 minutes, and around 2000 words and 

14.30 minutes shaped the one of Chemistry. 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 9 

II.2. Getting the corpus ready 

The corpora were compiled in three stages: data collection, transcription, and 

annotation. The several types of transcriptions and annotations were done in the 

following order: first, a verbatim transcription of the corpus of CPs (lectures and DSs); 

then the annotation of the generic structure (moves) and the semantic evaluation of the 

corpus of dialogic exchanges of DSs; and finally, the transcription of kinesic and 

paralinguistic features that co-express with the semantic evaluation already annotated. 

In following sections, I give an account of the process of collecting, transcribing, and 

annotating data; as well as of the multilayer annotation of the corpus. Figure 1, in the 

next page, gives a synoptic view of the design of the corpus that makes possible to carry 

out a multimodal approach for the exploration of evaluation in DSs, which is described 

throughout the section. 

II.2.1. Collecting the data 

The first stage in the compilation of a corpus is the collection of the data. However, 

there is a previous stage before collecting the data. We need presenters to give their 

permission to be video recorded. As commented, the corpus is part of a major project 

MASC. The procedure we follow to collect the data in MASC is first to contact the 

organisers of the events. In many cases, the organisers give us the go-ahead to email the 

speakers. But it can also happen that the organisers become mediators. In both cases, we 

write a formal email explaining the project they are going to be involved in. We only 

tape those speakers who give a positive reply to our request. In addition, the data are 

initially compiled for research purposes; however, participants also sign a consent form 

when part of the data is going to be published. 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 10 

Figure 1. Design of a multimodal corpus of DSs. 

For the present study, the original corpus (lectures and discussion sessions) was video 

recorded and the organisers of both conferences played the role of mediators. However, 

sometimes the use of go-betweens entails a risk. An example of the difficulties that may 

Checking & edition 

Video & audio edition 

Video & audio 
recording of CPs 

Corpus of DSs Corpus of lectures 

Corpus of dialogic exchanges 

Contact organisers and 
presenters 

Verbatim transcription 
of CPs 

Verbatim transcription 

Audio recording Video recording 

Multilayer time synchronisation 

Multidisciplinary 
team work 

Researcher 
takes notes 

Annotation of DSs macrostructure 

Annotation of semantic evaluation Annotation of generic structure (moves) 

Paralinguistic transcription Kinesic transcription 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 11 

appear when researchers do not contact directly with the speakers is what happened in 

the Chemistry conference. The organisers informed us that we only had permission to 

tape 11 out of the 36 presentations and discussion sessions; however, when the 

conference was over some of the speakers complained about not having been video 

recorded. A major obstacle to compile data for a multidisciplinary specialised academic 

spoken corpus is to have access to other areas of knowledge different from ours, since 

neither the organisers nor the participants in the event are familiar with the methodology 

we use. In those cases, it is essential that once the organisers green-light our project we 

try to personally contact speakers to avoid misunderstandings. 

Several aspects should be taken into account before and during the recording to 

guarantee the quality of the data. Special mention should be made of those aspects 

related to the physical context and the speakers’ performance. Before setting up the 

camera one should consider the size of the room, as well as the distribution of tables, 

computer/OHP, aisles, window/s and door/s. On the one hand, the intrusion of the 

camera should cause as little trouble as possible to the presenters in the sense that, they 

should not feel threatened by it, otherwise their behaviour could change. The smaller the 

room, the more difficult it is to create a comfortable environment and at the same time 

focus on the speaker. Moreover, the camera should neither prevent the audience from 

seeing the speaker, nor distract them from the presentation and discussion. On the other 

hand, a video recording can become a valuable source of data for the analysis, and for 

the design of pedagogical materials, if the quality of the image and the sound is good. 

Light conditions are essential for the quality of the image, an aspect that has to be 

negotiated with the organisers of the event beforehand. Regarding the sound, external 

microphones may help to improve it. The speakers’ performance should also be taken 

into account when setting up the camera to be able to focus on them all the time. 

Presenters may be sitting or standing up, but they can also move around. Accordingly, it 

is a matter of extreme importance to be careful in this issue, otherwise we could lose 

relevant data for a multimodal approach. 

The conference paper presentations that shape the data for the study were video 

recorded with a mini-DV digital video camera and an external unidirectional 

microphone plugged in the camera. One of the advantages of unidirectional 

microphones is that they seem to reduce ambient noise and to capture the sound of the 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 12 

image that is in focus. In the corpus, presenters were in focus during the presentations 

and the discussion sessions. In the conference on Linguistics we were able to use two 

cameras which also allowed us to record the audience. This is an important difference in 

the data collection that has determined that only the presenters’ performance should be 

the centre of the contrastive analysis. The external microphones helped to get an 

acceptable sound quality of the presenters’ speech. However, the sound quality of the 

discussants was lower, which sometimes made the transcription hard. In the Chemistry 

conference, it was so because although the camera was set up in the middle of the room, 

among the discussants, the presenter was the one always in focus. In the Linguistics 

conference, the second camera was set up at the front of the room to focus on the 

audience; however, the quality of the audio recordings of those discussants sitting at the 

back was also reduced. Regarding the image, quality was good in the Linguistics 

conference, but in the conference on Chemistry it was a bit dark because, during the 

presentation and discussion session, lights were off on behalf of an excellent slide show 

and only light coming in from back windows illuminated the room. Light condition was 

a fruitless negotiation with the organisers of the conference. Unfortunately, this reduced 

the quality of the video recordings which will affect the analysis of kinesics, particularly 

of face expression and gaze. In addition, in Linguistics during few seconds in four 

exchanges the presenter was not on focus. These problems can be attributed principally 

to the inexperience of collecting a multimodal corpus at that time, that was the first 

contribution to the MASC, and therefore we were not so sensitive to those particular 

aspects of the recording and the consequences for this type of research. 

The next step in the collection of data is the edition of the recordings. I used the video 

editing software Avid Liquid 7.0 to create .avi files. This format allowed me to 

manipulate the data creating audio files (.wav) to improve quality with the audio editor 

available in the program. In addition, after the analysis of the macrostructure of the DSs, 

I created the sub-corpora of dialogic exchanges making audio and video clips from the 

original recordings of the whole events. The format of these clips enabled me to export 

them to the multimodal annotation tool. 

The collection of data involved the audio and video recording, but also the collection of 

contextual information. We observed how the paper presentations and the discussion 

sessions were performed and made a register in a form during the observation about 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 13 

different aspects such as the type of event and communicative act (e.g. title, field of 

knowledge, duration), the speaker (academic status, nationality, mother tongue, age, and 

sex), the room (type of room and we sketch the distribution of participants, recording 

devices, furniture and props), the audience (type and number), the speaker resources 

(PPP, OHP, handouts, microphones, etc), the speaker/s’ performance (mode of 

presentation (if explaining, reading or both) and posture adopted (if moving, sitting or 

standing up), the discussion (if there is discussion or not, when (during or after the 

presentation) and audience’s turns (number, language, and sex), the recording (time and 

equipment), and any incident that occurs during the communicative act. The observation 

aims at fulfilling aspects that one cannot capture with the camera or the microphone and 

may help to understand the communicative act. 

II.2.2. Transcribing the data

Once the audio and video recordings were edited, the next step was to transcribe what 

was said, that is, to create a verbatim transcription. The transcription was done for the 

corpus of CPs (lectures and discussion sessions) in a collaborative work between the 

GRAPE and the English Language Institute (ELI), at the University of Michigan. 

Transcriptions followed the established MICASE conventions, where some contextual 

data were also represented (i.e. XML tags and symbols were utilized to annotate 

potentially relevant features like speaker identity, speaker turns, speech overlap, 

laughter, backchannels and pauses2). Transcribers were native speakers of English who 

were previously trained. The process was implemented by checking and editing the 

transcriptions, a task that was accomplished by a multidisciplinary team since the help 

of an expert in the field was necessary to check the Chemistry transcripts. The 

transcripts of the conference in Linguistics were transferred to the ELI and gathered in a 

single corpus which was named John Swales Conference Corpus (JSCC), a project that 

aims at complementing MICASE. As MICASE, transcripts of JSCC are also publicly 

available at the ELI corpora website3. 

The other two types of transcriptions, kinesic and paralinguistic, were exclusively done 

for the analysis of evaluation in the corpora of dialogic exchanges when linguistic 

evaluation is expressed. Therefore, it was done after the orthographic transcription and 

http://www.e-revistes.uji.es/languagevalue�
http://www.elicorpora.info/�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 14 

the annotation of semantic evaluation. Changes in kinesics and paralanguage that co-

occur with semantic evaluation were identified and data were registered in the corpus 

with the help of the multimodal annotation tool ELAN (see detailed description in 

Section II.2.4.). 

The scope of analysis of kinesics covered changes of: ARMS AND HANDS GESTURES,

FACIAL EXPRESSION, GAZE DIRECTION, and HEAD MOVEMENT. Transcription of kinesics 

was a laborious job since the identification of the co-expression with linguistic 

evaluation was only possible by slowing down the video recording repeatedly to reveal 

any change, any micro expression (Ekman and Friesen 1969), not only of the face but of 

any of the kinesic aspects considered in the study, that are not observable in normal 

examinations. For example, in one of the exchanges in Linguistics the presenter used 

the expression “how it’s often taught” in her response to a discussant’s question, where 

the evaluative adverb “often” co-expressed with a kinesic feature of raising eyebrows 

that lasted 114 milliseconds. That would be difficult to capture without the annotator 

program. In Chemistry, it was not always possible to determine the exact direction of 

eye gaze. As a result, assumptions had to be made on body and head orientation. On the 

other hand, the transcription of gestures was made broadly, in the sense that in the study 

I was not interested in the gestures themselves, but in how they co-expressed with 

evaluative semantics. For this reason, I did not use an accurate identification of the three 

phases of prototypical gestures, i.e. preparation, stroke, and retraction4 (Kendon 1980). 

Nonetheless, a preliminary study showed preparation and stroke commonly co-occur 

with linguistic evaluation. 

Regarding paralanguage, as the starting point of the analysis was semantic evaluation, 

its examination was limited to changes in the pronunciation of discrete words. This 

approach narrowed the transcription to changes in the speaker’s VOICE QUALITY, i.e. 

LOUDNESS, and VOICE QUALIFIER, i.e. SYLLABIC DURATION (after Poyatos 2002). The 

identification of LOUDNESS was done by the comparison with the surroundings. Sound 

waveforms available in ELAN were essential at this stage, since waveforms reach the 

highest peaks when loudness goes up and the lowest peaks when it gets down. Figure 2 

shows a sample of identification of loudness-up in ELAN of a fraction of clip in 

Chemistry, where the maximum amplitude of the waveform of the evaluative word 

problems corroborates the phonetic perception of the stressed noun. 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 15 

Figure 2. Sample view of identification of paralanguage voice quality. 

As for VOICE QUALIFIER, changes in the SYLLABIC DURATION refer to whether the word 

is pronounced faster or slower than expected in the discourse, that is, in comparison 

with the pronunciation of surrounding words. Figure 3 shows a sample of identification 

of long syllabic duration of a portion of a Linguistics exchange. By comparing duration 

of the evaluative utterance tends to be more broad, it can be observed that the adjective 

broad is attributed with the paralinguistic feature of long duration. Whereas the verb 

tends to be is pronounced in 582 ms and more in 222 ms; the adjective, despite being a 

monosyllabic word similar to more, lasts 594 ms, a duration even longer than the 

pronunciation of tends to be. 

Figure 3. Sample view of identification of paralanguage voice qualifier. 

In addition, I have also included in the analysis the transcription of LAUGHTER, a type of 

differentiator or of VOICE QUALIFICATOR. I have considered the speakers’ instances of 

individual laughter in contrast to episodes of general laughter, because I understand 

them as the expression of the speakers’ attitude towards what they are saying. I cannot 

obviate the fact that this is a non-linguistic vocal effect which shows emotional 

reactions. Other paralinguistic aspects, such as intonation, would appear in holistic 

analysis rather than in the exploration of paralanguage of discrete items, as done in the 

study. 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 16 

II.2.3. Annotating the data

Annotation differs from transcription in its content. Rather than capturing overtly 

observable aspects, annotation focuses on more abstract relationships. Annotation, as 

the collection and the transcription of the data, is determined by the purpose of the 

study. In view of that, a pragmatic or functional annotation was done on the verbal 

language to examine the structure of the discussion session and the linguistic evaluation. 

Regarding the annotation of the structure, it is important to observe that the analysis 

conducted was corpus driven. Therefore, all the tags used in the annotation were not 

pre-selected before the analysis, but drawn from the findings. The macrostructure of the 

corpus of DSs was annotated to shed some light on the flow of the discussions, to see 

how turn-taking operates in DSs of specialised CPs. Three different types of tags were 

used for this aim: the identification of the PARTICIPANTS (speaker and addressee), the 

TYPE OF TURN and its POSITION in the discussion. All three were assembled in the 

following string which identifies each of the turns taken and overlapping: 

speaker : type of turn _ position of the turn ~ addressee 

Regarding the identification of the PARTICIPANTS, even though it has been said that the 

identity of the speakers was already captured in the verbatim transcription, I have 

adapted MICASE conventions to identify the role the participants play in the 

interaction5. That is, rather than identifying the participant by the order they speak (S1, 

S2, etc.), I identified them by the primary role they play as: CHAIR (CH), PRESENTER (P), 

DISCUSSANT (D), or AUDIENCE (AUD). Besides, discussants were also assigned a 

number that shows the order in which they speak. I maintained unknown speaker/s (SU) 

and two or more speakers (SS) tags. Moreover, the name used for the tag was 

participants rather than speakers (as in the MICASE) since I aimed at identifying a 

further functional level, if they were speakers or addressees. As regards the TYPE OF 

TURN, the function that each turn had in the DS was tagged as: COMMENT (C), QUESTION 

(Q), and RESPONSE (R). The third tag identifies the POSITION OF THE DISCOURSAL TURN 

in the discussion. The dialogue between discussant and presenter can occur in two turns 

or in several turns. In order to trace the complexity of the sequence it has been 

annotated when the discussant’s and presenter’s turn STARTS the exchange (S), or when 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 17 

it is a FOLLOW-UP turn (FU). Follow-up turns have also been numbered. When there is 

not follow-up, only start turns were tagged even though they started and finished the 

exchange. 

The following example, taken from a Linguistics dialogic exchange, illustrates how the 

exchange between the first discussant in the DS and presenter was annotated in the 

corpus. The discussant formulates a question to the presenter to start her turn 

<D1:Q_S~P> and the presenter responds <P:R_S~D1>. However, the discussant does 

not consider the interaction is finished after the presenter’s response and goes on with a 

follow-up question <D1:Q_FU1~P> which is also responded by the presenter 

<P:R_FU1~D1>, with first attempt in overlap and then in his turn. 

D1:Q_S~P: um, (were these others) that worked in these (fields) were guest editors or were 

they all the official editors 

P:R_S~D1: um, both both kinds. uh um and the_ in in linguistic and in meds- in medical uh 

journals yes 

D1:Q_FU1~P: cuz i just wondered if they might get kind of a different, um, well different 

kind of type of editorial from a guest editor, who doesn’t usually get the floor 

<P:R_FU1~D1><OVERLAP> absolutely, mm </OVERLAP> and might use the 

opportunity to say things uh_ you know, put forward their views and... 

P:R_FU1~D1: yep, yep. certainly, there’s lot of variation from one journal to another, so 

that they seem to have their <SU-m><OVERLAP> in-house style </OVERLAP> in-house 

customs and perceptions of the genre, but also according to the the author. […] 

The annotation of the corpus of DSs allowed to identify, among other aspects, the 

sequence of the dialogues held in the exchanges (i.e. a question is followed by a 

response, a comment is followed by a comment and the like). This analysis has 

determined the selection of the recurrent patterns of the dialogues that make up the sub-

corpora of dialogic exchanges to conduct the analysis of evaluation. The two sub-

corpora (of Linguistics and Chemistry) were also functionally annotated in terms of the 

moves that shape the dialogic patterns and also in terms of linguistic evaluation. The 

generic structure of the exchanges was annotated to confirm the hypothesis that it is 

evaluation, both linguistic and non-linguistic, that articulates it. The tags used to mark 

the moves were also driven by the corpus. Conversely, the annotation of linguistic 

evaluation follows an abridged version of the appraisal model postulated by Martin and 

White (2005). I considered it interesting for the cross-disciplinary study to tag whether 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 18 

the SEMANTIC RESOURCES expresses one or more than one of the three domains of 

evaluation in the model: ATTITUDE, ENGAGEMENT, and GRADUATION6. 

In the next section I describe how these annotations and the transcriptions were 

incorporated to the corpus to carry out the analysis. Before moving to the description of 

how the multimodal annotated corpus was created, I would like to note the importance 

of tagging not only by the examination of the verbatim transcription but, even at this 

stage, by the consideration of the whole performance, that is, audio and video 

recordings. The multimodal approach might help the analyst to make a more accurate 

interpretation of the original event, closer to reality. It is important to bear in mind that, 

in the interaction, participants interpret their interlocutors’ speech on the basis of what 

they hear, the content and the way it is said (that is, linguistics and paralanguage), and 

what they see (kinesics, visual aids, and any physical interaction with the surroundings). 

I consider thus, that the study of certain aspects of interpersonal meaning in spoken 

discourse (like those examined in the study), which were based exclusively on the 

analysis of verbatim transcripts could cause analysis inaccuracy, because a significant 

part of the modes of expression that speakers use are disregarded. 

II.2.4. Creating a multimodal annotated corpus

As described in previous sections, the study conducted with the corpus analysed the 

data from two approaches. First, I focused on the macrostructure of DSs from a top-

down approach. At this level, the analysis was conducted on the corpus of DSs. Then, I 

explored moves and multimodal evaluation in the subcorpora of exchanges. The 

examination of moves similarly followed a top-down approach, but the exploration of 

multimodal evaluation followed a bottom-up approach. At this level of analysis the use 

of a multimodal annotation tool made the work easier, since it was necessary to time-

synchronise the different levels of transcriptions (verbatim or orthographic, kinesic, and 

paralinguistic), annotations (moves and evaluative semantics), and the audio and video 

data. I used the multimodal annotation tool ELAN7 (EUDICO Linguistic Annotator) 

(Wittenburg et al. 2006) to accomplish this task. This tool enabled me to create as many 

layers or tiers (as the program calls them) as needed for the different types of 

transcriptions and annotations. I use ten tiers in this corpus: two for verbatim 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 19 

transcriptions (discussant’s and presenter’s), two for linguistic evaluation (discussant’s 

and presenter’s), one for moves, one for paralanguage, and four for kinesics (gesture, 

head movement, gaze, and facial expression). 

Figure 4. Sample view of multimodal annotation in ELAN. 

Figure 4 shows a sample of multimodal annotation view in ELAN of a portion of a 

orthographic 
transcription

annotation of 
linguistic 
evaluation 

annotation of 
generic moves

paralinguistic 
transcription

kinesic 
transcription

video viewer 

time position viewer 

waveform viewer 

annotation density 
 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 20 

Chemistry exchange. I have enlarged in the figure the four viewers that work in ELAN: 

video, waveform, annotation density, and time position. All viewers are synchronised 

and thus displayed at the same point(s) in time. The first stage was to introduce the plain 

verbatim transcriptions and synchronise them with audio and video data. Sound 

waveforms were a useful aid at this point. Then, I annotated moves and linguistic 

evaluation of presenter and discussant. Finally, the transcriptions of kinesics and 

paralanguage were done on the grounds of the semantic evaluation. Once all the data 

were introduced, I could start the analysis with the aid of a search tool also available in 

the program. Manual extraction of data was necessary in the qualitative approach of the 

study. 

III. PEDAGOGICAL APPLICATIONS

As noted, the compilation of the corpus described in the previous section was done to 

study presenters’ multimodal expression of evaluation in DSs of two academic 

disciplines. However, although the results of the study can find applications in English 

for Academic Purposes courses that focus on communicative skills, the multimodal 

annotated corpus itself can also be used as a pedagogical tool in the classroom, and as a 

valuable source of instances to create teaching and learning material to understand this 

academic research genre and the interpersonal feature that characterises it. In this 

section, I make some suggestions of the pedagogical potential of the annotated corpus, 

which due to the newness of the research I have not yet had the opportunity to put it in 

practice. 

ELAN offers many possibilities to retrieve multimodal data, which can be used in the 

classroom or in the design of activities. There are two ways to access the annotated 

corpus. The focus could be on the analysis of a single dialogic exchange and all the 

aspects transcribed and annotated in it. That is, it could be interesting to show students 

instances of: 

- semantic evaluation 

- semantic evaluation + audio 

- semantic evaluation + audio + video 

- semantic evaluation and co-expression with kinesic and/or paralinguistic features 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 21 

- generic moves 

Figure 5 illustrates the exploration of a dialogic exchange from Chemistry. You can 

select from the list of the ten tiers the feature that you are interested in. In the example, I 

have selected “gesture” as one of the kinesic features. Once the selection is done, you 

access to a list of all the instances of gestures that co-express with semantic evaluation. 

In the dialogic exchanges below there are 13 instances. For the annotation, I have used 

different tags to simplify the reference to the gestures. In the example, I have selected 

“CPU” that stands for “closing palms up”. A click on it, gesture Nr 2, and one has 

access to the video, audio, and annotation density where that gesture is performed. 

Figure 5. Sample view of the exploration of a dialogic exchange in ELAN. 

The other way to retrieve data is using the searching tool. This allows me to focus on 

http://www.e-revistes.uji.es/languagevalue�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 22 

one annotation (this is the general term used in the program, but it embraces both 

annotations and transcriptions) to find all the instances of it that appear in the corpus. In 

Figure 6, I illustrate the example of the move “OPT”, “opening the turn” that is used in 

the two corpora 14 times (6 in Chemistry and 8 in Linguistics). If I click on instance Nr 

6, ELAN opens a new window to display the video, audio, and annotation density 

viewer where this move is expressed in the exchange. 

Figure 6. Sample view of searching an annotation in ELAN. 

The potential of these small corpora is significant. To mention a few data, 521 

evaluative utterances have been annotated (373 expressed by presenters and 188 by 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 23 

discussants) where the identification of the three appraisal categories has been done 

(attitude, graduation, and engagement). In addition, 276 kinesic features and 56 

paralinguistic features where co-expressed with presenters’ semantic evaluation and 

transcribed. Regarding the generic structure, 90 moves were annotated. 

In this paper, I have described the aspects that need to be considered when compiling an 

interactive spoken academic genre for the study of evaluation. As proven, the use of 

multimodal corpora represents a major breakthrough in the field of corpus linguistics 

and academic spoken discourse analysis; since, taking into account the multimodal 

nature of oral communication provides a more comprehensive picture of the events. The 

corpus linguistics techniques used here open a new line of research to explore academic 

spoken discourse and to provide multimodal material for teaching and learning English 

for Academic Purposes. 

Notes 

1 The work described in this paper was supported by Universitat Jaume I (Grant CONT/2010/08). 

2 For a detailed documentation of the MICASE transcription conventions, cf. the MICASE manual at 

<http://micase.elicorpora.info/micase-statistics-and-transcription-conventions/micase-transcription-and-

mark-up-convent> 6 November 2010. 

3 <http://www.elicorpora.info/> 6 November 2010. 

4 The phase of the movement that is closer to the apex, the main part of the gesture, is called stroke. The 

phase of movement leading to the stroke is named the preparation. And the phase of movement that 

follows the stroke is referred to as the recovery or retraction. 

5 MICASE transcription conventions identify speakers as: speaker IDs assigned in the order they first 

speak (S1, S2, etc); unknown speaker, without and with gender identified (SU); probable but not definite 

identity of speaker (SU-1); two or more speakers, in unison (SS). 

6 The attitudinal system has to do with ‘evaluating’. Engagement has to do with the negotiation of other 

voices in the text apart from the authorial voice. The third dimension in the appraisal model is graduation. 

A distinctive feature of attitudes is that they can be gradable. 

7 <http://www.lat-mpi.eu/tools/elan/> 6 November 2010. 

http://www.e-revistes.uji.es/languagevalue�
http://micase.elicorpora.info/micase-statistics-and-transcription-conventions/micase-transcription-and-mark-up-convent�
http://micase.elicorpora.info/micase-statistics-and-transcription-conventions/micase-transcription-and-mark-up-convent�
http://www.elicorpora.info/�
http://www.lat-mpi.eu/tools/elan/�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 24 

REFERENCES 

Adolph, S. and Carter, R. 2007. “Beyond the word. New challenges in analysing 

corpora of spoken English”. European Journal of English Studies, 11 (2), 133-

146. 

Brown, P. and Levinson, S. 1987. Politeness: Some Universals in Language Usage. 

Cambridge: Cambridge University Press. 

Campoy, M. C. and Luzón, M. J. (Eds.). 2007. Spoken Corpora in Applied 

Linguistics. Bern: Peter Lang. 

Efron, D. 1941. Gesture and Environment. Morningside Heights: King’s Crow Press. 

Ekman, P. and Friesen, W.V. 1969. “The repertoire of nonverbal behavioral 

categories: Origins, usage, and coding”. Semiotica, 1, 49-98. 

Flowerdew, J. 1992. “The language of definitions in science lectures”. Applied 

Linguistics, 13, 202-221. 

Fortanet-Gómez, I. and Querol-Julián, M. 2010. “The video corpus as a multimodal 

tool for teaching”. In Campoy, M. C., B. Bellés and Ll. Gea (Eds.) Corpus-based 

Approaches to English Language Teaching Corpus and Discourse. London & 

New York: Continuum, 261-270. 

Halliday, M.A.K. 1985. An Introduction to Functional Grammar. London: Arnold. 

Hood, S. and Forey, G. 2005. “Introducing a conference paper: Getting interpersonal 

with your audience”. Journal of English for Academic Purposes, 4, 291-306. 

Hyland, K. 2000. Disciplinary Discourses: Social Interactions in Academic Writing. 

London: Longman. 

Hyland, K. 2004. “Engagements and disciplinarity: The other side of evaluation”. In 

Del Lungo Camiciotty, G. and E. Tognini Bonelli (Eds.). Academic Discourse. 

New Insights into Evaluation. Bern: Peter Lang, 13-30. 

Kendon, A. 1980. Gesticulation and speech: Two aspects of the process of utterance. In 

Key, M. (Ed.). The Relationship of Verbal and Non-verbal Communication. The 

Hague: Mouton, 207-227. 

Kendon, A. 2004. Gesture. Visible Action as Utterance. Cambridge: Cambridge 

http://www.e-revistes.uji.es/languagevalue�


Multimodality in discussion sessions: corpus compilation and pedagogical use 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 25 

University Press. 

Kress, G. and van Leeuwen, T. 2001. Multimodal Discourse. The Modes and Media of 

Contemporary Communication. London: Edward Arnold. 

Martin, J.R. and White, P. 2005. The Language of Evaluation: Appraisal in English. 

London: Palgrave Macmillan. 

McNeill, D. 1992. Hand and Mind: What Gestures Reveal about Thought. Chicago & 

London: The University of Chicago Press. 

Olsen, L. and Huckin, T. 1991. “Pint-driven understanding in engineering lecture 

comprehension”. English for Specific Purposes, 9, 33-47. 

Poos, D. and Simpson, R.C. 2002. “Cross-disciplinary comparisons of hedging: some 

findings from the Michigan Corpus of Academic Spoken English”. In Reppen, 

R., S. Fitzmaurice and D. Biber (Eds.). Using Corpora to Explore Linguistic 

Variation. Philadelphia: John Benjamins, 3–21. 

Poyatos, F. 2002. Nonverbal Communication across Disciplines. Volume II. 

Paralanguage, Kinesics, Silence, Personal and Environmental Interaction. 

Amsterdam: John Benjamins. 

Ruiz-Madrid, N. and Querol-Julián, M. 2008. GRAPE Online Activities for 

Academic English. 6 November 2010 <http://www.grape.uji.es/activities/ 

pagina%201/index.html> 

Schegloff, E.A. and Sacks, H. 1973. “Opening up closings”. Semiotica, 8, 289-327. 

Sinclair, J., Forsyth, I.M., Coulhard, R.M. and Ashby, M. 1972. The English Use of 

Teachers and Pupils. Final report to SSRC. University of Birmingham. 

Swales, J.M. 1990. Genre Analysis: English in Academic and Research Settings. 

Cambridge: Cambridge University Press. 

Thompson, P. 2005. “Spoken language corpora”. In Wynne, M. (Ed.). Developing 

Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 59-70. 6 

November 2010 <http://www.ahds.ac.uk/creating/guides/linguistic-

corpora/index.htm> 

Ventola, E., Shalom, C. and Thomson, S. 2002. (Eds.) The Language of 

http://www.e-revistes.uji.es/languagevalue�
http://www.grape.uji.es/activities/%20pagina%201/index.html�
http://www.grape.uji.es/activities/%20pagina%201/index.html�
http://www.ahds.ac.uk/creating/guides/linguistic-corpora/index.htm�
http://www.ahds.ac.uk/creating/guides/linguistic-corpora/index.htm�


Mercedes Querol-Julián 

Language Value 2, (1), 1–26 http://www.e-revistes.uji.es/languagevalue 26 

Conferencing. Frankfurt: Peter Lang. 

Wittenburg, P., Brugman, H., Russel, A., Klassmann, A. and Sloetjes, H. 2006. 

“ELAN: A professional framework for multimodality research”. Proceedings of 

Language Resources and Evaluation Conference. 6 November 2010 

<http://www.mpi.nl/publications/escidoc-60436/@@popup> 

Wulff, S., Swales, J.M. and Keller, K. 2009. “‘We have seven minutes for questions’: 

The discussion sessions from a specialized conference”. English for Specific 

Purposes, 28, 79-92. 

Received September 2010 

Cite this article as: 

Querol-Julián, M. 2010. “Multimodality in discussion sessions: corpus compilation and pedagogical 
use”. Language Value, 2 (1), 1-26. Jaume I University ePress: Castelló, Spain. http://www.e-
revistes.uji.es/languagevalue. 

ISSN 1989-7103 

Articles are copyrighted by their respective authors 

http://www.e-revistes.uji.es/languagevalue�
http://www.mpi.nl/publications/escidoc-60436/@@popup�
http://www.e-revistes.uji.es/languagevalue�
http://www.e-revistes.uji.es/languagevalue�

	II. CORPUS DESIGN
	II.1. Corpus description
	II.2. Getting the corpus ready
	II.2.1. Collecting the data
	II.2.2. Transcribing the data
	II.2.3. Annotating the data
	II.2.4. Creating a multimodal annotated corpus


	III. PEDAGOGICAL APPLICATIONS
	Notes
	REFERENCES
	Marcadores de Word
	Note1text
	Note2text
	Note3text
	Note4text
	OLE_LINK3
	Note5text
	Note6text
	Note7text
	Note1
	Note2
	Note3
	Note4
	Note5
	Note6
	Note7