445

Studies in Second Language Learning and Teaching
Department of English Studies, Faculty of Pedagogy and Fine Arts, Adam Mickiewicz University, Kalisz

SSLLT 11 (3). 2021. 445-472
http://dx.doi.org/10.14746/ssllt.2021.11.3.7

http://pressto.amu.edu.pl/index.php/ssllt

Chinese secondary school teachers’ conceptions of
L2 assessment: A mixed-methods study

Maggie Ma
The Hang Seng University of Hong Kong, China

https://orcid.org/0000-0002-9805-5100
maggiema@hsu.edu.hk

Gavin Bui
The Hang Seng University of Hong Kong, China

https://orcid.org/0000-0002-1567-9074
gavinbui@hsu.edu.hk

Abstract
Teacher conceptions of assessment influence their implementation of learn-
ing-focused assessment initiatives as advocated in many educational policy
documents. This mixed-methods study investigated Chinese secondary school
teachers’ conceptions of L2 assessment in the context of an exam-oriented
educational system which emphasizes English grammar, vocabulary and read-
ing comprehension skills. For the quantitative part of the study, survey data
were collected to gauge the conceptions of assessment held by 66 senior sec-
ondary EFL teachers from six schools in Eastern China. For the qualitative part,
case studies of two teachers from schools with different rankings were con-
ducted. Quantitative results showed that the teacher participants as a group
agreed most with the view that assessment is to help learning. However, there
was a strong association between two factors, that is, the assessment as ac-
curate for examination and teacher/school control factor, and the assessment
as accurate for student development factor. The strong association indicated
that it may be less likely for the group of teachers to adopt the formative
assessment initiatives emphasizing student development as promoted in
the English curriculum reform. Qualitative findings further revealed individ-
ual differences in the two case study teachers’ conceptions and practices of


Maggie Ma, Gavin Bui

446

assessment as well as the interplay among meso-level (e.g., school factor),
micro-level (e.g., student factor), and macro-level (e.g., sociocultural and pol-
icy contexts) factors in shaping the teachers’ different conceptions and prac-
tices of assessment. A situated approach has been proposed to enhance
teachers’ assessment literacy.

Keywords: Chinese EFL teachers; teachers’ conceptions of assessment; assess-
ment practices

1. Introduction

Assessment plays an important role in affecting students’ learning. In recent years,
many countries, including China, have witnessed the promotion of formative as-
sessment (Berry & Adamson, 2011; Kennedy & Lee, 2008), which originated from
England in response to the negative influence of high-stakes national testing (Sto-
bart, 2006). The success of assessment innovation such as formative assessment
relies much on teachers, who are the key agents in educational assessment (Xu &
Brown, 2016). In particular, teacher beliefs regarding assessment may influence
how they respond to learning-focused assessment and the success of its imple-
mentation (Brown et al., 2011). A lack of teacher beliefs in the proposed assess-
ment innovation may constitute an obstacle to its success and calls for extensive
assessment training. In countries where there is an exam-oriented educational
system, it is thus crucial to understand teachers’ views of assessment both for the
success of policy initiatives and teachers’ professional development.

This paper explores Chinese secondary EFL teachers’ conceptions of as-
sessment, defined as “a teachers’ understanding of the nature and purpose of
how students’ learning is examined, tested, evaluated or assessed” (Brown &
Gao, 2015, p.4). This is because teacher conceptions exert a major influence on
how teachers perceive, respond to and interact with their teaching environment
(Marton, 1981). Acknowledging that teacher conceptions of assessment are
ecologically rational, previous research has investigated these conceptions in
different contexts and resorted to macro-level factors (i.e., social and cultural
factors) for an explanation (e.g., Brown et al., 2011; Brown & Michaelides, 2011;
Teng & Bui, 2020). Despite such research, there is limited research on the influ-
ence of meso-level (e.g., school factors) and micro-level (e.g., characteristics of
individual teachers) factors on teacher conceptions of assessment and their in-
terplay with macro-level factors, particularly in the case of nationally advocated
formative assessment innovation in exam-oriented educational contexts. Given
that different levels of factors may shape teachers’ assessment knowledge, beliefs,


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

447

and practices (Fulmer et al., 2015), it is important to explore how these factors
affect teacher conceptions of assessment to shed light on the successful imple-
mentation of formative assessment and assessment training. To address the re-
search gap, this study adopted a mixed-methods approach to examining Chinese
secondary EFL teachers’ conceptions of assessment and different layers of factors
that shaped such conceptions when the recent English language curriculum re-
form has foregrounded the importance of formative assessment in the context of
an exam-oriented educational system, which emphasizes English grammar, vocab-
ulary and reading comprehension skills (Hao & Otani, 2016). The findings of the
research may provide insights into the facilitation of the implementation of Eng-
lish education assessment initiatives and EFL teachers’ professional development.

2. Literature review

2.1. Teachers’ conceptions of assessment

Teachers hold beliefs about particular things (Pajares, 1992) and use their beliefs
to filter new information, frame problem spaces, and guide actions (Fives & Buehl,
2012). In the context of assessment, teachers’ beliefs about the nature and pur-
poses of assessment, that is, their conceptions of assessment, may influence their
assessment practices and create a lens through which they respond to curriculum
and assessment reforms. For example, in societies with an exam-oriented educa-
tional system, teachers may hold the belief that a powerful way to improve stu-
dent learning is to examine them, and they may be less likely to adopt formative
assessment initiatives in educational reforms (Brown et al., 2011).

Research on teachers’ conceptions of assessment, conducted extensively
by Brown and his colleagues, has identified four major purposes of assessment
based on the Teacher Conceptions of Assessment (TcoA) inventory (Brown, 2004,
2011; Brown et al., 2011; Brown & Michaelides, 2011). These purposes include:
(1) assessment as improvement of teaching and learning (improvement); (2) as-
sessment as making schools and teachers accountable for their effectiveness
(school accountability); (3) assessment as making students accountable for their
learning (student accountability); and (4) assessment as fundamentally irrele-
vant to the work and life of teachers and students (irrelevance). The first three
are categorized as “purposes” while the last one is termed an “anti-purpose.”
When the school and student accountability views of assessment are grouped
together, it seems that there are two major purposes of assessment in society,
that is, accountability and improvement (Brown & Gao, 2015). This illustrates the
dual functions of assessment and the potential tension that may arise from these
two functions (Brown et al., 2011). On the one hand, assessments are utilized to


Maggie Ma, Gavin Bui

448

evaluate the effectiveness of teachers and schools and to certify the learning of stu-
dents (i.e., the measuring and evaluative functions of assessment), but on the other
hand, assessments are employed to inform different stakeholders (e.g., parents,
teachers, students, governments, administrators) of learning progress and to en-
hance teaching and learning (i.e., the formative function of assessment).

Survey research using the TcoA has been conducted to explore teacher
conceptions of assessment. Teachers strongly endorsed the notion of using as-
sessment to improve teaching and learning. For example, secondary school
teachers in New Zealand and teachers in Cyprus agreed most strongly with the
view that assessment is used to improve learning (Brown, 2011; Brown & Mich-
aelides, 2011). While they still agreed with using assessment to evaluate stu-
dents, they viewed assessment as evaluating schools in a relatively negative light
(Brown, 2011; Brown & Michaelides, 2011). Teachers rejected the conception
that assessment is irrelevant. Assessment is important no matter whether it is
used for improving teaching and learning or for evaluation (Brown & Gao, 2015).
Research has also shown that for primary and secondary school teachers in New
Zealand, there was a negative correlation between improvement and irrele-
vance, and a weak correlation between improvement and using assessment to
evaluate students (Brown, 2004, 2011). New Zealand primary school teachers
tended to associate improvement with school accountability and to moderately
relate student accountability with irrelevance (Brown, 2004). In short, the afore-
mentioned studies explored both the strength of agreement for the main con-
ceptions of assessment held by teacher participants and the interrelation be-
tween them, which provided insights into teachers’ conceptions of assessment.
The current study also investigated these two issues related to Chinese EFL
teachers’ conceptions of assessment.

2.2. Chinese teachers’ conceptions of assessment

The TcoA inventory has been applied to gauge Chinese teachers’ conceptions of
assessment. Since the four-factor framework could not capture the various con-
ceptions held by Chinese teachers, Brown et al. (2011) created a TcoA inventory
for Chinese contexts (C-TcoA) based on data collected from 1,014 primary and
secondary school teachers in Hong Kong and 898 primary and secondary school
teachers in Guangzhou. Three major interrelated factors have been identified based
on teacher responses to a 6-point positively packed agreement rating scale (i.e.,
two negative and four positive rating points for each survey item to elicit variance
in response to socially accepted statements, including strongly disagree, mostly
disagree, slightly agree, moderately agree, mostly agree, and strongly agree).
These three major factors include improvement, accountability, and irrelevance.


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

449

The improvement factor encompasses three sub-factors, that is, assessment is for
student development, assessment is for helping students learn, and assessment
results should be accurate. The accountability factor also consists of three sub-
factors, that is, taking into account measurement errors in assessment use, using
assessment to control teachers and evaluate schools, and utilizing examination as
assessment. The irrelevance factor refers to the negative aspects of assessment.

Brown and Gao (2015) proposed a model of Chinese conceptions of as-
sessment based on collaborative research between them and graduate student
theses written under the supervision of Gao. The model includes six major con-
ceptions, ranging from a more external management and control perspective to
a more individualistic developmental view of assessment, in addition to a more
negative view of assessment. These conceptions include management and in-
spection (i.e., using assessment to inspect and control schools, teachers, and
students for better teaching and achievement); institutional targets (i.e., using
assessment to check if students have achieved pre-set learning standards as in-
stantiated in public examinations); facilitation and diagnosis (i.e., using assess-
ment to provide valid information for the diagnosis and facilitation of teaching
effectiveness); ability development (i.e., using assessment to increase students’
motivation and learning abilities); personal quality (i.e., using assessment to en-
hance the overall quality of students); and negativity (i.e., assessment exerts a
negative influence on teaching and learning).

Research on the C-TcoA has shown that Chinese teachers agreed most with
the  conception  that  assessment  is  needed  for  improvement  (Brown  et  al.,  2011;
Chen & Brown, 2015). In Chen and Brown’s (2015) study involving 1,500 Chinese
teachers from primary, middle, and high schools, after “assessment as teacher im-
provement,” “assessment is for student development” was the most endorsed view.
A strong positive association was identified between assessment as improvement
and assessment as accountability (Brown et al., 2011), indicating that teachers con-
sidered that examining students facilitated their learning. In Brown et al.’s (2011)
study, a positive correlation was also found between assessment for accountability
and irrelevance. In Chen and Brown’s (2015) study, a moderately strong connection
was found between school accountability and student development.

Despite the research on Chinese teachers’ conceptions of assessment men-
tioned earlier (Brown et al., 2011; Chen & Brown, 2015), there is limited research
on Chinese EFL (English as a foreign language) teachers’ conceptions of assessment
in the context of nationally mandated formative assessment innovation. Using the
C-TcoA and assessment practices inventory (Zhang & Burry-Stock, 2003), Gan et al.
(2018) probed into 107 Chinese secondary EFL teachers’ conceptions and practices
of assessment. Four main conceptions of assessment were identified, including
“help learning,” “student development,” “teacher/student accountability,” and


Maggie Ma, Gavin Bui

450

“examination and school accountability.” Like the teachers in other studies
(Brown, 2011; Brown & Michaelides, 2011), the Chinese EFL teachers agreed
most with the view that assessment helps students improve their learning. The
second most endorsed view was “assessment as examination and school account-
ability.” A moderately strong correlation was identified between the “help learn-
ing” factor and the “student development” factor, and between the “teacher/stu-
dent accountability” factor and the “examination and school accountability” fac-
tor. The “teacher/student accountability” factor was found to be weakly corre-
lated to the “help learning” factor and the “student development” factor, respec-
tively. A weak correlation was also identified between the “examination and
school accountability” factor and the “student development” factor, while a me-
dium level of correlation was found between the “examination and school ac-
countability” factor and the “help learning” factor.

Gan et al.’s (2018) research also examined Chinese secondary EFL teachers’
assessment practices. The teachers reported using different assessment practices
frequently, including aligning teaching and assessment (e.g., matching assessment
with instruction), using assessments for improvement (e.g., using assessment re-
sults when planning teaching), using traditional assessments (e.g., using multiple
choice questions to assess students), sharing assessment criteria (e.g., communi-
cating assessment criteria to students in advance), and providing oral feedback.
However, the teachers seemed to only occasionally use student-centered assess-
ments, such as self or peer assessment, a phenomenon also identified in other EFL
contexts (e.g., Bui & Kong, 2019). The most frequently adopted assessment prac-
tice, aligning teaching and assessment, was associated with both the “help learn-
ing” factor and the “student development” factor, but not the “teacher/student
accountability” factor, indicating that the teacher participants somehow imple-
mented assessment-for-learning principles. Student-centered assessments were
the only type of assessment that had no systematic relationship with the four main
conceptions of assessment identified in Gan et al.’s (2018) study.

2.3. Factors affecting Chinese teachers’ conceptions of assessment

Previous research utilizing C-TcoA has explained the teacher participants’ con-
ceptions of assessment through the influence of sociocultural and policy con-
texts. Chinese sociocultural values attach great importance to performance in
public examinations, which informs decision-making regarding the selection of
students for opportunities for better education (He et al., 2011). Public examina-
tion results are used to evaluate not only students, but also teachers and schools
(Brown et al., 2011). At the same time, a person’s academic achievement is also


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

451

associated with beliefs about personal worth and virtue (China Civilization Cen-
tre, 2007). Therefore, helping students achieve higher scores in public examina-
tions not only contributes to their knowledge and performance, but also makes
them better people (Brown et al., 2011). At the policy level, the current curricu-
lum reform in China emphasizes an assessment reform, advocating the use of
formative assessment in English language education to promote students’ ho-
listic development (Chinese Ministry of Education, 2017). According to Brown
and Gao (2015), the assessment context seems to pull teachers towards differ-
ent ends, that is, summative assessment emphasizing performance, and forma-
tive assessment emphasizing learning improvement.

Research has also shown that teacher characteristics (i.e., sex and teaching
experience) may influence Chinese teachers’ conceptions of assessment. For exam-
ple, probably because more males assume the role of school leaders in Chinese
schools (Brown & Gao, 2015), male teachers agreed more strongly with the man-
agement and inspection conception and the institutional targets conception (South
China Normal University Team, 2010). Highly experienced teachers were found to
agree more strongly with the management and inspection conception and the in-
stitutional targets conception, and to agree less with the personal quality concep-
tion and the facilitation and diagnosis conception (Brown & Gao, 2015).

Work environments constitute another source of influence. Teachers in
senior secondary schools, who face the greatest pressure to prepare students
to perform well in public examinations, agreed most with the irrelevance, man-
agement and inspection, as well as institutional targets conceptions, but agreed
least with the personal quality conception (Wang, 2010). Teachers in the final
year of senior secondary school agreed most with personal quality conception
and those in higher ranking/banding schools agreed more with personal quality
conception as well (Shang, 2007).

As can be seen from the literature review, research employing the TcoA
has mainly adopted a quantitative approach to investigating conceptions of as-
sessment held by teachers in different regions and countries (e.g., Brown, 2004,
2011; Brown & Michaelides, 2011; Chen & Brown, 2015; Gan et al., 2018), with
the results being explained by sociocultural and policy contexts. Quantitative
studies on factors affecting Chinese teachers’ conceptions of assessment have
also focused on particular categories of factors such as teacher characteristics
and work environments (Shang, 2007; South China Normal University Team,
2010; Wang, 2010). The aforementioned research has contributed greatly to the
understanding of teachers’ views of assessment and factors affecting them.
However, quantitative research can only reveal a general picture of teachers’
conceptions of assessment without providing an in-depth understanding of the
interaction among global and local factors in shaping individual teachers’ views


Maggie Ma, Gavin Bui

452

and related practices of assessment. From an ecological perspective, teachers’ as-
sessment views and practices are influenced by three distinct but interacting lev-
els of contextual factors, including macro-level factors (e.g., national and cultural
influences), meso-level factors (e.g., school factors and expectations of parents
and the immediate community), and micro-level factors (e.g., factors related to
the classroom, students, and teachers), among which meso-level factors deserve
more attention (Fulmer et al., 2015). To understand teachers’ conceptions and
practices of assessment in detail and in context, it seems that qualitative data
should be utilized as well. This study utilized both quantitative and qualitative data
for a more refined and contextualized understanding of Chinese teachers’ con-
ceptions of assessment in the context of the recent English language curriculum
reform, which emphasizes formative assessment initiatives. If teachers do not en-
dorse the view that assessment can be used to promote teaching and learning, as
advocated in the education reform, then the proposed new form of assessment
is unlikely to be successful. Sustainable assessment training programs are also
needed to keep in-service teachers informed of assessment principles (Xu &
Brown, 2017). However, attempting to change teachers’ behaviors only (e.g., in-
creasing formative assessment practices) without taking into consideration their
existing beliefs is likely to fail (Brown & Gao, 2015). It is thus crucial to understand
how Chinese EFL teachers conceive of assessment and factors affecting their con-
ceptions both for the success of policy initiatives and teachers’ professional de-
velopment. Inspired by the research gaps identified in the literature review, this
paper seeks to answer the following research questions:

RQ1. What were the overall conceptions of assessment among the Chi-
nese EFL teachers in the study, and what, if any, relations emerged
among those conceptions?

RQ2. What was the impact of teaching experience and school banding on
the teacher participants’ conceptions of assessment?

RQ3. What were the individual teacher participants’ conceptions and
practices of assessment and what were the factors affecting them?

3. Methods

3.1. Research design

This study adopted a mixed-methods approach that involved both quantitative
and qualitative data. An explanatory sequential mixed methods design (Cre-
swell, 2014) was utilized. Quantitative data were collected first, followed by a
qualitative phase of the study. The quantitative results informed the selection


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

453

of participants in the qualitative phase, with the qualitative data expecting to
provide more depth and insights into the quantitative results of the study. To
answer RQ1, the 31-item Chinese teachers’ conceptions of assessment (C-TcoA)
questionnaire was used to collect quantitative data to obtain a general picture
of the teacher participants’ views of assessment. As previous research identified
the interrelationship among Chinese teachers’ different conceptions of assess-
ment (Brown et al., 2011; Gan et al., 2018), this study also aimed to examine
whether the teacher participants’ various views of assessment were potentially
interrelated. To answer RQ2, the same set of quantitative data were utilized to
ascertain the potential influence of teaching experience and school banding on
the participants’ conceptions of assessment, given that research has identified
the influence of teacher characteristics (i.e., sex and teaching experience) and
work environment (e.g., school banding) (Brown & Gao, 2015; Shang, 2007;
Wang, 2010). We thus focused particularly on the two variables of teaching ex-
perience and school banding to identify their potential influence. Due to the
very small number of male teachers in the study (i.e., 7 out of 66), the influence
of sex on teacher conceptions of assessment was not investigated. Although the
answer to RQ2 can shed light on the potential influence of micro-level factors
(i.e., teaching experience as one teacher factor) and of meso-level factors (i.e.,
school banding as one school factor) on teachers’ conceptions of assessment,
in-depth qualitative data were needed to add to the quantitative data by exem-
plifying the potential interaction among macro-level, meso-level, and micro-
level factors. Therefore, based on the findings of the first two research questions
(i.e., the influence of school banding on the teacher participants’ conceptions
of  using  assessment  to  promote  learning—see  the  section  on  results),  two
teachers from schools with different bandings were selected. Case studies of
these teachers were conducted for RQ3 to understand their conceptions and
practices of assessment in context and the different layers of shaping influences
on them. In short, the mixed-methods approach allowed the investigation of a
general tendency among a particular group of teachers and a contextualized un-
derstanding of individual teachers’ assessment conceptions and practices.

3.2. Participants

For the quantitative part of the study, a purposive sample of 66 Chinese EFL
teachers from six senior secondary schools in a city in Eastern China participated
in the C-TcoA survey. These six schools were purposively selected based on two
criteria. First, the schools represented different school bandings, including mu-
nicipal-level key schools, district-level key schools, and general high schools. Sec-
ondary schools in China are categorized into those that enjoy higher banding or


Maggie Ma, Gavin Bui

454

reputation (i.e., key schools) and those that are not as reputable (i.e., non-key
schools or general high schools) (Yu et al., 2016). Among the key schools, there is
also a distinction between municipal-level key schools and district-level key
schools,  with  the  former  being  more  prestigious  than  the  latter.  Second,  the
schools were known to the researchers. In this study, schools known to the re-
searchers tended to be more supportive of the research project compared with
those schools to be recruited from random sampling. Random sampling may be a
relatively ineffective sampling strategy in Chinese school contexts (Brown et al.,
2011). Table 1 shows the background information of the teacher participants.

Table 1 Background information of the teacher participants

Participants’ background Number (%)
Sex

Female 59 (89.4%)
Male 7 (10.6%)

Educational background
Bachelor’s degree 41 (62.1%)
Master’s degree 22 (33.3%)
Not given 3 (4.6%)

Teaching experience
1-4 years 15 (22.7%)
5-18 years 18 (27.3%)
19-23 years 13 (19.7%)
Over 24 years 13 (19.7%)
Not given 7 (10.6%)

School banding
General high schools 21 (31.8%)
District-level key schools 19 (28.8%)
Municipal-level key schools 25 (37.9%)
Not given 1 (1.5%)

The qualitative part of the study involved case studies of two purposefully
selected teacher participants. A strength of case study is its capacity to provide an
in-depth and contextualized understanding of contemporary real-life phenomena
(Creswell, 2013). The teachers were chosen based on the following criteria: (1)
they worked in schools with different bandings; (2) they were enthusiastic about
and supportive of the research. As the quantitative analysis revealed that school
banding (i.e., municipal-level key school vs. district-level key school) exerted an
influence on teachers’ conception of using assessment to promote learning (see
the section on results), school banding was used as one of the criteria for case
selection. Teacher A, a female teacher with 29 years of teaching experience, came
from a municipal-level key school. Teacher B, a female teacher with 15 years of
teaching experience at the time of study, came from a district-level key school.


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

455

3.3. Data collection and analysis

The quantitative data were mainly collected through the 31-item Chinese teach-
ers’ conceptions of assessment (C-TcoA) questionnaire (Brown et al., 2011),
which helped to gauge the EFL teacher participants’ conceptions of assessment.
The C-TcoA elicited teachers’ self-ratings for the following conceptions of assess-
ment: (1) assessment helps teaching and learning; (2) assessment promotes stu-
dents’ development; (3) assessments are accurate; (4) assessment involves ex-
aminations; (5) measurement errors should be taken into consideration in as-
sessment use; (6) assessment is used to control teachers and evaluate schools;
and (7) assessments are irrelevant.

Confirmatory factor analysis was employed to determine if the EFL
teacher participants’ responses fitted the factor model identified by Brown et
al. (2011) (χ²/df = 1.70, RMSEA = 0.10, RMR = 0.11, CFI = 0.94). As RMSEA1 and
RMR were greater than .08 and .05 respectively, exploratory factor analysis (EFA)
was utilized to develop an alternative model. Prior to performing EFA, the suit-
ability of data for factor analysis was assessed. The Kaiser-Meyer-Olkin value
was .68 and Bartlett’s test of sphericity reached statistical significance (approx-
imate χ2 = 725.27, df = 231, p = .00), supporting the factorability of the correla-
tion matrix. Varimax rotation was used for EFA. After EFA, inter-factor correla-
tions were calculated to explore the potential relationships among the factors.
As the data were not normally distributed, the Kruskal-Wallis test was used to
examine the influence of: (a) teaching experience (1 to 4 years N = 15, 5 to 18
years N = 18, 19-23 years N = 13, over 24 years N = 13) and (b) school banding
(general high school N = 21, district-level key school N = 19, municipal-level key
school N = 25). Bonferroni correction was applied given that we ran two Kruskal-
Wallis tests. Therefore, the threshold for the p value was set at 0.05/2 = 0.025.

For the qualitative part of the study, two semi-structured interviews were con-
ducted  with  two  purposefully  selected  teachers  to  obtain  a  contextualized  under-
standing of their conceptions and practices of assessment. The interviews were con-
ducted in Chinese, the teachers’ native language, but they were allowed to switch
between Chinese and English whenever necessary for the sake of a clear expression
of meaning. Each interview was audio recorded and lasted for about 45 minutes.

To analyze the interview data, we employed a qualitative data analysis scheme
including data reduction, data display, and conclusion drawing and verification (Miles

1 We decided to follow the guidelines endorsed in Brown (2015). That is, RMSEA values less than
0.05 suggest a good model fit; RMSEA values less than 0.08 suggest adequate model fit; RMSEAs
in the range of 0.08-0.1 suggest a mediocre fit; and models with RMSEA value >= 0.1 should be
rejected. Therefore, the RMSEA value of 0.10 in this study suggests an unsatisfactory model fit.
The full results of RMSEA with the 90% CI statistics will be provided upon reader request.


Maggie Ma, Gavin Bui

456

et al., 2014). The interview data were transcribed verbatim and checked for ac-
curacy. Data reduction was performed by treating a paragraph as a unit of cod-
ing and focusing on information reflecting the interviewees’ conceptions and
practices of assessment and factors affecting them. We used Brown and Gao’s
(2015) model of Chinese teachers’ conceptions of assessment (i.e., manage-
ment and inspection, institutional targets, facilitation and diagnosis, ability de-
velopment, personal quality, and negativity) to code information related to con-
ceptions of assessment. For example, the code “institutional targets” was as-
signed to the following data: “In my school, we mainly use tests to measure stu-
dents’ performance. The final grade is based on the average of students’ test
results.” Regarding the coding of assessment practices, we utilized the six types
of classroom assessment practices adopted by Chinese EFL teachers (Gan et al.,
2018) as an analytical framework, which included aligning teaching and assess-
ment, using assessments for improvement, using traditional assessments, shar-
ing assessment criteria, providing oral feedback, and student-centered assess-
ments. For instance, the code “using traditional assessments” was assigned to
the following data: “Tests are conducted weekly, monthly, mid- and final-term.
After  test-taking  drills  and  my  explanation  of  the  answers  to  the  test,  there  is
not much time left.” We also coded information regarding the factors affecting
the participants’ conceptions or practices of assessment. For example, the code
“influence of college entrance examination” was assigned to the following data:
“If the college entrance examination is still used and if the English test paper is
still so difficult, it is quite impossible to change the current situation.” During
data analysis, we were also open to new codes as well. The relationships between
different codes were examined to develop emerging themes, such as the influ-
ence of college entrance examination on the use of traditional assessments.
Case narratives were also developed for the teachers. Cross-case comparisons
were conducted, with similarities and differences between cases identified and
analyzed using matrixes. Conclusions about the teacher participants’ concep-
tions and practices of assessment as well as factors affecting them were drawn
and verified through member-checking.

To ensure the reliability and trustworthiness of data analysis, the two au-
thors independently coded all the qualitative data and the inter-coder reliability
reached 85%. They then discussed to resolve disagreements in coding. After a
second round of coding, the inter-coder reliability reached 92%. Member-check
interviews were also conducted to elicit the teachers’ opinions on our interpre-
tations of interview data.


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

457

4. Results

4.1. Teachers’ conceptions of assessment: A general picture

RQ1 addressed the Chinese EFL teachers’ conceptions of assessment and the interre-
lationship, if any, among the assessment conceptions. The revised C-TcoA model con-
tained five inter-correlated factors (Table 2). Factor 1 (i.e., help learning), comprising 3
items, showed that assessment helps students to learn. Factor 2 (i.e., student/teacher
accountability), containing 4 items, showed that teachers and students should be held
accountable for teaching and learning. Factor 3 (i.e., assessment as accurate for stu-
dent development), containing 5 items, identified assessment for student develop-
ment. Factor 4 (i.e., assessment as accurate for examination and teacher/school con-
trol), containing 6 items, showed that assessment is used to prepare students for ex-
aminations and to control teacher and schools. Factor 5 (i.e., irrelevance), comprising
4 items, showed that assessment is irrelevant. Two of the factors identified by Brown
et al. (2011) (i.e., help learning and irrelevance) were confirmed in the study.

Table 2 C-TcoA factors, items, and factor loadings based on exploratory factor analysis

Scale and items Factor loading
Help learning

1. Assessment helps students improve their learning. .89
2. Assessment determines if students meet qualification standards. .88
3. Assessment information modifies ongoing teaching of students. .86

Student/teacher accountability
22. Assessment sets the schedule or timetable for classes. .62
23. Assessment helps students gain good scores in examinations. .82
24. Assessment selects students for future education or employment opportunities. .80
25. Assessment results contribute to teachers’ appraisals. .71

Assessment as accurate for student development
4. Assessment results are sufficiently accurate. .51
9. Assessment helps students succeed in authentic/real-world experiences. .74
10. Assessment is used to provoke students to be interested in learning. .77
11. Assessment cultivates students’ positive attitudes towards life. .67
13. Assessment stimulates students to think. .67

Assessment as accurate for examinations and teacher/school control
8. Assessment results can be depended on. .56
14. Assessment is assigning a grade or level to student work. .67
19. Assessment teaches examination-taking techniques. .68
26. Assessment helps students avoid failures on examinations. .61
6. Assessment is used by school leaders to police what teachers do. .68
30. Assessment is an accurate indicator of a school’s quality. .45

Irrelevance
12. Assessment results are filed and ignored. .61
15. Assessment is an imprecise process. .71
18. Assessment interferes with teaching. .68
27. Assessment forces teachers to teach in a way against their beliefs. .75


Maggie Ma, Gavin Bui

458

Table 3 C-TcoA factor means, SDs, and Cronbach’s α

Factors
Number

of
items

Scale example
Cronbach’s

α
M SD

1. Help learning 3 Assessment helps students improve their learning. .90 4.93 1.27
2. Student/teacher accountability 4 Assessment selects students for future education or em-

ployment opportunities.
.82 3.66 1.04

3. Assessment as accurate for student
development

5 Assessment cultivates students’ positive attitudes towards
life.

.81 4.20 0.85

4. Assessment as accurate for exami-
nation and teacher/school control

6 Assessment teaches examination-taking techniques. .76 3.80 0.87

5. Irrelevance 4 Assessment forces teachers to teach in a way against their
beliefs.

.71 3.12 1.13

Table 3 shows the mean score for each factor. The teacher participants tended
to agree most with the conception that assessment is used to help learning. There was
moderate agreement with the idea that assessment is for student development on
condition that it is accurate. The teacher participants also tended to moderately agree
that as long as assessment is accurate, it may be used to prepare students for exams
and to control teacher/school and that students and teachers should be held account-
able for assessment. The teachers slightly agreed that assessment is irrelevant.

Table 4 The inter-correlation between EFL teachers’ assessment conception factors
Teacher assessment conceptions

1 2 3 4 5
1. Help learning 1.36**
2. Student/teacher accountability -.12** 1.36**
3. Assessment as accurate for student development .36** .28** 1.36**
4. Assessment as accurate for examination and
teacher/school control

-.02** .48** .55** 1.36*

5. Irrelevance -.083** .45** .05** .27* 1.36**
*p < .05, **p < .01

As indicated by Table 4, there was high inter-factor correlation between
the “assessment as accurate for examination and teacher/school control” factor
and the “assessment as accurate for student development” factor (r = .55).
There was medium correlation between the “assessment as accurate for exam-
ination and teacher/school control” factor and the “student/teacher accounta-
bility” factor (r = .48), between the student/teacher accountability factor and
the irrelevance factor (r = .45), and between the “help learning” factor and the
“assessment as accurate for student development” factor (r = .36).

RQ2 investigated the influence of teaching experience and school banding
on the teacher participants’ conceptions of assessment. Regarding the influence
of teaching experience, no statistically significant differences have been found
across the four groups of teachers with different years of teaching experience.


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

459

Concerning the influence of school banding, a Kruskal-Wallis test revealed a sta-
tistically significant difference in the “help learning” factor across teachers from three
types of schools with different bandings (municipal-level key schools, N = 25; district-
level key schools, N = 19; general high schools, N = 21), X2 (2, N = 65) = 8.124, p = .017.
The teachers from municipal-level key schools and general high schools both recorded
median values of 6. The teachers from district-level key schools recorded a median
value of 4. Mann-Whitney U tests further revealed a significant difference between
the teachers from municipal-level key schools (Md = 6, N = 25) and those from district-
level key schools (Md = 4, N = 19), U = 128.5, z = -2.70, p = .007, r = .41. In other words,
teachers from municipal-level key schools seemed to agree more strongly than those
from district-level key schools that assessment is for enhancing student learning.

4.2. Teachers’ conceptions and practices of assessment: Two cases

RQ3 probed into two individual teachers’ conceptions and practices of assess-
ment and factors affecting them. Interviews with the two teachers revealed in-
dividual differences in assessment conceptions and practices despite similari-
ties. The two teachers’ conceptions and practices of assessment are reported
first, followed by a summary of factors affecting them.

Both teacher participants acknowledged that assessment may serve multi-
ple purposes, but each highlighted different priorities. For example, Teacher A
stated: “In my school, we mainly use tests to measure students’ performance. The
final grade is based on the average of students’ test results.” This quote reflected
the conception that assessment is used as a mechanism to evaluate students. She
added: “Assessment is mainly about giving tests to students, especially Senior
Three students. As our school is a high-banding school, our school leaders want
students to achieve high scores in external examinations, and teachers are forced
to teach to the test. We don’t have time to think about better ways to teach and
to assess.” This quote indicates that the teacher conceived assessment not only
as administering tests to prepare students for external examinations such as the
college entrance examination, but also as a mechanism by the school and school
leaders  to  constrain  what  teachers  do  to  raise  students’  examination  scores,  as
can be seen from the use of the phrase “forced to teach to the test.”

Teacher A expressed a sense of exhaustion by comparing the past and cur-
rent situation: “In the past I could still decide what to teach in my class and I en-
joyed teaching quite a lot, but in recent years the college entrance examination
for  the  English  subject  has  become  more  and  more  difficult,  and  I  start  to  feel
exhausted  and  I  just  want  to  retire.  The  examination  has  constrained  what  we
have to teach.” It seemed that Teacher A became less motivated to teach because
the college entrance examination constrained what she could teach in class.


Maggie Ma, Gavin Bui

460

Concerning the most frequently used assessment practices, Teacher A
thought that it was difficult to rank the different types of assessment practices
as  identified  in  Gan  et  al.  (2018)  because  she  stated  that  tests  were  used  the
most frequently in her English class, while student-centered assessment such as
peer- or self-assessment was seldom used. She mentioned: “Tests are con-
ducted weekly, monthly, mid- and final- term. After test-taking drills and my ex-
planation of the answers to the test, there is not much time left.” Although she
was aware that peer- and self- assessment was promoted in the new senior sec-
ondary English language curriculum, she talked about the difficulty in imple-
menting change: “If the college entrance examination is still used and if the Eng-
lish test paper is still so difficult, it is quite impossible to change the current sit-
uation.” The quote indicated that from Teacher A’s perspective the current ex-
amination system creates limited space for using formative assessment prac-
tices such as peer- or self- assessment.

In short, Teacher A regarded assessment as giving students, especially
Senior Three students, tests to measure their performance and preparing them
for the college entrance examination to achieve high scores and to fulfill school
leaders’ expectations. Her case suggested the influence of macro-level factor
(i.e., the college entrance examination), meso-level factor (i.e., a high banding
school with high expectations from school leaders), and micro-level factor (i.e.,
Senior Three students in a high banding school). Notably, although not explicitly
mentioned by Teacher A, the students in her school were high achieving stu-
dents compared with those from district-level key schools and general high
schools (a point mentioned by Teacher B). They were thus expected to perform
excellently in the college entrance examination.

Different  from  Teacher  A,  Teacher  B  talked  about  the  formative  assess-
ment initiatives in the English education reform and highlighted the use of as-
sessment for promoting learning and student development. To her, assessment
meant the kind of classroom tasks students do and receive feedback on. She
stated:  “We  create  tasks  for  students  to  do  in  class,  such  as  a  group  task  for
students to discuss themes in a piece of reading. I may provide feedback on dif-
ferent dimensions of the task such as verbal delivery, correctness of ideas, task
fulfillment, and so on. I talk about the strengths and weaknesses, but more feed-
back is usually given to the weak group.” Teacher B added: “We also have a com-
bination of teacher-, self- and peer- assessment. For example, we may ask one
group of students to peer assess another group. Although most of the time stu-
dents only give marks, the more capable ones can provide comments too.”
These quotes suggested that the teacher conceived of the purpose of assess-
ment as eliciting evidence that is subject to different sources of feedback, that
is, the formative dimension of assessment.


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

461

Teacher B also commented on the affective aspect of teacher feedback:
“Positive and accurate feedback can stimulate our students’ interest in learning,
which is an essential student quality. Encouragement and guidance help stu-
dents make progress not only in their academic study, but also in their life.” This
suggested that the teacher considered assessment to promote students’ devel-
opment through positive and to-the-point teacher feedback. She explained:
“The students need a teacher who can guide not only their academic study, but
also their views of the world and life.”

Regarding  the  most  frequently  used  assessment  practices,  teacher  oral
feedback and student-centered assessment (e.g., peer- and self-assessment) were
regarded  as  the  top  two  most  frequently  used  practices  in  Teacher  B’s  English
classes. Using traditional assessment methods such as tests was ranked as the
least used type. Teacher B explained: “School leaders in reputable schools may
have high expectations on their teachers regarding the admission of students into
prestigious universities, and this may give teachers great pressure to prepare stu-
dents for external examinations. They are in a cycle of giving students tests and
then explaining test answers. In our school, the most important task is to raise our
students’ interest in English and foster positive learning attitudes, particularly in
the first two years of senior high school. This is because our students are not as
good as those in reputable schools.” Teacher B explained that although she came
from a district-level key school, the students in her school were similar to those
from general high schools in terms of academic performance.

Overall, Teacher B regarded assessment as a means of promoting student
learning and development. In particular, she underscored the importance of
providing feedback on students’ task performance and using it to encourage and
guide her students, particularly for Senior One and Two students. Despite the
fact that she worked in a district-level key school, her students resembled those
from general high schools academically. Therefore, her top priority seemed to
be the use of feedback to motivate and promote students’ learning during their
senior one and two study, with the awareness that her practices were consistent
with the formative view of assessment as advocated in the English curriculum
reform. Teacher B’s case reflected the influence of meso-level (i.e., school band-
ing), micro-level (i.e., average performing students studying in senior one and
two in a less prestigious school), and macro-level factors (i.e., the formative as-
sessment  initiatives  in  the  English  education  reform)  on  her  views  of  assess-
ment, although the other macro-level factor (i.e., the college entrance examina-
tion) remained the same for her school.

Table 5 summarizes the two teacher participants’ conceptions of assess-
ment with reference to Brown and Gao’s (2015) framework.


Maggie Ma, Gavin Bui

462

Table 5 A comparison between the two teachers’ conceptions of assessment

Brown and Gao’s (2015) framework Teacher A Teacher B
Management and inspection P
Institutional targets P
Facilitation and diagnosis P
Ability development P
Personal quality P
Negativity

5. Discussion

This study has sought to answer three research questions related to Chinese
secondary EFL teachers’ conceptions of assessment. Regarding RQ1, the study
has identified five major conceptions of assessment among the Chinese EFL
teachers based on the Chinese teachers’ conceptions of assessment inventory
(Brown et al., 2011). The “help learning” factor referred to using assessment to
improve learning and teaching and determine if students meet qualification
standards. The “assessment as accurate for student development” factor indi-
cated that as long as assessment results are sufficiently accurate, assessment
helps students succeed in real-life experiences, stimulates their thinking and in-
terest in learning, and cultivates their positive attitudes toward life. The “assess-
ment as accurate for examination and teacher/school control” factor suggested
that as long as assessment results are reliable, it can be used to prepare students
for examinations, control what teachers do, and indicate a school’s quality. The
“student/teacher accountability” factor suggested that assessment selects stu-
dents for future education or employment opportunities and assessment results
contribute to teachers’ appraisals. The “irrelevance” factor meant that assess-
ment is an imprecise process, interferes with teaching, forces teachers to teach
in a way against their beliefs, and assessment results are ignored. The “help
learning” factor and the “student/teacher accountability” factor were con-
sistent with Gan et al.’s (2018) research on Chinese EFL teachers. The “assess-
ment as accurate for student development” factor and the “assessment as ac-
curate for examination and teacher/school control” factor were different from
their study. This group of teacher participants bundled the notion that assess-
ment is accurate and reliable with both “student development and examina-
tion” and “teacher/school control.” It seemed that to the teacher participants,
judgments about student development as well as examination preparation and
the control of teacher/school depend on whether assessment is accurate and
reliable. The “irrelevance” factor identified in the study was not found in Gan et
al.’s (2018) study. In the study, the most endorsed conception was that assess-
ment is used to help learning. In this sense, this group of teachers held similar


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

463

views to those in previous research investigating Chinese secondary EFL teach-
ers (Gan et al., 2018), New Zealand secondary school teachers (Brown, 2011),
and Cypriot teachers (Brown & Michaelides, 2011). However, the teacher par-
ticipants were different from the Chinese teachers in Brown et al.’s (2011) re-
search where the same inventory was used.

There was strong inter-correlation between the “assessment as accurate
for examinations and teacher/school control” factor and the “assessment as ac-
curate for student development” factor (r = .55). In other words, as long as as-
sessment is accurate, using assessment to prepare students for examinations
and to control teachers/schools may also facilitate students’ development. Such
an association can probably be explained by the Chinese idea that excellent as-
sessment results reflect a more valuable person (Brown et al., 2011). In the Chi-
nese  context,  one  who  achieves  good  scores  in  examinations  is  regarded  as  a
good person because examination results indicate the quality and worth of the
individual (China Civilization Centre, 2007).

There was medium correlation between the “assessment as accurate for
examinations and teacher/school control” factor and the “student/teacher ac-
countability” factor (r = .48) in the teachers’ conceptions of assessment. This
indicated that those teachers who regarded assessment as a mechanism to eval-
uate teachers and students also considered it to be a way to prepare students
for examinations and to control teachers and schools on condition that it is ac-
curate. The Chinese society attaches great importance to public examination re-
sults because they are utilized to select students and evaluate teachers and
schools (Brown et al., 2011). Therefore, schools, teachers, and learners face
great pressure to ensure that students perform well in external high-stakes ex-
aminations. More often than not, drilling test-taking skills is employed for that
purpose. For example, as mentioned by Teacher A, her lesson was dominated
by the practice of test-taking skills because she was under school pressure to
produce high-achievers in the English test of the college entrance examination.

There was also medium correlation between the “student/teacher ac-
countability” factor and the “irrelevance” factor (r = .45). This suggested that
when it is connected to student/teacher accountability, assessment is likely to
be irrelevant. While this finding was not reported in Gan et al.’s (2018) study, it
was somewhat similar to the finding in Brown’s (2004) research on New Zealand
primary school teachers. It should be noted that only student accountability was
moderately related to irrelevance in Brown’s (2004) study, while in this study
both teacher and student accountability was associated with irrelevance. The
teacher participants questioned the validity of assessment as teacher and stu-
dent accountability probably because they were less convinced that public ex-
amination results alone can account for either students’ quality of learning or


Maggie Ma, Gavin Bui

464

teachers’ quality of teaching. For example, as mentioned by teacher B: “exami-
nation results cannot fully reflect teaching or learning quality.”

A medium-strength correlation was also found between the “help learn-
ing” factor and the “assessment as accurate for student development” factor (r
= .36). The finding indicated that assessment, perceived to contribute to learn-
ing, is also considered to facilitate student development if it is accurate. Teacher
beliefs may be subject to the influence of historical, social, cultural, and policy
contexts (Brown et al., 2019). Chinese teachers adhere to the cultural value that
being a teacher involves educating students in not only the academic dimension,
but also attitudinal and behavioral dimensions. This cultural value is reflected
by the meaning of “cultivating” in Chinese (Gao & Watkins, 2001) and the Chi-
nese expression “Jiao Shu Yu Ren,” which means imparting knowledge and ed-
ucating students to be good people in the society. Just as teacher B pointed out:
“The students need a teacher who can guide not only their academic study, but
also their views of the world and life.” The current educational policy in China
emphasizing students’ holistic development, including linguistic development,
cultural awareness, moral development, and thinking and learning skills (Chi-
nese Ministry of Education, 2017), may be another reason for the connection
between the “help learning” conception and the “assessment as accurate for
student development” conception.

Regarding RQ2, this study has identified the influence of school banding on
teachers’ conception of assessment as helping with learning. Teachers from munic-
ipal-level key schools agreed more strongly with the idea that assessment is to pro-
mote learning compared with those from district-level key schools. While previous
research showed that Chinese teachers in high-status/banding secondary schools
agreed more with personal quality factors (Shang, 2007), this study further revealed
that work environment such as school banding may influence Chinese teachers’
conceptions of assessment related to using assessment to enhance learning. Such
an influence indicated the need to take into consideration the meso-level factor of
school environment (i.e., school banding) in relation to the implementation of form-
ative assessment initiatives and teacher assessment training.

To sum up, the quantitative data revealed a general picture of the Chinese
secondary EFL teachers’ conceptions of assessment. Macro-level factors (soci-
ocultural and policy contexts) were used to explain the connection between
their different conceptions of assessment. The quantitative data also demon-
strated the impact of one meso-level factor (i.e., school banding) on the teach-
ers’ conceptions of assessment.

Regarding RQ3, the qualitative data further identified the differences in two
individual teachers’ conceptions and practices of assessment. It seemed that the
conceptions of Teachers A and B represented opposite points in the continuum


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

465

describing Chinese teachers’ thinking of assessment (Brown & Gao, 2015). That
is, Teacher A’s views indicated the management and inspection (e.g., using as-
sessment to control teachers so as to urge better achievement) and institutional
target (e.g., using assessment to measure students’ performance and to prepare
them for examinations) parts of the continuum. Teacher B’s views, on the other
hand, suggested the facilitation and diagnosis (e.g., providing oral feedback on
students’ performance), ability development (e.g., using positive teacher feed-
back to motivate students), and personal equality (e.g., using teacher feedback
to guide students’ views of the world and life) parts of the continuum. In gen-
eral, Teacher A’s and Teacher B’s conceptions of assessment reflected the sum-
mative (e.g., summative examination and judgment of learner outcomes) and
formative (e.g., feedback provision, improved learning and learning motivation)
dimensions of assessment, respectively. In accordance with the different con-
ceptions of assessment, the two teachers prioritized either summative or form-
ative assessment practices in their English classes.

The aforementioned differences can largely be attributed to the role of a
meso-level factor (i.e., school factor) and related to it, a micro-level factor (i.e.,
student factors) in mediating the influence of macro-level factors (sociocultural
and policy contexts) to shape teachers’ different conceptions of assessment to-
wards either the summative or formative end of the continuum. The assessment
context in China may push teachers towards two different ends of the assess-
ment continuum (i.e., the summative or formative ends) (Brown & Gao, 2015).
As high-stakes test may stimulate intensive test preparation in the classroom
(Qi, 2004), Teacher A’s assessment conceptions and practices can be said to be
derived from the washback effect of the college entrance examination. How-
ever, in the study it was the interplay of various contextual factors that contrib-
uted to her conceptions and practices of assessment. Teacher A’s school context
(i.e., reputable school, school leaders’ high expectations of teachers and stu-
dents) and the high achieving Senior Three students studying in it reinforced
summative views of assessment predominant in sociocultural values (i.e., the
importance of the college entrance examination). Teacher B’s school context
(i.e., a school with a lesser reputation, less pressure from leaders) and its aver-
age-performing Senior One and Two students seemed to be more conducive to
fostering her learning-focused views of assessment as advocated in the English
curriculum reform document (Chinese Ministry of Education, 2017), despite the
importance of the college entrance examination.

According to Fulmer et al. (2015), meso level factors and their connection
with macro- or micro-level factors are worth attention in research on teachers’
assessment conceptions, knowledge, or practices. As demonstrated by the
quantitative part of the study, a meso-level factor (i.e., school banding) exerted


Maggie Ma, Gavin Bui

466

an influence on Chinese secondary EFL school teachers’ conceptions of assess-
ment. The qualitative part of the study further identified the role of a meso-level
factor (e.g., school banding) and a micro-level factor (e.g., the kind of students in
schools with different bandings) in mediating macro-level factors (e.g., the college
entrance examination). The qualitative findings showed the interaction among the
meso-level, micro-level and macro-level factors in explaining individual Chinese
secondary EFL teachers’ conceptions of assessment. Notably, while the quantita-
tive data showed that teachers in municipal-level key schools agreed more than
those in district-level key schools that assessment is for promoting learning, the
qualitative data showed a different pattern in the two individual teachers’ concep-
tions. This contrast between the quantitative and qualitative findings was probably
due to the fact that the former reflected the general tendency of teachers as
groups (i.e., groups of teachers from municipal-level or district-level key schools),
while the latter revealed the conceptions of assessment held by teachers as indi-
viduals because of the interplay among macro-, meso-, and micro-level factors.
Such a contrast highlighted the importance of using qualitative data to add to
quantitative data for an in-depth understanding of teachers’ conceptions of assess-
ment, which is subject to various layers of contextual factors.

Concerning the implications of the study, the Chinese secondary EFL teach-
ers as a group associated examination and teacher/school control with student
development,  which  makes  it  less  likely  for  the  teachers  to  adopt  formative  as-
sessment initiatives that aim to foster students’ holistic development as man-
dated by the English curriculum reform (Chinese Ministry of Education, 2017). As
pointed out by Brown et al. (2011), if a relevant accountability authority places
much less emphasis on employing high-stakes examinations to evaluate students,
then changes in teacher beliefs and practices are much more likely. This point has
also been echoed by teacher A. In China (e.g., Zhejiang and Jiangsu Provinces)
there has been a recent attempt to reform the college entrance examination by
including more criteria for university admission (e.g., personal growth portfolios)
in addition to examination scores (Gan et al., 2018). However, at the current stage,
public examinations still dominate the educational context, and there may be dif-
ficulties for the Chinese EFL teachers in the study to embrace formative assess-
ment emphasizing students’ holistic development.

Although the teachers in the municipal-level key schools as a group
tended to endorse the view that assessment is used to enhance learning,
Teacher A’s case indicated that in the same high banding school there may be
individual teachers like her who believed less in the idea of using assessment for
learning due to the interactional impact of meso-level factors (e.g., school band-
ing), micro-level factors (e.g., student factors) and macro-level factors (e.g., the
college entrance examination). Her case suggested that a situated approach


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

467

should be adopted to introduce changes into the assessment beliefs and prac-
tices of teachers such as her. Such an approach is a complex endeavor which
involves the consideration of the three layers of factors as mentioned earlier.
For example, although limited changes can be made to macro-level factors (e.g.,
the college entrance examination) currently, meso-level factors can be manipu-
lated to influence the assessment conceptions and practices of teachers like
Teacher A. As the opportunities for reflective practices and participation in
learning communities represent two main ways of teacher learning to enhance
teachers’ assessment literacy (Xu & Brown, 2016), school leaders may establish
a community of practice (Wenger-Trayner & Wenger-Trayner, 2015) comprising
leaders and teachers who share the same visions regarding the learning pur-
poses of assessment. Such a community may then promote a formative view of
assessment to teachers such as Teacher A and gradually involve them in partici-
pating reflectively in the community of practice. Notably, in an attempt to create
such a facilitative school environment, school leaders themselves need to first
reflect on their views of assessment and obtain more knowledge about forma-
tive assessment. Since the aforementioned meso-level factor will also interact
with micro-level factors (e.g., the summative views of assessment already held
by Teacher A), it is important to promote a form of formative assessment that
teachers may find contextually appropriate (e.g., formative use of summative
assessment in Teacher A’s case) to influence their conceptions of assessment
towards the formative end of the continuum.

Compared with their counterparts in municipal-level key schools, the teach-
ers in the district-level key schools overall endorsed less the view of using assess-
ment for learning purposes. However, Teacher B’s case suggested that a formative
view of assessment can be fostered due to an interplay of macro-, meso- and mi-
cro-level factors. In schools such as the one where Teacher B worked, a situated
approach to shaping teachers’ assessment conceptions and practices can also be
adopted. Despite the fact that few changes can be made to the macro-level fac-
tors (e.g., the college entrance examination), at the school level (i.e., meso-level),
a community of practice involving teachers such as Teacher B as key members can
be built and opportunities should be given to these teachers to share with their
colleagues the formative views and practices of assessment, with the aim of in-
volving the reflective participation of more teachers in the community. To improve
the effectiveness of such sharing activities, it is important to pay attention to not
only the key members’ assessment conceptions, but also their assessment
knowledge (i.e., micro-level factors). In this way, adequate suggestions on differ-
ent types of contextually appropriate formative assessment can be provided to
different kinds of teachers according to their micro-level factors (e.g., those teach-
ing Senior One and Two versus those teaching Senior Three).


Maggie Ma, Gavin Bui

468

In this sense, teachers such as Teacher B need to further enhance their
knowledge of formative assessment, despite the formative view of assessment
and awareness of its cognitive and affective benefits. For example, Teacher B be-
lieved that formative assessment was reserved for average-performing students
like  those  in  her  school  who  needed  more  teacher  scaffolding  and  encourage-
ment, and that high achieving students in Teacher A’s school did not need it.
Formative assessment is powerful in improving weak students’ performance
(Black & Wiliam, 1998), but it does not mean that it should only be reserved for
average  or  weak  students.  In  addition,  Teacher  B  seemed  to  attach  less  im-
portance to using assessment results to inform instruction, despite her use of
teacher oral feedback and student-centered assessment practices. This lack of
connection between assessment and instruction has also been identified in Lam’s
(2019) research on Hong Kong secondary English teachers. Teacher B’s case
showed that demonstrating formative conceptions of assessment does not nec-
essarily mean that the teacher has sophisticated and sufficient knowledge of
formative assessment. If teachers like her have to play a key role in sharing their
formative conceptions and practices of assessment and encouraging colleagues
to participate in the community of practice, it is necessary to ensure that they
possess appropriate conceptions as well as knowledge of formative assessment.

6. Conclusion

This study has sought to explore Chinese secondary EFL teachers’ conceptions of
assessment and the shaping influences on it based on both quantitative and qual-
itative data. As a group, the teacher participants agreed most strongly with the
view that assessment is used to promote learning. However, the strong association
they made between the “assessment as accurate for examination and
teacher/school control” factor and the “assessment as accurate for student devel-
opment” factor suggested that the formative assessment initiatives focusing on
students’ holistic development as promoted in the English curriculum reform are
less likely to be adopted by the teachers as a group at the current stage. The quan-
titative analysis also identified the influence of one meso-level factor (i.e., school
banding) on the teachers’ conception of assessment as helping with learning. Qual-
itative data further demonstrated how a meso-level factor (e.g., school factors
such as school banding) and a micro-level factor (e.g., student factors) interacted
with each other to mediate the macro-level factor (e.g., the college entrance ex-
amination) in shaping Teacher A’s and Teacher B’s conceptions of assessment, rep-
resenting the summative and formative dimensions of assessment, respectively.

This study has demonstrated the importance of utilizing both quantitative and
qualitative data to provide the general pattern and contextualized understanding of


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

469

Chinese secondary EFL teachers’ conceptions of assessment. In particular, the
qualitative data added to the quantitative data by demonstrating the situated
nature of teacher conceptions of assessment, which are subject to the interac-
tion of various contextual factors. Accordingly, a situated approach paying spe-
cial attention to the interacting impact of meso-level (i.e., school factor) and
micro-level factors (e.g., teacher and student factors) should be adopted to
shape the teachers’ views and knowledge of assessment and to facilitate the
implementation of formative assessment as advocated in English curriculum re-
form in China. This study only involved a purposive sample of 66 teachers from six
secondary schools in Eastern China, so its findings can only be generalized to sim-
ilar contexts. Nevertheless, the investigation has shown the importance of con-
sidering the interplay of macro-, meso- and micro-level factors in exploring teach-
ers’ conceptions of assessment through a mixed-methods approach and pro-
posed a situated approach to developing teachers’ assessment literacy. Future re-
search may involve a more representative sample with the use of both perception
and classroom observation data to explore EFL teachers’ conceptions of assess-
ment. Research may also investigate effective ways to implement formative as-
sessment at the classroom and school levels based on a situated approach.


Maggie Ma, Gavin Bui

470

References

Berry, R., & Adamson, B. (Eds.). (2011). Assessment reform in education: Policy
and practice (Vol. 14). Springer Science & Business Media.

Black, P., & Wiliam, D. (1998, October). Inside the black box: Raising standards
through classroom assessment. Phi Delta Kappan, 80(2), 139-149. https://
doi.org/10.1177/003172171009200119

Brown, G. T. L. (2004). Teachers’ conceptions of assessment: implications for policy
and professional development. Assessment in Education: Principles, Policy &
Practice, 11, 301-318. https://doi.org/10.1080/0969594042000304609

Brown, G. T. L. (2011). Teachers’ conceptions of assessment: Comparing primary
and secondary teachers in New Zealand. Assessment Matters, 3, 45-70.
https://doi.org/10.18296/am.0097

Brown, G. T., & Gao, L. (2015). Chinese teachers’ conceptions of assessment for
and of learning: Six competing and complementary purposes. Cogent Edu-
cation, 2(1), 993836. https://doi.org/10.1080/2331186x.2014.993836

Brown, G. T., Gebril, A., & Michaelides, M. P. (2019). Teachers’ conceptions of
assessment: a global phenomenon or a global localism. Frontiers in Edu-
cation, 4(16). https://doi.org/10.3389/feduc.2019.00016

Brown, G. T. L., Hui, S. K. F., Yu, F. W. M., & Kennedy, K. J. (2011). Teachers’ con-
ceptions of assessment in Chinese contexts: A tripartite model of account-
ability, improvement, and irrelevance. International Journal of Educa-
tional Research, 50, 307-320. https://doi.org/10.1016/j.ijer.2011.10.003

Brown, G. T. L., & Michaelides, M. P. (2011). Ecological rationality in teachers’
conceptions of assessment across samples from Cyprus and New Zealand.
European Journal of Psychology of Education, 26(3), 319-337. https://doi.
org/10.1007/s10212-010-0052-3

Brown, T. A. (2015). Confirmatory factor analysis for applied research. Guilford
Publications.

Bui, G., & Kong, A. (2019). Metacognitive instruction for peer review interaction
in L2 writing. Journal of Writing Research, 11(2), 357-392. https://doi.org/
10.17239/jowr-2019.11.02.05

Chen, J., & Brown, G. T. (2016). Tensions between knowledge transmission and
student-focused teaching approaches to assessment purposes: Helping
students improve through transmission. Teachers and Teaching, 22 (3),
350-367. https://doi.org/10.1080/13540602.2015.1058592

China Civilization Centre. (2007). China: Five thousand years of history and civi-
lization. City University of Hong Kong Press.

Chinese Ministry of Education. (2017). English curriculum standards for senior
high schools. People’s Education Press.


Chinese secondary school teachers’ conceptions of L2 assessment: A mixed-methods study

471

Creswell, J. W (2013). Qualitative inquiry and research design: Choosing among
five approaches. Sage.

Creswell, J. W. (2014). Research design: Qualitative, quantitative, and mixed
methods approaches. Sage Publications.

Fives, H., & Buehl, M. M. (2012). Spring cleaning for the “messy” construct of
teachers’ beliefs: What are they? Which have been examined? What can
they tell us? In K. R. Harris, S. Graham, & T. Urdan (Eds.), APA educational
psychology handbook: Individual differences and cultural al and contex-
tual factors (Vol. 2, pp. 471-499). American Psychological Association.

Fulmer, G. W., Lee, I. C., & Tan, K. H. (2015). Multi-level model of contextual fac-
tors  and  teachers’  assessment  practices:  An  integrative  review  of  re-
search. Assessment in Education: Principles, Policy & Practice, 22(4), 475-
494. https://doi.org/10.1080/0969594x.2015.1017445

Gan, Z., Leong, S. S., Su, Y., & He, J. (2018). Understanding Chinese EFL teachers’
conceptions and practices of assessment: Implications for teacher assess-
ment literacy development. Australian Review of Applied Linguistics, 41(1),
4-27. https://doi.org/10.1075/aral.17077.gan

Gao, L., & Watkins, D. (2001). Identifying and assessing the conceptions of teaching
of secondary school physics teachers in China. British Journal of Educational
Psychology, 71(3), 443-469. https://doi.org/10.1348/000709901158613

Hao, J., & Otani, M. (2016). English education in high schools in China: Its current
status and problems. Memoirs of the Faculty of Education of Shimane Uni-
versity (Educational Science), 50, 65-73.

He, Y., Levin, B. B., & Li, Y. (2011). Comparing the content and sources of the pedagog-
ical beliefs of Chinese and American pre-service teachers. Journal of Education
for Teaching, 37, 155-171. https://doi.org/10.1080/02607476.2011.558270

Kennedy,  K.  J.,  &  Lee,  J.  (2008). Changing schools in Asia: Schools for the
knowledge society. Routledge.

Lam, R. (2019). Teacher assessment literacy: Surveying knowledge, conceptions
and practices of classroom-based writing assessment in Hong Kong. Sys-
tem, 81, 78-89. https://doi.org/10.1016/j.system.2019.01.006

Marton, F. (1981). Phenomenography – describing conceptions of the world
around us. Instructional Science, 10 (2), 177-200.

Miles, M. B., Huberman, A. M., & Saldaña, J. (2014). Qualitative data analysis: A meth-
ods sourcebook. Sage Publications.

Pajares, M. F. (1992). Teachers’ beliefs and educational research: Cleaning up a
messy construct. Review of Educational Research, 62, 307-332. https://
doi.org/10.3102/00346543062003307


Maggie Ma, Gavin Bui

472

Qi, L. (2004). Has a high-stakes test produced the intended changes? In L. Cheng,
Y. Watanabe, & A. Curtis (Eds.), Washback in language testing: Research
contexts and methods (pp. 147-170). Lawrence Erlbaum.

South China Normal University Team. (2010, July). Teachers’ conceptions of as-
sessment: Developing models for teachers in China [Paper presentation].
The International Test Commission Conference 2010, The Chinese Univer-
sity of Hong Kong, Shatin. http://www.itc2010hk.com/

Shang, H. (2007). Research on the middle school teachers’ conceptions of learn-
ing assessment (Unpublished master’s thesis). South China Normal Uni-
versity, Guangzhou.

Stobart, G. (2006). The validity of formative assessment. In J. Gardner (Ed.), As-
sessment and learning (pp. 133-146). Sage Publications.

Teng, F., & Bui, G. (2020). Thai university students studying in China: Identity,
imagined communities, and communities of practice. Applied Linguistics
Review, 11(2), 341-368. https://doi.org/10.1515/applirev-2017-0109

Wang, P. (2010). Research on the Chinese teachers’ conceptions and practice of
assessment (Unpublished doctoral dissertation). South China Normal Uni-
versity, Guangzhou (in Chinese).

Wenger-Trayner, E., & Wenger-Trayner, B. (2015). Introduction to communities
of practice: A brief overview of the concept and its uses. https://wenger-t
rayner.com/introduction-to-communities-of-practice/

Xu, Y., & Brown, G. T. (2016). Teacher assessment literacy in practice: A recon-
ceptualization. Teaching and Teacher Education, 58, 149-162. https://doi.
org/10.1016/j.tate.2016.05.010

Xu, Y., & Brown, G. T. (2017). University English teacher assessment literacy: A sur-
vey-test report from China. Papers in Language Testing and Assessment, 6
(1), 133-158.

Yu, C., Wei, F., Li, L., Morrissey, P., & Chen, N. (2016). Social attitudes in contem-
porary China. Routledge.

Zhang, Z., & Burry-Stock, J. A. (2003). Classroom assessment practices and teachers’
self-perceived assessment skills. Applied Measurement in Education, 16(4),
323-342. https://doi.org/10.1207/s15324818ame1604_4