Measuring the impact of rater negotiation in writing performance assessment

Jonathan Trace, Gerriet Janssen, Valerie Meier

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require adjudication by at third rater. However, from an assessment validation standpoint, questions remain about the impact of negotiation on the scoring inference of a validation argument (Kane, 2006, 2012). Thus, this mixed-methods study evaluates the impact of score negotiation on scoring consistency in second language writing assessment, as well as negotiation’s potential contributions to raters’ understanding of test constructs and the local curriculum. Many-faceted Rasch measurement (MFRM) was used to analyze scores (n = 524) from the writing section an EAP placement exam and to quantify how negotiation affected rater severity, self-consistency, and bias toward individual categories and test takers. Semi-structured interviews with raters (n = 3) documented their perspectives about how negotiation affects scoring and teaching. In this study, negotiation did not change rater severity, though it greatly reduced measures of rater bias. Furthermore, rater comments indicated that negotiation supports a nuanced understanding of the rubric categories and increases positive washback on teaching practices.

Original languageEnglish
Pages (from-to)3-22
Number of pages20
JournalLanguage Testing
Volume34
Issue number1
DOIs
Publication statusPublished - 2017 Jan 1
Externally publishedYes

Fingerprint

performance assessment
trend
language
Performance Assessment
Raters
teaching practice
curriculum
Teaching
interview
resources
Scoring

Keywords

  • Many-faceted Rasch measurement
  • negotiation
  • performance assessment
  • rubric
  • validation
  • writing

ASJC Scopus subject areas

  • Language and Linguistics
  • Social Sciences (miscellaneous)
  • Linguistics and Language

Cite this

Measuring the impact of rater negotiation in writing performance assessment. / Trace, Jonathan; Janssen, Gerriet; Meier, Valerie.

In: Language Testing, Vol. 34, No. 1, 01.01.2017, p. 3-22.

Research output: Contribution to journalArticle

Trace, Jonathan ; Janssen, Gerriet ; Meier, Valerie. / Measuring the impact of rater negotiation in writing performance assessment. In: Language Testing. 2017 ; Vol. 34, No. 1. pp. 3-22.
@article{742919f3041546748a1eafe6fc2d13d9,
title = "Measuring the impact of rater negotiation in writing performance assessment",
abstract = "Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require adjudication by at third rater. However, from an assessment validation standpoint, questions remain about the impact of negotiation on the scoring inference of a validation argument (Kane, 2006, 2012). Thus, this mixed-methods study evaluates the impact of score negotiation on scoring consistency in second language writing assessment, as well as negotiation’s potential contributions to raters’ understanding of test constructs and the local curriculum. Many-faceted Rasch measurement (MFRM) was used to analyze scores (n = 524) from the writing section an EAP placement exam and to quantify how negotiation affected rater severity, self-consistency, and bias toward individual categories and test takers. Semi-structured interviews with raters (n = 3) documented their perspectives about how negotiation affects scoring and teaching. In this study, negotiation did not change rater severity, though it greatly reduced measures of rater bias. Furthermore, rater comments indicated that negotiation supports a nuanced understanding of the rubric categories and increases positive washback on teaching practices.",
keywords = "Many-faceted Rasch measurement, negotiation, performance assessment, rubric, validation, writing",
author = "Jonathan Trace and Gerriet Janssen and Valerie Meier",
year = "2017",
month = "1",
day = "1",
doi = "10.1177/0265532215594830",
language = "English",
volume = "34",
pages = "3--22",
journal = "Language Testing",
issn = "0265-5322",
publisher = "SAGE Publications Ltd",
number = "1",

}

TY - JOUR

T1 - Measuring the impact of rater negotiation in writing performance assessment

AU - Trace, Jonathan

AU - Janssen, Gerriet

AU - Meier, Valerie

PY - 2017/1/1

Y1 - 2017/1/1

N2 - Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require adjudication by at third rater. However, from an assessment validation standpoint, questions remain about the impact of negotiation on the scoring inference of a validation argument (Kane, 2006, 2012). Thus, this mixed-methods study evaluates the impact of score negotiation on scoring consistency in second language writing assessment, as well as negotiation’s potential contributions to raters’ understanding of test constructs and the local curriculum. Many-faceted Rasch measurement (MFRM) was used to analyze scores (n = 524) from the writing section an EAP placement exam and to quantify how negotiation affected rater severity, self-consistency, and bias toward individual categories and test takers. Semi-structured interviews with raters (n = 3) documented their perspectives about how negotiation affects scoring and teaching. In this study, negotiation did not change rater severity, though it greatly reduced measures of rater bias. Furthermore, rater comments indicated that negotiation supports a nuanced understanding of the rubric categories and increases positive washback on teaching practices.

AB - Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require adjudication by at third rater. However, from an assessment validation standpoint, questions remain about the impact of negotiation on the scoring inference of a validation argument (Kane, 2006, 2012). Thus, this mixed-methods study evaluates the impact of score negotiation on scoring consistency in second language writing assessment, as well as negotiation’s potential contributions to raters’ understanding of test constructs and the local curriculum. Many-faceted Rasch measurement (MFRM) was used to analyze scores (n = 524) from the writing section an EAP placement exam and to quantify how negotiation affected rater severity, self-consistency, and bias toward individual categories and test takers. Semi-structured interviews with raters (n = 3) documented their perspectives about how negotiation affects scoring and teaching. In this study, negotiation did not change rater severity, though it greatly reduced measures of rater bias. Furthermore, rater comments indicated that negotiation supports a nuanced understanding of the rubric categories and increases positive washback on teaching practices.

KW - Many-faceted Rasch measurement

KW - negotiation

KW - performance assessment

KW - rubric

KW - validation

KW - writing

UR - http://www.scopus.com/inward/record.url?scp=85008262998&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85008262998&partnerID=8YFLogxK

U2 - 10.1177/0265532215594830

DO - 10.1177/0265532215594830

M3 - Article

AN - SCOPUS:85008262998

VL - 34

SP - 3

EP - 22

JO - Language Testing

JF - Language Testing

SN - 0265-5322

IS - 1

ER -