Evaluating Rater-Mediated Assessments with Rasch Measurement Theory and Mokken Scaling Open Access

Wind, Stefanie Anne (2014)

Permanent URL: https://etd.library.emory.edu/concern/etds/2f75r8811?locale=en

Published

Abstract

Models based on Rasch Measurement Theory (Rasch, 1960/1980) are frequently used to explore the quality of ratings assigned in large-scale rater-mediated educational assessments (Engelhard, 2013; Wolfe, 2009) because they meet the requirements for invariant measurement. In contrast, the utility of nonparametric models that meet the requirements for invariant measurement for monitoring rating quality is unexplored. Because they are less restrictive, nonparametric models may provide useful information to inform the interpretation and use of rater-assigned scores. The purpose of this study is to describe, illustrate, and extend current indices of rating quality with concepts from Mokken scaling. The major methods used to address the guiding questions for this study include a literature review, illustrative data analyses, and the application of parametric and nonparametric models to data from large-scale rater-mediated assessments. Mokken-based analyses are conducted using the mokken package for the R statistical software program (van der Ark, 2013; R Development Core Team, 2013). Rasch-based analyses are conducted using the Facets program (Linacre, 2010).

Major findings suggest that Mokken scale analysis provide diagnostic information that supplements indices of measurement quality based on Rasch measurement theory. Further, findings suggest that parametric and nonparametric indicators of measurement quality provide related, but slightly different, information about measurement quality in the context of rater-mediated assessments. The diagnostic information provided by the Mokken-based indicators illustrated in this study is especially promising for assessment development, including rater training and the development of scoring rubrics. In response to the increased emphasis on the use of evidence to guide policy and practice in education (Cooper, Levin, & Campbell, 2009; Huff, Steinberg, & Matts, 2010; Mislevy, Steinberg, Breyer, Almond, & Johnson, 2002), the use of assessments that require constructed responses (e.g., essays and portfolios) is increasing, such as those included in the next-generation assessments included in the Race to the Top initiative (U.S. Department of Education, 2010). Within the framework of invariant measurement, this study proposes and applies a coherent set of indicators of rating quality based on measurement models with useful properties that can be used in practice to inform the development, interpretation, and use of rater-mediated assessments.

Chapter One: Introduction - 4

Theoretical Framework - 6

Statement of the Problem - 21

Purpose of the Study - 21

Research Questions - 22

Definitions - 22

Overview of Dissertation - 27

Chapter Two: Review of Literature - 29

What are the major underlying measurement issues related to rater-mediated assessments? - 30

How have these measurement issues been traditionally addressed in previous research? - 34

1. What are the major indices of rater agreement? - 36

2. What are the major indices of rater errors and systematic biases? - 49

3. What are the major indices of rater accuracy? - 56

Summary - 63

Chapter Three: Illustration of Modern Rating Quality Indices based on Rasch Measurement Theory - 68

What is Item Response Theory? - 70

What is Rasch Measurement Theory? - 73

Rasch Measurement Theory for Dichotomous Data - 74

Rasch Measurement Theory for Polytomous Ratings - 81

Using Rasch Measurement Theory to Examine the Quality of Ratings - 84

Model I: Many-Facet Rasch Model for Rater Invariance - 90

Model II: Many-Facet Rasch Model for Rater Accuracy - 103

Summary - 108

Chapter Four: Illustration of Modern Rating Quality Indices based on Mokken Scaling - 110

What is Mokken Scaling? - 110

Mokken Scaling for Dichotomous Data - 115

Mokken Scaling for Polytomous Ratings - 131

Using Mokken Scaling to Examine the Quality of Ratings - 136

Model III: Monotone Homogeneity for Ratings (MH-R) Model - 137

Model IV: Double Monotonicity for Ratings (DM-R) model - 143

Summary - 153

Chapter Five: Examining Rating Scales using Rasch and Mokken Models for Rater-Mediated Assessments - 155

Introduction - 155

Purpose - 157

Research Questions - 157

Procedures - 158

Data Analysis - 159

Results - 176

Conclusions - 183

Chapter Six: Discussion and Conclusions - 187

Research Question 1: What are the major underlying measurement issues related to rating quality? - 188

Research Question 2: How have these measurement issues been traditionally addressed in previous research? - 190

Research Question 3: How has Rasch measurement theory been used to examine the quality of ratings? - 192

Research Question 4: How can Mokken scaling be used to examine the quality of ratings? - 193

Research Question 5: What is the relationship between Rasch- and Mokken-based indices of rating quality? - 195

Limitations - 197

Implications for Research, Theory, Policy, and Practice - 198

References - 203

About this Dissertation

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Laney Graduate School
Department	Educational Studies
Degree	PhD
Submission	Dissertation
Language	English
Research Field	Psychology, Psychometrics Education, Tests and Measurements
Keyword	Rasch nonparametric item response theory writing assessment polytomous item response theory rater-mediated Mokken scaling performance assessment
Committee Chair / Thesis Advisor	Engelhard, George, Emory University
Committee Members	Jensen, Robert J, Emory University Cheong, Yuk Fai, Emory University

Last modified

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Evaluating Rater-Mediated Assessments with Rasch Measurement Theory and Mokken Scaling ()	2018-08-28 14:57:56 -0400	Download

Evaluating Rater-Mediated Assessments with Rasch Measurement Theory and Mokken Scaling Open Access

Wind, Stefanie Anne (2014)

Abstract

Table of Contents

About this Dissertation

Primary PDF

Supplemental Files