Jul 21

The Inaccuracy of Tests

Sebastian Bergmann Some Rights Reserved

Sebastian Bergmann Some Rights Reserved

I’m taking an online course, Formative Assessment and Standards-Based Grading, from the Marzano Research Laboratory, and it’s reinforcing what I’ve believed about how subjective it is to grade and score kids on their learning. As teachers we strive to assess our students’learning as accurately as possible. The truth is that an A in my class is not the same as an A in someone else’s class. I’m not even certain that a, “Meets this Standard,” in my class is the same as, “Meets this Standard,” in a similar class. Letter grades or standards-based grades are snapshots of a complicated process and change all the time. That being said I do prefer standards-based grading for many reasons as long as the “grading” period doesn’t happen often. I’d much rather give my students feedback most of the time to help them learn than cloud that feedback with letters or numbers that reduce their learning.

Here’s a great example from the formative assessment course I’m taking:

Let’s take this one type of test –
Questions 1-10 are basic, simple recall questions, probably multiple choice or fill in the blank, about content that was explicitly taught.

Questions 11-14 are more complex questions, probably short answer, still about content that was explicitly taught.

Questions 15-16 are complex questions, probably short answer, that ask students to apply what they learned. Content was NOT explicitly taught.

So let’s say the same exact test is given to students in three different classes, a common assessment. I took the test and got all of the first 10 questions correct, I got half of the second set of questions correct and I got none of the last two questions correct.

Take into account this scenario of my test being scored in each of the three different classes:
In class 1 the teacher weighs the first set of questions, 1-10 at 20%, the second set of questions, 11-14 at 20%, and the third set of questions, 15-16 at 40%.

So in class 1 I got 20% + 10% + 0% = 30%, I got an F.

In class 2 the teacher weighs the first set of questions, 1-10 at 60%, the second set of questions, 11-14 at 30%, and the third set of questions, 15-16 at 10%.

So in class 2 with the same test getting the same questions correct I got 60% + 15% + 0% = 75%, I got a C.

In class 3 the teacher weighs the first set of questions, 1-10 at 80%, the second set of questions, 11-14 at 20%, and the third set of questions, 15-16 at 0% because that teacher feels he or she cannot hold students accountable for content he or she did NOT explicitly teach.

So in class 3 I got 80% + 10% = 90%, I got an A-!

How can I take the same test and depending on the class, or specifically depending on the way the questions types are weighted, get anywhere from an F to an A-?!

“There’s measurement error in any test no matter how well we design them, there’s going to be measurement error. Let me give you an example. One of the readability analyses, there’s a lot of different ones you can use to find out what’s the readability level on any kind of a passage. One of those in particular has a 1.5, what would be considered a year and a half measurement error. What if I’m a First Grade teacher, I want that readability to be spot on, don’t I? So if I happen to see that it says 2.0, that’s the readability, 2.0, and let’s pretend that it only has one measurement error, one grade level span at measurement error, that means that 2.0 could be as low as First Grade, could be as high as Third Grade. That’s what we call measurement error. There’s error inherent in every kind of measurement, which is why you don’t want to base big decisions only on a couple of assessments, and certainly not only on just one assessment, you want to use multiple pieces of evidence.”
Marzano Research Laboratory Vice President Dr. Tammy Heflebower, from Formative Assessment and Standards-Based Grading course.

We need to figure out what we want out of our assessments. If it’s to help students learn then feedback is the probably the best way to go. If it’s to report to students and their families what they have learned, or what they’ve shown that they have learned, then using rubrics and student self-assessment with teacher input (standards-based grading) is useful. If you’re stuck having to reduce student learning to letter grades then try having conversations with students as to what their “final grade” should be. Usually what ends up happening here is some elaborate math to convert standards-based grades to letter grades or percentages. It sucks that we have to do that just because that’s the way it’s been done.

Related Posts Plugin for WordPress, Blogger...
Be Sociable, Share!

Permanent link to this article: http://www.educatoral.com/wordpress/2014/07/21/the-inaccuracy-of-tests/


    • Dave Webb on July 21, 2014 at 12:24 pm

    Students need to learn from a variety of teachers so that they are not just learning the explicit content, but learn to adapt learning itself to different environments. The subjectivity you are concerned about is part of this, and is not a bad thing; remember, what is expected of people in the “real” world is ever-changing, not guaranteed by some rubric. Students need to learn to cope with this. Any discussion of subjectivity vs fairness is largely irrelevant with respect to this topic. If only specified, explicit content can be taught; and rigid rubrics are required for evaluation; then why are teachers even needed? Big business wins, and replaces the human aspect of teaching and learning with technology. Don’t get caught up in that agenda.

  1. I totally agree that students need to learn from a variety of teachers and adapt to different learning environments. The subjectivity that is natural is not a bad thing but I hear administrators question why our students are getting good grades in their classes yet not passing standardized tests. We all know the answers to that question but the fact that a one-day-a-year test can hold so much weight is ridiculous.

    The agenda that we are caught up in by default is the big corporate agenda to use the common core standards to compare students in every city across the U.S. to each other and to evaluate their teachers using their common core generated tests.

    I’m actually fortunate that I can stay out of that agenda because WA state allows its teachers to choose whether or not standardized test scores are to be used as part of their evaluation to show evidence of student growth.

Comments have been disabled.