School Choice Popularity Growing Steadily in Wisconsin
School choice is a growing movement in Wisconsin as parents increasingly take advantage of... Read More
Researchers from University of Akron, Ohio found that the nine robo-grading products available on the market assigned the same scores as human graders.
Although Scantron answer sheets have been part and parcel of test-taking since at least the 1960s, there have been no serious attempts to design a software tool for grading student essays until recently. No doubt many overwhelmed teachers would have welcomed the wide adoption of such tools at any time, the need for them became more acute in 2005, when essay-writing became a mandatory part of the redesigned SAT test. Now, with many states reworking their academic programs to conform with the Common Core Curriculum means that kids in all grades will be submitting more written assignments than ever, and with the shrinking education budgets leading to staffing cuts, computer-aided grading would greatly reduce the burden on instructors who remain.
With that in mind, two professors from the College of Education at the University of Akron, Ohio, Morgan and Mark Shermis, decided to put several essay-grading software packages available on the market to a rigorous test, by having them grade 16,000 essays that been previously assigned grades by teachers. The results, announced during this year’s National Council on Measurement in Education meeting held in Vancouver, Canada, showed that at least some of the programs produced marks very similar to the ones given by humans.
Grading software from nine manufacturers, which together cover 97 per cent of the US market, was used in the test. To calibrate the systems, each looked for correlations between factors associated with good essays, such as strong vocabulary and good grammar, and the human-assigned score. After training, the software marked another set of essays without access to the human-given grades.
According Morgan Shermis, the grades assigned by the computer programs were statistically identical to those given by human teachers which proves that such software has progressed a great deal since development first began. When he heard of the Akron team’s findings, Les Perelman, who teachers writing at the Massachusetts Institute of Technology, said he wasn’t surprised but not because he considers himself a great supporter of computer grading. On the contrary, as far as he’s concerned, these kinds of programs are “reinventing the wheel,” replicating the technology already available, on the market and installed on nearly every computer in the country and the world: Microsoft Word.
The ubiquitous word processing program is “a much better product than anything that’s going to be developed by this competition,” he says. Its grammar checker is fairly sophisticated, but can be fooled. For instance, if a student types, “The car was parked by the side of the road,” Word suggests, “(The) side of the road parked the car.”
Perelman worries that the bid to develop machine readers will, in the end, train humans to read more like machines. “It will get good agreement (between humans and machines) but not necessarily good writing.”
Still, with the CCS’s wide adoption scheduled for 2014, and thus with the imminent increase of the number of writing assignments students will have to complete and teachers will have to grade, many feel they can not afford to turn up their noses at a tool that will help them cope. Jeff Pence, who teaches English at the Dean Rusk Middle School in Canton, GA, and who uses essay-scoring software to grade the papers of his 120 students, admits that while he is not blind to the tools’ shortcomings, neither is he unaware of the shortcomings of overwhelmed human graders. So far this year, with the aid of the program, he was able to collect and grade 25 written assignments from each of his students, sometimes returning them the next day, while hand-grading even a single batch would have previously taken him nearly two weeks.
”I know, as does every teacher out there, that on that 63rd essay, I am nowhere near as consistent, accurate or thorough as I was on the first three.”
Thursday
April 26th, 2012
Filed Under
School choice is a growing movement in Wisconsin as parents increasingly take advantage of... Read More
According to data collected from Ohio’s new value-added teacher ratings, there appears to be... Read More
Lawmakers in Dubai hope that forcing schools and parents to sign a legally binding contract before... Read More
Plan your career as an educator using our free online datacase of useful information.
Comments
From what I read on this subject, these graders basically just check syntax and grammar and can’t meaningfully analyze context. I understand that teachers need help in this regard, but natural language processing hasn’t reached the level yet where these are going to be truly useful to grade high-school level work.
OpenEd, where I’m CEO, managed the demonstration. Dr. Mark Shermis, UAkron, wrote the report with Ben Hammer, Kaggle.com–where the demonstration and the open competition are hosted. Jaison Morgan designed the competition.
The 9 engines tested operationalize 40-50 variables. The top 3-4 engines match or beat human graders across most of 8 tested data sets. Some sets were scored on 6 traits, others used holistic scoring. Here’s a summary of an NPR interview Shermis did this week:
http://gettingsmart.com/blog/2012/04/better-tests-more-writing-deeper-learning/
Kevin, I think you underestimate how useful grammar and syntax checking is for writers in earlier grades. A lot of people’s writing is abysmal nowadays because they don’t get nearly enough practice. I remember college freshman English and most of my classmates couldn’t put together a grammatically sound written report even after several rounds of drafting.
Please include Lincoln’s “Gettysburg Address” when evaluating the grading programs. Will any of the grading programs provide a statement like “The opening sentence needs to be stronger.” My daughter’s Spanish Composition teacher wrote that on one of her papers.
Is that supposed to be valuable criticism? Cause it doesn’t sound like it. “opening sentence needs to be stronger?” What the heck is that supposed to even mean?
Maybe the teacher’s comments need to be clearer. I suggest your daughter write that on her report when she submits it again.
“First impressions are so important. How many times have you heard that? It is true that the first impression—whether it’s a first meeting with a person or the first sentence of a paper—sets the stage for a lasting opinion.
The introductory paragraph of any paper, long or short, should start with a sentence that piques the interest of your readers.
In a typical essay, that first sentence leads into two or three sentences that provide details about your subject or your process. All of these sentences build up to your thesis statement. ”
Seeing how you haven’t been in this teachers classroom maybe just maybe the teacher has used this language while teaching the above concept. So maybe it makes sense to your daughter and not you because you weren’t there. Just maybe. Or Joe’s right, this incompetent teacher needs to be fired with the rest of them.
Actually, she was one of the best teachers that my daughter ever had. Her Spanish writing skills improved significantly under this teacher. The word “stronger” applies to the selection of words used to express an idea. The comments were on an “A” paper and aided the development of her writing style.
That criticism is useful. “First sentence needs to be stronger” is not.
It is useful if used in a context the student understands, apparently from what the parent has said, it helped.
So the criticism was about???
[...] and high school. These same essays were also given to human readers. The results found that these robo graders “assigned the same scores as human [...]
[...] for automating essay grading put to the test. Retrieved June 5, 2012, from Education News: http://www.educationnews.org/technology/software-for-automating-essay-grading-put-to-the-test/ Like this:LikeBe the first to like this. « [...]