If you fully implemented Pearson’s essay-grading program, which relies on something called “Latent Semantic Analysis,” you could lose large numbers of passionate, energetic English teachers, whose jobs today consist to a large extent in making assessments of student literacy. Ask any English teacher about “grading papers,” and s/he’ll probably groan. Well, that might be about to change. The “Intelligent Essay Assessor” gives low-cost, content-specific writing assessments done immediately, with high degrees of accuracy.
Good-bye unionized professionals! Hello budget surpluses! I wonder that this game changer isn’t getting more news.
At base, the “Intelligent Essay Assessor” relies on real humans essay readers who evaluate real student writing. But that is merely to gauge the metrics, mean the machine, ready the digital raptor in the program.
The way it works:
- You choose a big essay test (big in terms of numbers of examinees) such as a final exam, or a standard prompt used for proficiency assessment, like an ACT essay question. Get 200 copies of real human essay responses to Pearson, who then subject these essay samples to rubric-equipped human essay assessors.
- The human results are then translated into an algorithm in the KAT engine (the aforementioned “Latent Semantic Analysis”) which takes it from there, with no additional cost. Millions of kids to test? No problem with KAT.
Q: How is the computer trained to score student essays?
A: Partner companies wishing to offer an essay scoring service collect 100 – 200 student papers written in response to a given prompt. These papers are then scored by human graders and sent to Pearson, where the papers and their scores are used to train the computer to score new student essays in response to the prompt. The KAT engine learns to score the different score points based on the human scored papers. The engine can be trained to provide holistic as well as analytic or trait scores.Q: How does the human scoring work?
A: Human graders assess a paper’s overall quality using a specific rubric. For analytic scoring, they examine a paper for important traits. Each essay is scored by two graders for the holistic score and again by two graders for each of the analytic traits. If the two graders diverge by more than one point on any score, a third grader scores the paper to settle that discrepancy.Q: How does the computer recognize a good essay?
A: The Intelligent Essay Assessor uses the KAT engine to assess the content of an essay, as well as more mechanical aspects of writing. When a student submits an essay for scoring, the system immediately measures the meaning of the essay. It then compares the essay to the training essays, looking for similarities, and assigns a holistic score in part by placing the essay in a category with the most similar training essays. Analytic scoring occurs in much the same way. For each trait, the system assesses that trait in the student essay, compares it to the training essays, and then categorizes the trait in question.Q: How does IEA scoring compare to the way teachers grade writing?
A: IEA’s approach mirrors the way teachers grade essays. For example, when teachers evaluate a student’s essay, they look for characteristics that identify an essay as an A or C paper. Their expectations are likely based on their previous experience as a grader and on criteria for the assignment in question. In other words, teachers search for a match between the essay itself and the criteria for a particular grade or score. The Intelligent Essay Assessor is trained to mimic this process.Q: How does IEA score essays with highly unusual writing styles?
A: An essay with a highly unique writing style or unusual construction may receive an advisory message along with a score. If an essay is off-topic, written in a language other than English, too brief or too repetitive, a written refusal to write, or otherwise incomprehensible, a student will receive an advisory and no score. These advisory messages ask the student to discuss the essay and all feedback with his or her teacher to ensure an appropriate evaluation of the writing.
If fully implemented in a test-mad world the Pearson mode might make the better title for English teacher “Intelligent Essay Assessment” facilitator, if that’s all the job became.
But if the essay grading were given to a machine, how much more free an English teacher’s energy would be to design newer, better writing lessons and to work one on one with struggling writers! Then the job might better be entitled “Writing instructor,” or “Literacy coach.” I rather like that.
Perhaps Pearson’s product would be a good thing for English teachers who persist beyond the disrupting reorganization (by economic and technological forces) that seems likely in the years to come. How much time and energy we have given to essay assessment! What a saving grace to have it lifted for a broader, more energetic focus to literacy learning!