Advances in the Field of Automated Essay Evaluation

Kaja Zupanc, Zoran Bosnic


Automated essay evaluation represents a practical solution to a time-consuming, labor-intensive and expensive activity of manual grading of students' essays. During the last 50 years, many challenges have arised in the field, including seeking ways to evaluate the semantic content, providing automated feedback, determining validity and reliability of grades and others. In this paper we provide comparison of 21 state-of-the-art approaches for automated essay evaluation and highlight their weaknesses and open challenges in the field. We conclude with the findings that the field has developed to the point where the systems represent a useful complement (not replacement) to human scoring.

Full Text:



H. B. Ajay, P. I. Tillet, and E. B. Page,

Analysis of essays by computer (AEC-II),"

U.S. Department of Health, Education, and

Welfare, Office of Education, National Center

for Educational Research and Development,

Washington, D.C., Tech. Rep., 1973.

Y. Attali, A Differential Word Use Measure

for Content Analysis in Automated Essay

Scoring," ETS Research Report Series,

vol. 36, 2011.

Y. Attali, Validity and Reliability of Automated

Essay Scoring," in Handbook of Automated

Essay Evaluation: Current Applications and

New Directions, M. D. Shermis and J. C.

Burstein, Eds. New York: Routledge, 2013,

ch. 11, pp. 181-198.

Y. Attali and J. Burstein, Automated Essay

Scoring With e-rater V . 2," The Journal of

Technology, Learning and Assessment, vol. 4,

no. 3, pp. 3-29, 2006.

L. Bin and Y. Jian-Min, Automated Essay

Scoring Using Multi-classiffier Fusion," Com-

munications in Computer and Information

Science, vol. 233, pp. 151-157, 2011.

E. Brent, C. Atkisson, and N. Green, Timeshifted

Collaboration: Creating Teachable

Moments through Automated Grading," in

Monitoring and Assessment in Online Col-

laborative Environments: Emergent Compu-

tational Technologies for E-learning Support,

A. Juan, T. Daradournis, and S. Caballe,

Eds. IGI Global, 2010, pp. 55-73.

E. Brent and M. Townsend, Automated essay

grading in the sociology classroom," in

Machine Scoring of Student Essays: Truth

and Consequences?, P. Freitag Ericsson and

R. H. Haswell, Eds. Utah State University

Press, 2006, ch. 13, pp. 177-198.

B. Bridgeman, Human Ratings and Automated

Essay Evaluation," in Handbook of

Automated Essay Evaluation: Current Ap-

plications and New Directions, M. D. Shermis

and J. C. Burstein, Eds. New York:

Routledge, 2013, ch. 13, pp. 221-232.

J. Burstein, K. Kukich, S. Wolff, C. Lu, and

M. Chodorow, Computer Analysis of Essays,"

in Proceedings of the NCME Sympo-

sium on Automated Scoring, no. April, Montreal,

, pp. 1-13.

J. Burstein, J. Tetreault, and N. Madnani,

The E-rater Automated Essay Scoring System,"

in Handbook of Automated Essay Eval-

uation: Current Applications and New Direc-

tions, M. D. Shermis and J. Burstein, Eds.

New York: Routledge, 2013, ch. 4, pp. 55-

D. Castro-Castro, R. Lannes-Losada,

M. Maritxalar, I. Niebla, C. Péerez-Marquées,

N. C. éAlamo Suéarez, and A. Pons-Porrata,

A multilingual application for automated

essay scoring," in Advances in Artifficial In-

telligence 11th Ibero-American Conference

on AI. Lisbon, Portugal: Springer, 2008,

pp. 243-251.

Y. Chali and S. A. Hasan, On the Effectiveness

of Using Syntactic and Shallow Semantic

Tree Kernels for Automatic Assessment of

Essays," in Proceedings of the International

Joint Conference on Natural Language Pro-

cessing, no. October, Nagoya, Japan, 2013,

pp. 767-773.

T. H. Chang, C. H. Lee, P. Y. Tsai, and H. P.

Tam, Automated essay scoring using set of

literary sememes," in Proceedings of Interna-

tional Conference on Natural Language Pro-

cessing and Knowledge Engineering, NLP-

KE 2008. Beijing, China: IEEE, 2008, pp.


H. Chen, B. He, T. Luo, and B. Li, A

Ranked-Based Learning Approach to Automated

Essay Scoring," in Proceedings of the

Second International Conference on Cloud

and Green Computing. Ieee, Nov. 2012, pp.


Y. Chen, C. Liu, C. Lee, and T. Chang, An

Unsupervised Automated Essay- Scoring

System," IEEE Intelligent systems, vol. 25,

no. 5, pp. 61-67, 2010.

J. R. Christie, Automated Essay Marking

for both Style and Content," in Proceedings

of the Third Annual Computer Assisted As-

sessment Conference, 1999.

A. Fazal, T. Dillon, and E. Chang, Noise

Reduction in Essay Datasets for Automated

Essay Grading," Lecture Notes in Computer

Science, vol. 7046, pp. 484-493, 2011.

P. W. Foltz, L. A. Streeter, K. E. Lochbaum,

and T. K. Landauer, Implementation and

Applications of the Intelligent Essay Assessor,"

in Handbook of Automated Essay Eval-

uation: Current Applications and New Di-

rections, M. D. Shermis and J. Burstein, Eds.

New York: Routledge, 2013, ch. 5, pp. 68-88.

F. Gutiererz, D. Dou, S. Fickas, and G. Grif-

ffiths, Online Reasoning for Ontology-Based

Error Detection in Text," On the Move to

Meaningful Internet Systems: OTM 2014

Conferences Lecture Notes in Computer Sci-

ence, vol. 8841, pp. 562-579, 2014.

F. Gutierrez, D. Dou, S. Fickas, and G. Grif-

ffiths, Providing grades and feedback for student

summaries by ontology-based information

extraction," in Proceedings of the 21st

ACM international conference on Informa-

tion and knowledge management - CIKM

'12, 2012, pp. 1722-1726.

F. Gutierrez, D. Dou, A. Martini, S. Fickas,

and H. Zong, Hybrid Ontology-based Information

Extraction for Automated Text

Grading," in Proceedings of 12th Interna-

tional Conference on Machine Learning and

Applications, 2013, pp. 359-364.

A. Herrington, Writing to a Machine is Not

Writing At All," in Writing assessment in the

st century: Essays in honor of Edward M.

White, N. Elliot and L. Perelman, Eds. New

York: Hampton Press, 2012, pp. 219-232.

D. Higgins, J. Burstein, and Y. Attali, Identifying

off-topic student essays without topicspeci

ffic training data," Natural Language

Engineering, vol. 12, no. 02, pp. 145-159,

May 2006.

T. Ishioka and M. Kameda, Automated

Japanese essay scoring system:jess," Pro-

ceedings. 15th International Workshop on

Database and Expert Systems Applications,

, pp. 4-8, 2004.

T. Ishioka, Automated Japanese Essay

Scoring System based on Articles Written by

Experts," in Proceedings of the 21st Interna-

tional Conference on Computational Linguis-

tics and 44th Annual Meeting of the ACL, no.

July, Sydney, 2006, pp. 233-240.

M. M. Islam and A. S. M. L. Hoque, Automated

essay scoring using Generalized Latent

Semantic Analysis," Journal of Comput-

ers, vol. 7, no. 3, pp. 616-626, 2012.

K. S. Jones, Natural language processing:

a historical review," Linguistica Com-

putazionale, vol. 9, pp. 3-16, 1994.

T. Kakkonen, N. Myller, E. Sutinen, and

J. Timonen, Comparison of Dimension

Reduction Methods for Automated Essay

Grading," Educational Technology & Society,

vol. 11, no. 3, pp. 275-288, 2008.

T. Kakkonen, N. Myller, J. Timonen, and

E. Sutinen, Automatic Essay Grading with

Probabilistic Latent Semantic Analysis," in

Proceedings of the second workshop on Build-

ing Educational Applications Using NLP, no.

June, 2005, pp. 29-36.

M. T. Kane, Validation," in Educational

Measurement, 4th ed., R. L. Brennan, Ed.

Westport, CT: Praeger Publishers, 2006, pp.


T. K. Landauer, P. W. Foltz, and D. Laham,

An introduction to latent semantic analysis,"

Discourse Processes, vol. 25, no. 2-3, pp.

-284, Jan. 1998.

T. K. Landauer, D. Laham, and P. W. Foltz,

The Intelligent Essay Assessor," IEEE In-

telligent systems, vol. 15, no. 5, pp. 27-31,

B. Lemaire and P. Dessus, A System to

Assess the Semantic Content of Student Essays,"

Journal of Educational Computing Re-

search, vol. 24, no. 3, pp. 305-320, 2001.

E. D. Liddy, Natural Language Processing,"

in Encyclopedia of Library and Information

Science, 2nd ed., M. Decker, Ed. Taylor &

Francis, 2001.

S. M. Lottridge, E. M. Schulz, and H. C.

Mitzel, Using Automated Scoring to Monitor

Reader Performance and Detect Reader

Drift in Essay Scoring." in Handbook of Au-

tomated Essay Evaluation: Current Applica-

tions and New Directions, M. D. Shermis and

J. Burstein, Eds. New York: Routledge,

, ch. 14, pp. 233-250.

S. M. Lottridge, H. C. Mitzel, and F. Chou,

Blending machine scoring and hand scoring

for constructed responses," in Paper pre-

sented at the CCSSO National Conference on

Student Assessment, Los Angeles, California,

O. Mason and I. Grove-Stephenson, Automated

free text marking with paperless

school," in Proceedings of the Sixth Interna-

tional Computer Assisted Assessment Con-

ference, 2002, pp. 213-219.

E. Mayffield and C. Penstein-Rosée, An Interactive

Tool for Supporting Error Analysis for

Text Mining," in Proceedings of the NAACL

HLT 2010 Demonstration Session, Los Angeles,

CA, 2010, pp. 25-28.

E. Mayffield and C. Rosée, LightSIDE: Open

Source Machine Learning for Text," in Hand-

book of Automated Essay Evaluation: Cur-

rent Applications and New Directions, M. D.

Shermis and J. Burstein, Eds. New York:

Routledge, 2013, ch. 8, pp. 124-135.

D. McCurry, Can machine scoring deal with

broad and open writing tests as well as human

readers?" Assessing Writing, vol. 15,

no. 2, pp. 118-129, 2010.

T. McGee, Taking a Spin on the Intelligent

Essay Assessor," in Machine Scoring of

Student Essays: Truth and Consequences?2,

P. Freitag Ericsson and R. H. Haswell, Eds.

Logan, UT: Utah State University Press,

, ch. 5, pp. 79-92.

K. M. Nahar and I. M. Alsmadi, The Automatic

Grading for Online exams in Arabic

with Essay Questions Using Statistical

and Computational Linguistics Techniques,"

MASAUM Journal of Computing, vol. 1,

no. 2, 2009.

R. Östling, A. Smolentzov, and E. Höglin,

Automated Essay Scoring for Swedish," in

Proceedings of the Eighth Workshop on Inno-

vative Use of NLP for Building Educational

Applications, vol. 780, Atlanta, Georgia, US.,

, pp. 42-47.

E. B. Page, The Imminence of... Grading

Essays by Computer," Phi Delta Kappan,

vol. 47, no. 5, pp. 238-243, 1966.

E. B. Page, Computer Grading of Student Prose

, Using Modern Concepts and Software,"

Journal of Experimental Education, vol. 62,

no. 2, pp. 127-142, 1994.

D. E. Powers, J. C. Burstein, M. Chodorow,

M. E. Fowles, and K. Kukich, Stumping erater:

challenging the validity of automated

essay scoring," Computers in Human Behav-

ior, vol. 18, no. 2, pp. 103-134, Mar. 2002.

C. Ramineni and D. M. Williamson, Automated

essay scoring: Psychometric guidelines

and practices," Assessing Writing,

vol. 18, no. 1, pp. 25-39, 2013.

C. S. Rich, M. C. Schneider, and J. M.

D'Brot, Applications of Automated Essay

Evaluation inWest Virginia," in Handbook of

Automated Essay Evaluation: Current Ap-

plications and New Directions, M. D. Shermis

and J. Burstein, Eds. New York: Routledge,

, ch. 7, pp. 99-123.

L. M. Rudner, V. Garcia, and C. Welch, An

Evaluation of the IntelliMetric Essay Scoring

System," The Journal of Technology, Learn-

ing and Assessment, vol. 4, no. 4, pp. 3-20,

L. M. Rudner and T. Liang, Automated

Essay Scoring Using Bayes Theorem," The

Journal of Technology, Learning and Assess-

ment, vol. 1, no. 2, pp. 3-21, 2002.

M. T. Schultz, The IntelliMetric Automated

Essay Scoring Engine - A Review and an

Application to Chinese Essay Scoring," in

Handbook of Automated Essay Evaluation:

Current Applications and New Directions,

M. D. Shermis and J. C. Burstein, Eds. New

York: Routledge, 2013, ch. 6, pp. 89-98.

M. D. Shermis, State-of-the-art automated

essay scoring: Competition, results, and future

directions from a United States demonstration,"

Assessing Writing, vol. 20, pp. 53-

, 2014.

M. D. Shermis and J. Burstein, Introduction,"

in Automated essay scoring: A cross-

disciplinary perspective, M. D. Shermis and

J. Burstein, Eds. Manwah, NJ: Lawrence

Erlbaum Associates, 2003, pp. xiii-xvi.

M. D. Shermis, J. Burstein, and S. A. Bursky,

Introduction to Automated Essay Evaluation,"

in Handbook of Automated Essay Eval-

uation: Current Applications and New Direc-

tions, M. D. Shermis, J. Burstein, and S. A.

Bursky, Eds. New York: Routledge, 2013,

ch. 1, pp. 1-15.

M. D. Shermis, J. Burstein, and K. Zechner,

Automated Essay Scoring: Writing Assessment

and Instruction," in International en-

cyclopedia of education, 3rd ed., P. Peterson,

E. Baker, and B. McGaw, Eds. Oxford, UK:

Elsevier, 2010.

M. D. Shermis and J. C. Burstein, Eds.,

Handbook of Automated Essay Evaluation:

Current Applications and New Directions.

New York: Routledge, 2013.

M. D. Shermis and B. Hamner, Contrasting

State-of-the-Art Automated Scoring of

Essays: Analysis," in Handbook of Auto-

mated Essay Evaluation: Current Applica-

tions and New Directions, M. D. Shermis and

J. Burstein, Eds. New York: Routledge,

, ch. 19, pp. 313-346.

M. D. Shermis, H. R. Mzumara, J. Olson,

and S. Harrington, On-line Grading of Student

Essays: PEG goes on the World Wide

Web," Assessment & Evaluation in Higher

Education, vol. 26, no. 3, pp. 247-259, 2001.

M. I. Smith, The Reading-Writing Connection,"

MetaMetrics, Tech. Rep., 2009.

M. I. Smith, A. Schiano, and E. Lattanzio,

Beyond the classroom." Knowledge Quest,

vol. 42, no. 3, pp. 20-29, 2014.

M. Syed, I. Norisma, and A. Rukaini, Embedding

Information Retrieval and Nearest-

Neighbour Algorithm into Automated Essay

Grading System," in Proceedings of the Third

International Conference on Information

Technology and Applications (ICITA05),

, pp. 169-172.

S. Valenti, F. Neri, and A. Cucchiarelli,

An Overview of Current Research on Automated

Essay Grading," Journal of Informa-

tion Technology Education, vol. 2, pp. 319-

, 2003.

S. C. Weigle, English as a Second LanguageWriting

and Automated Essay Evaluation,"

in Handbook of Automated Essay Eval-

uation: Current Applications and New Di-

rections, M. D. Shermis and J. C. Burstein,

Eds. New York: Routledge, 2013, ch. 3, pp.


F. Wild, C. Stahl, G. Stermsek, Y. Penya,

and G. Neumann, Factors In

uencing Effectiveness

in Automated Essay Scoring with

LSA," in Proceedings of AIED 2005, Amsterdam,

Netherlands, 2005, pp. 947-949.

R. Williams and H. Dreher, Automatically

Grading Essays with Markit," Issues in In-

forming Science and Information Technol-

ogy, vol. 1, pp. 693-700, 2004.

D. M. Williamson, X. Xi, and F. J. Breyer,

A Framework for Evaluation and Use of

Automated Scoring," Educational Measure-

ment: Isues and Practice, vol. 31, no. 1, pp.

-13, 2012.

K. Zupanc and Z. Bosnić, Automated Essay

Evaluation Augmented with Semantic

Coherence Measures," in Proceedings of the

th IEEE International Conference on Data

Mining, Shenzhen, China, 2014, pp. 1133-

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.