Cross-Lingual Approaches for Text Difficulty Classification in Non-English Languages
Abstract
Full Text:
PDFReferences
Balyan, R., McCarthy, K. S., & McNamara, D. S. (2020). Applying natural language processing and hierarchical machine learning approaches to text difficulty classification. International Journal of Artificial Intelligence in Education, 30(3), 337–370. https://doi.org/10.1007/s40593-020-00201-7
Schwarm, S. E., & Ostendorf, M. (2005). Reading level assessment using support vector machines and statistical language models. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), 523–530. https://doi.org/10.3115/1219840.1219905
Heilman, M., Collins-Thompson, K., & Eskenazi, M. (2008). An analysis of statistical models and features for reading difficulty prediction. Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications, 71–79.
De Clercq, O., & Hoste, V. (2016). All mixed up? Finding the optimal feature set for general readability prediction and its application to English and Dutch. Computational Linguistics, 42(3), 457–490. https://doi.org/10.1162/COLI_a_00255
Aluisio, S., Specia, L., Gasperin, C., & Scarton, C. (2010). Readability assessment for text simplification. Proceedings of the NAACL HLT 2010 Fifth Workshop on Innovative Use of NLP for Building Educational Applications, 1–9.
Vajjala, S., & Meurers, D. (2012). On improving the accuracy of readability classification using insights from second language acquisition. Proceedings of the Seventh Workshop on Building Educational Applications Using NLP, 163–173.
Sinha, M., Dasgupta, T., & Basu, A. (2014). Text readability in Hindi: A comparative study of feature performances using support vectors. Proceedings of the 11th International Conference on Natural Language Processing, 223–231.
Madrazo Azpiazu, I., & Pera, M. S. (2020). Is cross-lingual readability assessment possible? Journal of the Association for Information Science and Technology, 71(6), 644–656. https://doi.org/10.1002/asi.24293
Weiss, Z., Chen, X., & Meurers, D. (2021). Using broad linguistic complexity modeling for cross-lingual readability assessment. Proceedings of the 10th Workshop on NLP for Computer Assisted Language Learning, 38–54.
Khallaf, N., & Sharoff, S. (2021). Automatic difficulty classification of Arabic sentences. https://doi.org/10.48550/arXiv.2103.04386
Li, W., Wang, Z., & Wu, Y. (2022). A unified neural network model for readability assessment with feature projection and length-balanced loss. https://doi.org/10.48550/arXiv.2210.10305
Mohtaj, S., Naderi, B., Möller, S., Maschhur, F., Wu, C., & Reinhard, M. (2022). A transfer learning based model for text readability assessment in German. https://doi.org/10.48550/arXiv.2207.06265
Ivanov, V. V. (2022). Sentence-level complexity in Russian: An evaluation of BERT and graph neural networks. Frontiers in Artificial Intelligence, 5, 1008411. https://doi.org/10.3389/frai.2022.1008411
Varnamkhasti, M. M. (2024). Persian readability classification using DeepWalk and tree-based ensemble methods. Natural Language Processing Journal, 9, 100116. https://doi.org/10.1016/j.nlp.2024.100116
Van Ngo, D., & Parmentier, Y. (2023). Towards sentence-level text readability assessment for French. Second Workshop on Text Simplification, Accessibility and Readability (TSAR@RANLP2023).
Pickelmann, F., Färber, M., & Jatowt, A. (2023). Ablesbarkeitsmesser: A system for assessing the readability of German text. European Conference on Information Retrieval, 288–293. https://doi.org/10.1007/978-3-031-28241-6_28
Leal, S. E., Duran, M. S., Scarton, C. E., Hartmann, N. S., & Aluísio, S. M. (2024). NILC-Metrix: Assessing the complexity of written and spoken language in Brazilian Portuguese. Language Resources and Evaluation, 58(1), 73–110. https://doi.org/10.1007/s10579-023-09693-w
Ribeiro, E., Mamede, N., & Baptista, J. (2024). Text readability assessment in European Portuguese: A comparison of classification and regression approaches. Proceedings of the 16th International Conference on Computational Processing of Portuguese, 551–557.
Naous, T., Ryan, M. J., Lavrouk, A., Chandra, M., & Xu, W. (2023). ReadMe++: Benchmarking multilingual language models for multi-domain readability assessment. https://doi.org/10.48550/arXiv.2305.14463
Naderi, B., Mohtaj, S., Ensikat, K., & Möller, S. (2019). Subjective assessment of text complexity: A dataset for German language. https://doi.org/10.48550/arXiv.1904.07733
Lu, D., Qiu, X., & Cai, Y. (2020). Sentence-level readability assessment for L2 Chinese learning. In Chinese Lexical Semantics: 20th Workshop, CLSW 2019, Revised Selected Papers 20, 381–392. https://doi.org/10.1007/978-3-030-38189-9_40
Yang, S., Sun, R., & Wan, X. (2023). A new dataset and empirical study for sentence simplification in Chinese. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 8306–8321. https://doi.org/10.48550/arXiv.2306.04188
Mohammadi, H., & Khasteh, S. H. (2020). A machine learning approach to Persian text readability assessment using a crowdsourced dataset. 2020 28th Iranian Conference on Electrical Engineering (ICEE), 1–7. https://doi.org/10.1109/ICEE50131.2020.9260933
Martinez, A. R. (2012). Part-of-speech tagging. Wiley Interdisciplinary Reviews: Computational Statistics, 4(1), 107–113. https://doi.org/10.1002/wics.195
Hamilton, W. L. (2020). Graph representation learning. Morgan & Claypool Publishers. https://doi.org/10.2200/S01045ED1V01Y202009AIM046
Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 855–864. https://doi.org/10.1145/2939672.2939754
Cunningham, P., & Delany, S. J. (2021). K-nearest neighbour classifiers—a tutorial. ACM Computing Surveys, 54(6), 1–25. https://doi.org/10.1145/3459665
Kramer, O. (2013). K-nearest neighbors. In Dimensionality Reduction with Unsupervised Nearest Neighbors (pp. 13–23). Springer. https://doi.org/10.1007/978-3-642-38652-7_2
Yao, L., Mao, C., & Luo, Y. (2019). Graph convolutional networks for text classification. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 7370–7377. https://doi.org/10.1609/aaai.v33i01.33017370
DOI: https://doi.org/10.31449/inf.v46i21.8968
This work is licensed under a Creative Commons Attribution 3.0 License.








