Efficient Transformer Based Sentiment Classification Models

Leeja Mathew, Bindu V R


Recently, transformer models have gained significance as a state-of-the art technique for sentiment prediction based on text. Attention mechanism of transformer model speeds up the training process by allowing modelling of dependencies without regard to their distance in the input or output sequences. There are two types of transformer models – transformer base models and transformer large models. Since the implementation of large transformer models need better hardware and more training time, we propose new simpler models or weak learners with lower training time for sentiment classification in this work. These models enhance the speed of performance without compromising the classification accuracy. The proposed Efficient Transformer-based Sentiment Classification (ETSC) models are built by setting configuration of large models as minimum, shuffling dataset randomly and experimenting with various percentages of training data. Early stopping and smaller batch size in training techniques improve the accuracy of the proposed model. The proposed models exhibit promising performance in comparison with existing transformer-based sentiment classification models in terms of speed and accuracy.

Full Text:



Zhang L, Wang S, Liu B. Deep learning for sentiment analysis: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2018 Jul;8(4):e1253.

Liu R, Shi Y, Ji C, Jia M. A survey of sentiment analysis based on transfer learning. IEEE Access. 2019 Jun 26;7:85401-12.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is all you need. Advances in neural information processing systems. 2017;30.

Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2018 Oct 11.

Dai J, Yan H, Sun T, Liu P, Qiu X. Does syntax matter? a strong baseline for aspect-based sentiment analysis with roberta. arXiv preprint arXiv:2104.04986. 2021 Apr 11.

“XLNet, RoBERTa, ALBERT models for Natural Language Processing (NLP).” https://iq.opengenus.org/advanced-nlp-models/ (accessed Oct. 30, 2021).

“Binary Classification - Simple Transformers.” https://simpletransformers.ai/docs/binary-classification/ (accessed Oct. 30, 2021).

Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M, Davison J. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations 2020 Oct (pp. 38-45).

“How do pre-trained models work?. …and why you should use them more often | by Dipam Vasani | Towards Data Science.” https://towardsdatascience.com/how-do-pretrained-models-work-11fe2f64eaa2 (accessed Oct. 30, 2021).

Kant N, Puri R, Yakovenko N, Catanzaro B. Practical text classification with large pre-trained language models. arXiv preprint arXiv:1812.01207. 2018 Dec 4.

Kumar V, Choudhary A, Cho E. Data augmentation using pre-trained transformer models. arXiv preprint arXiv:2003.02245. 2020 Mar 4.

Xu H, Shu L, Yu PS, Liu B. Understanding pre-trained bert for aspect-based sentiment analysis. arXiv preprint arXiv:2011.00169. 2020 Oct 31.

Munikar M, Shakya S, Shrestha A. Fine-grained sentiment classification using BERT. In2019 Artificial Intelligence for Transforming Business and Society (AITB) 2019 Nov 5 (Vol. 1, pp. 1-5). IEEE.

Zhao M, Lin T, Mi F, Jaggi M, Schütze H. Masking as an efficient alternative to finetuning for pretrained language models. arXiv preprint arXiv:2004.12406. 2020 Apr 26.

Naseem U, Razzak I, Musial K, Imran M. Transformer based deep intelligent contextual embedding for twitter sentiment analysis. Future Generation Computer Systems. 2020 Dec 1;113:58-69.

Kaiser L, Bengio S, Roy A, Vaswani A, Parmar N, Uszkoreit J, Shazeer N. Fast decoding in sequence models using discrete latent variables. InInternational Conference on Machine Learning 2018 Jul 3 (pp. 2390-2399). PMLR.

Tang T, Tang X, Yuan T. Fine-tuning BERT for multi-label sentiment analysis in unbalanced code-switching text. IEEE Access. 2020 Oct 12;8:193248-56.

Tang T, Tang X, Yuan T. Fine-tuning BERT for multi-label sentiment analysis in unbalanced code-switching text. IEEE Access. 2020 Oct 12;8:193248-56.

Wang C, Li M, Smola AJ. Language models with transformers. arXiv preprint arXiv:1904.09408. 2019 Apr 20.

Farahani M, Gharachorloo M, Farahani M, Manthouri M. Parsbert: Transformer-based model for persian language understanding. Neural Processing Letters. 2021 Dec;53(6):3831-47.

Cheng X, Xu W, Wang T, Chu W. Variational semi-supervised aspect-term sentiment analysis via transformer. arXiv preprint arXiv:1810.10437. 2018 Oct 24.

Biesialska K, Biesialska M, Rybinski H. Sentiment analysis with contextual embeddings and self-attention. InInternational Symposium on Methodologies for Intelligent Systems 2020 Sep 23 (pp. 32-41). Springer, Cham.

Voita E, Talbot D, Moiseev F, Sennrich R, Titov I. Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned. arXiv preprint arXiv:1905.09418. 2019 May 23.

Hoang M, Bihorac OA, Rouces J. Aspect-based sentiment analysis using bert. In Proceedings of the 22nd nordic conference on computational linguistics 2019 (pp. 187-196).

Xu Q, Zhu L, Dai T, Yan C. Aspect-based sentiment classification with multi-attention network. Neurocomputing. 2020 May 7;388:135-43.

Mathew L, Bindu VR. Efficient classification techniques in sentiment analysis using transformers. International Conference on Innovative Computing and Communications 2022 (pp. 849-862). Springer, Singapore.

Ruder S, Peters ME, Swayamdipta S, Wolf T. Transfer learning in natural language processing. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: Tutorials 2019 Jun (pp. 15-18).

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692. 2019 Jul 26.

Lan Z, Chen M, Goodman S, Gimpel K, Sharma P, Soricut R. Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942. 2019 Sep 26.

DOI: https://doi.org/10.31449/inf.v46i8.4332

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.