Improving Big Data Recommendation System Performance using NLP techniques with multi attributes

Hoger K. Omar; Mondher Frikha; Alaa Khalil Jumaa

doi:10.31449/inf.v48i5.5255

Improving Big Data Recommendation System Performance using NLP techniques with multi attributes

Abstract

Due to the wide availability of big data, institutions and companies are currently concentrating on developing highly effective recommender systems for their users. Traditional recommender systems use standard information like user, item, and ratings. However, this data may not be sufficient for precise results. To enhance accuracy, it is recommended to include additional information such as textual data in the recommendation system. When dealing with large textual data, employing Natural Language Processing (NLP) techniques is essential for effective data analysis. Hence, this paper proposed a novel big data recommender system that enhances collaborative filtering (CF) results by leveraging NLP techniques and dealing with multiple attributes. The study constructs two big data recommendation system models by using a machine learning algorithm. In both models, the Alternating Least Squares (ALS) algorithm within the Apache Spark big data tool was utilized. The first model did not incorporate NLP techniques, while the second model considered the novel NLP techniques by taking into account the user's review comments. A dataset of more than 3 million ratings and reviews was gathered from the Amazon website, amounting to a size of 3.1 GB. The results demonstrated significant improvement after incorporating the suggested NLP-based techniques with multiple attributes.

Author Biography

Hoger K. Omar

Authors

Hoger K. Omar
Mondher Frikha
Alaa Khalil Jumaa

DOI:

https://doi.org/10.31449/inf.v48i5.5255

Downloads

Published

02/26/2024

How to Cite

K. Omar, H., Frikha, M., & Jumaa, A. K. (2024). Improving Big Data Recommendation System Performance using NLP techniques with multi attributes. Informatica, 48(5). https://doi.org/10.31449/inf.v48i5.5255

Download Citation

Issue

Vol. 48 No. 5 (2024): Online-only issue

Section

Online-only

License

Authors retain copyright in their work. By submitting to and publishing with Informatica, authors grant the publisher (Slovene Society Informatika) the non-exclusive right to publish, reproduce, and distribute the article and to identify itself as the original publisher.

All articles are published under the Creative Commons Attribution license CC BY 3.0. Under this license, others may share and adapt the work for any purpose, provided appropriate credit is given and changes (if any) are indicated.

Authors may deposit and share the submitted version, accepted manuscript, and published version, provided the original publication in Informatica is properly cited.

Improving Big Data Recommendation System Performance using NLP techniques with multi attributes

Abstract

Author Biography

Hoger K. Omar

Authors

DOI:

Downloads

Published

How to Cite

Issue

Section

License

Developed By

Information