Improving Big Data Recommendation System Performance using NLP techniques with multi attributes
Abstract
Due to the wide availability of big data, institutions and companies are currently concentrating on developing highly effective recommender systems for their users. Traditional recommender systems use standard information like user, item, and ratings. However, this data may not be sufficient for precise results. To enhance accuracy, it is recommended to include additional information such as textual data in the recommendation system. When dealing with large textual data, employing Natural Language Processing (NLP) techniques is essential for effective data analysis. Hence, this paper proposed a novel big data recommender system that enhances collaborative filtering (CF) results by leveraging NLP techniques and dealing with multiple attributes. The study constructs two big data recommendation system models by using a machine learning algorithm. In both models, the Alternating Least Squares (ALS) algorithm within the Apache Spark big data tool was utilized. The first model did not incorporate NLP techniques, while the second model considered the novel NLP techniques by taking into account the user's review comments. A dataset of more than 3 million ratings and reviews was gathered from the Amazon website, amounting to a size of 3.1 GB. The results demonstrated significant improvement after incorporating the suggested NLP-based techniques with multiple attributes.
Full Text:
PDFDOI: https://doi.org/10.31449/inf.v48i5.5255
This work is licensed under a Creative Commons Attribution 3.0 License.