Predicting Football Player Transfer Values Using Bagging and Hybrid Machine Learning Approaches

Biao Geng

Abstract


Accurately assessing a football player's market value is essential for enabling informed decision-making by clubs, agents, and investors during player transfers, contract negotiations, and strategic investment planning. In this context, machine learning (ML) algorithms offer a robust framework for analyzing historical data, performance indicators, and market dynamics to produce realistic valuations. These datadriven methods assist in identifying undervalued opportunities and flagging overpriced players, thereby enhancing the overall efficiency of transfer market operations. The dataset employed in this research includes a comprehensive set of player-related features such as age, weight, weak foot rating, preferred foot, and international reputation, among others. These attributes collectively contribute to a detailed profile of each player's capabilities and market relevance. The objective of this study is to develop reliable and accurate predictive models that estimate player market values by leveraging advanced machine learning techniques, thereby improving upon traditional, subjective valuation approaches. Several regression-based models were explored, including Bagging Decision Tree Regression (Bg_DT), and Bagging Support Vector Regression (Bg_SVR). To further enhance model performance, optimization algorithms such as Motion-encoded Particle Swarm Optimization (Motion-encoded PSO) and the Red Deer Algorithm (RDA) were applied for hyperparameter tuning. Among the evaluated models, the Bagging Decision Tree optimized with Motion-encoded PSO (Bg_DT- Motion-encoded PSO) demonstrated superior performance. It achieved the lowest Root Mean Squared Error (RMSE) and the highest coefficient of determination (R²) across both validation and testing phases. Specifically, the Bg_DT- Motion-encoded PSO model yielded an RMSE of 533×10⁵ and an R² of 0.962 during validation, indicating strong predictive accuracy and generalization capability. These findings underscore the effectiveness of ensemble learning techniques—particularly Bagging Decision Trees—in conjunction with advanced metaheuristic optimizers like Motion-encoded PSO, for accurately estimating football player market values.


Full Text:

PDF

References


P. christopher A. Eric Weil, “Football,” Encyclopaedia Britannica.

G. Kitching, “The Origins of Football: History, Ideology and the Making of ‘The People’s Game,’” History Workshop Journal, 79(1): 127–153, Apr. 2015. https://doi.org/10.1093/hwj/dbu023

S. Dobson and B. Gerrard, “The determination of player transfer fees in English professional soccer,” Journal of Sport Management, 13(4): 259–279, 1999. https://doi.org/10.1123/jsm.13.4.259

O. Müller, A. Simons, and M. Weinmann, “Beyond crowd judgments: Data-driven estimation of market value in association football,” Eur J Oper Res, 263(2): 611–624, 2017. https://doi.org/10.1016/j.ejor.2017.05.005

B. Frick, “THE FOOTBALL PLAYERS’ LABOR MARKET: EMPIRICAL EVIDENCE FROM THE MAJOR EUROPEAN LEAGUES,” Scott J Polit Econ, 54(3): 422–446, Jul. 2007, doi: https://doi.org/10.1111/j.1467-9485.2007.00423.x.

C. Li, S. Kampakis, and P. Treleaven, “Machine learning modeling to evaluate the value of football players,” arXiv preprint arXiv:2207.11361, 2022. https://doi.org/10.48550/arXiv.2207.11361

S. Herm, H.-M. Callsen-Bracker, and H. Kreis, “When the crowd evaluates soccer players’ market values: Accuracy and evaluation attributes of an online community,” Sport Management Review, 17(4): 484–492, 2014. https://doi.org/10.1016/j.smr.2013.12.006

J. L. Felipe, A. Fernandez-Luna, P. Burillo, L. E. de la Riva, J. Sanchez-Sanchez, and J. Garcia-Unanue, “Money Talks: Team Variables and Player Positions that Most Influence the Market Value of Professional Male Footballers in Europe,” Sustainability, 12(9): 1–8, 2020. https://doi.org/10.3390/su12093709

I. Behravan and S. M. Razavi, “A novel machine learning method for estimating football players’ value in the transfer market,” Soft comput, 25(3): 2499–2511, 2021. https://doi.org/10.1007/s00500-020-05319-3

T. Markham, “What is the Optimal Method to Value a Football Club?,” SSRN Electronic Journal, Mar. 2013, doi: 10.2139/ssrn.2238265.

T. Peeters, “Testing the Wisdom of Crowds in the field: Transfermarkt valuations and international soccer results,” Int J Forecast, 34(1): 17–29, 2018. https://doi.org/10.1016/j.ijforecast.2017.08.002

W. Rahman, AI and Machine Learning. in SAGE Essentials. SAGE Publications, 2020.

M. A. Al-Asadi and S. Tasdemır, “Predict the value of football players using FIFA video game data and machine learning techniques,” IEEE access, 10: 22631–22645, 2022. https://doi.org/10.1109/ACCESS.2022.3154767

D. Memmert, “Data analytics in football: positional data collection, modeling, and analysis,” Journal of Sport Management, 33(574): 308–2019, 2019. https://doi.org/10.1123/jsm.2019-0308

S. Majewski, “Identification of Factors Determining Market Value of the Most Valuable Football Players,” Journal of Management and Business Administration. Central Europe, 24: 91–104, Sep. 2016, doi: 10.7206/jmba.ce.2450-7814.177.

P. Singh and P. S. Lamba, “Influence of crowdsourcing, popularity and previous year statistics in market value estimation of football players,” Journal of Discrete Mathematical Sciences and Cryptography, 22(2): 113–126, 2019. https://doi.org/10.1080/09720529.2019.1576333

E.-G. Talbi, Metaheuristics: from design to implementation. John Wiley & Sons, 2009.

C. Li, S. Kampakis, and P. Treleaven, “Machine learning modeling to evaluate the value of football players,” arXiv preprint arXiv:2207.11361, 2022. https://doi.org/10.48550/arXiv.2207.11361

D.-Y. Li, W. Xu, H. Zhao, and R.-Q. Chen, “A SVR based forecasting approach for real estate price prediction,” in 2009 International conference on machine learning and cybernetics, IEEE, 2009: 970–974. https://doi.org/10.1109/ICMLC.2009.5212389

M. Xu, P. Watanachaturaporn, P. K. Varshney, and M. K. Arora, “Decision tree regression for soft classification of remote sensing data,” Remote Sens Environ, 97(3): 322–336, 2005. https://doi.org/10.1016/j.rse.2005.05.008

C. D. Sutton, “11 - Classification and Regression Trees, Bagging, and Boosting,” in Data Mining and Data Visualization, vol. 24, C. R. Rao, E. J. Wegman, and J. L. B. T.-H. of S. Solka, Eds., Elsevier, 2005: 303–329. doi: https://doi.org/10.1016/S0169-7161(04)24011-1.

D. Opitz and R. Maclin, “Popular ensemble methods: An empirical study,” Journal of artificial intelligence research, 11: 169–198, 1999. https://doi.org/10.1613/jair.614

D. Fister, R. Safarič, and I. Fister, “Parameter tuning of PI-controller with Bat algorithm,” Informatica, 40(1), 2016.

R. Eberhart and J. Kennedy, “A new optimizer using particle swarm theory,” in MHS’95. Proceedings of the sixth international symposium on micro machine and human science, Ieee, 1995, 39–43. https://doi.org/10.1109/MHS.1995.494215

and L. L. Edmund Ryan, Oliver Wild, Apostolos Voulgarakis, “Fast sensitivity analysis methods for computationally expensive models with multi-dimensional output.” https://doi.org/10.5194/gmd-11-3131-2018




DOI: https://doi.org/10.31449/inf.v49i22.7715

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.