Automated Financial Statement Auditing via YOLOv5s Object Detection and NLP-Based Semantic Analysis
Abstract
Driven by globalization and digitalization, the complexity and volume of financial statements have exploded, and the limitations of traditional auditing methods in terms of efficiency and accuracy have become increasingly prominent. At present, there are relatively few relevant studies on the combination of object detection and text analysis in financial auditing, and this paper has launched an innovative exploration in this field and proposed an intelligent financial statement audit system. The system integrates advanced YOLOv5s financial image recognition technology and natural language processing algorithms to achieve fast and accurate recognition and understanding of financial information. This study presents an integrated framework combining computer vision and natural language processing for financial report analysis, employing YOLOv5s optimized with a domain-specific dataset containing 15,000 annotated financial statement images to achieve 96.4% detection accuracy in parsing complex tabular structures. For text understanding, we implement a hybrid NLP architecture utilizing BERT for semantic role labeling and BiLSTM with attention mechanisms to extract financial indicators and risk factors, trained on a corpus of 50,000 financial reports with 85-15 train-test split. In order to ensure the scientific and reliable research, the experimental results show that the intelligent audit system has a recognition accuracy of 98% when processing large-scale financial statement data, which is 15% higher than that of traditional methods. The system is 3 times faster, significantly shortening the audit cycle and reducing the audit cost. At the same time, the system can also automatically detect abnormal data, assist auditors to quickly locate potential financial risks, and provide a strong guarantee for decision support.
Full Text:
PDFDOI: https://doi.org/10.31449/inf.v49i11.8999
This work is licensed under a Creative Commons Attribution 3.0 License.








