Transfer Learning-based Speech Emotion Recognition: A TCA-JSL Approach for Chinese and English Datasets
Abstract
Speech emotion recognition (SER) has important application value in many scenarios and is a focus of current research. This paper developed a transfer component analysis-joint subspace learning (TCA-JSL) algorithm based on transfer learning for Chinese and English SER. It used TCA to reduce the dimension of speech emotion features, employed the JSL algorithm to generate categorical emotion features, and realized the recognition of different emotion types based on a support vector machine. An experimental analysis was carried out on Chinese speech emotion library CASIA, English speech emotion library eNTERFACE, and English speech emotion library SAVEE. The results showed that the p value of the TCA-JSL algorithm for CASIA→eNTERFACE was 0.4950 ± 0.0152, and the R value was 0.3542 ± 0.0163; the P value for eNTERFACE→CASIA was 0.4533 ± 0.0151, and the R value was 0.3511 ± 0.0161. Compared with the joint distribution adaptive regression (JDAR) method, p < 0.05. The P value of the TCA-JSL algorithm for CASIA→SAVEE was 0.4521 ± 0.0176, and the R value was 0.3544 ± 0.0161; the P value for SAVEE→CASIA was 0.4987 ± 0.0175, and the R value was 0.3511 ± 0.0158. Compared with the JDAR method, p < 0.05. The results verify that the TCA-JSL algorithm is effective in Chinese and English SER and can be applied to real tasks.
Full Text:
PDFDOI: https://doi.org/10.31449/inf.v49i13.7640

This work is licensed under a Creative Commons Attribution 3.0 License.