基于MFCC-VGGish融合特征的鲸类声信号识别方法

王绪岩; 夏涛; 吕婧; 张学雷

doi:10.12362/j.issn.1671-6647.20250730001

基于MFCC-VGGish融合特征的鲸类声信号识别方法

A Whale Acoustic Signal Recognition Method Based on MFCC-VGGish Fusion Features

摘要

摘要: 声信号在鲸类个体间通讯与导航中至关重要。为提高鲸类声信号识别精度，本文提出一种融合梅尔频率倒谱系数（MFCC）特征与VGGish特征的多模态分析方法。基于4种鲸类音频样本，提取13维MFCC与128维VGGish特征，通过动态加权机制融合，并结合互信息特征选择与线性判别分析（LDA）优化特征空间。采用支持向量机（SVM）与随机森林（RF）分类器，经5折交叉验证与超参数优化，融合特征在测试集上准确率分别达99.28%和99.17%，其平均值较单一特征提升约3%，召回率均超99%，且在不同信噪比下表现出更强鲁棒性。消融实验进一步验证了动态加权、特征选择与降维在性能提升中的协同作用。尽管计算复杂度因VGGish特征提取而较高，但其带来的性能提升表明该方法具备实际应用潜力。本文验证了深浅特征互补融合在鲸类声信号识别中的有效性，为构建高灵敏度、非侵入式的海洋生物监测系统提供了新的技术思路。

Abstract: Acoustic signals are pivotal to communication and navigation among cetaceans. To improve whale-call recognition, we propose a multimodal approach that fuses Mel-Frequency Cepstral Coefficients (MFCC) with VGGish deep representations. Using audio from four whale species, we extract 13-dimensional MFCC and 128-dimensional VGGish features, combine them via a dynamic weighting scheme, and further refine the representation with mutual-information–based feature selection and Linear Discriminant Analysis (LDA). With Support Vector Machine (SVM) and Random Forest (RF) classifiers trained using five-fold cross-validation and hyperparameter tuning, the fused representation attains test-set accuracies of 99.28% (SVM) and 99.17% (RF), yielding an average gain of about 3 percentage points over single-feature baselines, with recall exceeding 99%. Under varying signal-to-noise ratios, the fused features consistently exhibit stronger robustness than MFCC or VGGish alone. Ablation studies attribute the gains to the synergy among dynamic weighting, feature selection, and dimensionality reduction. Although VGGish extraction increases computational cost, the accuracy–robustness trade-off remains favorable, indicating strong potential for practical deployment. Overall, the results validate that complementary fusion of shallow (MFCC) and deep (VGGish) features is effective for whale acoustic recognition and provides a promising foundation for high-sensitivity, non-intrusive marine bio-monitoring.

HTML全文

参考文献(25)

施引文献

资源附件(0)