基于MFCC和SVM的说话人性别识别matlab代码（含16个语音文件）

大小: 10.46MB

文件类型: .zip

金币: 2

下载: 0 次

发布日期: 2023-10-06
语言: C/C++
标签: mfcc svm 性别识别

高速下载

资源简介

本项目建立了一个小的语音库（8男8女），编写mfcc函数提取出语音的mfcc特征，然后利用svm进行训练和测试，实现性别识别，并创建gui进行功能展示，正确率为93.75%。本代码功能尚比较简单，有待继续完善。

资源截图

小图大图

代码片段和文件信息

function [ CC FBE frames ] = mfcc（ speech fs Tw Ts alpha window R M N L ）
% MFCC Mel frequency cepstral coefficient feature extraction.
%
%   MFCC（SFSTWTSALPHAWINDOWRMNL） returns mel frequency 
%   cepstral coefficients （MFCCs） computed from speech signal given 
%   in vector S and sampled at FS （Hz）. The speech signal is first 
%   preemphasised using a first order FIR filter with preemphasis 
%   coefficient ALPHA. The preemphasised speech signal is subjected 
%   to the short-time Fourier transform analysis with frame durations 
%   of TW （ms） frame shifts of TS （ms） and analysis window function 
%   given as a function handle in WINDOW. This is followed by magnitude 
%   spectrum computation followed by filterbank design with M triangular 
%   filters uniformly spaced on the mel scale between lower and upper 
%   frequency limits given in R （Hz）. The filterbank is applied to 
%   the magnitude spectrum values to produce filterbank energies （FBEs） 
%   （M per frame）. Log-compressed FBEs are then decorrelated using the 
%   discrete cosine transform to produce cepstral coefficients. Final
%   step applies sinusoidal lifter to produce liftered MFCCs that 
%   closely match those produced by HTK [1].
%
%   [CCFBEframeS]=MFCC（...） also returns FBEs and windowed frames
%   with feature vectors and frames as columns.
%
%   This framework is based on Dan Ellis‘ rastamat routines [2]. The 
%   emphasis is placed on closely matching MFCCs produced by HTK [1]
%   （refer to p.337 of [1] for HTK‘s defaults） with simplicity and 
%   compactness as main considerations but at a cost of reduced 
%   flexibility. This routine is meant to be easy to extend and as 
%   a starting point for work with cepstral coefficients in MATLAB.
%   The triangular filterbank equations are given in [3].
%
%   Inputs
%           S is the input speech signal （as vector）
%
%           FS is the sampling frequency （Hz） 
%
%           TW is the analysis frame duration （ms） 
% 
%           TS is the analysis frame shift （ms）
%
%           ALPHA is the preemphasis coefficient
%
%           WINDOW is a analysis window function handle
% 
%           R is the frequency range （Hz） for filterbank analysis
%
%           M is the number of filterbank channels
%
%           N is the number of cepstral coefficients 
%             （including the 0th coefficient）
%
%           L is the liftering parameter
%
%   Outputs
%           CC is a matrix of mel frequency cepstral coefficients
%              （MFCCs） with feature vectors as columns
%
%           FBE is a matrix of filterbank energies
%               with feature vectors as columns
%
%           frameS is a matrix of windowed frames
%                  （one frame per column）
%
%   Example
%           Tw = 25;           % analysis frame duration （ms）
%           Ts = 10;           % analysis frame shift （ms）
%           alpha = 0.97;      % preemphasis coefficient
%           R = [ 300 3700 ];  % frequency range to

属性            大小     日期    时间   名称
----------- ---------  ---------- -----  ----
     目录           0  2018-04-27 21:41  MFCC_2\
     文件        7190  2018-04-21 21:05  MFCC_2\mfcc.m
     文件        4797  2018-04-21 21:19  MFCC_2\trifbank.m
     文件       10707  2018-04-23 16:30  MFCC_2\untitled.fig
     文件        9006  2018-04-23 16:30  MFCC_2\untitled.m
     目录           0  2018-04-27 21:41  MFCC_2\Validation_test_set\
     文件      914076  2018-04-21 22:19  MFCC_2\Validation_test_set\f1.wav
     文件      983706  2018-04-21 22:23  MFCC_2\Validation_test_set\f2.wav
     文件      998454  2018-04-21 22:25  MFCC_2\Validation_test_set\f3.wav
     文件      919192  2018-04-21 22:27  MFCC_2\Validation_test_set\f4.wav
     文件     1036958  2018-04-21 22:29  MFCC_2\Validation_test_set\f5.wav
     文件     1190964  2018-04-21 22:31  MFCC_2\Validation_test_set\f6.wav
     文件     1052724  2018-04-21 22:34  MFCC_2\Validation_test_set\f7.wav
     文件      980224  2018-04-21 22:36  MFCC_2\Validation_test_set\f8.wav
     文件      887246  2018-04-21 22:21  MFCC_2\Validation_test_set\m1.wav
     文件      790168  2018-04-21 22:24  MFCC_2\Validation_test_set\m2.wav
     文件     1046376  2018-04-21 22:26  MFCC_2\Validation_test_set\m3.wav
     文件      929228  2018-04-21 22:28  MFCC_2\Validation_test_set\m4.wav
     文件     1033064  2018-04-21 22:30  MFCC_2\Validation_test_set\m5.wav
     文件     1061118  2018-04-21 22:33  MFCC_2\Validation_test_set\m6.wav
     文件      952576  2018-04-21 22:35  MFCC_2\Validation_test_set\m7.wav
     文件      907726  2018-04-21 22:36  MFCC_2\Validation_test_set\m8.wav
     文件        6993  2018-04-21 21:19  MFCC_2\vec2frames.m

上一篇：MFC制作SQLITE3操作界面
下一篇：Qt实现的迷宫与魔塔游戏源码迷宫模式、魔塔模式、游戏素材、C++课程设计

共有条评论

基于MFCC和SVM的说话人性别识别matlab代码（含16个语音文件）

资源简介

资源截图

代码片段和文件信息

评论

相关资源