资源简介
语音中准确的情绪识别对于智能医疗、智能娱乐和其他智能服务等应用程序非常重要。由于汉语语言的复杂性,汉语语音的高精度动作识别具有挑战性。本文探讨了如何提高语音情感识别的准确性,包括语音信号特征提取和情感分类方法。从语音样本中提取五种特征:梅尔频率倒谱系数(mfcc)、音调、共振峰、短期过零率和短期能量
代码片段和文件信息
function [ CC FBE frames ] = mfcc( speech fs Tw Ts alpha window R M N L )
% MFCC Mel frequency cepstral coefficient feature extraction.
%
% MFCC(SFSTWTSALPHAWINDOWRMNL) returns mel frequency
% cepstral coefficients (MFCCs) computed from speech signal given
% in vector S and sampled at FS (Hz). The speech signal is first
% preemphasised using a first order FIR filter with preemphasis
% coefficient ALPHA. The preemphasised speech signal is subjected
% to the short-time Fourier transform analysis with frame durations
% of TW (ms) frame shifts of TS (ms) and analysis window function
% given as a function handle in WINDOW. This is followed by magnitude
% spectrum computation followed by filterbank design with M triangular
% filters uniformly spaced on the mel scale between lower and upper
% frequency limits given in R (Hz). The filterbank is applied to
% the magnitude spectrum values to produce filterbank energies (FBEs)
% (M per frame). Log-compressed FBEs are then decorrelated using the
% discrete cosine transform to produce cepstral coefficients. Final
% step applies sinusoidal lifter to produce liftered MFCCs that
% closely match those produced by HTK [1].
%
% [CCFBEframeS]=MFCC(...) also returns FBEs and windowed frames
% with feature vectors and frames as columns.
%
% This framework is based on Dan Ellis‘ rastamat routines [2]. The
% emphasis is placed on closely matching MFCCs produced by HTK [1]
% (refer to p.337 of [1] for HTK‘s defaults) with simplicity and
% compactness as main considerations but at a cost of reduced
% flexibility. This routine is meant to be easy to extend and as
% a starting point for work with cepstral coefficients in MATLAB.
% The triangular filterbank equations are given in [3].
%
% Inputs
% S is the input speech signal (as vector)
%
% FS is the sampling frequency (Hz)
%
% TW is the analysis frame duration (ms)
%
% TS is the analysis frame shift (ms)
%
% ALPHA is the preemphasis coefficient
%
% WINDOW is a analysis window function handle
%
% R is the frequency range (Hz) for filterbank analysis
%
% M is the number of filterbank channels
%
% N is the number of cepstral coefficients
% (including the 0th coefficient)
%
% L is the liftering parameter
%
% Outputs
% CC is a matrix of mel frequency cepstral coefficients
% (MFCCs) with feature vectors as columns
%
% FBE is a matrix of filterbank energies
% with feature vectors as columns
%
% frameS is a matrix of windowed frames
% (one frame per column)
%
% Example
% Tw = 25; % analysis frame duration (ms)
% Ts = 10; % analysis frame shift (ms)
% alpha = 0.97; % preemphasis coefficient
% R = [ 300 3700 ]; % frequency range to
属性 大小 日期 时间 名称
----------- --------- ---------- ----- ----
文件 7190 2018-04-21 21:05 mfcc.m
文件 4797 2018-04-21 21:19 trifbank.m
文件 10707 2018-04-23 16:30 unti
文件 9006 2018-04-23 16:30 unti
文件 6993 2018-04-21 21:19 vec2fr
文件 65587 2019-05-28 20:16 语音情感识别代码matlab.rtf
- 上一篇:opencv实现分水岭算法
- 下一篇:opengl动态显示贝塞尔曲线
评论
共有 条评论