A best measure for speaker recognition

Asked Jun 06 '11 at 09:51

Active Mar 02 '18 at 09:19

Viewed 172 times

I have a set $E_{1}$, with a finite cardinality $n$ of rectangular matrices which contains the useful MFCC coefficients generated from $n$ speech signals. Similary I have a set $E_{2}$ of same cardinality as that of $E_{1}$ which is a collection of vectors of finite dimension containing the LPC of the same set of speech signal which was used to form $E_{1}$. Now $ D=\{ E_{1},E_{2} \}$ forms the database for the speaker recognition system.

When a test signal is given, MFCC $M_{i}$ and LPC $L_{i}$ are generated and the closest members $M_{j} \in E_{1}$ for MFCC and $L_{j} \in E_{2}$ for LPC are found using a distance function $d$. It is not necessary that $M_{j}$ and $L_{j}$ are the exact members of $E_{1}$ and $E_{2}$ respectively. It depends on the acoustic environment during the test phase.

What is the distance function used in literature?
If its $L_{2}$ norm, is there any other better measure which is "sensitive" so I can reduce the possibility of misclassification?

edited Jun 14 '13 at 14:23

jonsca

1,790
3
20
30

asked Jun 06 '11 at 09:51

Dinesh

Usually different distances like log-spectral distance, etc. are used. But my suggestion is to use DTW kind of algorithm for this scenario. – talk2speech Mar 02 '18 at 09:19

A best measure for speaker recognition

0 Answers0