I have a set $E_{1}$, with a finite cardinality $n$ of rectangular matrices which contains the useful MFCC coefficients generated from $n$ speech signals. Similary I have a set $E_{2}$ of same cardinality as that of $E_{1}$ which is a collection of vectors of finite dimension containing the LPC of the same set of speech signal which was used to form $E_{1}$. Now $ D=\{ E_{1},E_{2} \}$ forms the database for the speaker recognition system.
When a test signal is given, MFCC $M_{i}$ and LPC $L_{i}$ are generated and the closest members $M_{j} \in E_{1}$ for MFCC and $L_{j} \in E_{2}$ for LPC are found using a distance function $d$. It is not necessary that $M_{j}$ and $L_{j}$ are the exact members of $E_{1}$ and $E_{2}$ respectively. It depends on the acoustic environment during the test phase.
- What is the distance function used in literature?
- If its $L_{2}$ norm, is there any other better measure which is "sensitive" so I can reduce the possibility of misclassification?