Highest Voted 'speech-recognition' Questions - Signal Processing Stack Exchange

18

votes

3 answers

human speech noise filter

Does anyone know of a filter to attenuate non-speech? I am writing speech recognition software and would like to filter out everything but human speech. This would include background noise, noise produced by a crappy microphone, or even background…

asked Jul 30 '12 at 16:43

rurouniwallace

403
1
4
14

12

votes

1 answer

Determining how similar audio is to human speech

While looking for an answer to this problem, I found this board so decided to cross post this question of mine from Stack Overflow. I am searching for a method of determining the similarity between an audio segment and a human voice, which is…

audio algorithms speech-recognition

asked Jun 16 '12 at 02:31

Jeff Gortmaker

223
1
5

10

votes

1 answer

How does noise reduction for speech recognition differ from noise reduction that is supposed to make speech more "intelligible" for humans?

this is a question that has interested me for some time now, mainly because I'm working on noise reduction for an existing speech recognition system myself. Most papers on noise reduction techniques seem to focus on how to make speech more…

noise speech-recognition speech-processing noise-cancellation

asked Jul 14 '17 at 14:17

marlonfl

103
5

10

votes

1 answer

Designing a feature vector for discriminating between different sonic waveforms

Consider the 4 following waveform signals: signal1 = [4.1880 11.5270 55.8612 110.6730 146.2967 145.4113 104.1815 60.1679 14.3949 -53.7558 -72.6384 -88.0250 -98.4607] signal2 = [ -39.6966 44.8127 95.0896 145.4097 144.5878 …

computer-vision frequency-spectrum autocorrelation speech-recognition

asked Jun 11 '12 at 14:42

Andy

1,647
1
16
26

9

votes

1 answer

How to segment phone call audio into silence/non silence?

My problem is that I don't know the energy of the background noise, so I can't just threshold the energy. The processing is done in real time, and I have about 500msec to decide. Ideally, I'd want quiet consonants considered non-silence.

audio speech-recognition

asked Oct 26 '11 at 08:58

Michael Litvin

372
2
7

9

votes

3 answers

How does Siri recognize me saying "Hey Siri"?

I am trying to understand how my iPhone can continually listening for me saying Hey Siri, Alexa, Hey Cortana or Okay Google without quickly draining my battery down. I imagined two kind of algorithm. One that record slice of time such as 10 ms wide…

sound speech-recognition voice

asked Mar 02 '17 at 21:08

nowox

191
5

8

votes

2 answers

What does a "vector" in a hidden Markov model mean?

I know that a Hidden Markov Model (HMM) is used in speech recognition and understand it to some degree. However, what I don't know is how input (speech) is "transformed" to a vector which in later used in HMM. How do you get a vector from a sound…

speech-recognition

asked Aug 16 '11 at 20:35

StupidOne

199
1
6

8

votes

1 answer

What's the correct graphical interpretation of a series of MFCC vectors?

I'm studying speech-recognition, in particular the use of MFCC for feature extraction. All examples I've found online tend to graph a series of MFCC extracted from a particular utterance as follows (graph generated by me from the software I'm…

speech-recognition mfcc visualization feature-extraction

asked Mar 30 '17 at 23:10

jotadepicas

193
1
8

7

votes

1 answer

Distinguish vowels from consonants

Problem of processing speech. Required to determine the phonemes and identify vowels and consonants. Anyone involved in this? Please advise what work on the subject is worth reading?

speech-recognition

asked Mar 02 '13 at 12:31

ekruten

93
1
4

7

votes

3 answers

Why does the excitation signal appear, separated, at high quefrencies in the cepstrum?

So, I've just begun a speech and language processing course and have found the explanation of the process of getting the cepstrum of a signal and its properties a little confusing. The following is a description of my current understanding and an…

speech-recognition speech voice cepstral-analysis

asked Jan 27 '13 at 10:01

Sam

171
3

7

votes

1 answer

how does this equation correspond to smoothing?

Please help me understand smoothing of data. This is a follow up to my previous question posted here. Especially the top answer by Junuxx where he says a way of smoothing a function $f(x)$ is: $$ f'[t] = 0.1 f[t-1] + 0.8 f[t] + 0.1 f[t+1] $$ here we…

speech-recognition smoothing speech

asked Oct 18 '12 at 00:51

user13267

501
1
5
20

6

votes

1 answer

Hidden Markov Model for Speech Rcognition. HMM Number of States

This is a question that came to mind as a result of a previous question Hidden Markov Models - Distinct Observation Symbols and subsequent answer from @pichenettes. One approach to speech recognition is to use Hidden Markov Models (HMM) to identify…

speech-recognition

asked Feb 20 '13 at 14:48

user2718

2,176
10
10

5

votes

0 answers

Zero-padding of MFCC coefficients

I am trying to implement speech recognition using backpropagation algorithm, and I have been following this paper. I have followed it all the way, except that it tells me to zero-pad the coefficients when there is an empty slot, because the MFCC…

matlab speech-recognition mfcc

asked Apr 24 '13 at 12:43

motiur

394
1
4
15

5

votes

2 answers

Dynamic Time Warping - Comparing Values

Ok, so I'm trying to compare two different speech signals and I have come into a problem. Here goes: I have split the signal into blocks, and I have computed the MFCC coefficients of each block. I then use a DTW algorithm to compare the (inputted)…

speech-recognition mfcc

asked Feb 05 '13 at 02:48

Phorce

455
1
6
17

5

votes

1 answer

What are i-vectors and x-vectors in the context of Speech Recognition?

I have read that i-vectors and x-vectors are widely used in speaker recognition tasks but I don't get the difference between them and how exactly they work. Can someone explain it starting from the ground to a bit technical? I came across following…

speech-processing speech-recognition speech

asked Jun 24 '19 at 01:58

mausamsion

151
1
1
4

Questions tagged [speech-recognition]