Questions tagged [speech-synthesis]
18 questions
13
votes
1 answer
Speech synthesis requiring very little CPU performance?
Back in the days of 1 MHz 8-bit CPU personal computers (Apple II, Atari 800, et.al.), there were software programs that could do comprehendible arbitrary text-to-speech synthesis on those PCs. What published speech synthesis algorithms might be…
hotpaw2
- 33,409
- 7
- 40
- 88
4
votes
2 answers
Mel Cepstral Distortion
I am working on a speech synthesis model and I am looking to evaluate my synthesized speech. I found that most people use the Mel Cepstral Distortion (MCD) which can be calculated by the…
MrHat
- 81
- 1
- 7
3
votes
3 answers
How to simulation different kinds of Noises in speech signal?
What are the different kinds of noise in speech signal? How can I simulate the same in Matlab for adding to a clean speech signal?
K V Vijay Girish
- 41
- 1
- 6
2
votes
0 answers
How to use linear predictive coding to compress voice diphone samples?
I'm working on an experimental diphone / unit selection speech synthesizer for my native language which lacks good speech synthesizer for blind people.
The problem is that recorded unit library can get very huge (hundreds of megabytes, as seen in…
JustAMartin
- 121
- 3
2
votes
0 answers
What is actually transmitted following LPC of a speech frame?
For each frame, what's sent over to the receiver for decoding? The coefficients , pitch in some bits, voiced/unvoiced classification in another bit?
Another broad question to develop intuition about LPC and CELP for speech encoding.
By using…
panthyon
- 1,083
- 11
- 24
1
vote
1 answer
Computation of parameter filter to match a given frequency response
I'm looking for the practical way to compute the parameters ($a_1$, $a_2$) of a digital filter to match a certain frequency response. I'm studying the 12-poles filter of the vintage component SP0256 which is used in speech synthesis, using bandpass…
Robert Dawson
- 13
- 4
1
vote
0 answers
Bandwidth enhanced sinusoidal model with oscillators banks
i am currently implementing the bandwidth enhanced sinusoidal model for an additive synthesizer, this model allow to accurately produce noisy sounds by adding a noise component to each sinusoids.
But i have some troubles understanding the technical…
Onirom
- 113
- 5
1
vote
1 answer
Effect of redundant training data in HMM-based speech recognizer/synthesizer?
How are redundant training data handled during the training stage?
For example, assume we have one observation for phone $\theta$ in the training set.
Then the training (for a monophone) is done with:
$$\lambda_{max}^\theta = \text{arg}…
stock username
- 49
- 7
1
vote
1 answer
Why are the observation features of an HMM-based recognition/synthesis system modeled by a Gaussian distribution?
Why are the observation features (namely MFCCs) of an HMM-based recognition/synthesis system modeled by a Gaussian distribution?
Even the state duration is modeled by a Gaussian in this paper:
K. Tokuda et al., "Speech synthesis based on hidden…
stock username
- 49
- 7
1
vote
1 answer
Removing vocoder effect from audio file
There are many tutorials and guides explaining how to create a "robotic" voice by using a vocoder or a mix of filters that change pitch, tempo and maybe add a slight echoed delay. However, is it possible to do that opposite? How would you take a…
Cerin
- 588
- 5
- 9
1
vote
1 answer
In framing of audio samples', what is need of frame shift while giving frame size??
In framing of audio samples' in audio feature extraction, what is need of frame shift while giving frame size??
i.e. frame size = 20ms,
frame shift = 10ms.
Rather then shifting/overlapping why can't we use continuous frames then of…
Surendra
- 25
- 1
- 4
1
vote
2 answers
PSOLA how to calculate synthesized Marks
I have successfully extracts all Pitch Markers from waveforms, I used autocorrelation to find my pitch periods, The markers can be observed in the figure below:
I want change my signal (Pitch or Speed) using PSOLA, I need find the synthesized…
user2721828
- 43
- 5
0
votes
1 answer
Sample text to collect all possible English biphones for Text-To-Speech
Since the list of phonemes in English is fixed, it should be possible to come up with a sample text(s) to collect all possible biphones for text-to-speech synthesis.
Does anyone have a sample text for this purpose?
alvas
- 51
- 5
0
votes
1 answer
Hw does a digital piano work?
I know someohow it combines different harmonics to synthsis note, but I don't exactly know how this happens. Does digital piano use pwm to synthesis a note?
Ardawan
- 1
0
votes
2 answers
Conceptually confused by LPC for speech: Do we synthesize by the inverse filter (FIR)?
I've been following the process of using LPC to analyse the speech and then synthesize the speech by swapping out coefficients over time. I'm purely concerned with pitched vowels at this point.
lpc(speech_segment) gives us $a_{p}$ coefficients of an…
Aditya TB
- 99
- 7