Comparing a clean, noisy and enhanced signal

Question

I was doing some experiments on analyzing few audio samples and I'm stuck with this. Suppose we have a clean audio signal ($y_{clean}$), a noisy version of it ($y_{noisy}$) and an enhanced version ($y_{enhanced}$), by applying some speech enhancement algorithm to $y_{noisy}$.

What would be the right way to quantify the relative increase/decrease in the quality of $y_{enhanced}$ (compared to $y_{clean}$) in say, particular sections of the audio ? I'm not interested in SNR (or is it wrong to ignore it?) I was thinking something like computing the spectrogram for both and compute the distance between them, say using $L_{\inf}$ or $L_2$ norm. I don't know if it's even correct way to do this. Any help would be much appreciated !

(PS: I'm from CS background, so I don't have much knowledge about signal processing. Apologies if this question seems too basic)

Thanks.

applesoup · Answer 1 · 2021-04-10T15:53:16.467

The answer to you question depends on how quality is defined in your specific application. Since you're working with audio signals, it is often beneficial to use the perceived quality to analyze the efficiency of an audio enhancement algorithm.

How to Determine the Perceived Quality

Methods to determine the perceived quality of an audio signal can be categorized into

subjective and
objective

methods. Subjective methods are based on human listeners that rate the audio signal according to some defined criteria. Objective methods use algorithms that use some measure, modeling parts of the human auditory system, to estimate the perceived quality.

Listening Tests

Usually, the most accurate way to determine the perceived quality is to perform a listening test and ask human listeners to rate the quality of different signals. Listening test methods comprise, e.g., the MUltiple Stimuli with Hidden Reference and Anchor and pairwise comparison approaches.

Objective Measures

A simple objective measure surely is the signal-to-noise ratio (SNR). In many cases, though, the SNR is only related to the actual perceived quality to a certain extent. Therefore, more involved objective measures have been created. Well-known ones are

the Perceptual Evaluation of Speech Quality (for speech signals) and
Perceptual Evaluation of Audio Quality (for audio/music signals)

algorithms.

score 0 · Accepted Answer · answered Apr 03 '21 at 10:46

0

The best quantitative (using signal measurements) way to compare an information-carrying signal contaminated with noise and an information-carrying signal contaminated with less noise is to compare the SNRs of the two signals.

answered Apr 03 '21 at 10:46

Richard Lyons

4,305
11
24

But how am I supposed to calculate the SNR of the $y_{enhanced}$ ? Although it's possible for $y_{noisy}$ (since its equal to $y_{clean}$ + noise), I can't simply do $y_{enhanced} - y_{clean}$ to get the noise level for it, can I ? I'm getting weird results when I do this – Debasish Das Apr 10 '21 at 11:16
In my experience, what's *best* crucially depends on the details of a specific problem. The best solution for one problem in no way needs to be the best solution for another, potentially even very similar, problem. In this specific case, i.e., the comparison of audio quality, the SNR in many cases only yields a very coarse indication of the *quality*. – applesoup Apr 10 '21 at 11:58
@Debasish, I have a small section in my DSP book covering the topic "Estimating SNR in the Frequency Domain". If you send me a private e-mail, at: , I will send you a copy of that section of my book. – Richard Lyons Apr 12 '21 at 07:43

Comparing a clean, noisy and enhanced signal

2 Answers2

How to Determine the Perceived Quality

Listening Tests

Objective Measures