why should I scale the fft using 1/N?

Question

I am writing a report, and my advisor asked me to explain why I scale the fft by a factor 1/N (where N is the length of the array).

I used to use the scaling convention of multipling the fft by the time increment (dt), this convention was good for me, because it ensures the check of the Parseval theorem. Unfortunately I had discussion with one of my advisors, because, since this convention does not give you the right amplitude, he thinks it is wrong. As I have read online, there are not right convention or wrong convention. If I use the factor 1/N the amplitude is checked, if I use the factor dt then the parseval identity is preserved. I have 2 question now:

Why, doing the fft, I cannot have both: amplitude checked and energy checked?
I have already demonstrated (in my report) that if I scale the fft by a factor 1/N, I obtain the right amplitude, since the first value of my fft is equal to the time average of my function. Now I would like to show with formulas why this convention gives me the right amplitude? I have already searched online and here on the forum, but I did not find any good answer that explain every passage.

I would suggest taking the FFT of sinusoids of various frequencies, and showing that the resulting FFT has a single bin with the same amplitude as the input sinusoid. Then instead of going by some dubious authority on the Internet, you're actually giving a mathematical proof. — TimWescott, Jan 03 '20 at 17:12
I have already did this. They asked me to show it using the theory step by step. I know that it is not safe going by "internet authorities", but I did not find books or paper that explain in deep this. What I want to obtain is an explanation that is more than "it works", I want to understand "why it works" , in other words I want to know which is the theory behind. — Luca Mirtanini, Jan 03 '20 at 18:11
It's hard to tell, but could they be asking you to re-prove the Fourier transform? In that case that's what you need to look for. You can represent an N-point DFT as multiplying the input signal, in the form of a vector, by an N by N orthonormal matrix, whose eigenvalues all have magnitude 1 and whose eigenvectors are (if I remember correctly!) all sinusoids or constant. Perhaps go back to your advisor - nicely - for clarification on just what you need to prove. — TimWescott, Jan 03 '20 at 18:18
No. My question is different. The matlab guide https://it.mathworks.com/help/matlab/ref/fft.html indirectly suggests to use the normalisation 1/N. My question is why the fft needs to be divided by N? Why in this case the Parseval theorem is not verified? — Luca Mirtanini, Jan 03 '20 at 18:38
@LucaMirtanini different people normalize their FFT differently. Matlab's docs don't apply to "the FFT", but to "the Matlab `fft` function". There's no way other than sitting down, writing down the DFT formula **you** are using and finding the right factor that makes power in frequency and time domain equivalent. — Marcus Müller, Jan 03 '20 at 18:53
@MarcusMüller can you explain why I cannot do a demonstration of why the factor 1/N is necessary? — Luca Mirtanini, Jan 03 '20 at 19:22
Because a demonstration shows a single case. A proof proves all cases. — Marcus Müller, Jan 03 '20 at 19:23
"*Now I would like to show with formulas why this convection gives me the right amplitude*" — Marcus Müller, Jan 03 '20 at 19:23
also, honestly, you're overthinking this: Tim's first comment was spot on: had you done 2 hours ago, you would now have 1:45 h of having derived the right factor! If you are having a hard time doing what's recommended in that first comment, do please add your calculations to show where you're having a hard time. — Marcus Müller, Jan 03 '20 at 19:25
Are you applying any window to your signal before taking the FFT? — jojek, Jan 03 '20 at 19:28
@MarcusMüller I answered that I have already done what tim suggested. But it was not what I was asking. — Luca Mirtanini, Jan 03 '20 at 19:40

Laurent Duval · Accepted Answer · 2020-01-04T18:48:04.257

3

OK, let us go for a 2-point DFT. It should be noted that, depending on the software used, scaling can be different, and ought to be checked. The standard unscaled version does multiply the input vector by:

$$\left[\begin{matrix} 1 & 1\\1 & -1\end{matrix}\right]$$

We have two more options: preserve the average, thus we need:

$$\frac{1}{{2}}\left[\begin{matrix} 1 & 1\\1 & -1\end{matrix}\right]$$

as $\frac{1}{{2}}\left[\begin{matrix} 1 & 1\\1 & -1\end{matrix}\right]\left[\begin{matrix} x_1\\x_2\end{matrix}\right]$ has the same average as $\left[\begin{matrix} x_1\\x_2\end{matrix}\right]$, or

$$\frac{1}{\sqrt{2}}\left[\begin{matrix} 1 & 1\\1 & -1\end{matrix}\right]$$

to preserve energy (thus orthogonality) in both domains. More generally, you have three standard options:

avoid scaling
scale by $\frac{1}{\sqrt{N}}$
or by $\frac{1}{{N}}$,

depending on the invariance purpose. But you can scale columns with other quantities, as needed for your computational purpose. The following code aims at showing that the $1/\sqrt{N}$ scaling could work (if I am not too tired) to preserve Parseval-Plancherel or Rayleigh's energy theorem.

nSample = 2^1 ; % One can change the 2^* to other powers
data = randn(nSample,1) ; 
dataFFT = fft(data)/sqrt(nSample); 
ratioParseval = (norm(data)-norm(dataFFT))/norm(data)

edited Jan 04 '20 at 18:48

answered Jan 03 '20 at 20:23

Laurent Duval

28,803
3
26
88

2

Thank you Laurent for your very detailed answer. When you say "preserve the average", I don't understand because if I do: A=[1,1;1,-1] B=[x1;x2]; C=mean(1/2*A*B) ; D=mean(B); C is not equal to D. So what do you mean with the preserving the average? – Luca Mirtanini Jan 04 '20 at 16:18
2

Oh I have been too fast. Only the first line of $1/2[1,\,1]^T[x_1,\,x_2]$ corresponds to the mean, but the all matrix was needed to keep the Fourier transform, and the second line, related to the derivative, has the same scaling issue. – Laurent Duval Jan 04 '20 at 16:23
3

What does it mean that "the all matrix was need to keep the Fourier transform? If the scale 1/sqrt(N) preserve the energy, why using this scale the Parseval Theorem is not verified? (thank you, your answer is helping a lot) – Luca Mirtanini Jan 04 '20 at 17:42
1

Seems to work: `nSample = 2^1 ; data = randn(nSample,1) ; dataFFT = fft(data)/sqrt(nSample); ratioParseval = (norm(data)-norm(dataFFT))/norm(data)` – Laurent Duval Jan 04 '20 at 18:25
1

Ok doing this: %Energy in the time domain x=sum(data.^2); %Energy in the freq domain X=sum(dataFFT.^2) , I satisfy the Parseval theorem. But x shouldnt be multiplied by dt and the X shouldn't be multiplied by df ? – Luca Mirtanini Jan 04 '20 at 19:00
1

As long as you are doing FFT or DFT, the $dx$ and the $df$ are hidden under the carpet (equal to unity, more of less), because actual time sampling (and units) is not carried in most discrete formulations. Long discussion ahead:) – Laurent Duval Jan 04 '20 at 20:12

score 2 · Answer 2 · answered Jan 04 '20 at 19:02

Your $dt$ has an implicit $1/N$ in it:

$$ dt \frac{time}{sample} = \frac{ T_{DFT} }{ N } \cdot \frac{ \frac{time}{frame} }{ \frac{samples}{frame} } $$

That's why it works.

My preference is strongly to use a $1/N$ normalization. The principal reason is that it makes the magnitudes of the bin values of pure whole integer tones independent of how many sample points you chose to fill a given duration.

It also turns the DFT into an average instead of a sum. See this: DFT Graphical Interpretation: Centroids of Weighted Roots of Unity

Technically there is no energy in the DFT itself. That's a usage. The sum of squares (energy) is preserved across the transform when a $1/\sqrt{N}$ normalization is used.

Not normalizating can be more efficient to calculate, which is why it makes sense that most FFT library functions don't normalize. It's just a rescaling, after all.

What is the point of looking at the $dt$ I can’t see the connection — OuttaSpaceTime, Jan 28 '22 at 21:48

score 1 · Answer 3 · edited Feb 06 '20 at 12:06

FFT is a fast way to compute DFT. Hence the scale factor $1/N$ belongs to the DFT (specifically the inverse DFT in MATLAB ifft() function).

As Marcus has already pointed out; it's arbitrary to put the scale factor either into the forward or to the inverse DFT.

However, the concept of energy equivalence in time and frequency domains (i.e., norm be preserved by the transform) requires that the scale factor be symmetrically distributed into both forward and inverse transforms. i.e;

$$ X[k] = \frac{1}{\sqrt{N}} \sum_{n=0}^{N-1} x[n] e^{-j\frac{2\pi}{N} n k} \tag{1} $$

$$ x[n] = {\sqrt{N}} \sum_{k=0}^{N-1} X[k] e^{j\frac{2\pi}{N} n k} \tag{2} $$

otherwise you won't have a unitary transform.

MATLAB utilization of the fft() and ifft() funtions to obtain the symmetric DFT pairs (Parseval checks) will be the following:

N = 16;               % sequence length
x = randn(1,N);       % time-domain signal

X = sqrt(1/N)*fft(x,N);  % forward DFT 
xi = sqrt(N)*ifft(X,N);  % inverse DFT

% Check Parseval. 
sum(x.^2)           % Energy in time domain
sum(abs(X).^2)      % Energy in freq domain

Note that a linear transform (linear mapping) is shown like:

$$y = A x \tag{3}$$ where $x$ and $y$ are $N \times 1 $ vectors and $A$ is an $N \times N$ transformation matrix.

The energy of the transformed vector $y$ can be given as :

$$ \mathcal{E_y} = ||y||^2 = y^H y = (Ax)^H (Ax) = x^H(A^H A) x \tag{4}$$

If we desire a transform which would preserve energy in both domains; i.e., $\mathcal{E_y} = \mathcal{E_x}$, then we look for the equality

$$ ||y||^2 = ||x||^2 \tag{5} $$

which implies from Eq.4 that we have :

$$ A^H A = I \tag{6}$$

In other words the linear transformation matrix $A$ has the property that $$A^{-1} = A^H \tag{7}$$ Such matrices are called unitary matrices (aka orthonormal) and such transforms are called unitary transforms.

For DFT being a unitary transform, you need to have the symmetric scaling as in Eqs 1 and 2. Note that if you use the asymmetric scaling, then you will still have an orthogonal transform, but not unitary (orthonormal).

Hi, thank you for this explanation. I have some doubts: 1) The index H means"transposed"? 2) The symbol || || means modulus? 3) So, if I have well understood, in Matlab I need the factor 1/N in order to have a symetric scaling and have a orthonormal transform isn't it? 4) Why if I use the factor 1/N, the Parseval theorem is not checked? — Luca Mirtanini, Jan 04 '20 at 15:03
Hi. 1-) H stands for **Hermitian** transpose; $x^H = (x^*)^T$ . 2-) ||.|| is the **norm** of a vector. Euclidean norm of x is $$||x||^2 = (\sum_{k=1}^N |x_k|^2)^{1/2} $$ 3-) No. MATLAB already uses $1/N$ scaling in the **inverse** FFT. If you want to make MATLAB fft function symmetric, you should use `X = sqrt(1/N)*fft(x,N)' ,`X = sqrt(N)*ifft(x,N)' . 4-) Yes if you use 1/N with MATLAB parseval won't check as explained in 3. Use the scaling in 3 with MATLAB to get the parseval's check. Note DFT is always **orthogonal** but symmetric scaling makes it **unitary**,hence **orthonormal**. — Fat32, Jan 04 '20 at 18:50

why should I scale the fft using 1/N?

3 Answers3

Linked