4

Consider a survey of firms of size $n$. This survey includes, among other variables, the average wage of workers in firm $i$ ($x_i$) and the number of workers in the firm ($L_i$). Both are random variables.

I want to study the properties of the average wage in this economy. This average wage is defined as

$$ \chi = \sum_i \delta_i x_i $$

where $\delta_i$ is the proportion of workers in firm $i$ with respect to all workers in the sample. This is, $\delta_i=\frac{L_i}{\sum_i L_i}$. These weights add up to one.

As part of my analysis of this weighted average, I am interested in two decompositions which came into my mind:

  1. Mathematical decomposition:

    Define the arithmetic (i.e. unweighted) average as $\bar x$. We can write:

    $$ \chi = \bar x \left(\frac{\sum_i \delta_i x_i}{\bar x}\right) $$

    In words, we can think of $\chi$ as a decomposition between the arithmetic mean and a measure of "dispersion around that mean". In effect, if $x_i=c$ (no dispersion), $\chi = \bar x = c$. So whether $\chi$ is above or below $\bar x$ tell us something about dispersion of $x_i$.

  2. Statistical decomposition:

    Assuming the sample is iid, the sample equivalent of the population moment $E(\delta_i x_i)$ is

    $$\hat E(\delta_i x_i) = \dfrac{\sum_i \delta_i x_i}{n} = \dfrac{\chi}{n}$$

    But we know the properties of the covariance:

    $$ E(\delta_i x_i) = E(\delta_i)E(x_i) + cov(\delta_i,x_i) $$

    Which sample equivalents are:

    $$ \hat E(\delta_i x_i) = \hat E(\delta_i) \hat E(x_i) + \hat{cov}(\delta_i,x_i) $$

    Using the above, and noting that $n\hat E(\delta_i)=1$ (because of the definition of weights), we get:

    $$ \chi \approx \hat E(x_i) + n \ \hat{cov}(\delta_i,x_i) $$

    This is also a form of mean and dispersion decomposition, since in the case of no dispersion ($x_i=x$), the covariance term is zero.

Now, the two above seem to me very straighforward, and almost obvious. As such, I imagine these decompositions have been studied in the literature already. Maybe they have a name, and a whole set of properties around them.

Thus, what I want is to find our about these decompositions. However, I cannot find anywhere a literature related to them. For what I can gather, this is unrelated to mean-variance decomposition literature, Blinder-Oaxaca decomposition, standardisation, and so on.

My questions are: are these common decompositions? Do they have a name? Can you refer me to literature where I can read more about them?

luchonacho
  • 2,568
  • 3
  • 21
  • 38
  • Most of what you deduce is incorrect because the $\delta_i$ are functions of the data. It's hard to believe that is "not relevant." Among other things, this implies expressions for variances and covariances are wrong. In effect, by not telling us what the $\delta_i$ are, you are stating that $\chi$ could be practically any function of the data--and that makes this question far too broad to be answerable. Perhaps you could, therefore, edit this post to explain what the $\delta_i$ are? – whuber May 08 '18 at 11:56
  • @whuber Why are the expressions wrong? E.g. in what sense is the covariance formula wrong? My dataset is multivariate, and $\delta_i$ is a function of other variables. Explaining that would make the post much more complex. And I don't know exactly why is that needed. Say you want to understand a weighted average wage, where the weight is the number of employees in firms. That is not my case but still, I think the above applies. – luchonacho May 08 '18 at 12:20
  • You treat the $\delta_i$ as if they were constants--and thereby ignore their variance and covariances with the $x_i$--but that contradicts your assertion that they are functions of the data. – whuber May 08 '18 at 13:56
  • @MartijnWeterings Indeed. Corrected the notation and added note that weights add up to one. I still don't see why there is a problem. Imagine a DGP with two random variables, from where I obtain an iid sample of size $n$, and get the data $\delta_i$ and $x_i$. The formula still holds, whichever is (if there is) the causal relationship between $\delta_i$ and $x_i$. – luchonacho May 08 '18 at 14:19
  • @whuber Why am I ignoring their covariance? The equation has $cov(\delta_i,x_i)$. Can you please tell me where is the error in the E() = E()E() +cov() formula? – luchonacho May 08 '18 at 14:20
  • Your edits help, thank you. If I'm following your argument correctly, then you seem to have omitted a factor of $n$ in that formula. – whuber May 08 '18 at 14:28
  • I get your formula's now, and can connect your two decompositions by writing: $$\begin{array} \ \chi &= \bar{x} \left( \sum_{i} \delta_i \frac{x_i}{\bar{x}} \right) \\ & = \bar{x} \left( \sum_{i} (\delta_i-\frac{1}{n}) (\frac{x_i}{\bar{x}}-1) - \sum_{i} \frac{1}{n} - \sum_{i} (\frac{-1}{n}\frac{x_i}{\bar{x}}) - \sum_{i} (- \delta_i) \right) \\ & = \bar{x} + \bar{x} \sum_{i} (\delta_i-\frac{1}{n}) (\frac{x_i}{\bar{x}}-1)\\ & = \bar{x} + n \, cov(\delta_i, x_i) \end{array}$$ but I don't really get what you wan't to do with it. What wheel are you inventing? – Sextus Empiricus May 08 '18 at 15:38
  • You mention that $\chi$ is a measure of dispersion, but $\chi - \bar{x} = n \, cov(\delta_i,x_i)$ is rather a measure of the covariance between $\delta_i$ and $x_i$ which happens to be related to the dispersion of $x_i$ by $$cov(X,Y) = \sigma_X \sigma_Y corr(X,Y) $$ – Sextus Empiricus May 08 '18 at 15:44
  • @MartijnWeterings 1) Yes, you have got the same formula than me. 2) I wonder if this decomposition has a name, and perhaps further properties attached to it. I do not want to do double work. 3) I agree with you that more than dispersion $\chi$ measures covariance. Intuitively, any mean-preserving spread of $x$ (i.e. more dispersion) which maintains orthogonal to $\delta_i$ does not change $\chi$. 4) The problem with the correlation formula is that, afaik, cov and not correlation is a "deep parameter" of the DGP (e.g. when writing down a joint normal distribution, the matrix has cov on it). – luchonacho May 09 '18 at 15:50
  • 2
    @luchonacho, you have given this question a relatively high bounty, but I wonder whether you could get more attention by changing the positioning of the question. You question is about reinventing the wheel, but I wonder what wheel you are actually trying to invent. You explain in depth some mathematical formula (2 versions), but you go much less into the background, intuition, reasoning, motivation, etc. I wonder what is the point, what is so new/special/innovative about this, why is this a 'big' reinvention of a wheel (and what wheel)? What's the deal? What's the hurdle? What's the point? – Sextus Empiricus May 11 '18 at 11:27
  • @luchonacho can you add a comment 'ping me' when you have changed it? The question sort of intrigues me but I am a bit lost about the meaning. – Sextus Empiricus May 11 '18 at 11:58
  • 1
    @MartijnWeterings I hope it reads better now. – luchonacho May 12 '18 at 08:14

1 Answers1

1

Since $\delta_i = L_i / \sum_{i=1}^n L_i$, this allows you to rewrite your quantity of interest directly in terms of the wages and sizes of firms (I am going to add a subscript for dependence on $n$):

$$\chi_n = \sum_{i=1}^n \delta_i x_i = \frac{\sum_{i=1}^n L_i x_i}{\sum_{i=1}^n L_i}.$$

The statistic $\chi_n$ is the average-worker-wage in the first $n$ firms (where the average is by worker, not by firm). In contrast, the unweighted average $\bar{x}_n = \tfrac{1}{n} \sum_{i=1}^n x_i$ is the average-firm-wage in the first $n$ firms. (The latter can also be considered to be the average-worker-wage that would occur if all firms had the same number of workers.)

The statistic $\chi_n$ is a function of the average-wages $w_1, ..., w_n$ and the firm sizes $L_1, ..., L_n$. Its sampling properties depend on the properties of these underlying variables, and descriptions of its expected value, and other statistical aspects, would be determined by the underlying distribution of these values.


Decomposition 1 (multiplicative decomposition): What you have figured out in your first decomposition is that the average-worker-wage to the average-firm-wage using the multiplicative equation:

$$\chi_n = \bar{x}_n \cdot M_n \quad \quad \quad \quad M_n \equiv \frac{\sum_{i=1}^n \delta_i x_i}{\bar{x}_n} = \frac{n \sum_{i=1}^n L_i x_i}{(\sum_{i=1}^n L_i)(\sum_{i=1}^n x_i)}.$$

The multiplicative factor $M_n$ converts the average-firm-wage to the average-worker-wage; it is a measure of the spread of workers across firms, relative to the wage of those firms. This quantity takes on a large value (higher than one) if workers are concentrated in high-wage firms, and takes on a low value (lower than one) if workers are concentrated in low-wage firms. In the special case where all firms have the same size we have $L_1 = \cdots = L_n$, which gives $M_n = 1$ and $\chi_n = \bar{x}_n$.

The only real use of this decomposition would be if you want to use the multiplicative factor as a measure of concentration of workers across firms with different wages. This statistic could be a useful quantifier of worker "concentration" across firms with different wages, if this is something of interest. Since the multiplier is more complex than either of the individual averages (and is derived from their ratio), it is unlikely to be useful beyond this. I am not aware of any particular literature on this decomposition, but you might try looking at economic literature that looks at macroeconomic aggregates of wages and worker concentration. In any case, whether or not there is literature on this statistic, the interpretation of the statistic is obvious enough.


Decomposition 2 (additive decomposition): It is hard to make sense of your work here, since there is a lot of conflation of expectation with averaging, etc. In any case, the decomposition you are looking for is resolved by the comment by Martijn Weterings above. The sample covariance between the values $x_i$ and $\delta_i = L_i / \sum_{i=1}^n L_i$ (without Bessel's correction) is:

$$\begin{equation} \begin{aligned} \mathbb{Cov}(\boldsymbol{x}_n, \boldsymbol{\delta}_n) &= \frac{1}{n} \sum_{i=1}^n (x_i - \bar{x}_n) (\delta_i - \bar{\delta}_n) \\[8pt] &= \frac{1}{n} \sum_{i=1}^n (x_i - \bar{x}_n) (\delta_i - \tfrac{1}{n}) \\[8pt] &= \frac{\bar{x}_n}{n} \sum_{i=1}^n (\tfrac{x_i}{\bar{x}_n} - 1) (\delta_i - \tfrac{1}{n}) \\[8pt] &= \frac{\bar{x}_n}{n} \Big( \sum_{i=1}^n \tfrac{x_i \delta_i}{\bar{x}_n} - \tfrac{1}{n} \sum_{i=1}^n \tfrac{x_i}{\bar{x}_n} - \sum_{i=1}^n \delta_i + \sum_{i=1}^n \tfrac{1}{n} \Big) \\[8pt] &= \frac{\bar{x}_n}{n} ( M_n - 1 - 1 + 1 ) \\[8pt] &= \frac{\bar{x}_n}{n} ( M_n - 1 ) \\[8pt] &= \frac{1}{n} ( \chi_n - \bar{x}_n ). \\[8pt] \end{aligned} \end{equation}$$

Hence, you have the additive decomposition:

$$\chi_n = \bar{x}_n + n \cdot \mathbb{Cov}(\boldsymbol{x}_n, \boldsymbol{\delta}_n).$$

The additive term in this latter decomposition adjusts the average-firm-wage upward or downward to yield the average-worker-wage; this is also a measure of worker concentration in high-wage firms, but this time the measure is relative to the average-firm-wage $\bar{x}_n$, and it is on an additive scale. This term is positive if $M_n > 1$ and negative if $M_n < 1$ (and zero if $M_n = 1$), and it is fully determined by the scale $\bar{x}_n$ and the previous multiplicative measure of worker-concentration.

Again, I am not aware of any particular literature on this decomposition. It is just one way that you can express the average-worker-wage using an additive metric representing worker-concentration in high-wage firms.

Ben
  • 91,027
  • 3
  • 150
  • 376