There is a good, long, explanation of PCA at Making sense of principal component analysis, eigenvectors & eigenvalues so here I will only show why the eigenvector with largest eigenvalue maximizes the variance of linear combinations.
Let $\Sigma$ be the variance-covariance matrix of the random variables $X_1, X_2, \dotsc, X_n$, which we will let be the components of random vector $X$. That is,
$$ \DeclareMathOperator{\V}{\mathbb{V}} \DeclareMathOperator{\C}{\mathbb{C}} \C(X_i,X_j)=\Sigma_{ij}
$$
Since $`sigma$ is a symmetric matrix, we can use the spectral decomposition, that is
$$
\Sigma = U \Lambda U^T = \sum_{i=1}^n \lambda_i^2 u_i u_i^T
$$ where $U$ is orthogonal and $\lambda$ is diagonal. Now let $a$ be an $n$vector defining the linear combination $a^T X$. We want to choose $a$ so at to maximize the variance of the linear combination:
$$
\V\left( a^T X\right) = a^T \V\left( X \right) = a^T \Sigma a \\
= a^T \sum_{i=1}^n \lambda_i^2 u_i u_i^T a \\
= \sum_{i=1}^n \lambda_i^2 (a^T u_i) (u_i^T a)
$$
The vectors $u_i$ are mutually ortonormal, the sum above can be seen as an average of the $n$ values $\lambda_i^2$, and it is clear that it will be maximized if we choose $a$ so that the highest of the values $\lambda_1^2$ get all the weight. That will happen if we choose $a$ to be the $u_i$-eigenvector corresponding to the largest eigenvalue $\lambda_i$. But then the weight of the rest will be zero, by orthonormality. That finishes the proof.