I'm sure there's a ton of material on these exact questions in interweb.
One way to look at PCA is as follows. Let's say you have a set of observations: $X(i)$. Now each $X(i)$ is a vector itself, i.e. it consists of $n$ variables $X_1(i),X_2(i),\dots,X_n(i)$.
Sometimes these $n$ components are highly correlated with each other. For instance, imagine measuring the weight, height, chest size, skulls size etc. of a population of in a town. So, maybe you instead of dealing with $n$ different size measures you want to have just one. You could run PCA on a matrix $X_j(i)$, and obtain the first PCA component. PCA will return you a score matrix of the same dimensions as $X$, so you get the first column $s_1(i)$. This will most likely be your size measure, because it will capture co-movement of all size measures in one number. The coefficient matrix will be n-by-n, you get the first column (or a row depending on the software), which will return you the vector of ceofficients, which are weights of each size measure such as weight and height to obtain the first principal component.