PCA is simply a procedure, and can be applied to any continuous data. The procedure isn't rendered invalid if these conditions are violated, so they're not assumptions in that sense. But, they do suggest cases where PCA can be useful or not useful.
High dimensional data concentrated near hyperplanes
We could say that, if $D$ dimensional data are concentrated near a $d$ dimensional hyperplane (where $d < D$), then they can be well approximated using $d$ principal components. This is ignoring conditions like outliers that could throw PCA off, despite concentration of the remaining points near the hyperplane.
This follows from the fact that 1) PCA reduces the dimensionality by projecting points into a $d$ dimensional linear subspace (i.e. onto a $d$ dimensional hyperplane), and 2) PCA finds the subspace that minimizes the squared reconstruction error. This is an alternative way of formulating the optimization problem, and is equivalent to maximizing the variance. The reconstruction error for a point $x_i$ is the distance between $x_i$ and its projection onto the hyperplane. Therefore, if points are concentrated near a $d$ dimensional hyperplane, PCA can find this hyperplane, and the reconstruction error using $d$ components will be small.
If the data are not concentrated near a $d$ dimensional hyperplane, then no such hyperplane can provide a low reconstruction error, and $d$ principal components cannot provide a good approximation.
High dimensional data concentrated near ellipsoids
This condition doesn't really determine how well PCA will work in general. Here are a couple counterexamples.
1) Consider a set of points drawn from some arbitrarily shaped distribution on a $d$ dimensional plane, then mapped linearly into a higher dimensional space. PCA will be able to perfectly reconstruct these points using $d$ components, despite the data being shaped nothing like an ellipsoid.
2) A sphere is an ellipsoid. PCA won't work well if the data has a spherical distribution. Yes, that's reading things literally, and perhaps the slides meant something like the following:
If $D$ dimensional data has an ellipsoidal distribution that's elongated along $d$ dimensions and closer to flat along the others, then the data can be well approximated using $d$ principal components. This follows from the fact that such a distribution is concentrated near a $d$ dimensional hyperplane.