Consider an $I\times J$ contingency table $C$ with elements $C_{ij}$ and total number of observations $n=\sum_i\sum_j C_{ij}$.
Simple $K$-dimensional correspondence analysis models this as
$$
\frac{C_{ij}}{n} = \alpha_i\beta_j(1 + \sum^{K}_k\mu_i\sigma_k\nu_j)
$$
You can think of this as a particular geometrical decomposition of a table of proportions, as described in the Stata documentation linked to by @dimitriy-v-masterov. This equation is the final equation in that document, despite being probably the most useful one to have started with.
Personally, I prefer to think of CA as a least squares approximation to the $K$ dimensional log multiplicative 'association' model of $C$:
\begin{align}
C_{ij} \sim &~ Poisson(\mu_{ij})\\
\mu_{ij} = &~ \alpha_i^* + \beta_j^* + \sum^{K}_k\mu_i^*\sigma_k^*\nu_j^*.
\end{align}
This makes it a bit clearer that the goal of both models is to create an interpretable low dimensional model of the table's association structure - that is, the variation in counts that should not be expected under independence. Increasing $K$ changes the models' complexity from independence to saturation.
In both models $\alpha$ and $\beta$ ensure the margin counts are captured whereas the elements of the sum exist to model the association structure. In CA the elements in the sum are essentially the first $K$ singular vectors and values of an SVD of the residuals from an independence model of $C$.
Biplots plot variously scaled $\mu$ or $\mu^*$ and $\nu$ or $\nu^*$ in the same space. Confusingly, some people refer to such plots as CA rather than the first model.
Comparisons of either model to factor analysis are basically unhelpful. From a measurement perspective these models model proximity, or 'ideal point' item structure, rather than a dominance structure of factor analysis.