I'm mostly using the papers
to extend @amoeba's comment into an answer here:
Let's start with the PLS X
$\mathbf X = \mathbf T \mathbf P' + \mathbf E$ and
$\mathbf T = \mathbf X \mathbf W'$
and the Y matrices
$\mathbf Y = \mathbf U \mathbf Q' + \mathbf F$
(outer relations)
(take care to construct the weights $\mathbf W'$ and $\mathbf Q'$ so they refer directly to $\mathbf X$ and $\mathbf Y$, not to deflated matrices!)
Regression can then take place on the X and Y scores: $\hat u = t b$ (inner relation),
thus
$\mathbf Y = \mathbf T \mathbf B \mathbf Q' + \mathbf E$
$\mathbf{\hat Y} = \mathbf X \mathbf W' \mathbf B \mathbf Q'$
Now, the last three matrices ($\mathbf W' \mathbf B \mathbf Q'$) are all part of the PLS model parameters. We can therefore introduce one matrix $\mathbf B' = \mathbf W' \mathbf B \mathbf Q'$ which gives PLS coefficients in analogy to the usual MLR coefficients and write
$\mathbf{\hat Y} = \mathbf X \mathbf B'$
which is the usual form of a linear regression model.
Your prediction can either use these "shortcut" coefficients, or the 3 steps of calculating
- X scores $\mathbf{\hat T} = \mathbf X \mathbf W'$, then
- Y scores $\mathbf{\hat U} = \mathbf{\hat T} \mathbf B$, and finally
- $\mathbf{\hat Y} = \mathbf{\hat U} \mathbf Q'$
update: this procedure modeling both $\mathbf X$ and $\mathbf Y$ with latent variables and scores is known as PLS2. In contrast, PLS1 models only one dependent variable $\mathbf y$ (or $\mathbf Y^{(n \times 1)}$ at a time so that no Y-scores are obtained. Multiple dependent variates can be modeled by separate PLS1 models -- one per variate.
Whether multiple PLS1 or a single PLS2 model are better depends on the application, e.g. on whether the dependent variates are correlated and whether an underlying structure with few(er) latent variables is expected.
In practice, you also need to take care of centering (standard practice) and possible scaling (less standard practice) of $\mathbf X$ and $\mathbf Y$.
For cross validation, the prediction works exactly the same way as for unknown cases: you fit the model on your training cases and then predict the left out cases like any other unknown case.
(Assuming this is not asking whether shortcut solutions exist to update a PLS model for exchanging one case during leave-one-out cross validation)