Even though you asked about smoothed n-gram models, your question is more general. You want to know how the computations done in a model on a training set relate to computations on the test set.
Training set computations.
You should learn the parameters of your (n-gram) model using the training set only. In your case, the parameters are the conditional probabilities. For instance, you may find that $p(\text{cat})=\frac{7+\lambda}{1000+\lambda V}$ if your vocabulary size is $V$. These numbers are the ones you’d use to compute perplexity on the training set.
Test set computations.
When you compute the perplexity of the model on the test set, you reuse the same learned parameters from before. You don’t recompute $p(\text{cat})$. You still use $\frac{7+\lambda}{1000+\lambda V}$, regardless of how often “cat” appears in the test data. (One notable problem to beware of: if a word is not in your vocabulary but shows up in the test set, even the smoothed probability will be 0. To fix this, it’s a common practice to “UNK your data”, which you can look up separately.)
The point.
The point of this is to see how well your model generalizes. The test data is a surrogate for real-world data that you’ll see when deploying your model. You ignore it when fitting the model. You then compute perplexity on the test data, as an estimate of how you’d do on that real world data.