There is a particular way to address your question, it depends on how you measure correlation actually.
My question is what does the correlation give that the MSE cannot?
Let's see, we define the $\text{MSE}$:
\begin{equation}
MSE = \frac{1}{n}\sum_{i=1}^{n}{\Big( y_i - \hat{y}_i \Big)^2}
\end{equation}
It is a cost function that is well known to be convex, therefore is widely used by gradient-based optimization methods. Also, it is affected by large values which could be outliers that the model couldn't learn, in addition, every value is equally weighted in the overall cost calculation of the model.
On the other hand, a widely used metric to measure the relationship between two variables is $\textit{pearson}$ correlation:
\begin{equation}
\rho(X, Y) = \frac{E \left[(X - \mu_{x}) (Y - \mu_{y}) \right]}
{\sqrt{E \left[ (X - \mu_{x})^2 \right] E \left[ (Y - \mu_{y})^2 \right] }} = \frac{cov(X, Y)}{\sigma_{X}\sigma_{Y}}
\end{equation}
Since such formulation is based on the linearity property of expectation, it will address a linear relationship between the $x$ and $y$ variables. It is well known $pearson$ is not robust to linear relationships with a significant amount of outliers, also to non-linear relationships even without outliers.
So, from the $\text{MSE}$ you have equal weight on all the samples, whether the model learns them or not, and from $\text{pearson}$ you have only the capability to measure the linear relationship. So what about outliers and non-linearities?
There is another option to use in order to address both of the previous situations, that is the $spearman$ correlation:
\begin{equation}
S(X, Y) = \rho_{rg_{X}, rg_{Y}} = \frac{cov(rg_{X}, rg_{Y})}{\sigma_{rg_{X}}\sigma_{rg_{Y}}}
\end{equation}
This is in fact a particular case of $pearson$ but applied to ranked versions of the variables. In a Spearman correlation, when two variables are non-linearly related, a monotonic relationship could be detected which will result in a coefficient of 1 (or -1), meaning that all data points with greater $X$ values than that of a given data point will have greater $Y$ values as well.
I will suggest you can consider the following as options for you to choose as a multiple choice answer:
It does depend on how you calculate the correlation. $\text{MSE}$ can give you a convex function well suited for gradient-based methods, it does not consider variations among errors just overall, equally weighted, aggregated individual errors.
This is only when using correlation: You can use $\text{pearson}$ if you are working on a type of outcome that you need to have a linearly proportional amount of error between model and ground truth. It does provide hints about how in some regions or subsamples of your data could be non-linearities addressed.
This is only when using correlation: You can use $\text{spearman}$ if you cannot afford to have non-linear differences in the errors among subsamples or regions of your data. Basically, use this one to test if your predicted values are increasing or decreasing according to the ground truth, for example.
Compounding the three to form a hybrid cost function. In that way, you can take the convexity of $\text{MSE}$, the local existing linear relationship with $\text{pearson}$ and the global non-linear monotonic relationship with $\text{spearman}$.