It depends on what functional of the future distribution you want to elicit.
Put differently, future outcomes follow some probability distribution (which, judging from your description, may be heavy-tailed and/or zero-inflated), and the point forecast you want to evaluate is a "one number summary" of this distribution. This holds even if you do not explicitly look at the distribution - it will always be there and lurking under the surface.
The issue is that different error measures elicit different one number summaries from the underlying distribution. The MSE is minimized in expectation by the expectation of the distribution. The MAE is minimized by its median. (That the MSE is more strongly influenced by the tail of the distribution than the MAE is just another way of saying that the expectation of the distribution is more strongly influenced by the tail than the median.) A quantile loss will be optimized by the appropriate quantile.
One consequence is that different point forecasts will be optimal for different error measures. Another one is that you should remember that your OLS regression will likely optimize the MSE as an objective function, so it does not really make sense to evaluate forecasts from an OLS model using the MAPE. (The MAE makes sense if you believe in symmetric errors, which again does not seem to be the case here.)
So the question should first be what functional you are interested in, and only after you have given this some thought should you pick an appropriate error measure. Which functional solves your problem, in turn, depends on what you want to do with the point forecast afterwards.
More information can be found at What are the shortcomings of the Mean Absolute Percentage Error (MAPE)?, at Why use a certain measure of forecast error (e.g. MAD) as opposed to another (e.g. MSE)? and in Kolassa (2020).