Estimators do not exist "out there", and we are in the search of determining the "best" way to "find" them. As functions, they may have the same general structural form, but their results, the estimates, depend on the input that we will give them.
The OP states two different models. In order for these models to have any relation to estimation, they must be accompanied each by its own sample of observations, which is presumably the case. Then by applying maximum likelihood estimation, we will obtain one estimate, if we use only the one sample, another estimate if we use only the second sample, and yet a third estimate, if we use both samples. And all will be "maximum likelihood" estimates, each being the argmax of a different log-likelihood, since in each case the conditioning sample will be different.
Since the assumption is that all $\epsilon$'s are i.i.d. random variables, the two samples can be pooled, and the joint density / likelihood be written as the product of $n+n$ marginal normal densities. Then take logarithms etc.
Can we say anything about which of the three MLE's is "better" to use? Of course this requires defining a criterion against which the three MLE's will be judged (eg. Mean Squared Error-MSE). Instinctively, we would go with the one that uses both samples as one (larger sample).
Finally, an interesting and instructive option one could consider, is to see what would happen (in terms of estimator properties), if one obtained the two MLE's separately from the two samples, and consider some combination of them. Under the assumed normality, the finite sample distribution of the variance estimators is known. A related post to this option (but not in a regression context) is this one.