(This question is an attempt to zoom in on the key issue in this question using as little information as possible.)
Lets say I want to derive the likelihood function of $\beta$ given $x$ and $y$ for the model $$y=x\beta+u$$ where $$u\sim NID(0,1).$$ I would start with these steps: \begin{align} \mathcal{L}(\beta\mid x,y) &=f_{XY}(x, y \mid \beta)\\ &=f_u(y=x\beta + u\mid\beta, x, y). \end{align} I think that I need the second step to continue the derivation, but I don't know how to motivate it. I take it entirely on intuition. Is it correct? If so, how do I justify it? If not, how would the next few steps look like?
Edit: Just to give a sense of where I'm going with this, the rest of the derivation would look like this: \begin{align} &=f_u(y-x\beta=u\mid\beta, x, y)\\ &=\varphi(y-x\beta). \end{align}