I got this problem while I was reading the book "Machine Learning: A Probabilistic Perspective" by Kevin Murphy. It is in section 7.6.1 of the book.
Assume the likelihood is given by
$$ \begin{split} p(\mathbf{y}|\mathbf{X},\mathbf{w},\mu,\sigma^2) & = \mathcal{N}(\mathbf{y}|\mu+\mathbf{X}\mathbf{w}, \sigma^2\mathbf{I}_N) \\ & \propto (-\frac{1}{2\sigma^2}(\mathbf{y}-\mu\mathbf{1}_N - \mathbf{X}\mathbf{w})^T(\mathbf{y}-\mu\mathbf{1}_N - \mathbf{X}\mathbf{w})) \end{split} \tag{7.53} $$
$\mu$ and $\sigma^2$ are scalars. $\mu$ serves as an offset. $\mathbf{1}_N$ is a column vector with length $N$.
We put an improper prior on $\mu$ of the form $p(u) \propto 1$ and then integrate it out to get
$$ p(\mathbf{y}|\mathbf{X},\mathbf{w},\sigma^2) \propto (-\frac{1}{2\sigma^2}||\mathbf{y}-\bar{y}\mathbf{1}_N - \mathbf{X}\mathbf{w}||_2^2) \tag{7.54} $$
where $\bar{y}=\frac{1}{N}\sum_{i=1}^{N}y_i$ is the empirical mean of the output.
I tried to expand the formula (last line in $7.53$) to integrate directly but failed.
Any idea or hint on how to derive from $(7.53)$ to $(7.54)$?