Let $H_2$ be an orthogonal projector onto the column space of $X_2$. We have that
\begin{align*}
& \min_{\beta_1, \beta_2} \left\{ \|y - X_1\beta_1 - X_2\beta_2\|_2^2 + \lambda \|\beta_1\|_1 \right\} \\
= & \, \min_{\beta_1, \beta_2} \left\{ \|H_2\left(y - X_1\beta_1 \right) - X_2 \beta_2\|_2^2 + \|\left(I-H_2\right)\left(y - X_1\beta_1 \right) \|_2^2 + \lambda \|\beta_1 \|_1 \right\} \\
= & \, \min_{\beta_1 | \beta_2} \min_{\beta_2} \left\{ \|H_2\left(y - X_1\beta_1 \right) - X_2 \beta_2\|_2^2 + \|\left(I-H_2\right)\left(y - X_1\beta_1 \right) \|_2^2 + \lambda \|\beta_1 \|_1 \right\},
\end{align*}
where
\begin{align*}
\hat\beta_2
& = \arg\min_{\beta_2} \left\{ \|H_2\left(y - X_1\beta_1 \right) - X_2 \beta_2\|_2^2 + \|\left(I-H_2\right)\left(y - X_1\beta_1 \right) \|_2^2 + \lambda \|\beta_1 \|_1 \right\} \\
& = \arg\min_{\beta_2} \left\{ \|H_2\left(y - X_1\beta_1 \right) - X_2 \beta_2\|_2^2 \right\}
\end{align*}
satisfies $X_2 \hat\beta_2 = H_2 (y - X_1 \beta_1)$ for all $\beta_1$ since $H_2 (y - X_1 \beta_1) \in \mathrm{col}(X_2)$ for all $\beta_1$. Considering in this sentence the case that $X_2$ is full rank, we further have that $$\hat\beta_2 = (X_2^T X_2)^{-1} X_2^T (y - X_1 \beta_1),$$ since $H_2 = X_2 (X_2^T X_2)^{-1} X_2$ in this case.
Plugging this into the first optimization problem, we see that
\begin{align*}
\hat\beta_1
& = \arg\min_{\beta_1} \left\{ 0 + \|\left(I-H_2\right)\left(y - X_1\beta_1 \right) \|_2^2 + \lambda \|\beta_1 \|_1 \right\} \\
& =\arg\min_{\beta_1} \left\{ \|\left(I-H_2\right)y - \left(I-H_2\right)X_1\beta_1 \|_2^2 + \lambda \|\beta_1 \|_1 \right\}, \tag{*}
\end{align*}
which can be evaluated through the usual lasso computational tools. As whuber suggests in his comment, this result is intuitive since the unrestricted coefficients $\beta_2$ can cover the span of $X_2$, so that only the part of space orthogonal to the span of $X_2$ is of concern when evaluating $\hat\beta_1$.
Despite the notation being slightly more general, nearly anyone who has ever used lasso is familiar with this result. To see this, suppose that $X_2 = \mathbf{1}$ is the (length $n$) vectors of ones, representing the intercept. Then, the projection matrix $H_2 = \mathbf{1} \left( \mathbf{1}^T \mathbf{1} \right)^{-1} \mathbf{1}^T = \frac{1}{n} \mathbf{1} \mathbf{1}^T$, and, for any vector $v$, the orthogonal projection $\left( I - H_2 \right) v = v - \bar{v} \mathbf{1}$ just demeans the vector. Considering equation $(*)$, this is exactly what people do when they compute the lasso coefficients! They demean the data so that the intercept doesn't have to be considered.