It requires some algebra to derive the result, but it's actually not complicated.
Derivative of vector norm
First, we need to remember the derivative of the vector norm. Let's start with the non-zero case $\mathbf{x} \neq \mathbf{0}$:
$$
\frac{d}{d \mathbf{x}} \lVert \mathbf{x} \rVert =
\frac{\mathbf{x}}{ \lVert \mathbf{x}\rVert } .
$$
The vector norm is non-differentiable at zero, therefore we need to follow a different path if $\mathbf{x} = \mathbf{0}$. We need to consider the subdifferential $\partial(\lVert \mathbf{x}\rVert)$, which has the form
\begin{align}
\partial(\lVert \mathbf{x}\rVert)
&= \{ \mathbf{v} \in \mathbb{R}^{p} \mid
\lVert\mathbf{x}^\prime\rVert \geq \lVert\mathbf{x}\rVert + \mathbf{v}^\top (\mathbf{x}^\prime - \mathbf{x}), \forall \mathbf{x}^\prime \in \mathbb{R}^{p} \} \\
&= \{ \mathbf{v} \in \mathbb{R}^{p} \mid
\lVert \mathbf{x}^\prime \rVert \geq \mathbf{v}^\top \mathbf{x}^\prime, \forall \mathbf{x}^\prime \in \mathbb{R}^{p} \} \\
&= \{ \mathbf{v} \in \mathbb{R}^{p} \mid
\lVert \mathbf{v} \rVert \leq 1 \}
\end{align}
Thus, a subgradient vector $\mathbf{v}$ of $\lVert \mathbf{x} \rVert$ at $\mathbf{x} = \mathbf{0}$ needs to satisfy $\lVert \mathbf{v} \rVert \leq 1$.
The derivative is thus defined as
$$
\partial ( \lVert \mathbf{x} \rVert ) =
\begin{cases}
\{ \mathbf{x} / \lVert \mathbf{x} \rVert \} & \text{if $\mathbf{x} \neq \mathbf{0}$} , \\
\{ \mathbf{v} \in \mathbb{R}^{p} \mid
\lVert \mathbf{v} \rVert \leq 1 \}
& \text{if $\mathbf{x} = \mathbf{0}$} .
\end{cases}
$$
SCAD-penalized linear regression
The objective function we want to minimize is
\begin{align}
\arg\min_{\mathbf{\beta}_j} &\quad Q(\mathbf{\beta}) \\
Q(\mathbf{\beta}) &=
\frac{1}{2} (\mathbf{y} - \mathbf{X} \mathbf{\beta})^\top
(\mathbf{y} - \mathbf{X} \mathbf{\beta}) + \lambda \sum_{j=1}^J p_{\lambda,\gamma}( \lVert{ \mathbf{\beta}_j }\rVert ) ,
\end{align}
where $J$ are the number of groups, and $\lambda > 0$ controls the amount of penalization.
The SCAD penalty is defined as
$$
p_{\lambda,\gamma}(u) = \begin{cases}
\lambda u & \text{if $0 \leq u \leq \lambda$}, \\
- \frac{u^2 - 2 \gamma \lambda u + \lambda^2}{2(\gamma-1)}
& \text{if $\lambda < u \leq \gamma \lambda$}, \\
\frac{(\gamma+1)\lambda^2}{2} & \text{if $u > \gamma \lambda$} .
\end{cases}
$$
Most paper, such as Breheny and Huang, 2015, assume orthonormal groups, i.e., $\left( \mathbf{X}^j \right)^\top \mathbf{X}^j = \mathbf{I}$, which I will assume here too.
Case 1 – LASSO
Let's consider the first case which corresponds to the LASSO. I use $\mathbf{X}^{j}$ to denote the columns of $\mathbf{X}$ corresponding to the
$j$-th group, and $\mathbf{X}^{-j}$ to denote all columns not belonging to the $j$-th group. Setting the derivative to zero, we obtain
\begin{align}
\frac{ \partial Q(\beta) }{\partial \beta_j }
&= -(\mathbf{X}^{j})^\top (\mathbf{y} - \mathbf{X}^{-j} \beta_{-j} - \mathbf{X}^{j} \beta_{j})
+ \lambda \partial(\lVert \beta_j \rVert) \\
&= -(\mathbf{X}^{j})^\top (\mathbf{y} - \mathbf{X}^{-j} \beta_{-j}) + (\mathbf{X}^{j})^\top \mathbf{X}^{j} \beta_{j}
+ \lambda \partial(\lVert \beta_j \rVert) \\
&= -(\mathbf{X}^{j})^\top (\mathbf{y} - \mathbf{X}^{-j} \beta_{-j}) + \mathbf{I} \beta_{j}
+ \lambda \partial(\lVert \beta_j \rVert) \\
&= - \mathbf{z}_j + \mathbf{I} \beta_{j}
+ \lambda \partial(\lVert \beta_j \rVert) \\
&= \mathbf{0} ,
\end{align}
where I made use of the orthonormality assumption and set $\mathbf{z}_j = (\mathbf{X}^{j})^\top (\mathbf{y} - \mathbf{X}^{-j} \beta_{-j})$, which is the part not depending on $\beta_j$.
If $ \beta_j \neq \mathbf{0}$, we substitute the derivative from above and obtain
$$
- \mathbf{z}_j + \beta_{j}
+ \lambda \frac{\beta_j}{\lVert \beta_j \rVert}
= \mathbf{0} ,
$$
which is equivalent to
$$
\beta_{j} = \mathbf{z}_j \left( 1 + \frac{\lambda}{\lVert \beta_j \rVert} \right)^{-1} .
$$
Now, we need to somehow replace $\lVert \beta_j \rVert$ on the right hand side by noting
$$
\lVert \beta_{j} \rVert
= \lVert \mathbf{z}_j \rVert \left( 1 + \frac{\lambda}{\lVert \beta_j \rVert} \right)^{-1}
= \lVert \mathbf{z}_j \rVert - \lambda ,
$$
details of which are explained in another question wrt the Group LASSO.
Substituting this result in the equation from above, we get
$$
\beta_{j} = \mathbf{z}_j \left( 1 + \frac{\lambda}{\lVert \mathbf{z}_j \rVert - \lambda} \right)^{-1}
= \frac{\mathbf{z}_j}{\lVert \mathbf{z}_j \rVert} \left( \lVert \mathbf{z}_j \rVert - \lambda \right)
$$
For the case $ \beta_j = \mathbf{0}$, we need to consider the subdifferential and the Karush–Kuhn–Tucker (KKT) conditions, which state
$$
\mathbf{0} \in -\mathbf{z}_{j} + \mathbf{\beta}_j
+ \lambda \mathbf{v} ,
$$
where $\mathbf{v}$ is any vector with $\lVert \mathbf{v} \rVert \leq 1$.
Since $\mathbf{v} = \frac{\mathbf{z}_{j} }{\lambda}$, it implies
$\lVert \mathbf{v} \rVert \leq 1 \Leftrightarrow \lVert \mathbf{z}_j \rVert \leq \lambda$.
Thus, the LASSO solution is
$$
\mathbf{\beta}_j = \begin{cases}
\frac{\mathbf{z}_j}{\lVert \mathbf{z}_j \rVert} \left( \lVert \mathbf{z}_j \rVert - \lambda \right)
& \text{if $\lVert \mathbf{z}_j \rVert > \lambda$}, \\
\mathbf{0}
& \text{if $\lVert \mathbf{z}_j \rVert \leq \lambda$}.
\end{cases}
$$
Case 2
For the second case of the SCAD penalty, the subdifferential is
$$
\partial( p_{\lambda, \gamma} ( \lVert \beta_j \rVert)) = - \frac{\beta_j - \gamma \lambda \partial(\lVert\beta_j\rVert)}{\gamma - 1} .
$$
Again, we need to distinguish between the non-zero and zero case and essentially repeat the same steps as for case 1. Let's begin with
$ \beta_j \neq \mathbf{0}$, for which the partial derivative is
$$
\frac{ \partial Q(\beta) }{\partial \beta_j }
= - \mathbf{z}_j + \beta_{j}
- \frac{\beta_j - \gamma \lambda \frac{\beta_j}{\lVert\beta_j\rVert}}{\gamma - 1}
= \mathbf{0} .
$$
Solving for $\beta_j$, leads to
\begin{align}
\beta_j &= \mathbf{z}_j
\left( 1 - \frac{1 - \gamma\lambda \frac{1}{\lVert\beta_j\rVert} }{\gamma - 1} \right)^{-1}
= \mathbf{z}_j
\frac{\gamma - 1}
{\gamma - 2 + \gamma\lambda \frac{1}{\lVert\beta_j\rVert}} , \\
\lVert \beta_j \rVert &=
\left( \lVert{ \mathbf{z}_j }\rVert - \frac{\gamma \lambda}{\gamma - 1} \right)
\frac{\gamma -1}{\gamma -2}
= \frac{ \lVert{ \mathbf{z}_j }\rVert (\gamma - 1) -\gamma \lambda }{\gamma - 2}
\end{align}
Substituting the vector norm into the equation above, we obtain
\begin{align}
\beta_j &= \mathbf{z}_j
\frac{\gamma - 1}
{\gamma - 2 + \gamma\lambda \frac{\gamma - 2}{ \lVert{ \mathbf{z}_j }\rVert (\gamma - 1) -\gamma \lambda }} \\
&= \mathbf{z}_j
\frac{\gamma - 1}
{\frac{\gamma\lVert{ \mathbf{z}_j }\rVert (\gamma - 1) -\gamma^2 \lambda - 2\lVert{ \mathbf{z}_j }\rVert (\gamma - 1) +2\gamma \lambda + \gamma^2\lambda - 2 \gamma \lambda }{ \lVert{ \mathbf{z}_j }\rVert (\gamma - 1) -\gamma \lambda }} \\
&= \mathbf{z}_j
\left(
\frac{ \lVert{ \mathbf{z}_j }\rVert(\gamma - 1) - \gamma \lambda}
{\gamma \lVert{ \mathbf{z}_j }\rVert - 2\lVert{ \mathbf{z}_j }\rVert}
\right) \\
&= \frac{\mathbf{z}_j}{\lVert{ \mathbf{z}_j }\rVert}
\left(
\frac{ \lVert{ \mathbf{z}_j }\rVert(\gamma - 1) - \gamma \lambda}
{\gamma - 2}
\right) \\
&= \frac{\mathbf{z}_j}{\lVert{ \mathbf{z}_j }\rVert}
\left( \lVert{ \mathbf{z}_j }\rVert - \frac{\gamma \lambda}{\gamma - 1} \right)
\frac{\gamma -1}{\gamma -2} .
\end{align}
If $ \beta_j = \mathbf{0}$, we consider the subdifferential with $\lVert \mathbf{v} \rVert \leq 1$ and obtain the KKT condition
$$
\mathbf{0}
\in - \mathbf{z}_j
+ \frac{ \gamma \lambda \mathbf{v}}{\gamma - 1} ,
$$
from which we can derive
$$
\lVert \mathbf{v} \rVert \leq 1
\Leftrightarrow
\lVert{ \mathbf{z}_j \rVert} \leq \frac{\gamma \lambda}{\gamma-1} .
$$
Putting both cases together, the solution for the second case is
$$
\mathbf{\beta}_j = \begin{cases}
\frac{\mathbf{z}_j}{\lVert{ \mathbf{z}_j }\rVert}
\left( \lVert{ \mathbf{z}_j }\rVert - \frac{\gamma \lambda}{\gamma - 1} \right)
\frac{\gamma -1}{\gamma -2}
& \text{if $\lVert \mathbf{z}_j \rVert > \frac{\gamma \lambda}{\gamma-1}$}, \\
\mathbf{0}
& \text{if $\lVert \mathbf{z}_j \rVert \leq \frac{\gamma \lambda}{\gamma-1}$}.
\end{cases}
$$
Case 2
Finally, the third case of the SCAD penalty, which corresponds to no penalization, and yields the solution
$$
\frac{ \partial Q(\beta) }{\partial \beta_j }
= - \mathbf{z}_j + \beta_{j}
= \mathbf{0}
\Leftrightarrow
\beta_{j} = \mathbf{z}_j .
$$
Final Solution
The points where the three cases are joined are identical to those in the univariate case, which brings us to the final solution of SCAD-penalized linear regression:
\begin{align}
\beta_j &= \frac{\mathbf{z}_j}{\lVert{ \mathbf{z}_j }\rVert} F_{\lambda, \gamma} (\lVert{ \mathbf{z}_j }\rVert) \\
F_{\lambda, \gamma} (z) &= \begin{cases}
S(z, \lambda)
& \text{if $|z| \leq 2\lambda$}, \\
S\left( z, \frac{\gamma \lambda}{\gamma - 1} \right) \frac{\gamma - 1}{\gamma -2}
& \text{if $2 \lambda < |z| \leq \gamma \lambda$}, \\
z
& \text{if $|z| > \gamma \lambda$} .
\end{cases}
\end{align}