Short answer
The function $h(X)=X$ is used for the GMM because it is equivalent to the OLS estimator, which by the Gauss–Markov theorem is the best linear unbiased estimator.
The details
We start with some notation to avoid any confusion with rows and columns:
\begin{equation*}
X =
\begin{bmatrix}
x_{11} & \ldots & x_{1p} \\
\vdots & \ddots & \vdots\\
x_{n1} & \ldots & x_{np}
\end{bmatrix}
,\;\bar{y} =
\begin{bmatrix}
y_1 \\
\vdots\\
y_n
\end{bmatrix}
,\;\bar{\beta} =
\begin{bmatrix}
\beta_1 \\
\vdots\\
\beta_p
\end{bmatrix}
,\;\bar{\epsilon} =
\begin{bmatrix}
\epsilon_1 \\
\vdots\\
\epsilon_n
\end{bmatrix}
\end{equation*}
We assume that $X$ has full column rank.
Taking $h(X) = X$, the GMM conditions are
\begin{equation}
E\left[
\begin{bmatrix}
x_{j1} & \cdots & x_{jn}
\end{bmatrix}
\begin{bmatrix}
\epsilon_1 \\
\vdots\\
\epsilon_n
\end{bmatrix}
\right]
= 0
\end{equation}
for $j \in \{1,\ldots,p\}$, i.e. the expected covariance of each column of $X$ with the errors is 0. We can put these $p$ conditions into one neat equation as follows:
\begin{equation}
E\left[ X^T\bar{\epsilon}\right] = \bar{0}
\end{equation}
(Here $\bar{0}$ denotes the zero vector.)
To find an estimate of $\bar{\beta}$ using the GMM, we need to minimise the sample estimate of $E\left[ X^T\bar{\epsilon}\right]$ with respect to $\bar{\beta}$, i.e. we need to find the value of $\bar{\beta}$ that minimises the norm of the following expression:
\begin{equation}
X^T\!\left(\bar{y} - X\bar{\beta}\right)
\end{equation}
Notice that $X\bar{\beta}$ is in the column space of $X$, since it is a linear combination of the columns of $X$. Also note that $X^T\!\left(\bar{y} - X\bar{\beta}\right) = \bar{0}$ if and only if $X\bar{\beta}$ is the projection of $\bar{y}$ on to the column space of $X$, since if $X\bar{\beta}$ is anything else in the column space of $X$, then the vector $\bar{y} - X\bar{\beta}$ isn't orthogonal to the column space and thus the dot products in the expression $X^T\!\left(\bar{y} - X\bar{\beta}\right)$ are not 0. The following diagram (taken from Wikipedia) illustrates this point:
We want to minimise $X^T\!\left(\bar{y} - X\bar{\beta}\right)$ with respect to $\bar{\beta}$, which is clearly achieved when $X^T\!\left(\bar{y} - X\bar{\beta}\right) = \bar{0}$. So we rearrange the equation $X^T\!\left(\bar{y} - X\bar{\beta}\right) = \bar{0}$ to find the necessary value of $\bar{\beta}$:
\begin{equation}
\bar{\beta} = \left(X^TX\right)^{-1}X^T\bar{y}
\end{equation}
But this is just the usual OLS estimator, which by the Gauss–Markov theorem is the best linear unbiased estimator.