I have a question related to the sign and size of the OLS bias in the case of a Tobit model.
Consider the following model
(1) Sample of observations $\{X_i,Y_i\}_{i=1}^n$, i.i.d., $X_i$ is a vector $k\times 1$
(2) $ Y_i^\star=X_i'\beta +U_i $
(3) $Y_i = \begin{cases} Y_i^\star & \text{if } Y^\star_i\geq0, \\ 0 & \text{otherwise}. \end{cases}$
(4) $U_i \sim N(0, \sigma^2_u)$, $U_i$ independent of $X_i$
We can show that $$ E(Y_i| X_i, Y_i> 0)= X_i'\beta+\sigma_u\frac{\phi(\frac{X_i'\beta}{\sigma_u})}{\Phi(\frac{X_i'\beta}{\sigma_u})} $$
Suppose I run an OLS regression of $Y_i>0$ on $X_i$ and that I have only one regressor plus intercept. What are the sign and size of the bias of the OLS slope estimator? My thought was the following:
(i) assume $k=2$, $Y_i=\beta_0+\beta_1X_i+\xi_i$ where $\xi_i:=\epsilon_i+\sigma_u\frac{\phi(\frac{X_i'\beta}{\sigma_u})}{\Phi(\frac{X_i'\beta}{\sigma_u})}$, $\epsilon_i$ independent of $X_i$
(ii) Focus on $\beta_1$
(iii) We can show that $\hat{\beta}_{1,OLS}-\beta_1\rightarrow_p \sigma_u\frac{cov(X_i, \frac{\phi(\frac{\beta_0+\beta_1X_i}{\sigma_u})}{\Phi(\frac{\beta_0+\beta_1X_i}{\sigma_u})})}{Var(X_i)}$
(iv) We know that $\frac{\phi(\frac{\beta_0+\beta_1X_i}{\sigma_u})}{\Phi(\frac{\beta_0+\beta_1X_i}{\sigma_u})}$ is decreasing in $\frac{\beta_0+\beta_1X_i}{\sigma_u}$
(v) Hence, if $\beta_1\geq 0$ then $cov(X_i, \frac{\phi(\frac{\beta_0+\beta_1X_i}{\sigma_u})}{\Phi(\frac{\beta_0+\beta_1X_i} {\sigma_u})})\leq 0$; if $\beta_1\leq 0$ then $cov(X_i, \frac{\phi(\frac{\beta_0+\beta_1X_i}{\sigma_u})}{\Phi(\frac{\beta_0+\beta_1X_i} {\sigma_u})})\geq 0$
From several sources I found that instead $\hat{\beta}_{1,OLS}$ is downward biased. Could you provide some help to understand what I'm doing wrong?