Your data have come from a normal distribution so the null hypothesis for the Jarque-Bera test (that the population the sample are drawn from has zero skew and zero excess kurtosis) is actually true. Although we usually call Jarque-Bera a "test for normality", there are other distributions which also have zero skew and zero excess kurtosis (see this answer for an example), so a Jarque-Bera test can't distinguish them from a normal distribution.
A p-value is
the probability of getting a result as or or more extreme than the observed result, assuming the null hypothesis is true. It is not the probability of rejecting the null hypothesis.
I hope this deals with the "Does it mean that..." aspect of your question. If we see a very small p-value, like 0.001, this means that our observed results would be very improbable if $H_0$ were true (indeed, highly surprising - something as or more extreme than this we'd only expect to happen 1 time in 1000). This leads us to suspect that $H_0$ is incorrect. On the contrary, a high p-value is not at all surprising, and although it is not evidence actively in favour of $H_0$ it certainly does not put $H_0$ into doubt. In general we consider low p-values as evidence against $H_0$, and a lower p-values constitutes stronger evidence. What would lead us to reject $H_0$? It's common to set a level of significance, often 5%, and reject $H_0$ if we observe a p-value lower than the significance level. In your case we would not reject $H_0$ at any sensible level of significance.
When $H_0$ is true, the p-value will have a continuous uniform distribution between 0 and 1, also known as the rectangular distribution because of the shape of the pdf. This isn't just true for the Jarque-Bera test, and while it isn't quite true for all hypothesis tests (consider tests on discrete distributions such as a binomial proportion test or Poisson mean test) "the p-value is equally likely to be anywhere from 0 to 1" is usually a good way of thinking about the p-value under the null.
NB to address a common misconception: just because the null is true does not mean we should expect the p value to be high! There is a 50% chance of it being above 0.5, 50% chance of it being below. If you set your significance level to 5% - that is, you will reject $H_0$ if you obtain a p value below 0.05 - then be aware this will happen 5% of the time even if the null is true (this is why your significance level will be the same as your probability of a Type I error). But there's also a 5% chance of it being between 0.95 and 1, or between 0.32 and 0.37, or between 0.64 and 0.69. I hope this covers the "why do I get this p-value" aspect of your query.
Caution: I have been describing here the ideal situation where the Jarque-Bera test is working well. The test relies on the sample skewness and sample kurtosis being normally distributed - the Central Limit Theorem guarantees this will be asymptotically true in large sample sizes, but this approximation is not very good in smaller sample sizes. In fact your $n=85$ is too small - and so the reported p-values under the null aren't quite uniformly distributed. But if you'd used rnorm(1000)
instead, my description would have been accurate.
When you refer to the "probability to discard the normality hypothesis (it being true)" you seem to be thinking about the Type I error rate. But you can't see that from just one sample, you need to think about the chances of making an incorrect decision across many samples. A good way to understand how error rates work is by simulation. Keep running the same R code and you'll keep getting different p values. Make a histogram of those p values and you'll find them approximately equally likely to be drawn anywhere between 0 and 1, so long as you've chosen a large enough $n$ for the Jarque-Bera test to work nicely. If you set your significance level at 5% you'll find that, in the long run, you'll make the Type I error of rejecting the null hypothesis even though it's true (which happens in your simulation when p < 0.05) about 5% of the time. If you want to reduce your Type I error rate to 1% then set your significance level to 1%. You might even set it lower. The problem with doing so is that you make it much harder to reject the null hypothesis when it is false, so you are increasing the Type II error rate.
Also, if you do want to apply a Jarque-Bera test on a sample size as low as 85, my earlier caution about small sample sizes applies. Since the reported p-values based on the asymptotic distribution will not be uniformly distributed under the null, p < 0.05 doesn't occur 5% of the time. So you can't achieve a Type I error rate of 5% simply by rejecting $H_0$ if the reported p < 0.05! Instead, you have to adjust critical values e.g. based on simulation results, as is done in Section 4.1 of Thadewald, T, and H. Buning, 2004, Jarque-Bera test and its competitors for testing normality - A power comparison, Discussion Paper Economics 2004/9, School of Business and Economics, Free University of Berlin.
In your simulation you only considered normally distributed data; what if you simulate data that isn't normal instead? In this case we should reject the null hypothesis but you will find you don't always get a p value below 0.05 (or whatever significance level you set) so the Jarque-Bera test results do not give you sufficient evidence to reject. The more powerful the test, the better it is at telling you to reject $H_0$ in this situation. You will find that you can improve the power of the test by increasing the sample size (whereas when the null was true, changing the sample size makes no difference to the rectangular distribution of the p values - try it! - when the data isn't drawn from a normal population, you'll find low p values become increasingly likely as you increase the sample size). The power of the test is also higher if your data are more blatantly departing from normality - see what happens as you sample from distributions with more extreme skew and kurtosis. There are alternative normality tests available, and they will have different powers against different types of departure from normality.
A final word of warning. Be aware that in many practical situations, we do not really want to run a normality test at all. Sometimes normality tests can be useful, though - for instance, if you are of a skeptical disposition and want to check whether the "random normal deviates" generated by your statistical software are genuinely normal. You should find that the rnorm
function in R
is fine, however!