What to make of high R-squared and non-significant p-value of a linear model?

Asked Jan 18 '20 at 21:35

Active Jan 21 '20 at 00:38

Viewed 63 times

I am using doc2vec to produce $\mathbb{R}^{50}$ vector representations of short bits of text. I am then using those vectors in a linear model to predict a continuous outcome variable. The R^2 is .25 which I believe is good considering what I am trying to predict. However, the p-value is only .3.

Is R^2 an appropriate metric for model fit in this case? What do I make of the rather high p-value?

I have 69 participants each reporting 6 documents for a total of 414 documents. The vectors are being used to predict beliefs about emotion-related features of the situation that each participant is describing. For example, in one model, I am trying to predict the participant's perceived coping ability to handle the emotional situation (the document) they described. The dependent variable is measured on a 7-point Likert scale.

edited Jan 20 '20 at 21:56

asked Jan 18 '20 at 21:35

Ashish

What is your sample size? How are the vectors being used? – Peter Flom Jan 19 '20 at 13:22
I added this information. Is there any other information that is needed? How do I re-open the question in light of the added information? Thank you. – Ashish Jan 20 '20 at 21:57

What to make of high R-squared and non-significant p-value of a linear model?

0 Answers0