0

I am using doc2vec to produce $\mathbb{R}^{50}$ vector representations of short bits of text. I am then using those vectors in a linear model to predict a continuous outcome variable. The R^2 is .25 which I believe is good considering what I am trying to predict. However, the p-value is only .3.

Is R^2 an appropriate metric for model fit in this case? What do I make of the rather high p-value?

I have 69 participants each reporting 6 documents for a total of 414 documents. The vectors are being used to predict beliefs about emotion-related features of the situation that each participant is describing. For example, in one model, I am trying to predict the participant's perceived coping ability to handle the emotional situation (the document) they described. The dependent variable is measured on a 7-point Likert scale.

Ashish
  • 296
  • 1
  • 3
  • 12
  • What is your sample size? How are the vectors being used? – Peter Flom Jan 19 '20 at 13:22
  • I added this information. Is there any other information that is needed? How do I re-open the question in light of the added information? Thank you. – Ashish Jan 20 '20 at 21:57

0 Answers0