0

Normally when we do logistic regression, we would have a dataset something like:

   X1 X2 Y
1:  A  3 0
2:  A  4 0
3:  A  3 0
4:  B  4 1

(4 observations)

However, for some reasons, I only have the aggregated version:

   X1 X2 count Y_count
1:  A  3     2       0
2:  A  4     1       0
3:  B  4     1       1

(i.e. a summary table. e.g. 2 records with X1 = A and X2 = 3 but the number of Y = 1 is 0)

Now, I understand that I can simply replicate and do the logistic regression normal way; but my question is whether I can simply use the summary table to do logistic regression (potentially in R). Any implications?

There're other posts of similar questions: here. Also I'm aware that "weights in logistic regression differs from ... weights in linear regression) (ref)

LeGeniusII
  • 111
  • 2
  • The answer to this already exists in the answer to this question: https://stats.stackexchange.com/questions/259502/in-using-the-cbind-function-in-r-for-a-logistic-regression-on-a-2-times-2-t – StatsStudent Jan 21 '19 at 05:18
  • For people who have the similar questions, there are three types of data formats logistic regression can use: long form, wide form, weighted form – LeGeniusII Jan 28 '19 at 23:00
  • @LeGeniusll, that's addressed in the previous answer. – StatsStudent Jan 28 '19 at 23:54

0 Answers0