How to do with repeated rows before multiple linear regression modeling?

Asked Nov 11 '21 at 21:11

Active Nov 11 '21 at 21:11

Viewed 12 times

If I want to construct a multiple linear regression model upon a dataset, which has one or two thousand observations, and I find there are three or four repeated rows in the dataset, then how should I deal with these repeated rows? Should I remove them before modeling, or it would be just fine to keep them? And what will be the difference between the results of these two choices?

asked Nov 11 '21 at 21:11

Cary

It depends on *why* those rows are repeated. If they represent observations with *independent* errors, then they are just as valid as any other row and there is little basis to remove them. Often a consideration of the potential sources of error is helpful in resolving this issue, because it can reveal the extent to which the errors might depart from the standard assumption of independence. – whuber Nov 11 '21 at 21:33
1

@whuber Thank you! – Cary Nov 12 '21 at 15:11

How to do with repeated rows before multiple linear regression modeling?

0 Answers0