My DataFrame consists of 2919 rows.
Now, for example I have this column "2ndFlrSF"
2ndFlrSF: Second floor's Area in square feet
and these are the values in it after I run my Pandas command
conc1['2ndFlrSF'].value_counts()
where conc1 is my DataFrame
Output:
0 1668
546 23
728 18
504 17
672 13
600 13
720 13
896 11
886 10
756 9
780 9
862 8
601 7
702 7
840 7
754 6
462 6
676 6
744 6
804 6
630 6
878 6
739 6
567 6
689 6
858 5
741 5
704 5
684 5
678 5
...
605 1
591 1
1150 1
1152 1
1158 1
1160 1
1074 1
1072 1
1066 1
1060 1
956 1
966 1
679 1
980 1
673 1
990 1
992 1
994 1
998 1
1000 1
1004 1
1008 1
661 1
1028 1
659 1
1036 1
1038 1
1042 1
1048 1
1721 1
Name: 2ndFlrSF, Length: 635, dtype: int64
As you can see it's mostly filled with 0's as values which is irrelevant. I have many more such columns. What should I do with such columns? And how should I impute NaN
values in them accordingly?