Row Correlation Heatmap Pandas

Asked Jul 11 '16 at 11:02

Active Oct 04 '18 at 09:54

Viewed 1,025 times

I'm trying to find any relationship/patterns between a large number of rows in a dataset (~2000) and I'm thinking of using a correlation heatmap. However, after transforming the df using df = df.T.corr() and only plotting the first 100 rows with seaborn, it already starts to look unreadable:

Is there a clearer way to do this with a larger number of rows?

edited Oct 04 '18 at 09:54

kjetil b halvorsen

63,378
26
142
467

asked Jul 11 '16 at 11:02

user3508494

2

Sorting the correlation matrix may provide clusters of variables, see [here](http://stats.stackexchange.com/q/26920/1036) for one description of how to sort them. – Andy W Jul 11 '16 at 12:17
2

Any Python based solutions? – user3508494 Jul 11 '16 at 13:18
I found `sns.clustermap(df.T.corr(), metric='correlation', method='centroid')` which might do the trick. – tmrlvi Nov 22 '17 at 15:26
Try to do some basic clustering before (with the kernel trick if necessary), then order your dataset with respect to the classes. In python, use scikit-learn's k-means, PCA or whatever clustering technique works with your data. – Romain Reboulleau Oct 04 '18 at 11:06

Row Correlation Heatmap Pandas

0 Answers0