python - AgglomerativeClustering on a correlation matrix -

i have correlation matrix of typical structure of size 288x288 defined by:

from sklearn.cluster import agglomerativeclustering df = read_returns() correl_matrix = df.corr()

where read_returns gives me dataframe date index, , columns of returns of assets.

now - want cluster these correlations reduce population size.

by doing reading , experimenting discovered agglomerativeclustering - , appears @ first pass appropriate solution problem.

i define distance metric ((.5*(1-correl_matrix))**.5) , have:

cluster = agglomerativeclustering(n_clusters=40, linkage='average') cluster.fit(((.5*(1-correl_matrix))**.5).values) label_groups = cluster.labels_

to observe of data , cross check work pick out cluster 1 , observe pairwise correlations , find min correlation between 2 items group in dataset find :

single_cluster = [] in range(0,correl_matrix.shape[0]):     if label_groups[i]==1:         single_cluster.append(correl_matrix.index[i])  min_correl = 1.0 x in single_cluster:     y in single_cluster:         if x<>y:             if correl_matrix[x][y]<min_correl:                 min_correl = correl_matrix[x][y]  print min_correl

and min pairwise correlation of .20

to me seems quite low - "low based off what?" fair question have no answer.

i anticipate/enforce each pairwise correlation of cluster >=.7 or this.

is possible in agglomerativeclustering?

am accidentally going down wrong path?

hierarchical clustering supports different "linkage" strategies.

single-link: connects points on minimum distance others in cluster
complete-link: connects based on maximum distance cluster
...

if want high minimum correlation = small maximum distance, calls complete linkage.

you may want treat negative correlations "good", too. i.e. use dist = 1 - abs(corr).

make sure use ghe dendrogram. if have outliers in data, want cut (n_clusters+n_outliers) partitions.

Search This Blog

Employment

python - AgglomerativeClustering on a correlation matrix -

Popular posts from this blog

Apache NiFi ExecuteScript: Groovy script to replace Json values via a mapping file -

python 3.x - PyQt5 - Signal : pyqtSignal no method connect -

audio - What is the sound ID for the "Glass" sound in iOS? -