Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

MATLAB is based on the analysis of 14 clustering methods


Jun 01, 2021 Article blog


Table of contents


Cluster analysis, also known as group analysis, is a statistical analysis method used to study sample classification problems, and it is also an important data mining algorithm. Clustering is composed of several patterns, usually, patterns are vectors of a measure, and clustering is based on similarity, with more similarities between patterns in a cluster than patterns that are not in the same cluster.

For clustering algorithms, most are implemented with SPSS软件 usually importing data, and selecting clustering methods to achieve, this paper borrows MATLAB软件 based on 14 different clustering methods, to achieve sample clustering.

14 clustering methods

(1) The maximum distance method

X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'euclid');
M=squareform(D);
Z=linkage(D,'complete');
H=dendrogram(Z);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods1

(2) The shortest distance method

X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'euclid');
M=squareform(D);
Z=linkage(D,'single')
;H=dendrogram(Z);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,'cutoff',0.8);

 MATLAB is based on the analysis of 14 clustering methods2

(3) Comprehensive clustering sub-procedure

X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
T=clusterdata(X,0.8);
Re=find(T=5)

(4) Center of gravity method and standard Euclidean distance

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'seuclid');
M=squareform(D);
Z=linkage(D,'centroid');
H=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods3

(5) Center of Gravity Method - Euclidean Distance Square

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'euclid');
D2=D.^2;
M=squareform(D2);
Z=linkage(D2,'centroid');
H=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D2);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods4

(6) Center of gravity method and precision weighted distance

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
[n,m]=size(X);
stdx=std(X);
X2=X./stdx(ones(n,1),:);
D=pdist(X2,'euclid');
M=squareform(D);
Z=linkage(D,'centroid');
H=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods5

(7) The shortest distance method is based on the standard European distance of the main ingredient

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
[E,score,eigen,T]=princomp(X);
D=pdist(score,'seuclid');
M=squareform(D);
Z=linkage(D,'single');
H=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods6

(8) Average French-standard European distance

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'seuclid');
M=squareform(D);
Z=linkage(D,'average');
H=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods7

(9) Weighting method and standard European distance

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'seuclid');
M=squareform(D);
Z=linkage(D,'weighted');
H=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods8

(10) Shortest distance method - Mars distance

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'mahal');M=squareform(D);Z=linkage(D,'single');H=dendrogram(Z,'labels',S);xlabel('City');ylabel('Scale');C=cophenet(Z,D);T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods9

(11) European distance of center of gravity method and standardized data

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
[n,m]=size(X);
mv=mean(X);
st=std(X);
x=(X-mv(ones(n,1),:))./st(ones(n,1),:);
D=pdist(X,'euclid');
M=squareform(D);
Z=linkage(D,'centroid');
H=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods10

(12) Maximum distance French-European distance

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'euclid');
M=squareform(D);
Z=linkage(D,'complete');
[H tPerm]=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods11

(13) Average method and similar coefficient

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
D=pdist(X,'cosine');
M=squareform(D);
Z=linkage(D,'centroid');
T=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods12

(14) Minimum distance method - standard European distance based on the main ingredient

S=['福冈';'合肥';'武汉';'长沙';'桂林';'温州';'成都'];
X=[16.21492 2000 -8.2 6.2;
   15.7 970 2209 -20.6 1.9;
   16.3 1260 2085 -17.3 2.8;
   17.2 14221726 -9.5 4.6;
   18.8 1874 1709 -4.9 8.0;
   17.9 1698 1848 -4.5 7.5;
   16.3 976 1239-4.6 5.6];
[E,score,eigen,T]=princomp(X);
PCA=[score(:,1),score(:,2)];
D=pdist(PCA,'seuclid');
M=squareform(D);
Z=linkage(D,'single');
H=dendrogram(Z,'labels',S);
xlabel('City');
ylabel('Scale');
C=cophenet(Z,D);
T=cluster(Z,3);

 MATLAB is based on the analysis of 14 clustering methods13

(Recommended tutorial: MATLAB tutorial)

Source: www.toutiao.com/a6863649930347545091/

That's what W3Cschool编程狮 has to say about MATLAB's analysis based on 14 clustering methods.