This documents gives some instructions on how to create graphical
representations of a correlation matrix in the statistical environment R
with package `Correlplot`

, using a variety of different
statistical methods (Graffelman and De Leeuw
(2023)). We use principal component analysis (PCA),
multidimensional scaling (MDS), principal factor analysis (PFA),
weighted alternating least squares (WALS), correlograms (CRG) and
corrgrams to produce displays of correlation structure. The next section
shows how to use the functions of the package in order to create the
different graphical representations, using both R base graphics and
`ggplot2`

graphics (Wickham
(2016)). The computation of goodness-of-fit statistics is also
addressed. All methods are illustrated on a single data set, the wheat
kernel data introduced below.

We first load some packages we will use:

```
library(calibrate)
#> Loading required package: MASS
library(ggplot2)
library(corrplot)
#> corrplot 0.92 loaded
library(Correlplot)
```

Throughout this vignette, we will use the wheat kernel data set taken
from the UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/seeds) in order
to illustrate the different plots. The wheat kernel data (Charytanowicz et al. (2010)) consists of 210
wheat kernels, for which the variables *area* (\(A\)), *perimeter* (\(P\)), *compactness* (\(C = 4*\pi*A/P^2\)), *length*,
*width*, *asymmetry coefficient* and *groove*
(length of the kernel groove) were registered. There are 70 kernels of
each of three varieties *Kama*, *Rosa* and
*Canadian*; here we will only use the kernels of variety
*Kama*. The data is made available with:

```
data("Kernels")
X <- Kernels[Kernels$variety==1,]
X <- X[,-8]
head(X)
#> area perimeter compactness length width asymmetry groove
#> 1 15.26 14.84 0.8710 5.763 3.312 2.221 5.220
#> 2 14.88 14.57 0.8811 5.554 3.333 1.018 4.956
#> 3 14.29 14.09 0.9050 5.291 3.337 2.699 4.825
#> 4 13.84 13.94 0.8955 5.324 3.379 2.259 4.805
#> 5 16.14 14.99 0.9034 5.658 3.562 1.355 5.175
#> 6 14.38 14.21 0.8951 5.386 3.312 2.462 4.956
```

The correlation matrix of the variables is given by:

```
p <- ncol(X)
R <- cor(X)
round(R,digits=3)
#> area perimeter compactness length width asymmetry groove
#> area 1.000 0.976 0.371 0.835 0.900 -0.050 0.721
#> perimeter 0.976 1.000 0.165 0.921 0.802 -0.054 0.794
#> compactness 0.371 0.165 1.000 -0.146 0.667 0.037 -0.131
#> length 0.835 0.921 -0.146 1.000 0.551 -0.037 0.866
#> width 0.900 0.802 0.667 0.551 1.000 -0.027 0.447
#> asymmetry -0.050 -0.054 0.037 -0.037 -0.027 1.000 -0.011
#> groove 0.721 0.794 -0.131 0.866 0.447 -0.011 1.000
```

The corrgram (Friendly (2002)) is a
tabular display of the entries of a correlation matrix that uses colour
and shading to represent correlations. Corrgrams can be made with the
fuction `corrplot`

This shows most correlations are positive, and correlations with
*asymmetry* are weak.

The correlogram (Trosset (2005)) represents correlations by the cosines of angles between vectors.

The vector `theta.cos`

contains the angles (in radians) of
each variable with respect to the positive \(x\)-axis. The approximation provided by
these angles to the correlation matrix is calculated by

The correlogram always perfectly represents the correlations of the
variables with themselves, and these have a structural contribution of
zero to the loss function. We calculate the root mean squared error
(RMSE) of the approximation by using function `rmse`

. We
include the diagonal of the correlation matrix in the RMSE calculation
by supplying a weight matrix of only ones.

This gives and RMSE of 0.2438, which shows this representation has a large amount of error. The correlogram can be modified by using a linear interpretation rule, rendering correlations linear in the angle (Graffelman (2013)). This representation is obtained by:

```
theta.lin <- correlogram(R,ifun="lincos",labs=colnames(R),xlim=c(-1.1,1.1),
ylim=c(-1.1,1.1),main="CRG")
```

The approximation to the correlation matrix by using this linear interpretation function is calculated by

```
Rhat.corlin <- angleToR(theta.lin,ifun="lincos")
rmse.lin <- rmse(R,Rhat.corlin,W=W1)
rmse.lin
#> [1] 0.1667556
```

and this gives a RMSE of 0.1668. In this case, the linear representation is seen to improve the approximation.

Function `ggcorrelogram`

can be used to create
correlograms on a `ggplot2`

canvas (Wickham (2016)). The output object contains the
field `theta`

containing the vector of angles.

We create a PCA biplot of the correlation matrix, doing the
calculations for a PCA by hand, using the singular value decomposition
of the (scaled) standardized data. Alternatively, standard R function
`princomp`

may be used to obtain the coordinates needed for
the correlation biplot. We use function `bplot`

from package
`calibrate`

to make the biplot:

```
n <- nrow(X)
Xt <- scale(X)/sqrt(n-1)
res.svd <- svd(Xt)
Fs <- sqrt(n)*res.svd$u # standardized principal components
Gp <- res.svd$v%*%diag(res.svd$d) # biplot coordinates for variables
bplot(Fs,Gp,colch=NA,collab=colnames(X),xlab="PC1",ylab="PC2",main="PCA")
```

The joint representation of kernels and variables emphasizes this is
a biplot of the (standardized) data matrix. However, this plot is a
*double biplot* because scalar products between variable vectors
approximate the correlation matrix. We stress this by plotting the
variable vectors only, and adding a unit circle. In order to facilitate
recovery of the approximations to the correlations between the
variables, correlation tally sticks can be added as with the
`tally`

function, as is shown in the figure below:

```
bplot(Gp,Gp,colch=NA,rowch=NA,collab=colnames(X),xl=c(-1,1),yl=c(-1,1),main="PCA")
circle()
tally(Gp[-6,1:2],values=seq(-0.2,0.8,by=0.2))
```

The PCA biplot of the correlation matrix can be obtained from a correlation-based PCA or also directly from the spectral decomposition of the correlation matrix. The rank two approximation, obtained by means of scalar products between vectors, is calculated by:

In principle, PCA also tries to approximate the ones on the diagonal of the correlation matrix. These are included in the RMSE calculation by supplying a unit weight matrix.

This gives a RMSE of 0.1460, which is an improvement over the
previous correlograms. Function `rmse`

can also be used to
calculate the RMSE of each variable separately:

```
rmse(R,Rhat.pca,W=W1,per.variable=TRUE)
#> area peri comp leng widt asym groo
#> 0.01429494 0.02168169 0.03158330 0.02386245 0.02047550 0.27686959 0.06000407
```

This shows that *asymmetry*, the shortest biplot vector, has
the worst fit.

PCA biplots can also be made on a `ggplot2`

canvas using
the function `ggbplot`

. We redo the biplots of the data
matrix and the correlation matrix. It is convenient first to establish
the range of variation along both axes considering both row and column
markers. We find these ranges using `jointlim`

:

```
limits <- jointlim(Fs,Gp)
limits
#> $xlim
#> [1] -2.250838 2.529940
#>
#> $ylim
#> [1] -2.650718 1.960994
```

Next, we prepare two dataframes, one with the coordinates and names for the rows, and one for the columns.

```
df.rows <- data.frame(Fs[,1:2])
colnames(df.rows) <- c("PA1","PA2")
df.rows$strings <- 1:n
df.cols <- data.frame(Gp[,1:2])
colnames(df.cols) <- c("PA1","PA2")
df.cols$strings <- colnames(R)
```

We compute the variance decomposition table:

```
lambda <- res.svd$d^2
lambda.frac <- lambda/sum(lambda)
lambda.cum <- cumsum(lambda.frac)
decom <- round(rbind(lambda,lambda.frac,lambda.cum),digits=3)
colnames(decom) <- paste("PC",1:p,sep="")
decom
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7
#> lambda 4.205 1.516 0.999 0.208 0.051 0.020 0.001
#> lambda.frac 0.601 0.217 0.143 0.030 0.007 0.003 0.000
#> lambda.cum 0.601 0.817 0.960 0.990 0.997 1.000 1.000
```

And use the table to construct axis labels that inform about the percentage of variance explained by each PC.

```
xlab <- paste("PC1 (",round(100*lambda.frac[1],digits=1),"%)",sep="")
ylab <- paste("PC2 (",round(100*lambda.frac[2],digits=1),"%)",sep="")
```

Finally, we construct the PCA biplot of the data matrix with
`ggbplot`

.

```
biplotX <- ggbplot(df.rows,df.cols,main="PCA biplot",xlab=xlab,
ylab=ylab,xlim=limits$xlim,ylim=limits$ylim,
colch="")
biplotX
```

The biplot of the correlation matrix is obtained by supplying the same biplot markers twice, once for the rows and once for the columns. Because the goodness-of-fit of the correlation matrix requires the squared eigenvalues (Gabriel (1971); Graffelman and De Leeuw (2023)), we first establish new labels for the axes:

```
lambdasq <- lambda^2
lambdasq.frac <- lambdasq/sum(lambdasq)
lambdasq.cum <- cumsum(lambdasq.frac)
decomsq <- round(rbind(lambdasq,lambdasq.frac,lambdasq.cum),
digits=3)
colnames(decomsq) <- paste("PC",1:p,sep="")
decomsq
#> PC1 PC2 PC3 PC4 PC5 PC6 PC7
#> lambdasq 17.682 2.299 0.997 0.043 0.003 0 0
#> lambdasq.frac 0.841 0.109 0.047 0.002 0.000 0 0
#> lambdasq.cum 0.841 0.950 0.998 1.000 1.000 1 1
xlab <- paste("PC1 (",round(100*lambdasq.frac[1],digits=1),"%)",sep="")
ylab <- paste("PC2 (",round(100*lambdasq.frac[2],digits=1),"%)",sep="")
```

```
biplotR <- ggbplot(df.cols,df.cols,main="PCA Correlation biplot",xlab=xlab,
ylab=ylab,xlim=c(-1,1),ylim=c(-1,1),
rowarrow=TRUE,rowcolor="blue",colch="",
rowch="")
biplotR
```

We now add correlation tally sticks to the biplot, in order to favour
“reading off” the approximations of the correlations. Increments of 0.2
in the correlation scale are marked off along the biplot vectors. For
the shortest biplot vector, `asym`

, we use a modified scale
with 0.01 increments. Note that the goodness-of-fit of the correlation
matrix is (always) better or as good as the goodness-of-fit of the
standardized data matrix (Graffelman and De Leeuw
(2023)).

```
biplotR <- ggtally(Gp[-6,1:2],biplotR,values=seq(-0.2,0.8,by=0.2),dotsize=0.2)
biplotR <- ggtally(Gp[6,1:2],biplotR,values=seq(-0.01,0.01,by=0.01),dotsize=0.2)
biplotR
```

We transform correlations to distances with the \(\sqrt{2(1-r)}\) transformation (Hills (1969)), and use the `cmdscale`

function from the `stats`

package to perform metric
multidimensional scaling. We mark negative correlations with a dashed
red line.

```
Di <- sqrt(2*(1-R))
out.mds <- cmdscale(Di,eig = TRUE)
Fp <- out.mds$points[,1:2]
opar <- par(bty = "l")
plot(Fp[,1],Fp[,2],asp=1,xlab="First principal axis",
ylab="Second principal axis",main="MDS")
textxy(Fp[,1],Fp[,2],colnames(R),cex=0.75)
par(opar)
ii <- which(R < 0,arr.ind = TRUE)
for(i in 1:nrow(ii)) {
segments(Fp[ii[i,1],1],Fp[ii[i,1],2],
Fp[ii[i,2],1],Fp[ii[i,2],2],col="red",lty="dashed")
}
```

We calculate distances in the map, and convert these back to correlations:

Again, correlations of the variables with themselves are perfectly approximated (zero distance), and we include these in the RMSE calculations:

The same MDS map can be made on a `ggplot2`

canvas
with

```
colnames(Fp) <- paste("PA",1:2,sep="")
rownames(Fp) <- as.character(1:nrow(Fp))
Fp <- data.frame(Fp)
Fp$strings <- colnames(R)
MDSmap <- ggplot(Fp, aes(x = PA1, y = PA2)) +
coord_equal(xlim = c(-1,1), ylim = c(-1,1)) +
ggtitle("MDS") +
xlab("First principal axis") + ylab("Second principal axis") +
theme(plot.title = element_text(hjust = 0.5,
size = 8),
axis.ticks = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank())
MDSmap <- MDSmap + geom_point(data = Fp, aes(x = PA1, y = PA2),
colour = "black", shape = 1)
MDSmap <- MDSmap + geom_text(data = Fp, aes(label = strings),
size = 3, alpha = 1,
vjust = 2)
Z <- matrix(NA,nrow=nrow(ii),ncol=4)
for(i in 1:nrow(ii)) {
Z[i,] <- c(Fp[ii[i,1],1],Fp[ii[i,1],2],Fp[ii[i,2],1],Fp[ii[i,2],2])
}
colnames(Z) <- c("X1","Y1","X2","Y2")
Z <- data.frame(Z)
MDSmap <- MDSmap + geom_segment(data=Z,aes(x=X1,y=Y1,
xend=X2,yend=Y2),
alpha=0.75,color="red",linetype=2)
MDSmap
```

Principal factor analysis can be performed by the function
`pfa`

of package `Correlplot`

. We extract the
factor loadings.

The biplot of the correlation matrix obtained by PFA is in fact the same as what is known as a factor loading plot in factor analysis, to which a unit circle can be added. The approximation to the correlation matrix and its RMSE are calculated as:

In this case, the diagonal is excluded, for PFA explicitly avoids fitting the diagonal. To make the factor loading plot, aka PFA biplot of the correlation matrix:

```
bplot(L,L,pch=NA,xl=c(-1,1),yl=c(-1,1),xlab="Factor 1",ylab="Factor 2",main="PFA",rowch=NA,
colch=NA)
circle()
```

The RMSE of the plot obtained by PFA is 0.0112, lower than the RMSE
obtained by PCA. Note that variable *area* reaches the unit
circle for having a communality of 1, or, equivalently, specificity 0,
i.e. PFA produces what is known as a *Heywood case* in factor
analysis. The specificities are given by:

```
diag(out.pfa$Psi)
#> area peri comp leng widt asym
#> 0.000000000 0.011494664 0.158097108 0.007885055 0.022509545 0.997799829
#> groo
#> 0.258768379
```

We also construct the PFA biplot on a `ggplot2`

canvas
with

```
L.df <- data.frame(L,rownames(L))
colnames(L.df) <- c("PA1","PA2","strings")
ggbplot(L.df,L.df,xlab="Factor 1",ylab="Factor 2",main="PFA biplot",
rowarrow=TRUE,rowcolor="blue",colch="",rowch="")
```

The correlation matrix can also be factored using weighted
alternating least squares, avoiding the fit of the ones on the diagonal
of the correlation matrix by assigning them weight 0, using function
`ipSymLS`

(De Leeuw
(2006)).

```
bplot(Fp.als,Fp.als,rowch=NA,colch=NA,collab=colnames(R),
xl=c(-1.1,1.1),yl=c(-1.1,1.1),main="WALS")
circle()
```

Weighted alternating least squares has, in contrast to PFA, no
restriction on the vector length. If the vector lengths in the biplot
are calculated, then variable *area* is seen to just move out of
the unit circle.

```
Rhat.wals <- Fp.als%*%t(Fp.als)
sqrt(diag(Rhat.wals))
#> [1] 1.00124368 0.99394213 0.91345321 0.99646265 0.99026217 0.04686397 0.86124152
rmse.als <- rmse(R,Rhat.wals)
rmse.als
#> [1] 0.01118619
```

The RMSE of the approximation obtained by WALS is 0.011186, slightly
below the RMSE of PFA. The WALS low rank approximation to the
correlation matrix can also be obtained by the more generic function
`wAddPCA`

, which allows for non-symmetric matrices and
adjustments (see the next section). In order to get uniquely defined
biplot vectors for each variable, corresponding to symmetric input, an
eigendecomposition is applied to the fitted correlation matrix.

```
out.wals <- wAddPCA(R, Wdiag0, add = "nul", verboseout = FALSE, epsout = 1e-10)
Rhat.wals <- out.wals$a%*%t(out.wals$b)
out.eig <- eigen(Rhat.wals)
Fp.adj <- out.eig$vectors[,1:2]%*%diag(sqrt(out.eig$values[1:2]))
rmse.als <- rmse(R,Rhat.wals)
rmse.als
#> [1] 0.01118619
```

We also make the WALS biplot on a `ggplot2`

canvas
with

```
Fp.als.df <- data.frame(Fp.als,colnames(R))
colnames(Fp.als.df) <- c("PA1","PA2","strings")
ggbplot(Fp.als.df,Fp.als.df,xlab="Dimension 1",ylab="Dimension 2",main="WALS biplot",
rowarrow=TRUE,rowcolor="blue",colch="",rowch="")
```

A scalar adjustment can be employed to improve the approximation of the correlation matrix, and the adjusted correlation matrix is factored for a biplot representation. That means we seek the factorization

\[ {\mathbf R} - \delta {\mathbf J} = {\mathbf R}_a = {\mathbf G} {\mathbf G}', \]

where both \(\delta\) and \(\mathbf G\) are chosen such that the
(weighted) residual sum of squares is minimal. This problem is solved by
using function `wAddPCA`

.

```
out.wals <- wAddPCA(R, Wdiag0, add = "one", verboseout = FALSE, epsout = 1e-10)
delta <- out.wals$delta[1,1]
Rhat <- out.wals$a%*%t(out.wals$b)
out.eig <- eigen(Rhat)
Fp.adj <- out.eig$vectors[,1:2]%*%diag(sqrt(out.eig$values[1:2]))
```

The optimal adjustment \(\delta\) is 0.071. The corresponding biplot is shown below.

```
bplot(Fp.adj,Fp.adj,rowch=NA,colch=NA,collab=colnames(R),
xl=c(-1.2,1.2),yl=c(-1.2,1.2),main="WALS adjusted")
circle()
```

Note that, when calculating the fitted correlation matrix, adjustment \(\delta\) is added back. The fitted correlation matrix and its RMSE are now calculated as:

This gives RMSE 0.0056. This is smaller than the RMSE obtained by WALS without adjustment. We summarize the values of the RMSE of all methods in a table below:

```
rmsevector <- c(rmse.crg,rmse.lin,rmse.pca,rmse.mds,rmse.pfa,rmse.als,rmse.adj)
methods <- c("CRG (cos)","CRG (lin)","PCA","MDS",
"PFA","WALS R","WALS Radj")
results <- data.frame(methods,rmsevector)
results <- results[order(rmsevector),]
results
#> methods rmsevector
#> 7 WALS Radj 0.005560242
#> 6 WALS R 0.011186190
#> 5 PFA 0.011196885
#> 4 MDS 0.068374694
#> 3 PCA 0.145959040
#> 2 CRG (lin) 0.166755585
#> 1 CRG (cos) 0.243753461
```

A summary of RMSE calculations for four methods (PCA and WALS, both
with and without \(\delta\) adjustment)
can be obtained by the functions `FitRwithPCAandWALS`

and
`rmsePCAandWALS`

; the first applies the four methods to the
correlation matrix, and the latter calculates the RMSE statistics for
the four approximations. The bottom line of the table produced by
`rmsePCAandWALS`

gives the overall RMSE for each methods.

```
output <- FitRwithPCAandWALS(R,eps=1e-15)
rmsePCAandWALS(R,output)
#> PCA PCA-A WALS WALS-A
#> area 0.0143 0.0542 0.0081 0.0043
#> peri 0.0217 0.0405 0.0088 0.0064
#> comp 0.0316 0.0590 0.0110 0.0054
#> leng 0.0239 0.0485 0.0067 0.0045
#> widt 0.0205 0.0626 0.0076 0.0068
#> asym 0.2769 0.1182 0.0181 0.0049
#> groo 0.0600 0.0675 0.0135 0.0061
#> All 0.1460 0.0706 0.0112 0.0056
```

The scalar adjustment changes the interpretation of the origin, and
it is convenient to show the zero correlation point on each biplot
vector. We do so by creating tally sticks, redoing the biplot on a
`ggplot2`

canvas. The origin now represents correlation 0.07
for all variables.

```
Fp.adj.df <- data.frame(Fp.adj,colnames(R))
colnames(Fp.adj) <- c("PA1","PA2")
colnames(Fp.adj.df) <- c("PA1","PA2","strings")
WALSbiplot.adj <- ggbplot(Fp.adj.df,Fp.adj.df,xlab="Dimension 1",ylab="Dimension 2",
main="WALS adjusted",rowarrow=TRUE,
rowcolor = "blue",rowch="",colch="")
WALSbiplot.adj <- ggtally(Fp.adj[-6,1:2],WALSbiplot.adj,
adj=-out.wals$delta[1,1],
values=seq(-0.2,0.8,by=0.2),dotsize=0.2)
WALSbiplot.adj
```

We generate an asymmetric approximation to the correlation matrix using a column adjustment, based on the decomposition

\[ {\mathbf R} = \delta {\mathbf 1} {\mathbf 1}' + {\mathbf 1} {\mathbf q}' + {\mathbf G} {\mathbf G}' + {\mathbf E}. \]

This decomposition can be obtained with function
`FitDeltaQSym`

.

```
out.walsq.sym <- FitRDeltaQSym(R,Wdiag0,itmax.inner=30000,itmax.outer=30000,
eps=1e-8)
Gq.sym <- out.walsq.sym$C
rownames(Gq.sym) <- colnames(R)
Rhat.wsym <- out.walsq.sym$Rhat
rmse.wsym <- rmse(R,Rhat.wsym)
rmse.wsym
#> [1] 0.00540069
```

This approximation as a RMSE of 0.0054, a very minor improvement in comparison with using a single scalar adjustment.

The corresponding biplot can be obtained with

```
Gq.sym.df <- data.frame(Gq.sym)
Gq.sym.df$strings <- colnames(R)
colnames(Gq.sym.df) <- c("PA1","PA2","strings")
ll <- 1.5
WALSbiplotq.sym <- ggbplot(Gq.sym.df,Gq.sym.df,main="WALS-q-sym biplot",xlab="Dimension 1",
ylab="Dimension 2",
ylim=c(-ll,ll),xlim=c(-ll,ll),circle=TRUE,
rowarrow=TRUE,rowcolor="blue",rowch="",
colch="")
for(i in c(1:5,7)) {
WALSbiplotq.sym <- ggtally(Gq.sym[i,1:2],WALSbiplotq.sym,
adj=-out.walsq.sym$delta-out.walsq.sym$q[i],
values=seq(-0.2,1,by=0.2),dotsize=0.2)
}
WALSbiplotq.sym
```

This biplot is similar in appearance to the previous biplot that uses a single scalar adjustment only. However, in principle it no longer has a unique origin, as the origin represents a (slightly) different correlation for each variable:

```
out.walsq.sym$delta + out.walsq.sym$q
#> [1] 0.07779402 0.07332050 0.07528872 0.07546010 0.07847079 0.07575909 0.07854010
```

For this data, a single scalar adjustment seems preferable.

Charytanowicz, M., J. Niewczas, P. Kulczycki, P. A. Kowalski, S.
Lukasik, and S. Zak. 2010. “A Complete Gradient Clustering
Algorithm for Features Analysis of x-Ray Images.” In
*Information Technologies in Biomedicine*, edited by Ewa Pietka
and Jacek Kawa, 15–24. Berlin-Heidelberg: Springer-Verlag.

De Leeuw, J. 2006. “A Decomposition Method for Weighted Least
Squares Low-Rank Approximation of Symmetric Matrices.”
*Department of Statistics, UCLA*. https://escholarship.org/uc/item/1wh197mh.

Friendly, M. 2002. “Corrgrams: Exploratory Displays for
Correlation Matrices.” *The American Statistician* 56 (4):
316–24. https://doi.org/10.1198/000313002533.

Gabriel, K. R. 1971. “The Biplot Graphic Display of Matrices with
Application to Principal Component Analysis.” *Biometrika*
58 (3): 453–67.

Graffelman, J. 2013. “Linear-Angle Correlation Plots: New Graphs
for Revealing Correlation Structure.” *Journal of
Computational and Graphical Statistics* 22 (1): 92–106. https://doi.org/10.1080/15533174.2012.707850.

Graffelman, J., and J. De Leeuw. 2023. “Improved Approximation and
Visualization of the Correlation Matrix.” *The American
Statistician*, 1–20. https://doi.org/10.1080/00031305.2023.2186952.

Hills, M. 1969. “On Looking at Large Correlation Matrices.”
*Biometrika* 56 (2): 249–53.

Trosset, M. W. 2005. “Visualizing Correlation.” *Journal
of Computational and Graphical Statistics* 14 (1): 1–19. https://doi.org/10.1198/106186005X27004.

Wickham, H. 2016. *Ggplot2: Elegant Graphics for Data Analysis*.
Springer-Verlag New York. https://ggplot2.tidyverse.org.