matrix_rank()
estimates the rank of a given data matrix D
by counting the
number of "practically nonzero" singular values of D
.
The rank of a matrix is the number of linearly independent columns or rows in the matrix, governing the structure of the data. It can intuitively be thought of as the number of inherent latent patterns in the data.
A singular value \(s\) is determined to be "practically nonzero" if
\(s \geq s_{max} \cdot thresh\), i.e. if it is greater than or equal to the
maximum singular value in D
scaled by a given threshold thresh
.
Examples
data <- sim_data()
matrix_rank(data$D)
#> [1] 10
matrix_rank(data$L)
#> [1] 3