fast.prcomp {gmodels} | R Documentation |
The standard prcomp
and svd
function are very inefficient for wide matrixes. fast.prcomp
and fast.svd
are modified versions which are efficient even
for matrixes that are very wide.
fast.prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE, tol = NULL) fast.svd( x, nu = min(n, p), nv = min(n, p), ...)
x |
data matrix |
retx, center, scale., tol |
See documetation for
prcomp
|
nu, nv, ... |
See documetation for svd |
The current implementation of the function svd
in S-Plus
and R is much slower when operating on a matrix with a large number of
columns than on the transpose of this matrix, which has a large
number of rows. As a consequence, prcomp
, which uses
svd
, is also very slow when applied to matrixes with a
large number of rows.
For R, the simple solution is to use La.svd
instead of
svd
. A suitable patch to prcomp
has
been submitted. In the mean time, the function fast.prcomp
has
been provided as a short-term work-around.
For S-Plus the solution is to replace the standard svd
with a version that checks the dimensions of the matrix, and performs
the computation on the transposed the matrix if it is wider than tall.
For R:
fast.prcomp
prcomp
that calls La.svd
instead
of svd
fast.svd
La.svd
.
For S-Plus:
fast.prcomp
prcomp
that calls fast.svd
instead
of svd
fast.svd
svd
. It then swaps u
and v
and returns the result. Otherwise, it just calls svd
and returns the results unchanged.
See the documetation for prcomp
or
svd
.
Modifications by Gregory R. Warnes gregory.r.warnes@pfizer.com
# create test matrix set.seed(4943546) nr <- 50 nc <- 2000 x <- matrix( rnorm( nr*nc), nrow=nr, ncol=nc ) tx <- t(x) # SVD directly on matrix is SLOW: system.time( val.x <- svd(x)$u ) # SVD on t(matrix) is FAST: system.time( val.tx <- svd(tx)$v ) # and the results are equivalent: max( abs(val.x) - abs(val.tx) ) # Time gap dissapears using fast.svd: system.time( val.x <- fast.svd(x)$u ) system.time( val.tx <- fast.svd(tx)$v ) max( abs(val.x) - abs(val.tx) ) library(stats) # prcomp directly on matrix is SLOW: system.time( pr.x <- prcomp(x) ) # prcomp.fast is much faster system.time( fast.pr.x <- fast.prcomp(x) ) # and the results are equivalent max( pr.x$sdev - fast.pr.x$sdev ) max( abs(pr.x$rotation[,1:49]) - abs(fast.pr.x$rotation[,1:49]) ) max( abs(pr.x$x) - abs(fast.pr.x$x) ) # (except for the last and least significant component): max( abs(pr.x$rotation[,50]) - abs(fast.pr.x$rotation[,50]) )