Loading [MathJax]/extensions/tex2jax.js

Sunday, January 3, 2016

How much faster is a truncated singular value decomposition?

The Singular Value Decomposition is an important matrix operation which enables many other numerical algorithms. The SVD lets you tame seemingly unwieldy matrices by uncovering their reduced "low rank" representation. A matrix which can be accurately approximated by a low-rank decomposition actually contains much less information than suggested by its dimensions.

A rank-1 matrix you might see often in the wild is an elementary school multiplication table. The multiplication table contains nxn entries which are intimately intertwined (i.e. one entry can't be changed independent of the others). A singular value decomposition of a multiplication table would reveal that the essential information is just the sequence of numbers 1, 2, ..., n (whose outer product reconstructs the table).

This kind of data compression can be very useful in machine learning, where we prefer to learn predictors which use a small of informative features (rather than thousands of highly correlated features). In addition to letting us boil down redundant features into more informative combinations, the SVD can also be used for denoising samples and imputing missing data.

What does the output of an SVD look like? From a given input matrix X, you get back a triplet U, s, V. The two matrices (U and V) often called the left and right singular vectors. The outer products of these singular vectors are rank-1 building block from which X gets reconstructed. The vector s contains singular values, each of which acts as a weight on a combination singular vectors.

To reconstruct your original matrix just add up the outer products of U and V, weighted by the entries of s. That is, you can get back a matrix whose entries are approximately equal to the original by running:

np.sum([np.outer(U[:, i], V[i, :]) * si for i, si in enumerate(s)], axis=0)

Alternatively, you can upgrade the vector s into a diagonal matrix and express the reconstruction of X purely through matrix multiplications:

np.dot(U, np.dot(np.diag(s), V))

In many cases, when we are dealing with low-rank underlying data, many of the singular values will be extremely small or exactly zero. In this case, it is wasteful to perform the "full" SVD and can be advantageous to instead do a "truncated" decomposition (computing only k singular vectors).

Exactly how much slower is a full SVD vs. a truncated SVD? I performed the following quick experiment to find out:

  1. Generate 100 random matrices with rank 50, 1000 columns and a random number of rows between 100 and 5000
  2. Record how long it takes to decompose the matrix using one of:
  3. Ensure that the reconstructed matrix matches the original

Results

The truncated SVD (using either an exact solver or approximate randomized solver) can be many times faster than a full SVD. If you're decomposing a large dataset with known low rank, go ahead and use a truncated SVD. On the other hand, if you're uncertain about the rank of your matrix (and want to inspect a large spectrum of singular values), you'll have to wait for a full SVD instead.

Source code:

import time
import numpy as np
import sklearn.decomposition
import seaborn
RANK = 50
N_COLS = 1000
def evaluate_svd(svd_fn, reconstruct_fn, min_rows=100, max_rows=5000, n_samples=100, n_cols=N_COLS, rank=RANK, random_seed=0):
np.random.seed(random_seed)
elapsed_times = []
errors = []
n_rows_array = np.random.randint(low=min_rows, high=max_rows + 1, size=n_samples)
n_rows_array.sort()
for n_rows in n_rows_array:
# construct a low-rank matrix
left = np.random.randn(n_rows, rank)
right = np.random.randn(rank, n_cols)
full = np.dot(left, right)
# how long does it take to perform the SVD?
start_t = time.time()
svd_outputs = svd_fn(full)
end_t = time.time()
elapsed_t = end_t - start_t
elapsed_times.append(elapsed_t)
# compute mean absolte error of reconstruction
reconstructed = reconstruct_fn(svd_outputs)
diff = full - reconstructed
mae = np.mean(np.abs(diff))
errors.append(mae)
print("n_rows=%d ==> time = %0.4f, MAE = %0.8f" % (n_rows, elapsed_t, mae))
max_error = np.max(errors)
print("Max Error=%f" % max_error)
assert max_error < 0.0000001
return n_rows_array, elapsed_times, errors
# Full SVD with NumPy
def np_svd(X):
return np.linalg.svd(X, full_matrices=False, compute_uv=True)
def np_inv_svd(svd_outputs):
U, s, V = svd_outputs
return np.dot(U, np.dot(np.diag(s), V))
# Truncated SVD with scikit-learn
def sklearn_svd(X, rank=RANK):
tsvd = sklearn.decomposition.TruncatedSVD(rank)
X_reduced = tsvd.fit_transform(X)
return (tsvd, X_reduced)
def sklearn_inv_svd(svd_outputs):
tsvd, X_reduced = svd_outputs
return tsvd.inverse_transform(X_reduced)
# Perform timings
n_rows, np_times, np_errors = evaluate_svd(np_svd, np_inv_svd)
def sklearn_randomized_svd(X, rank=RANK):
tsvd = sklearn.decomposition.TruncatedSVD(rank, algorithm="randomized", n_iter=1)
X_reduced = tsvd.fit_transform(X)
return (tsvd, X_reduced)
def sklearn_arpack_svd(X, rank=RANK):
tsvd = sklearn.decomposition.TruncatedSVD(rank, algorithm="arpack")
X_reduced = tsvd.fit_transform(X)
return (tsvd, X_reduced)
def sklearn_inv_svd(svd_outputs):
tsvd, X_reduced = svd_outputs
return tsvd.inverse_transform(X_reduced)
n_rows, sklearn_randomized_times, sklearn_randomized_errors = evaluate_svd(sklearn_randomized_svd, sklearn_inv_svd)
n_rows, sklearn_arpack_times, sklearn_arpack_errors = evaluate_svd(sklearn_arpack_svd, sklearn_inv_svd)
figure = seaborn.plt.figure(figsize=(10,10))
seaborn.plt.xlim(0, 5000)
seaborn.plt.ylim(0, 3)
seaborn.regplot(x=pd.Series(n_rows, name="n_rows"), y=pd.Series(np_times, name="elapsed time (s)"))
seaborn.regplot(x=pd.Series(n_rows, name="n_rows"), y=pd.Series(sklearn_randomized_times, name="elapsed time (s)"))
seaborn.regplot(x=pd.Series(n_rows, name="n_rows"), y=pd.Series(sklearn_arpack_times, name="elapsed time (s)"))
seaborn.plt.legend(("numpy.linalg.svd", "TruncatedSVD (randomized)", "TruncatedSVD (arpack)"))
seaborn.plt.title("Time to perform SVD (on matrices of rank 50 with 1000 columns)")