11
$\begingroup$

Say you have a very dense matrix that is 30000x30000 elements. The very dense matrix comes from the radiosity equation, which I discussed here.

Say you have Ax = B. You have B, and A is 30000x30000 elements. Solving: You need to find x given B.

Solving a matrix like this in C code actually introduces an interesting problem: you cannot actually store it all in memory at once.

So, I want to know about ways to solve this problem "in pieces" -- I want to mathematically decompose the 30000x30000 matrix into smaller sub-matrices (perhaps in chunks of 30000x20?), and somehow solve that way.

What algorithms exist to break down / solve a matrix "in pieces" or steps, that can be solved regardless of memory restrictions? It can be an iterative technique.

  • 8
    Could you clarify 'Solving'? Is it finding determinants, or adding, or multiplying, or inverting?2010-08-18
  • 0
    Please consider posting this on StackOVerflow. This might be off-topic here. In any case, in the current state your question is liable to be closed as 'not a real question' as you haven't specified what 'solving' means.2010-08-18
  • 0
    I have edited the question! This is actually not a stackoverflow question! The meat of the question is "How can I find x in Ax=B in a 'piecewise' sort of way"?2010-08-18
  • 1
    The techniques are typically just called "block" methods. If you have a 64 bit processor (so you can address the 7gigs of ram for A), then you can just use an off the shelf solver. If you can write a function that multiplies a vector by A (whether or not you actually have the entries of A in memory somewhere), then there are many efficient iterative solvers.2010-08-18
  • 1
    Also, is the matrix dense or sparse? This makes a big difference. The problem is likely intractable if dense.2010-08-18
  • 0
    If it is dense, you may be able to try some sort of guess and check code and get the solution as close as you need it for practical purposes, but I don't know much about the subject2010-08-18
  • 0
    _a very dense matrix_2010-08-19
  • 0
    Looking at your link to your other question, the only discernible structure your matrix has from that presentation is that it can be expressed in the form $\mathbf{I}-\mathbf{P}\mathbf{F}$, $\mathbf{I}$ an identity matrix, $\mathbf{P}$ a diagonal matrix of the reflectivities $\mathrm{diag}(\rho_1\dots\rho_n)$, and $\mathbf{F}$ the matrix of form factors. I would hope that there is something special about $\mathbf{F}$'s structure that can be exploited.2010-08-19
  • 1
    Otherwise, the only other observation I can make is that if all the eigenvalues of $\mathbf{P}\mathbf{F}$ are less than 1 in magnitude, you can expand the inverse of your matrix $\mathbf{I}-\mathbf{P}\mathbf{F}$ as a geometric series in $\mathbf{P}\mathbf{F}$: $\mathbf{I}+\mathbf{P}\mathbf{F}+\mathbf{P}^2\mathbf{F}^2+\dots$; this might prove to be useful.2010-08-19
  • 0
    Kaestur: apparently this is not a statistical application but a computer graphics one, per http://web.cs.wpi.edu/~matt/courses/cs563/talks/radiosity.html . PCA (essentially SVD in another garb) might prove to be too prohibitive in space and time for his application!2010-08-20
  • 0
    @J.Mangaldan: Thanks, I didn't look sufficiently closely at the comments when skimming through the answers. @bobobobo: I've added a link to your previous question to your question body here to provide more context.2010-08-20
  • 0
    I would be *very* reluctant to solve such a huge, dense linear system directly. Partitioning a la J.M.'s answer will help with memory usage but will not improve the (extremely depressing) time complexity. I haven't studied your particular system in detail, but I recommend considering if this problem can be formulated in a way that makes it amenable to hierarchical/multigrid methods. (For instance by grouping together nearby, nearly parallel faces and downsampling them.)2011-11-16

4 Answers 4

19

The first question that should be asked of any person presenting you with a large-ish matrix: "is the matrix dense or sparse?" In the latter case, one definitely does not need to allocate space for the zero entries, and should thus look into special techniques to storing them(which as I understand nowadays rely on a liberal dose of graph theory in the general case, though band matrices are still handled by storing their bands in an appropriate format).

Now, if even after that you have the perfect excuse for having a large dense matrix (which I still believe is quite unlikely), there is a way to invert and take the determinant of a large matrix via partitioning.

Say we have

$\textbf{A}=\begin{pmatrix}\textbf{E}&\textbf{F}\\ \textbf{G}&\textbf{H}\end{pmatrix}$

where $\textbf{E}$ and $\textbf{H}$ are square matrices with dimensions $m\times m$ and $n\times n$ respectively, and $\textbf{F}$ and $\textbf{G}$ are dimensioned appropriately (so the dimension of $\textbf{A}$ is $(m+n)\times(m+n)$). The inverse can then be computed as

$\textbf{A}^{-1}=\begin{pmatrix}\textbf{E}^{-1}+\left(\textbf{E}^{-1}\textbf{F}\right)(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\left(\textbf{G}\textbf{E}^{-1}\right)&-\left(\textbf{E}^{-1}\textbf{F}\right)(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\\-(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\left(\textbf{G}\textbf{E}^{-1}\right)&(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\end{pmatrix}$

where the parentheses emphasize common factors that you might profit from computing only once.

As for the determinant, it is given by

$\det\;\textbf{A}=\det\left(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F}\right)\;\det\;\textbf{E}$

EDIT:

I'm not entirely sure what seemed to be unsatisfactory with this answer, so here's a bit more of exposition: as always, the "inverse" here is merely notation! One would most certainly first perform LU decomposition on $\textbf{E}$ and $\textbf{H}$ first. One would also partition the right-hand side $\textbf{b}$ accordingly:

$\textbf{b}=\begin{pmatrix}\textbf{c}\\ \textbf{d}\end{pmatrix}$

so $\textbf{c}$ is a length-m vector and $\textbf{d}$ is a length-n vector.

Formally multiplying the partitioned inverse with the partitioned right-hand side gives

$\begin{pmatrix}\textbf{E}^{-1}+\left(\textbf{E}^{-1}\textbf{F}\right)(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\left(\textbf{G}\textbf{E}^{-1}\right)&-\left(\textbf{E}^{-1}\textbf{F}\right)(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\\-(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\left(\textbf{G}\textbf{E}^{-1}\right)&(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\end{pmatrix}\cdot\begin{pmatrix}\textbf{c}\\ \textbf{d}\end{pmatrix}$

which when expanded and simplified is

$\begin{pmatrix}\textbf{E}^{-1}\textbf{c}+\left(\textbf{E}^{-1}\textbf{F}\right)(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\left(\textbf{G}\textbf{E}^{-1}\textbf{c}-\textbf{d}\right)\\-(\textbf{H}-\textbf{G}\textbf{E}^{-1}\textbf{F})^{-1}\left(\textbf{G}\textbf{E}^{-1}\textbf{c}-\textbf{d}\right)\end{pmatrix}$

At this point you should be able to figure out how you would use an available decomposition of $\textbf{E}$ or $\textbf{H}$, and which common subexpressions can be just computed once.

  • 0
    Hmm, there were new comments as I was typing this out; so the first formula would be the one applicable to you. It is up to you on how you will organize the computation of subexpressions in your code, just remember to partition your right hand side conformally with the matrix.2010-08-18
  • 0
    __Thank you__! This is very helpful. Conceivably you could subdivide the matrix recursively then, and solve that way. I will leave this question open for a few days, in case there are any other solutions out there!2010-08-19
  • 0
    bobobobo: "Recursively"... that would sound like setting yourself up for a administrative nightmare when you start writing the program. Two things: 1. you still haven't said why you have a dense matrix in the first place or where it came from. I would hope that it would have some sort of exploitable structure that you won't have to employ the baroque formula in my answer. 2. Again, it is a bad idea to maintain explicit inverses; store the matrices indicated as inverses in the formula above as an appropriate decomposition.2010-08-19
  • 0
    The very dense matrix comes from the radiosity equation, which I discussed here: http://math.stackexchange.com/questions/2180/how-can-i-characterize-the-type-of-solution-vector-that-comes-out-of-a-matrix. The matrix is usually quite dense!2010-08-19
  • 0
    Ah, now I have seen where I obtained the formula from: check out "Computational Methods of Linear Algebra" by Faddeev and Faddeeva; the treatment is a bit dated but can still be useful.2010-08-20
  • 0
    Maybe too obvious to point out now, but... the choice of whether one performs an LU decomposition on $\mathbf{H}-\mathbf{G}\mathbf{E}^{-1}\mathbf{F}$, or just on $\mathbf{H}$ and then applying Sherman-Morrison-Woodbury would probably be a matter of taste and/or convenience.2010-08-20
  • 0
    J.M., just out of curiosity, where did the intuition behind the decomposition of A come from? I see that it works, but would you happen to know the path from which it was derived?2011-01-10
  • 0
    Terrence Tao makes use of the same in a recent blog post: https://terrytao.wordpress.com/2017/09/16/inverting-the-schur-complement-and-large-dimensional-gelfand-tsetlin-patterns/ , and puts it in relation to a "Schur complement".2017-10-04
8

This is too long for a comment, so I'll throw in this choice piece by Forman Acton in his Numerical Methods That Work on the subject of dealing with large matrices:

Whenever a person eagerly inquires if my computer can solve a set of 300 equations in 300 unknowns, I must quickly suppress the temptation to retort, "Yes, but why bother?" There are, indeed, legitimate sets of equations that large. They arise from replacing a partial differential equation on a set of grid points, and the person who knows enough to tackle this type of problem also usually knows what kind of computer he needs. The odds are all too high that our inquiring friend is suffering from a quite different problem: he probably has collected a set of experimental data and is now attempting to fit a 300-parameter model to it - by Least Squares! The sooner this guy can be eased out of your office, the sooner you will be able to get back to useful work - but these chaps are persistent. They have fitted three-parameter models on desk machines with no serious difficulties and now the electronic computer permits them more grandiose visions. They leap from the known to the unknown with a terrifying innocence and the perennial self-confidence that every parameter is totally justified. It does no good to point out that several parameters are nearly certain to be competing to "explain" the same variations in the data and hence the equation system will be nearly indeterminate. It does no good to point out that all large least-squares matrices are striving mightily to be proper subsets of the Hilbert matrix-which is virtually indeterminate and uninvertible—and so even if all 300 parameters were beautifully independent, the fitting equations would still be violently unstable. All of this, I repeat, does no good—and you end up by getting angry and throwing the guy out of your office...

...The computer center's director must prevent the looting of valuable computer time by these would-be fitters of many parameters. The task is not a pleasant one, but the legitimate computer users have rights, too. The alternative commits everybody to a miserable two weeks of sloshing around in great quantities of "Results" that are manifestly impossible, with no visible way of finding the trouble. The trouble, of course, arises from looking for a good answer to a poorly posed problem, but a computer director seldom knows enough about the subject matter to win any of those arguments with the problem's proposer, and the impasse finally has to be broken by violence—which therefore might as well be used in the very beginning.

  • 3
    :) interesting, but not really an answer!2010-08-19
  • 0
    It's supposed to be a comment, hence "community wiki".2010-08-19
  • 0
    Googles page rank gives large matrices to solve: https://en.wikipedia.org/wiki/PageRank#Algebraic2017-06-27
  • 1
    Deep learning and many other neural network algorithms in machine learning require large matrix operations, and improving these algorithms may require solving any number of difficult linear algebra problems using huge (but probably sparse) matrices . This quote seems out of date to me.2017-09-13
  • 0
    The essence mostly still stands, @Charles. People who have sparse matrices (e.g. those who deal with PDEs or machine learning) should certainly know better and use specialized methods, but a large dense unstructured matrix should still give you pause for thought.2017-09-13
1

If all you are interested in is solving $A x = B$, I would suggest looking at iterative and/or block-iterative methods. Noting your other question asking about positivity of solutions, I would suggest that you look at Applied and Computational Linear Algebra: A First Course by Charlie Byrne. The MART algorithm described therein is used for finding positive $x$ solutions to $A x = B$m, and addresses many similar problems from tomography and signal processing. If you like it get his 'Applied Iterative Methods'.

  • 0
    Dense, possibly unsymmetric and unstructured... I don't have high hopes for iterative approaches working well in his application.2010-08-19
  • 0
    Density is likely a problem (it will be a problem with any method on matrices that large. The finiteness of floating point precision begins to hurt.) Iterative methods no longer need be symmetric, positive definite, or diagonally dominant, etc., to work. They do have the advantage of giving positive solutions, which it appears that he wants.2010-08-19
  • 0
    Indeed, he would do well to have a look at the condition number of his matrix; if its logarithm is even just half of the number of digits of precision available he's short on luck.2010-08-19
  • 0
    Aha: http://web.cs.wpi.edu/~matt/courses/cs563/talks/radiosity.html and particularly "it is diagonally dominant ... and the upper right of the matrix is computable from the lower left." Maybe your proposal would be a better use of computing power than mine!2010-08-19
0

Searching for "Out of core LU" on google might help you.

Essentially, there are two simple choices:

  1. An iterative method that merely depends on repeatedly computing A*v and perhaps A'*v for a given vector 'v'. Such a computation can be done without having the entire matrix A in RAM

  2. A direct method, for this the "out of core LU" might help.

In general, for both choices above, searching for external memory algorithms will be useful.

  • 0
    The matrix of interest is diagonally dominant, but unsymmetric and dense. An iterative method might work, but I think coming up with the matrix multiplication black box is going to be tough (though partitioning again might be useful here).2010-09-04