From oleg Tue Feb 27 11:31:08 CST 1996 Newsgroups: comp.sys.mac.programmer.tools,comp.sys.mac.scitech,sci.math.num-analysis X-also-newsgroups: comp.c++.moderated (mailto:c++-submit@netlab.cs.rpi.edu) Subject: (lazy) SVD promising a matrix in extended LinAlg.shar Summary: new and better tricks in a new release of Linear Algebra C++ classlib Organization: University of North Texas, Denton Keywords: lazy objects, nested functions, BLAS, Numerical Math, SVD, interpolation, least squares, optimization, regularization, pseudoinverse An old (but still largely obscure) C++ concept of *lazy* matrices and vector promises got a facelift in a new, extended version of LinAlg.shar. The new version is sure wider: it now includes a Hooks-Jeevse multidimensional optimization, Aitken-Lagrange interpolation over a grid of uniform or arbitrary mesh, and SVD, the venerable Singular Value Decomposition of a rectangular matrix, with applications to least squares problems, matrix (pseudo)inverse, and a regularized solution of simultaneous linear equations. A README file tells more about the changes, including a newfound portability: the LinAlg code compiles as it is under UNIX (using gcc) and on a Mac (with Codewarrior 7/8). The new version has more "promises" (pun intended) and iterators, which is what this article is all about. Just to remind, a lazy matrix is not a matrix at all, but merely a "recipe" (a promise) how to make it. The promise is obviously tiny compared to the whole matrix. That's why it's _fast_ and convenient to return a lazy object as a result of a function and/or pass it around. The full matrix would be rolled out only when and where it's really needed. For example, Matrix haar = haar_matrix(9); haar_matrix is a *class*, not a simple function. However similar this looks to returning of a (512x512) matrix, it's dramatically different. haar_matrix() constructs a LazyMatrix, an object of just a few bytes long. A special "Matrix(const LazyMatrix& recipe)" constructor follows the recipe and makes the Haar matrix right in place. No matrix element is moved whatsoever! Another example of matrix promises is test_svd_expansion(const Matrix&); REAL array[] = {0,-4,0, 1,0,0, 0,0,9 }; test_svd_expansion(MakeMatrix(3,3,array,sizeof(array)/sizeof(array[0]))); Here, MakeMatrix is a LazyMatrix that promises to deliver a matrix filled in from a specified array. The function test_svd_expansion() forces the promise: the compiler makes a temporary matrix, fills it in using the LazyMatrix's recipe, and passes it to test_svd_expansion(). Once the function returns, the temporary matrix is disposed of. All this goes behind the scenes. Of course that temporary matrix could've been saved, but we really don't need it in this example, we don't even care to give it a name. Promises go along very well with iterators: class square_add : public LazyMatrix, public ElementAction { const Vector &v1; Vector &v2; void operation(REAL& element) { assert(j==1); element = v1(i)*v1(i) + v2(i)*v2(i); } void fill_in(Matrix& m) const { m.apply(*this); } public: square_add(const Vector& _v1, Vector& _v2) : LazyMatrix(_v1.q_row_lwb(),_v1.q_row_upb(),1,1), v1(_v1), v2(_v2) {} }; Vector vres = square_add(v2,v3); Vector vres1 = v2; vres1 = square_add(v2,v3); Here square_add promises to deliver a vector with elements being sums of squares of elements of two other vectors. The promise is forced either by a Vector constructor, or by an assignment to another vector (in the latter case, a check is made that the dimensions of the promise are compatible with those of the target). In either case, computation of new vector elements is done "inplace", no temporary storage is ever allocated/used. Iteration is also done "behind the scenes", relieving the user of worries about index range checking, etc. Not surprisingly, slothness is rewarded: returning a lazy object wins 2:1 (SunSparc20 66MHz) over the traditional method of spitting an object itself; check out a primitive benchmark sample_adv.{cc,dat}. In _many_ regards the C++ lazy matrices act as "closures" in such languages as Scheme, ML, etc. Indeed, a lazy object contains a procedure (the recipe) and an "environment", in which the procedure is called. To put it simply, the environment is parameters for the recipe. Like with Scheme closures, the environment is "captured" at the moment the promise is made, at a run time. I must admit, this "capturing of the environment" is a bit trickier than it looks. For example, haar_matrix() needs a very simple environment, the size (the order, actually) of a Haar matrix to make; well, it's not so difficult to capture a single integer value. square_add() on the other hand needs two vectors (addenda) as its environment. You can capture them either by value, or by reference. You bet, the latter is the most efficient, but as usual, it comes at a price... If a lazy object was created to read something >from a file, then the entire file is the environment, which is quite a big deal to capture by value. By taking a reference, you have to assume responsibility not to mess with the file until the promise is fulfilled. BTW, if some "environment" variable is captured by reference, there is an opportunity to modify the referred variable. That is, this data item acts like a regular "up-the-scope" variable in languages that allow truly nested functions. BTW, it's easy to guard against wayward recipes modifying "outer-scope" data: just capture by const-ref. Thus C++ closures have a bonus of access control. On the other hand, Scheme promises have memoization (that is, "cashing" of the computed value). This feature can be easily added to C++ promises; although I have yet to come across any situation where it's needed. Vector promises are a boon for many numerical algorithms, especially the ones that compute brand new vectors/matrices and _return_ them. One example is solution of an overspecified set of linear equations (in the least-squares sense): SVD svd(A); cout << "condition number of matrix A " << svd.q_cond_number(); Vector x = SVD_inv_mult(svd,b); // Solution of Ax=b The last function "left-divides" vector b by a (possibly non-square) matrix A. Since the function (a constructor, actually) returns just a promise of this division, the actual computation occurs when it's required (while constructing a new vector in the main function). Note, b doesn't have to be merely a vector, it can be a thick matrix, too. I'm aching to "templetize" the Matrix class. It would make especially good sense for matrix iterators (I don't mean STL's iterators here, I mean iterators that iterate themselves). Right now, apply() etc. matrix/vector iterators are implemented using abstract classes, which is rather inefficient. With templates, I could do the whole iteration "in-line". The most elegant solution requires member function templates, though. Neither gcc 2.7.2 nor CodeWarrior 8 support them. And gcc isn't very keen on templates still... In any case, I'm convinced this "promising" stuff is very cool: it makes computation just as efficient as the one with reference counting, "copy-on-write" assignments, etc, but without their overhead, and using _much_ simpler means. Also, unlike STL iterators, "natural" iterators don't have to carry/check status/state information (and they work in a "trusted" environment). I can (and probably will ;) rant about it in more detail if asked. The LinAlg package itself is available from netlib (say, ftp://netlib.att.com/netlib/c++/lin_alg.shar.Z or its mirrors), or, if you prefer [obsolete URL] For convenience, I prepared a Mac archive: [obsolete URL] /info-mac/dev/lib/lin-alg-cpp.hqx it has the identical source code, a little bit different test run outputs (due to the differences in FP compilation/computation between SunSparc and PowerPC), CodeWarrior projects and a compiled PowerMac library. Both archives are self-contained! The new README file (with a change log) can be grabbed separately http://pobox.com/~oleg/ftp/LinAlg.README.txt I will really appreciate any comments, questions and suggestions, mailto:oleg@pobox.com