Saturday, June 04, 2022

Givens Rotations vs. Partitioned Matrix of Triangular Matrices.

Copyright 2022 by John Halleck.
All rights reserved except as modified by the
GNU Free Documentation License version 1.3 or later.

Last Edited July 29th, 2023 - John Halleck Minor terminology fixes.

*** This page is truely a work in progress, expect it to change from time to time.


Requires:
Knowing what a triangular matrix is.
Basic familiarity with Matrices (transposes, inverses, identities, ...)
Basic familiarity with Orthogonal Matrices, At least for the definition.
Knowledge of what a "Least Squares" solution is.
Very Basic familiarity with the Givens Rotation
Some understanding of the "fill in" properties of Givens Rotations.

As shown in my "fill in" properties of Givens Rotations paper. If you do a givens rotation using any two rows of a triangular matrix, trying to zero out one of non-diagonal non-zeros then you WILL introduce non-zeros into the zeros half of triangular matrix. This makes it non-triangular, and loses all the nice mathematical properties of triangular matrices. This therefore defeats the purpose of having triangular matrices in the first place.

Some people might assume that this means that Givens Rotations are not appropriate for triangular matrices.


Here is the critical observation: If you pick two rows, *In Different triangles* of the same size, and in the same column, and they are both the same numbered row of the respective triangles, they will have *EXACTLY* the same sparcity pattern. This is true both before and after the Givens rotation.
Fill in can only happen (see the sparsity post referenced above) in the case that one row has a non-zero value, and the other doesn't. In those specific cases, the zero item becomes non-zero. For example, using "*" for a value that can possibly be non-zero:

$\begin{bmatrix} * & & \\ * & * & \\ * & * & * \\ \\ * & & \\ * & * & \\ * & * & * \\ \end{bmatrix} $

Consider a Givens Rotation of the first row of each triangle to zero out the first item in the first row of the other. The givens rotation is seeing

$\begin{bmatrix} * & 0 & 0 \\ * & 0 & 0 \end{bmatrix}$

Which can not possibly cause any fill in.
The same reasoning applies to the 2nd row of each... only the first two values can end up with non-zeros. and so one to the end of whatever size the matrix is.
Of course, if the outer matrix has zero matrices, then there can be fill in when one matrix is zero:
Consider:

$\begin{bmatrix} * & & \\ * & * & \\ * & * & * \\ \\ 0 & & \\ 0 & 0 & \\ 0 & 0 & 0 \\ \end{bmatrix} $

In this case,
A Givens of the first row of each will only fill in column 1 in the zeroed matrix.
A Givens of the second row of each will only fill in columns 1, and 2 in the zeroed matrix.
A Givens of the third row of each will only fill in columns 1, 2, and 3 in the zeroed matrix.
So the fillin only makes the zero matrix into another triangular matrix.

Of course, Givens rotation applications need to be applied to the paired rows of every single triangular matrix all the way across the two rows of matrix housing these triangular matrices.

Why triangular matrices? Because they have much lower operation counts for almost anything you would like to do with them. (Even finding the inverse of such a matrix is near trivial.]

Footnotes

1) A massive amount of mathematics is has Gauss' name attached to it. The Gauss in question is Karl Friedrich Gauss, a German Mathematian, who lived 30 April 1777 – 23 February 1855

Wednesday, June 01, 2022

Sparsity and the Givens Rotation.

Last edited Dec 14th, 2023 - John Halleck

Requires:
Basic familiarity with Matrices (transposes, inverses, identities, ...)
Basic familiarity with Orthogonal Matrices.
Very Basic familiarity with the Givens Rotation

Gives:
Some introduction to the "fill in" properties of Givens Rotations.

Every so often, on sites that answer math questions, I see a confused request for help something more or less like the following:

"I read up on Givens Rotations, and how they can be used to make a matrix Upper (or Lower) triangular by zeroing out the part of the matrix below (or above) the diagonal. When I do this I can see that it does do that. *BUT*, when I zero out everything below the diagonal, and then zero out everything above the diagonal, I expect to get the diagonal but instead I've filled in elements set that on the side that was only zeros. What's going on here?"

This normally gets a number of correct answers quoting famous theorems the the person asking not only has never heard of before, but doesn't yet have the mathematics background to even read. The asker reframes the question and gets even more esoteric answers. And they are scared off never to return.

So... This post is intended to answer the question directly without reference to any mathematics beyond the rotation itself. It also goes on to discuss in more detail some of the properties of Givens Rotations on sparsity.

What does the Givens Rotation REALLY do?

Text books explaining Givens Rotations usually show how to zero out an element of a matrix, showing only the column that the zeroing is in. But they don't usually show how it effects the entire rest of the row. It is what happens to the rest of the row that eventually answers the original question.

For an example with several columns (which is almost always the case), let's trace what a Givens Rotation actually does for it. Each column is contrived to demonstrate some different situation.

Given: $\begin{bmatrix} x & w & a & 0 & 0 \\ y & z & 0 & b & 0 \end{bmatrix}$

this is really just two rows of a much larger matrix in general.
This example is contrived to have a column with two abitrary column non-zero entries which we are trying to clear the lower entry with a Givens Rotation.
This is followed by a column that with a zero lower entry and a non-zero upper entry.
And then we have a column with only a non-zero entry in the lower entry.
And finally a column where both relevant entries are zero.

We will expand out the sines and cosines into the equivalent fractions
If this form is unfamiliar to you, see my Introduction to the Givens Rotation

$\begin{bmatrix}
\dfrac{ x}{\sqrt{x^2+y^2}} & \dfrac{y}{\sqrt{x^2+y^2}} \\ \dfrac{-y}{\sqrt{x^2+y^2}} & \dfrac{x}{\sqrt{x^2+y^2}} \end{bmatrix} \begin{bmatrix} x & w & a & 0 & 0 \\ y & z & 0 & b & 0 \end{bmatrix} $

$= \begin{bmatrix} \dfrac{x^2+y^2}{\sqrt{x^2+y^2}} & \dfrac{xw+yz} {\sqrt{x^2+y^2}} & \dfrac{ xa} {\sqrt{x^2+y^2}} & \dfrac{yb} {\sqrt{x^2+y^2}} & 0 \\ \dfrac{-yx+xy} {\sqrt{x^2+y^2}} & \dfrac{-yz+xw}{\sqrt{x^2+y^2}} & \dfrac{-ya} {\sqrt{x^2+y^2}} & \dfrac{xb} {\sqrt{x^2+y^2}} & 0 \end{bmatrix} $

$= \begin{bmatrix} \sqrt{x^2+y^2} & \dfrac{xw+yz}{\sqrt{x^2+y^2}} \dfrac{ xa} {\sqrt{x^2+y^2}} & \dfrac{yb} {\sqrt{x^2+y^2}} & 0 \\ 0 & \dfrac{-yz+xw}{\sqrt{x^2+y^2}} \dfrac{-ya} {\sqrt{x^2+y^2}} & \dfrac{xb} {\sqrt{x^2+y^2}} & 0 \end{bmatrix}
$

As can be seen by inspection: After an application of the Givens Rotation, we have:

  • In the column that was used to form the rotation, the lower row will be cleared.
  • In any other column that has some non-zero entry, the result will have a non-zero entry in BOTH rows of that column.
  • In the final case, if BOTH entries are zero, THEN the column has only zeros.

  • Now we can finally answer the original question.

    Assume you have a lower triangular matrix. (Non-zeros on the diagonal, zeros above it.)

    Assume you want to zero out the non-zeroes to leave a diagonal matrix.

    This can't be done because no matter what two rows you pick, the lower row will have a diagonal element that is non-zero, and all entries above it will be zero, including that column in the upper row. And we just saw that if either row of a Givens Rotation is non-zero, then after the rotation they both do. So we have now introduced a non-zero element above the diagonal.

    Because it has an element above the diagonal, it is no longer triangular since it has non-zeros on both sides of the diagonal. So the best one can hope for is to make a triangular matrix... not a diagonal one. Similar reasoning applies if it starts our upper triangular, just swap upper and lower in the proof. Q.E.D.


    Navigation:

    Previous article | Next article