Video Lectures - Lecture 26
These video lectures of Professor Gilbert Strang teaching 18.06 were recorded in Fall 1999 and do not correspond precisely to the current edition of the textbook. However, this book is still the best reference for more information on the topics covered in each lecture.
Strang, Gilbert. Introduction to Linear Algebra. 4th ed. Wellesley, MA: Wellesley-Cambridge Press, February 2009. ISBN: 9780980232714.
Instructor/speaker: Prof. Gilbert Strang
Free downloads: Free streaming: |
Resources: |
|
Transcript - Lecture 26
Okay. This is a lecture where complex numbers come in. It's a -- complex numbers have slipped into this course because even a real matrix can have complex eigenvalues. So we met complex numbers there as the eigenvalues and complex eigenvectors.
And we -- or -- this is probably the last -- we have a lot of other things to do about eigenvalues and eigenvectors. And that will be mostly real.
But at one point somewhere, we have to see what you do when the numbers become complex numbers.
What happens when the vectors are complex, when the matrixes are complex, when the -- what's the inner product of two, the dot product of two complex vectors -- we just have to make the change, just see -- what is the change when numbers become complex? Then, can I tell you about the most important example of complex matrixes? It comes in the Fourier matrix.
So the Fourier matrix, which I'll describe, is a complex matrix. It's certainly the most important complex matrix. It's the matrix that we need in Fourier transform. And the -- really, the special thing that I want to tell you about is what's called the fast Fourier transform, and everybody refers to it as the FFT and it's in all computer and it's used -- it's being used as we speak in a thousand places, because it has, like, transformed whole industries to be able to do the Fourier transform fast, which means multiplying -- how do I multiply fast by that matrix -- by that n by n matrix? Normally, multiplications by an n by n matrix -- would normally be n squared multiplications, because I've got n squared entries and none of them is zero.
This is a full matrix. And it's a matrix with orthogonal columns. I mean, it's just, like, the best matrix. And this fast Fourier transform idea reduces this n squared, which was slowing up the calculation of Fourier transforms down to n log(n). n log(n), log to the base two, actually. And it's this -- when that hit -- when that possibility hit, it made a big difference.
Everybody realized gradually what, -- that this simple idea -- you'll see it's just a simple matrix factorization -- but it changed everything. Okay.
So I want to talk about complex vectors and matrixes in general, recap a little bit from last time, and the Fourier matrix in particular. Okay.
So what's the deal? All right.
The main point is, what about length? I'm given a vector, I have a vector x.
Or let me call it z as a reminder that it's complex, for the moment. But I can -- later I'll call the components x. They'll be complex numbers.
But it's a vector -- z1, z2 down to zn. So the only novelty is it's not in R^n anymore. It's in complex n dimensional space. Each of those numbers is a complex number. So this z,z1 is in C^n, n dimensional complex space instead of R^n.
So just a different letter there, but now the point about its length is what? The point about its length is that z transpose z is no good. z transpose z -- if I just put down z transpose here, it would be z1, z2, to zn.
Doing that multiplication doesn't give me the right thing.
W-Why not? Because the length squared should be positive.
And if I multiply -- suppose this is, like, 1 and i. What's the length of the vector with components 1 and i? What if I do this, so n is just two. I'm in C^2, two dimensional space, complex space with the vector whose components are 1 and i. All right.
So if I took one times one and i times i and added, z transpose z would be zero. But I don't -- that vector is not -- doesn't have length zero -- the vector with the components 1 and i -- this multiplication -- what I really want is z1 conjugate z1.
You remember that z1 conjugate z1 is -- so you see that first step will be z1 conjugate z1, which is the magnitude of z1 squared, which is what I want. That's, like, three squared or five squared. Now, if it's -- if z1 is i, then I multiplied by minus i gives one plus one, so the component of length -- the component i, its modulus squared is plus one.
That's great. So what I want to do then is do that -- I want z1 bar z1, z2 bar z2, zn bar zn.
And remember that -- you remember this complex conjugate.
So -- so there's the point. Now I can erase the no good and put is good, because that now gives the answer zero for the zero vector, of course, but it gives a positive length squared for any other vector. So it's a -- it's the right definition of length, and essentially the message is that we're always going to be taking -- when we transpose, we also take complex conjugate.
So let's -- let's find the length of one -- so the vector one i, that's z, that's that vector z.
Now I take the conjugate of one is one, the conjugate of i is minus i. I take this vector, I get one plus one -- I get two.
So that's a vector and that's a vector of length -- square root of two. Square root of two is the length and not the zero that we would have got from one minus i squared. Okay.
So the message really is whenever we transpose, we also take conjugates. So here's a symbol -- one symbol to do both. So that symbol H, it stands for a guy named Hermite, who didn't actually pronounce the H, but let's pronounce it -- so I would call that z Hermitian z.
I'll -- let me write that word, Herm- so his name was Hermite, and then we make it into an adjective, Hermitian.
So z Hermitian z. z H z.
Okay. So, that's the -- that's the, length squared. Now what's the inner product? Well, it should match. The inner product of two vectors -- so inner product is no longer -- used to be y transpose x. That's for real vectors.
For complex vectors, whenever we transpose, we also take the conjugate. So it's y Hermitian x.
Of course it's not real anymore, usually.
That -- the inner product will usually be complex number.
But if y and x are the same, if they're the same z, then we have z -- z H z, we have the length squared, and that's what we want, the inner product of a vector with itself should be its length squared.
So this is, like, forced on us because this is forced on us. So -- so this z -- this -- everybody's picking up what this equals.
This is z1 squared plus zn squared.
That's the length squared. And that's the inner product that we have to go with. So it could be a complex number now. One more change.
Well, two more changes. We've got to change the idea of a symmetric matrix. So I'll just recap on symmetric matrixes. Symmetric means A transpose equals A, but not -- no good if A is complex. So what do we instead -- that applies perfectly to real matrixes. But now if my matrixes were complex, I want to take the transpose and the conjugate to equal A. So there's -- that's the -- the right complex version of symmetry.
The com- the symmetry now means when I transpose it, flip across the diagonal and take conjugates. So, for example -- here would be an example. On the diagonal, it had better be real, because when I flip it, the diagonal is still there and it has to -- and then when I take the complex conjugate it has to be still there, so it better be a real number, let me say two and five.
What about entries off the diagonal? If this entry is, say, three plus i, then this entry had better be -- because I want whatever this -- when I transpose, it'll show up here and i conjugate.
So I need three minus I there. So there's a matrix with -- that corresponds to symmetry, but it's complex.
And those matrixes are called Hermitian matrixes.
Hermitian matrixes. A H equals A.
Fine. Okay, that's -- and those matrixes have real eigenvalues and they have perpendicular eigenvectors. What does perpendicular mean? Perpendicular means the inner product -- so let's go on to perpendicular. Well, when I had perpendicular vectors, for example, they were like q1, q2 up to qn. That's my -- q is my letter that I use for perpendicular. Actually, I usually -- I also mean unit length. So those are perpendicular unit vectors. But now what does -- so it's a -- orthonormal basis, I'll still use those words, but how do I compute perpendicular? How do I check perpendicular? This means that the inner product of qi with qj -- but now I not only transpose, I must conjugate, right, to get zero if i is not j and one if i is j.
So it's a unit vector, meaning unit length, orthogonal -- all the angles are right angles, but these are angles in complex n dimensional space.
So it's q1, q on- qi bar transpose.
Or, for short, qi H qj.
So it will still be true -- so let me -- again I'll create a matrix out of those guys. The matrix will have these q-s in its columns, q2 to qn.
And I want to turn that into matrix language, just like before. What does that mean? That means I want all these inner products, so I take these columns of Q, multiply by their rows -- so it was -- it used to be Q -- it used to be Q transpose Q equals I, right? This was an orthogonal matrix.
But what's changed? These are now complex vectors. Their inner products are -- involve conjugating the first factor. So it's not -- it's the conjugate of Q transpose.
It's Q bar transpose Q. Q H.
So can I call this -- let me call it Q H Q, which is I. So that's our new -- you -- you see I'm just translating, and the -- the book h- on one page gives a little dictionary of the right words in the real case, R^n, and the corresponding words in the complex case for the vector space C^n.
Of course, C^n is a vector space, the numbers we multiply are now complex numbers -- we're just moving into complex n dimensional space. Okay.
Now -- actually, I have to say we changed the word symmet- symmetric to Hermitian for those matrixes. People also change this word orthogonal into another word that happens to be unitary, as a word that applies -- that signals that we might be dealing with a complex matrix here. So what's a unitary matrix? It's a -- it's just like an orthogonal matrix.
It's a square, n by n matrix with orthonormal columns, perpendicular columns, unit vectors -- unit vectors computed by -- and perpendicularity computed by remembering that there's a conjugate as well as a transpose. Okay. So those are the words. Now I'm ready to get into the substance of the lecture which is the most famous complex matrix, which happens to be one of these guys.
It has orthogonal columns, and it's named after Fourier because it comes into the Fourier transform, so it's the matrix that's all around us.
Okay. Let me tell you what it is first of all in the n by n case. Then often I'll let n be four because four is a good size to work with.
But here's the n by n Fourier matrix.
Its first column is the vector of ones. It's n by n, of course. Its second column is the powers, the -- actually, better if I move from the math department to EE for this one half hour and then, please, let me move back again. Okay.
What's the difference between those two departments? It's just math starts counting with one and electrical engineers start counting at zero. Actually, they're probably right.
So anyway, we'll give them -- humor them.
So this is really the zeroes column.
And the first column up to the n-1, that's the one inconvenient spot in electrical engineering. All these expressions start at zero, no problem, but they end at n-1.
Well, that's -- that's the difficulty of Course 6. So what's -- they're the powers of a number that I'm going to call W -- W squared, W cubed, W to the -- now what is the W here? What's the power? This was the zeroes power, first power, second power, this will be n minus first power. That's the column.
What's the next column? It's the powers of W squared, W to the fourth, W to the sixth, W to the two n-1. And then more columns and more columns and more columns and what's the last column? It's the powers of -- let's see.
We -- actually, if we look around rows, w- this matrix is symmetric. It's symmetric in the old not quite perfect way, not perfect because these numbers are complex. And so it's -- that first row is all ones. One W W squared up to W to the n-1. That's the last column is the powers of W to the n-1, so this guy matches that, and finally we get W to something here. I guess we could actually figure out what that something is.
What are the entries of this matrix? The i j entry of this matrix are -- I going to -- are you going to allow me to let i go from zero to n minus one? So i and g go from zero to n-1.
So the one -- the zero zero entry is a one -- it's just this same W guy to the power i times j.
Let's see. I'm jumping into formulas here and I have to tell you what W is and then you know everything about this matrix. So W is the -- well, shall we finish here? What was this -- this is the (n-1) (n-1) entry. This is W to the n-1 squared.
Everything's looking like a mess here, because we have -- not too bad, because all the entries are powers of W.
There -- none of them are zero. This is a full matrix.
But W is a very special number. W is the special number whose n-th power is one. In fact -- well, actually, there are n numbers like that.
One of them is one, of course.
But the one we -- the W we want is -- the angle is two pi over n. Is that what I mean? n over two pi. No, two pi over n.
W is E to the I and the angle is two pi over n.
Right. Where is this W in the complex plane? It's -- it's on the unit circle, right? This is -- it's the cosine of two pi over n plus I times the sine of two pi over n. But actually, forget this. It's never good to work with the real and imaginary parts, the rectangular coordinates, when we're taking powers. To take that to the tenth power, we can't see what we're doing.
To take this form to the tenth power, we see immediately what we're doing. It would be e to the i 20 pi over n. So when our matrix is full of powers -- so it's this formula, and where is this on the complex plain? Here are the real numbers, here's the imaginary axis, here's the unit circle of radius one, and this number is on the unit circle at this angle, which is one n-th of the full way round.
So if I drew, for example, n equals six, this would be e to the two pi, two pi over six, it would be one sixth of the way around, it would be 60 degrees.
And where is W squared? So I -- my W is e to the two pi I over six in this case, in this six by -- for the six by six Fourier transform, it's totally constructed out of this number and its powers.
So what are its powers? Well, its powers are on the unit circle, right? Because when I square a number, a complex number, I square its absolute value, which gives me one again. All the powers have -- are on the unit circle. And the -- the angle gets doubled to a hundred and twenty, so there's W squared, there's W cubed, there's W to the fourth, there's W to the fifth and there is W to the sixth, as we hoped, W to the sixth coming back to one. So those are the six -- can I say this on TV? The six sixth roots of one, and it's this one, the primitive one we say, the first one, which is W.
Okay, so what -- let me change -- let me -- I said I would probably switch to n equal four. What's W for that? It's the fourth root of one. W to the fourth will be one.
W will be e to the two pi i over four now. What's that? This is e to the i pi over two. This is a quarter of the way around the unit circle, and that's exactly i, a quarter of the way around. And sure enough, the powers are i, i squared, which is minus one, i cubed, which is minus i and finally i to the fourth which is one, right.
So there's W, W squared, W cubed, W to the fourth -- I'm really ready to write down this Fourier matrix for the four by four case, just so we see that clearly. Let me do it here.
F4 is -- all right, one one one one one one one W -- it's I. I squared.
That's minus one. i cubed is minus i.
I'll -- I could write i squared and i cubed.
Why don't I, just so we see the pattern for sure. i squared, i cubed, i squared, i cubed, i fourth, i sixth -- i fourth, i sixth and i ninth.
You see the exponents fall in this nice -- the exponent is the row number times the column number, always starting at zero.
Okay. And now I can put in those numbers if you like -- one one one one, one i minus one minus i, one minus one, one minus one and one minus i minus one i.
No. Yes.
Right. What's -- why do I think that matrix is so remarkable? It's the four by four matrix that comes into the four point Fourier transform.
When we want to find the Fourier transform, the four point Fourier transform of a vector with four components, we want to multiply by this F4 or we want to multiply by F4 inverse.
One way we're taking the transform, one way we're taking the inverse transform. Actually, they're so close that it's easy to confuse the two.
The inverse of this matrix will be a nice matrix also. So -- and that's, of course, what makes it -- that -- I guess Fourier knew that.
He knew the inverse of this matrix.
A- as you'll see, it just comes from the fact that the columns are orthogonal -- from the fact that the columns are orthogonal, we will quickly figure out what is the inverse. What Fourier didn't know -- didn't notice -- I think Gauss noticed it but didn't make a point of it and then it turned out to be really important was the fact that this matrix is so special that you can break it up into nice pieces with lots of zeroes, factors that have lots of zeroes and multiply by it or by its inverse very, very fast.
Okay. But how did it get into this lecture first? Because the columns are orthogonal. Can I just check that the columns of this matrix are orthogonal? So the inner product of that column with that column is zero.
The inner product of column one with column three is zero.
How about the inner product of two and four? Can I take the inner product of column two with column four? Or even the inner product of two with three, let's -- let's see, does that -- let me do two and four.
Okay. What -- oh, I see, yes, hmm. Hmm.
Let's see, I believe that those two columns are orthogonal. So let me take their inner product and hope to get zero. Okay, now if you hadn't listened to the first half of this lecture, when you took the inner product of that with that, you would have multiplied one by one, i by minus i, and that would have given you one, minus one by minus one would have given you another one minus I by I would have been minus I squared, that's another one.
So do I conclude that the inner product of columns -- I said columns two and four, that's because I forgot those are columns one and three. I'm interested in their inner product and I'm hoping it's zero, but it doesn't look like zero. Nevertheless, it is zero. Those columns are perpendicular. Why? Because the inner product -- we conjugate.
Do you remember that the -- one of the vectors in the inner product has to get conjugated. So when I conjugated, it changes that i to a minus i, changes this to a plus i, changes those -- that second sine and that fourth sine and I do get zero. So those columns are orthogonal. So columns are orthogonal.
They're not quite orthonormal. But I could fix that easily. They -- all those columns have length two.
Length squared is four, like this -- the four I had there -- this length squared, one plus -- one squared one squared one squared one squared is four, square root is two -- so if I really wanted them -- suppose I really wanted to fix life perfectly, I could divide by two, and now I have columns that are actually orthonormal. So what? So I can invert right away, right? O- orthonormal columns means -- now I'm keeping this one half in here for the moment -- c- means F4 Hermitian, can I use that, conjugate transpose times F4 is i.
So I see what the inverse is. The inverse of F4 is -- it's just like an -- an orthogonal matrix.
The inverse is the transpose -- here the inverse is the conjugate transpose. So, fine.
That -- that tells me that anything good that I learn about F4 I'll know the same -- I'll know a similar fact about its inverse, because its inverse is just its conjugate transpose.
Okay, now -- so what's good? Well, first, the columns are orthogonal. That's a key fact, then. That's the thing that makes the inverse easy. But what property is it that leads to the fast Fourier transform? So now I'm going to talk, in these last minutes, about the fast Fourier transform.
What -- here's the idea. F6, our six by six matrix, will c- there's a neat connection to F3, half as big. There's a connection of F8 to F4. There's a connection of F(64) to F(32). Shall I write down what that connection is? What's the connection of F(64) to F(32)? So F(64) is a 64 by 64 matrix whose W is the 64th root of one. So it's one 64th of the way round in F(64). And it -- do- and F(32) is a 32 by 32 matrix. Remember, they're different sizes. And the W in that 32 by 32 matrix is the 32nd root of one, which is twice as far -- that -- you sh- see that key point -- that's the -- that's how 32 and 64 are connected in the Ws.
The W for 64 is one 64th of the way -- so all I'm saying is that if I square the W -- W(64), that's what I'm using for the one over -- the -- W sixty f- this Wn is either the i two pi over n -- so W(64) is one 64th of the way around it. When I square that, what do I get but W(32)? Right? If I square this matrix, I double the angle -- if I square this number, I double the angle, I get, the -- the W(32). So somehow there's a little hope here to connect F(64) with F(32). And here's the connection.
Okay. Let me -- let me go back, -- yes, let me -- I'll do it here.
Here's the connection. F(64).
The 64 by 64 Fourier matrix is connected to two copies of F(32). Let me leave a little space for the connection. So this is 64 by 64.
Here's a matrix of that same size, because it's got two copies of F(32) and two zero matrixes.
Those zero matrixes are the key, because when I multiply by this matrix, just as it is, regular multiplication, I would take -- need 64 -- I would -- I would have 64 squared little multiplications to do. But this matrix is half zero.
Well, of course, the two aren't equal.
I'm going to put an equals sign, but there has to be some fix up factors -- one there and one there -- to make it true.
The beauty is that these fix up factors will be really -- almost all zeroes. So that as soon as we get this formula right, we've got a great idea for how to get from the sixty- from the 64 squared calculations -- so this original -- originally we have 64 squared calculations from there, but this one will give us -- this is -- this will -- we don't need that many -- we only need two times 32 squared, because we've got that twice.
And -- plus the fix-up. So I have to tell you what's in this fix-up matrix. The one on the right is actually a permutation matrix, a very simple odds and evens permutation matrix, the -- ones show up -- I haven't put enough ones, I really need a -- 32 of these guys at -- double space and then -- you see it's -- it's a permutation matrix. What it does -- shall I call it P for permutation matrix? So what that P does when it multiplies a vector, it takes the odd -- the even numbered components first and then the odds.
You see this -- this one skipping every time is going to pick out x0, x2, x4, x6 and then below that will come -- will pick out x1, x3, x5. And of course, that can be hard wired in the computer to be instantaneous.
So that says -- so far, what have we said? We're saying that the 64 by 64 Fourier matrix is really separated into -- separate your vector into the odd -- into the even components and the odd components, then do a 32 size Fourier transform onto those separately, and then put the pieces together again.
So the pieces -- putting them together turns them out to be I and a diagonal matrix and I and a minus, that same diagonal matrix. So the fix-up cost is really the cost of multiplying by D, this diagonal matrix, because there's essentially no cost in -- in the I part or in the permutation part, so really it's -- the fix-up cost is essentially because D is diagonal -- is 32 multiplications.
That's the -- there you're seeing -- of course we didn't check the formula or we didn't even say what D is yet, but I will -- this diagonal matrix D is powers of W -- one W W squared down to W to the 31st. So you see that when I -- to do a multiplication by D, I need to do 32 multiplications. There they are.
Then -- but the other, the more serious work is to do the F(32) twice on the -- separately on the even numbered and odd numbered components, so twice 32 squared.
So 64 squared is gone now. And that's the new count.
Okay, great, but what next? So that's -- I -- we now have the key idea -- we would have to check the algebra, but it's just checking a lot of sums that come out correctly. This is right -- the right way to see the fast Fourier transform, or one right way to see it. Then you've got to see what's the next idea. The next idea is to break the 32s down. Break those 32s down.
So we have this factor, and now we have the F(32), but that breaks into some guy here -- F thirty- F six- F(16) -- F(16). Each -- each F(32) is breaking into two copies of F(16), and then we have a permutation and then the -- so this is a -- like, this was a 64 size permutation, this is a 32 size permutation -- I guess I've got it twice. So it's -- I'm -- I'm just using the same idea recursively -- recursion is the key word -- that on each of those F(32)s -- so here's zero zero -- it's just -- to get F(32) -- this is the odd even permutations -- so you see, we're -- the combination of those permutations, what's it doing? This guy separates into odds -- in -- into evens and odds, and then this guy separates the evens into the ones -- the numbers that are mult- the even evens, which means zero four eight sixteen -- and even odds, which means two, six, ten, fourteen -- and then odd evens and odd odds. You see, together these permutations then break it -- break our vector down into x, even even and three other pieces.
Those are the four pieces that separately get multiplied by F(16) -- separately fixed up by these Is and Ds and Is and minus Ds -- so this count is now reduced.
This count is now -- what's it reduced to? So that's going to be gone, because 32 squared -- that's -- that's the change I'm making, right? The 32 squared -- w- so -- so it's this that's now reduced.
So I still have two times it, but now what's 32 squared? It's gone in favor of two sixteen squareds plus sixteen.
That's -- and then the original 32 to fix.
Maybe you see what's happening. Even easier than this formula is w- what's -- when I do the recursion more and more times, I get simpler and simpler factors in the middle. Eventually I'll be down to two point or one point Fourier transforms.
But I get more and more factors piling up on the right and left.
On the right, I'm just getting permutation matrixes. On the left, I'm getting these guys, these Is and Ds, so that there was a 32 there and -- each one of these is costing 32. Each one of those is costing 32. And how many will there be? So you see the 32 for this original fix up, because D had 32 numbers, 32 for this next fix up, because D has 16 and 16 more. I keep going. So the count in the middle goes down to zip, but these fix up counts are all that I'm left with, and how many factors -- how many fix-ups have I got -- log in -- from 64, one step to 32, one step to 16, one step to eight, four, two and one.
Six steps. So I have six fix-up -- six fix up factors. Finally I get to six times the 32. That's my final count.
Instead of 64 squared, this is log to the base two of 64 times 64 -- actually, half of 64. So actually, the final count is n log to the base two of n -- that's the 32 -- a half.
So can I put a box around that wonderful, extremely important and satisfying conclusion -- that the fast Fourier transform multiplies by an n by n matrix, but it does it not in n squared steps, but in one half n log n steps.
And if we just -- complete by doing a count, let's suppose -- suppose -- a typical case would be two to the tenth. Now n squared is bigger than a million. So it's a thousand twenty four times a thousand twenty four. But what is n -- what is one half -- what is the new count, done the right way? It's n -- the thousand twenty four times one half, and what's the logarithm? It's ten.
So times ten over two. So it's five times -- it's five times a thousand twenty four, where this one was a thousand twenty four times a thousand twenty four.
We've reduced the calculation by a factor of 200 just by factoring the matrix properly.
This was a thousand times n, we're now down to five times n.
So we can do 200 Fourier transforms, where before we could do one, and in real scientific calculations where Fourier transforms are happening all the time, we're saving a factor of 200 in one of the major steps of modern scientific computing.
So that's the idea of the fast Fourier transform, and you see the whole thing hinged on being a special matrix with orthonormal columns. Okay, that's actually it for complex numbers. I'm back next time really to -- to real numbers, eigenvalues and eigenvectors and the key idea of positive definite matrixes is going to show up. What's a positive definite matrix? And it's terrific that this course is going to reach positive definiteness, because those are the matrixes that you see the most in applications. Okay, see you next time.
Thanks.