Lecture 28: Matrix Methods for Inhomogeneous Systems
Topics covered: Matrix Methods for Inhomogeneous Systems: Theory, Fundamental Matrix, Variation of Parameters
Instructor/speaker: Prof. Arthur Mattuck
» Download this transcript - PDF (English - US)
The real topic is how to solve inhomogeneous systems, but the subtext is what I wrote on the board. I think you will see that really thinking in terms of matrices makes certain things a lot easier than they would be otherwise. And I hope to give you a couple of examples of that today in connection with solving systems of inhomogeneous equations. Now, there is a little problem. We have to have a little bit of theory ahead of time before that, which I thought rather than interrupt the presentation as I try to talk about the inhomogeneous systems it would be better to put a little theory in the beginning. I think you will find it harmless.
And about half of it you know already. The theory I am talking about is, in general, the theory of the systems x' = ax. I will just state it when n is equal to two. A two-by-two system likely you have had up until now. It is also true for end-by-end. It is just a little more tedious to write out and to give the definitions. Here is a little two-by-two system. It is homogeneous. There are no zeros. And it is not necessary to assume this, but since the matrix is going to be constant until the end of the term let's assume it in and not go for a spurious generality. So constant matrices like you will have on your homework.
Now, there are two theorems, or maybe three that I want you to know, that you need to know in order to understand what is going on. The first one, fortunately, is already in your bloodstream, I hope. Let's call it theorem A. It is simply the one that says that the general solution to the system, that system I wrote on the board, the two-by-two system is what you know it to be. Namely, from all the examples that you have calculated. It is a linear combination with arbitrary constants for the coefficients of two solutions. In other words, to solve it, to find the general solution you put all your energy into finding two independent solutions. And then, as soon as you found them, the general one is gotten by combining those with arbitrary constants.
The only thing to specify is what the x1 and the x2 are. "Where," I guess, would be the right word to use. Where x1 and x2 are two solutions, but neither must be a constant multiple of the other. That is the only thing I want to stress, they have to be independent. Or, as it is better to say, linearly independent. Are two linearly independent solutions. And department of fuller explanation, i.e., neither is a constant multiple of the other.
That is what it means to be linearly independent. Now, this theorem I am not going to prove. I am just going to say that the proof is a lot like the one for second order equations. It has an easy part and a hard part. The easy part is to show that all of these guys are solutions. And, in fact, that is almost self-evident by looking at the equation. For example, if x1 and x2, each of those solve that equation so does their sum because, when you plug it in, you differentiate the sum by differentiating each term and adding. And here A(x1 + x2) = Ax1 + Ax2.
In other words, you are using the linearity and the superposition principle. It is easy to show that all of these, well, maybe I should actually write something down instead of just talking. Easy that all these are solutions. Every one of those guys, regardless of what c1 and c2 is, is a solution. That is linearity, if I use that buzz word, plus the superposition principle, that the sum of two solutions is a solution.
The hard thing is not to show that these are solutions but to show that these are all the solutions, that there are no other solutions. No matter how you do that it is hard. The hard thing is that there are no other solutions. These are all. Now, you could sort of say, well, it has two arbitrary constants in it. That is sort of a rough and ready reason, but it is not considered adequate by mathematicians. And, in fact, I could go into a song and dance as to just why it is inadequate. But we have other things to do, bigger fish to fry, as they say. Let's fry a fish. No, we have another theorem first.
This one it is mostly the words that I am interested in. Once again, we have our old friend the Wronskian back. The Wronskian of what? Of two solutions. It is the Wronskian of the solution x1 and x2. They don't, by the way, have to be independent. Just two solutions to the system. And what is it? Hey, didn't we already have a Wronskian? Yeah. Forget about that one for the moment. Postpone it for a minute. This is a determinant, just like the old one way.
This is going to be a great lecture. (x1, x2). Now what is this? x1 is a column vector, right? x2 is a column vector. Two things in it. Two things in it. Together they make a square matrix. And this means it is determinant. It is the determinant of this. It is a determinant, in other words, of a square matrix. And that is what it is. I will change this equality. To indicate it is a definition, I will put the colon there, which is what you add, to indicate this is only equal because I say so. It is a definition, in other words. Now, there is a connection between this and the earlier Wronskian which I, unfortunately, cannot explain to you because you are going to explain it to me.
I gave it to you as part one of your homework problem. Make sure you do it. And, if you cannot remember what the old Wronskian is, please look it up in the book. Don't look it up in the solution to the problem. If you do that you will learn something. Then you will see how, in a certain sense, this is a more general definition than I gave you before. The one I gave you before is, in a certain sense, a special case of it.
Now that is just the definition. There is a theorem. And the theorem is going to look just like the one we had for second order equations, if you can remember back that far. The theorem is that if these are two solutions there are only two possibilities for the Wronskian. So either or. Either the Wronskian is -- Now, the Wronskian, these are functions, the column vectors are the solutions, so those are functions of the variable t, so are these. The Wronskian as a whole is a function of the independent variable t after you have calculated out that determinant. I will write it now this way to indicate that it s a function of t. Either the Wronskian is --
One possibility is identically zero. That is zero for all values of t, in other words. And this happens if x1 and x2 are not linearly independent. Usually people just say dependent and hope they are interpreted correctly. Are dependent. But since I did not explain what dependent means, I will say it. Not linearly independent. I know that is horrible, but nobody has figured out another way to say it. That is one possibility, or the opposite of this is never zero for any t value.
I mean a normal function is zero here and there, and the rest of the time not zero. Well, not this Wronskian. You only have two choices. Either it is zero all the time or it is never zero. It is like the function e^t. In other words, an exponential which is never zero, always positive and never zero. Or, it could be a constant. Anyway, it has to be a function which is never zero. And this happens in the other case, so this is --
There is no place to write it. This is the case if x1 and x2 are independent, by which I mean linearly independent. It is just I didn't have room to write it. That is pretty much the end of the theory. And now, let's start in on the matrices. The basic new matrix we are going to be talking about this period and next one on Monday also is the way that most people who work with systems actually look at the solutions to systems, so it is important you learn this word and this way of looking at it. What they do is look not at each solution separately, as we have been doing up until now. They put them all together in a single matrix.
And it is the properties of that matrix that they study and try to do the calculations using. And that matrix is called the fundamental matrix for the system. Sometimes people don't bother writing in the whole system. They just say it is a fundamental matrix for A because, after all, A is the only thing that is varying there. Once you know A, you know what the system is. So what is this guy?
Well, it is a two-by-two matrix. And it is the most harmless thing. It is the precursor of the Wronskian. It is what the Wronskian was before the determinant was taken. In other words, it is the matrix whose two columns are those two solutions. The other question is what we are going to call it. I kept trying everything and settled on calling it capital X because I think that is the one that guides you in the calculations the best.
This is definition two, so colon equality. Notice I am not using vertical lines now because that would mean a determinant. It is the matrix whose columns are two independent solutions. Is that all? Yeah. You just put them side-by-side. Why? That will come out. Why should one do this? Well, first of all, in order not to interrupt the basic calculation that I want to make with this during the period, it has two basic properties that we are going to need during this period. These are the properties.
Just two. And one is obvious and the other you will think, I hope, is a little less familiar. I think you will see there is nothing to it. It is just a way of talking, really. The first is the one that is already embedded in the theorem, namely that the determinant of the fundamental matrix is not zero for any t. Why? Well, I just told you it wasn't. This is the Wronskian. The Wronskian is never zero? Why is it never zero? Well, because I said these columns had to be independent solutions. So this is not just not zero, it is never zero. It is not zero for any value of t. That is good. As you will see, we are going to need that property. But the other one is a little stranger.
The only thing I can say is, get used to it. Namely that X prime equals AX. Now, why is that strange? That is not the same as this. This is a column vector. That is a square matrix and this is a column vector. This is not a column vector. This is a square matrix. This is what is called a matrix differential equation where the variable is not a single x or a column vector of a set of x's like the x and the y. It is a whole matrix.
Well, first of all, I should say what is it saying? This is a two-by-two matrix. When I multiply them I get a two-by-two matrix. What is this? This is a two-by-two matrix, every entry of which has been differentiated. That is what it means to put that prime there. To differentiate a matrix means nothing fancy. It just means differentiate every entry. It is just like to differentiate a vector (x, y), to make a velocity vector you differentiate the x and the y. Well, a column vector is a special kind of matrix. The definition applies to any matrix.
Well, why is that so? I state it as a property, but I will continue it by giving you, so to speak, the proof of it. In fact, there is nothing in this. It is nothing more than a little matrix calculation of the most primitive kind. Namely, what does this mean? Let's try to undo that. What does the left-hand side really mean? Well, if that is what x means, the left-hand side must mean the derivative of the first column.
That is its first column. And the derivative of the second column. That is what it means to differentiate the matrix X. You differentiate each column separately. And to differentiate the column you need to differentiate every function in it. Well, what does the right-hand side mean? Well, I am supposed to take A[x1, x2]. Now, I don't know how to prove this, except ask you to think about it.
Or, I could write it all out here. But think of this as a bing, bing, bing, bing. And this is a bing, bing. And this is a bong, bong. How do I do the multiplication? In other words, what is in the first column of the matrix? Well, it is dah, dah, and the lower thing is dah, dah. In other words, it is A times x1. Shut your eyes and visualize it. Got it? Dah, dah is the top entry, and dah, dah is the bottom entry. It is what you get by multiplying A by the column vector x1. And the same way the other guy is --
-- what you get by multiplying A by the column vector x2. This is just matrix multiplication. That is the law of matrix multiplication. That is how you multiply matrices. Well, good, but where does this get us? What does it mean for those two guys to be equal? That is going to happen, if and only if x1 prime is equal to A x1. This guy equals that guy. And similarly for the x2's. The end result is that this matrix, saying that the fundamental matrix satisfies this matrix differential equation is only a way of saying, in one breath, that its two columns are both solutions to the original system.
It is, so to speak, an efficient way of turning these two equations into a single equation by making a matrix. I guess it is time, finally, to come to the topic of the lecture. I said the thing the matrices were going to be used for is solving inhomogeneous systems, so let's take a look at those. I thought I would give you an example. Inhomogeneous systems. Well, what is one going to look like? So far what we have done is, up until now has been solving, we spent essentially two weeks solving and plotting the solutions to homogeneous systems. There was nothing over there. And homogeneous systems, in fact, with constant coefficients.
Stuff that looked like that that we abbreviated with matrices. Now, to make the system inhomogeneous what I do is add the extra term on the right-hand side, which is some function of t. Except, I will have to have two functions of t because I have two equations. Now it is inhomogeneous. And what makes it inhomogeneous is the fact that these are not zero anymore. There is something there. Functions of t are there.
These are given functions of t like exponentials, polynomials, the usual stuff you have on the right-hand side of the differential equation. What is confusing here is that when we studied second order equations it was homogeneous if the right-hand side was zero, and if there was something else there it was inhomogeneous. Unfortunately, I have stuck this stuff on the right-hand side so it is not quite so clear anymore.
It has got to look like that, in other words. How would the matrix abbreviation look? Well, the left-hand side is x prime. The homogenous part is ax, just as it has always been. The only extra part is those functions r. And this is a column vector, after the multiplication this is a column vector, what is left is column vector. Now, explicitly it is a function of t, given by explicit functions of t, again, like exponentials.
Or, they could be fancy functions. That is the thing we are trying to solve. Why don't I put it up in green? Our new and better and improved system. Think back to what we did when we studied inhomogeneous equations. We are not talking about systems but just a single equation. What we did was the main theorem -- I guess there are going to be three theorems today, not just two. Theorem C. Is that right? Yes, A, B. We are up to C. Theorem C says that the general solution, that is, the general solution to the system, is equal to the complimentary function, which is the general solution to x prime equals Ax, --
-- the homogeneous equation, in other words, plus, what am I going to call it? (x)p, right you are, a particular solution. But the principle is the same and is proved exactly the same way. It is just linearity and superposition. The linearity of the original system and the superposition principle. The essence is that to solve this inhomogeneous system, what we have to do is find a particular solution. This part I already know how to do. We have been doing that for two weeks. The new thing is to find this.
Now, if you remember back before spring break, most of the work in solving the second order equation was in finding that particular solution. You quickly enough learned how to solve the homogeneous equation, but there was no real general method for finding this. We had an exponential input theorem with some modifications to it. We took a week's detour in Fourier series to see how to do it for periodic functions or functions defined on finite intervals. There were other techniques which I did not get around to showing you, techniques involving the so-called method of undetermined coefficients. Although, some of you peaked in your book and learned it from there.
But the work is in finding (x)p. The miracle that occurs here, by contrast, is that it turns out to be easy to find (x)p. And easy in this further sense that I do not have to restrict the kind of function I use. For example, the second homework problem I have given you, the second part two homework problem. You will see how to use systems. For example, to solve this simple equation, I will write it out for you, consider that equation, tan(t). What technique will you apply to solve that? In other words, suppose you wanted to find a particular solution to that. The right-hand side is not an exponential. It is not a polynomial.
It is not like sin(bt) or cos(bt). I could use the Laplace transform. No, because you don't know how to take the L(tan(t)). Neither, for that matter, do I. Fourier series. Not a good choice for a function that goes to infinity at pi over two. So you cannot do this until you do your homework. Now you will be able to do it. In other words, one of the big things is not only will I give you a formula for the Xp but that formula will work even for tangent t, any function at all.
Well, I thought I would try to put a little meat on the bones of the inhomogeneous systems by actually giving you a physical problem so we would actually be able to solve a physical problem instead of just demonstrate a solution method. Here is a mixing problem. Just to illustrate what makes a system of equations inhomogeneous, here at two ugly tanks. I am not going to draw these carefully, but they are both 1 liter. And they are connected by pipes. And I won't bother opening holes in them. There is a pipe with fluids flowing back there and this direction it is flowing this way, but that is not the end.
The end is there is stuff coming in to both of them. And I think I will just make it coming out of this one. There is something realistic. The numbers 2, 3, 2. Let's start there and see what the others have to be. So these are flow rates. One liter tanks. The flow rates are in, let's say, liters per hour. And I have some dissolved substance in, so here is going to be x salt in there and the same chemical in there, whatever it is. x is the amount of salt, let's say, in tank one.
And y, the same thing in tank two. Now, if you have stuff flowing unequally this way, you must have balance. You have to make sure that neither tank is getting emptied or bursting and exploding. What is flowing in? What is x? Three is going out, two is coming in, so this has to be one in order that tank x stay full and not explode. And how about y? How much is going out? Two there and two here. Four is going out, three is coming in. This also has to be one. Those are just the flow rates of water or the liquid that is coming in. Now, the only thing I am going to specify is the concentration of what is coming in.
Here the concentration is 5 e^(-t). And that is what makes the problem inhomogeneous. Here the concentration is going to be zero. In other words, pure water is flowing in here to create the liquid balance. Here, on the other hand, salt solution is flowing in but with a steadily declining concentration. So, what is the system? Well, you have set it up exactly the way you did when you studied first order equations. It is inflow minus outflow. What is the outflow? The outflow is all in this pipe. The flow rates are liters per hour. Three liters per hour flowing out. How much salt does that represent?
It is negative three times the concentration of salt. But the concentration, notice, equals x / 1. In other words, x represents both the concentration and the amount. So I don't have to distinguish. If I had made it two liter tanks then I would have had to divide this by two. I am cheating, but it is enough already. x prime equals minus 3x. That is what is going out. What is coming in? Well, 2y is coming in. Concentration here. What is coming in? Is it y 2 liter? Plus what is coming in from the outside. We have to add that in, and that will be 5 e^(-t). How about y? y prime is changing. What comes in from x?
That is 3x. What goes out? Well, two is leaving here and two is leaving here. It doesn't matter that they are going out through separate pipes. They are both going out. It is minus 4, 2 and 2. How about the inhomogeneous term? There is one coming in, but there is no salt in it. Therefore, that is not changing. What is coming through that pipe is necessary for the liquid balance. But it has no effect whatsoever. I will put a zero here but, of course, you don't have to put that in. This is now an inhomogeneous system. In other words, the system is x prime equals this matrix, negative 3, the same sort of stuff we always had, plus the inhomogeneous term which is the column vector [5 e^(-t), 0].
It is the presence of this term that makes this system inhomogeneous. And what that corresponds to is this little closed system being attacked from the outside by these external pipes which are bringing salt in. Without those, of course the balance would be all wrong. I would have to change this to three and cut that out, I guess. But then, it would be just a simple homogenous system. It is these pipes that make it inhomogeneous. Now, I should start to solve that. I did this just to illustrate where a system might come from. Before I solve that, what I want to do is, of course, is solve it in general. In other words, how do you solve this in general? Because I promised you that you would be able to do in general, regardless of what sort of functions were in the r of t, that column vector. So let's do it.
First of all, you have to learn the name of the method. This method is for solving x prime equals Ax. It is a method for finding a particular solution. Of course, to actually solve it then you have to add the complimentary function. We are looking for a particular solution for this system. Now, the whole cleverness of the method, which I think was discovered a couple hundred years ago by, I think, Lagrange, I am not sure. The method is called variation of parameters.
I am giving you that so that when you forget you will be able to look it up and be indexes to some advanced engineering mathematics book or something, whatever is on your shelf. But, if course, you won't remember the name either so maybe this won't work. Variation of parameters, I will explain to you why it is called that. All the cleverness is in the very first line. If you could remember the very first line then I trust you to do the rest yourself.
I don't know any motivation for this first step, but mathematics is supposed to be mysterious anyway. It keeps me eating. It says, look for a solution and there will be one of the following form. Now, it will look exactly like -- Look carefully because it is going to be gone in a moment. It will look exactly like this. But, of course, it cannot be this because this solves the homogeneous system. If I plug this in with these as constants it cannot possibly be a particular solution to this because it will stop there and satisfy that with r equals zero.
The whole trick is you think of these are parameters which are now variable. Constants that are varying. That is why it is called variation of parameters. You think of these, in other words, as functions of t. We are going to look for a solution which has the form, since they are functions of t, I don't want to call them c1 and c2 anymore. I will call them v because that is what most people call them, v or u, sometimes.
The method says look for a solution of that form. The variation parameters, these are the parameters that are now varying instead of being constants. Now, if you take it in that form and start trying to substitute into the equation you are going to get a mess. I think I was wrong in saying I could trust you from this point on. I will take the first step from you, and then I could trust you to do the rest after that first step. The first step is to change the way this looks by using the fundamental matrix. Remember what the fundamental matrix was? Its entries were the two columns of solutions. These are solutions to the homogeneous system.
And I am going to write it using the fundamental matrix as, now thinks about it. The fundamental matrix has columns x1 and x2. Your instinct might be using matrix multiplication to put the v1 and the v2 here, but that won't work. You have to put them here. This says the same thing as that. Let's just take a second out to calculate. The x is going to look like (x1, y1). That is my first solution. My second solution, here is the fundamental matrix, is (x2, y2). And I am multiplying this on the right by (v1, v2). Does it come out right? Look. What is it?
The top is x1v1 + x2v2. The top, x1v1 + x2v2. It is in the wrong order, but multiplication is commutative, fortunately. And the same way the bottom thing will be v1y1 + v2y2. If I had written it on the other side instead, which is tempting because the v's occur on the left here, that won't work. What will I get? I will get v1x1 + v2y1, which is not at all what I want. You must put it on the right. But this is a very important thing. This is going to plague us on Monday, too. It must be written on the right and not on the left as a column vector. The rest of the program is very simple. I will write it out as a program.
Substitute into the system, into that, in other words, and see what v has to be. That is what we are looking for. We know what the x1 and x2 are. It is a question of what those coefficients are. And see what v is. Let's do it. Let's substitute. Let's see. The system is x' = Ax + r. I want to put in (x)p, this proposed particular solution. And it is a fundamental matrix, and the v is unknown. How do I differentiate the product of two matrices?
You differentiate the product of two matrices using the product rule that you learned the first day of 18.01. Trust me. Let's do it. I am going to substitute in. In other words, here is my (x)p, (x)p, and I am going to write in what that is. The left-hand side is the derivative of, X' v + X v'. Notice that one of these is a column vector and the other is a square matrix. That is perfectly Okay. Any two matrices which are the rate shape so you can multiply them together, if you want to differentiate their product, in other words, if the entries are functions of t it is the product rule. The derivative of this times time plus that times the derivative of this. You have to keep them in the right order. You are not allowed to shuffle them around carelessly. So that is that. What is it equal to?
Well, the right-hand side is A. And now I substitute just (x)p in, so that is X times v plus r. Is this progress? What is v? It looks like a mess but it is not. Why not? It is because this is not any old matrix X. This is a matrix whose columns are solutions to the system. And what does that do? That means X prime satisfies that matrix differential equation. X' = Ax. And, by a little miracle, the v is tagging along in both cases. This cancels that and now there is very little left.
The conclusion, therefore, is that Xv = r. What is v? It is v that we are looking for, right? You have to solve a matrix equation, now. This is a square matrix so you have to do it by inverting the matrix. You don't just sloppily divide. You multiply on which side by what matrix? Choice of left or right. You multiply by the inverse matrix on the left or on the right? It has to be on the left. Multiply both sides of the equation by X inverse on the left, and then you will get v = X^(-1) r. How do I know the X inverse exists? Does X inverse exist?
For a matrix inverse to exist, the matrix's determinant must be not zero. Why is the determinant of this not zero? Because its columns are independent solutions. Of course this is not right. I forgot the prime here. I am not failing this course after all. v' equals that. This is done by differentiating each entry in the column vector. And, therefore, we should integrate it. It will be the integral, just the ordinary anti-derivative of x^(-1) r. This is a column vector.
The entries are functions of t. You simply integrate each of those functions in turn. So integrate each entry. There is my v. Sorry, you cannot tell the v's from the r's here. And so, finally, the particular solution is (x)p is equal to -- It is really not bad at all. It is equal to X times v. It's equal to X Integral of [X^(-1) r dt]. Now, actually, there is not much work to doing that. Once you have solved the homogeneous system and gotten the fundamental matrix, taking the inverse of a two-by-two matrix is almost trivial. You flip those two and you change the signs of these two and you divide by the determinant.
You multiply it by r. And the hard part is if you can do the integration. If not, you just leave the integral sign the way you have learned to do in this silly course and you still have the answer. What about the arbitrary constant of integration? The answer is you don't need to put it in. Just find one particular solution. It is good enough. You don't have to put in the arbitrary constants of integration.
Because they are already in the complimentary function here. Therefore, you don't have to add them. I am sorry I didn't get a chance to actually solve that. I will have to let it go. The recitations will do it on Tuesday, will solve that particular problem, which means you will, in effect.