Welcome to the thirteenth entry in the calculus series, where I will be covering the chain rule of differentiation. I will begin by demonstrating the chain rule using a very simple and intuitive example. This should already secure this tool under your growing calculus toolbox.
However, we are not going to stop there. Following this example, I will also be diving into the mathematical framework that enables us to take advantage of the chain rule.
I won’t bore you with rigorous proofs. But I will be diving into just enough details that should enable you to appreciate the beauty of the mathematics behind calculus. Let us begin.
This essay is supported by Generatebg
Why Do We Need the Chain Rule in the First Place
To start, let us say that we need to differentiate the following function (y) with respect to the independent variable ‘x’:
The part within parentheses (x² + 5) is fairly straightforward. But what makes this function challenging is the fact that it is raised to a fractional exponent. So, how could we go about differentiating this function? Well, here are two approaches that we could follow:
1. First cube the part within parentheses [(x² + 5)³], then compute the square root of the result. Finally, differentiate the resulting expression as we have done before.
2. Apply the binomial theorem to resolve the exponent, and then, differentiate the resulting expression as we have done before.
If we set forth with either of these approaches, we are likely to deal with cumbersome computations and messy terms. It almost feels like we could use a better alternative. Well, here’s another option:
3. Use the chain rule to compute the derivative directly.
So, this is precisely where the chain rule comes in: to make our calculus lives easier. Now that we have a rough understanding of where the chain rule comes into play, let me show you an intuitive approach to applying the rule next.
How to Intuitively Apply the Chain Rule
As we established earlier, the part that makes this function a little bit scary is the computation of the expression (x² + 5) raised to three halves. Well, what if we skipped that computation, and replaced (x² + 5) with, say, ‘u’? That’s right, we start the path to applying the chain rule with the following substitution:
Now, we could just apply the power rule of calculus, we could compute the derivative of ‘y’ with respect to ‘u’ as follows:
Next, we could differentiate ‘u’ with respect to ‘x’ (again by applying the power rule) to get the following result:
Now, all we have to do to obtain the derivative of ‘y’ with respect to ‘x’ is multiply the two results above. This, in essence, is how the chain rule is applied:
That was easy, right? When you consider the alternatives such as applying the binomial theorem, the chain rule makes our lives much easier.
Most calculus users would be content with this level of understanding. But let us go one step deeper to understand why the chain rule of calculus works the way it does.
The Mathematical Framework Behind the Rule
When it comes to useful calculus tools, the chain rule is right up there. One tends to appreciate its usefulness the more and more one deals with tricky functions. While we are on the topic of tricky functions, the function ‘y’ we just solved is known as a ‘composite function’.
In other words, it is a function of a function. The term within parentheses (x² + 5) is called ‘inside function’, and the exponent (3/2) outside is called ‘outside function’. In the context of these two functions, what the chain rule actually does is as follows:
Step 1: Differentiate the outside function with respect to the inside one (treating it as one independent variable)
Step 2: Multiply the above result by the derivative of the inside function with respect to ‘x’.
If there are several layers of functions, you could just apply the above steps recursively, and they would work just fine. The chain rule is essentially called so because it lets you compute a chain of derivatives and multiply them by each other to get the final result.
Final Thoughts
Although there are rigorous mathematical proofs to show why the chain rule works, I prefer a brilliant yet simple analogy from Martin Gardner:
Consider three kids A, B, and C. If A is growing twice as fast as B, and B is growing thrice as fast as C, how fast is A growing with respect to C?
If you think about it for a bit, you would realise that if two rates are related by a common variable that is the independent variable in one rate and the dependent variable in the other, we could just multiply the rates to get the answer. In other words, A is growing six times as fast as C.
If you are interested in discovering (or rediscovering) the joy of calculus from basic first principles, watch this space for more calculus essays in the future!
References and credit: Silvanus Thompson and Martin Gardner.
If you’d like to get notified when interesting content gets published here, consider subscribing.
Further reading that might interest you:
- How To Use Science To Detect Fraud?
- The New Industrial Revolution Is Here.
- Is Zero Really Even Or Odd?
If you would like to support me as an author, consider contributing on Patreon.
Comments