Inverse function rule

From HandWiki
Short description: Calculus identity
The thick blue curve and the thick red curves are inverse to each other. A thin curve is the derivative of the same colored thick curve. Inverse function rule:
f(x)=1(f1)(f(x))

Example for arbitrary x05.8:
f(x0)=14
(f1)(f(x0))=4

In calculus, the inverse function rule is a formula that expresses the derivative of the inverse of a bijective and differentiable function f in terms of the derivative of f. More precisely, if the inverse of f is denoted as f1, where f1(y)=x if and only if f(x)=y, then the inverse function rule is, in Lagrange's notation,

[f1](a)=1f(f1(a)).

This formula holds in general whenever f is continuous and injective on an interval I, with f being differentiable at f1(a)(I) and wheref(f1(a))0. The same formula is also equivalent to the expression

𝒟[f1]=1(𝒟f)(f1),

where 𝒟 denotes the unary derivative operator (on the space of functions) and denotes function composition.

Geometrically, a function and inverse function have graphs that are reflections, in the line y=x. This reflection operation turns the gradient of any line into its reciprocal.[1]

Assuming that f has an inverse in a neighbourhood of x and that its derivative at that point is non-zero, its inverse is guaranteed to be differentiable at x and have a derivative given by the above formula.

The inverse function rule may also be expressed in Leibniz's notation. As that notation suggests,

dxdydydx=1.

This relation is obtained by differentiating the equation f1(y)=x in terms of x and applying the chain rule, yielding that:

dxdydydx=dxdx

considering that the derivative of x with respect to x is 1.

Derivation

Let f be an invertible (bijective) function, let x be in the domain of f, and let y be in the codomain of f. Since f is a bijective function, y is in the range of f. This also means that y is in the domain of f1, and that x is in the codomain of f1. Since f is an invertible function, we know that f(f1(y))=y. The inverse function rule can be obtained by taking the derivative of this equation.

ddyf(f1(y))=ddyy

The right side is equal to 1 and the chain rule can be applied to the left side:

d(f(f1(y)))d(f1(y))d(f1(y))dy=1df(f1(y))df1(y)df1(y)dy=1f(f1(y))(f1)(y)=1

Rearranging then gives

(f1)(y)=1f(f1(y))

Rather than using y as the variable, we can rewrite this equation using a as the input for f1, and we get the following:[2]

(f1)(a)=1f(f1(a))

Examples

  • y=x2 (for positive x) has inverse x=y.
dydx=2x    ;    dxdy=12y=12x
dydxdxdy=2x12x=1.

At x=0, however, there is a problem: the graph of the square root function becomes vertical, corresponding to a horizontal tangent for the square function.

  • y=ex (for real x) has inverse x=lny (for positive y)
dydx=ex    ;    dxdy=1y=ex
dydxdxdy=exex=1.

Additional properties

f1(x)=1f(f1(x))dx+C.
This is only useful if the integral exists. In particular we need f(x) to be non-zero across the range of integration.
It follows that a function that has a continuous derivative has an inverse in a neighbourhood of every point where the derivative is non-zero. This need not be true if the derivative is not continuous.
  • Another very interesting and useful property is the following:
f1(x)dx=xf1(x)F(f1(x))+C
Where F denotes the antiderivative of f.
  • The inverse of the derivative of f(x) is also of interest, as it is used in showing the convexity of the Legendre transform.

Let z=f(x) then we have, assuming f(x)0:d(f)1(z)dz=1f(x)This can be shown using the previous notation y=f(x). Then we have:

f(x)=dydx=dydzdzdx=dydzf(x)dydz=f(x)f(x)Therefore:
d(f)1(z)dz=dxdz=dydzdxdy=f(x)f(x)1f(x)=1f(x)

By induction, we can generalize this result for any integer n1, with z=f(n)(x), the nth derivative of f(x), and y=f(n1)(x), assuming f(i)(x)0 for 0<in+1:

d(f(n))1(z)dz=1f(n+1)(x)

Higher derivatives

The chain rule given above is obtained by differentiating the identity f1(f(x))=x with respect to x. One can continue the same process for higher derivatives. Differentiating the identity twice with respect to x, one obtains

d2ydx2dxdy+ddx(dxdy)(dydx)=0,

that is simplified further by the chain rule as

d2ydx2dxdy+d2xdy2(dydx)2=0.

Replacing the first derivative, using the identity obtained earlier, we get

d2ydx2=d2xdy2(dydx)3.

Similarly for the third derivative:

d3ydx3=d3xdy3(dydx)43d2xdy2d2ydx2(dydx)2

or using the formula for the second derivative,

d3ydx3=d3xdy3(dydx)4+3(d2xdy2)2(dydx)5

These formulas are generalized by the Faà di Bruno's formula.

These formulas can also be written using Lagrange's notation. If f and g are inverses, then

g(x)=f(g(x))[f(g(x))]3

Example

  • y=ex has the inverse x=lny. Using the formula for the second derivative of the inverse function,
dydx=d2ydx2=ex=y    ;    (dydx)3=y3;

so that

d2xdy2y3+y=0    ;    d2xdy2=1y2,

which agrees with the direct calculation.

See also

References