Link back to Home page

Link to typed notes:

1. Functions

Definition.:

f:XY between sets X and Y is a map that sends each element of X to a unique element in Y

Given a function

f:XY

The Domain of f is X
The Range of f is the subset R of Y containing of Y containing all elements f(x):

R={f(x)xX}

Notation on intervals

For a<bR, write [a,b] for the closed interval from a to b, and write (a.b) for the open interval from a to b

[a,b]={xR|axb}(a,b)={xR|a<x<b}[a,b)={xR|ax<b}(a,b]={xR|a<xb}

Special functions

Definition

Proving bijection and finding inverses of functions

  1. Specify domain and codomain

Eg: f:RR

f(x)=2x+1

  1. First, we need to show that this is an injective function

We need to show: if f(x1)=f(x2) then x1=x2

f(x1)=f(x2)
2x1+1=2x2+1
2x1=2x2
x1=x2

Therefore, f is injective

  1. Next, we need to prove that it's a surjective

We need to show that: yR,x such that f(x)=y

Let yR

We want to solve f(x)=y to find the input x

y=2x+1x=y12

Since y12R, such an x always exists.

Therefore, f is surjective

Since f is both injective and surjective, it is bijective

Inverse functions

Inverse functions only exist if a function f is bijective
Let f:XY be a function

So f(f1(y))=y, yY and f(f1(x))=x, xX

And it must:

We say g:YX is the inverse of f if:

fg=IYgf=IX

This means:

yY:f(g(y))=y
xX:g(f(x))=x

Definition And I is the identity function
We then write g=f1

We can restrict the domain and range of functions (eg: trig function) to find them an inverse

f:RR,f(x)=sin(x) is not a bijection

But… f:[π2,π2][1,1],f(x)=sin(x) is a bijection so an inverse does exist

Exponentials And logs

Definition of exponential

f:R(0,),f(x)=ax

Properties:

a0=1ax=1axaxay=ax+y  x,yR(ax)y=axy  x,yR(ab)x=axbx  a,b(0,)

ax is also a bijection so it has an inverse function

Definition The logarithm with base a is the function:

loga:(0,)R

Defined by loga(x)=y iff ay=x

loga(ax)=xalogax=x

Properties:

loga(1)=0loga(1x)=loga(x)loga(xy)=logax+logayloga(xy)=yloga(x)loga(x)=logb(x)logb(a)

Proof for property 4- power property

logaxy=ylogax

Let c= logax
Then x=ac
Raise both sides to the power of y

xy=(ac)y

Take logs of both sides we have..

logaxy=logaacy

Simplify..

logaxy=cy

Substitute back in for c we have

logaxy=ylogax

Even functions

Definition A function is even if:

f(x)=f(x),xf

Odd function

Definition A function is odd if:

f(x)=f(x),xf

2. Limits

Informal intuition

Informally, $$\lim_{x \to a} f(x) = L $$ if f(x) approaches L as x approaches a

Formal definition

Definition

Let I be the open interval containing a, and let f be a function defined on I, except maybe not at a. The limit of f(x), as x approaches c is L, given by:

limxaf(x)=Lϵ>0, δ>0 s.t. 0<|xa|<δ|f(x)L|<ϵ

The definition above reads, for any given ϵ>0, there exists δ>0 such that for all xa, if x is within δ units of a but not exactly equal to a (limits describe approaching a point), then, f(x) will be close enough to L within ϵ

Note the order in which ϵ and δ are given. Think of ϵ as the y-tolerance and then the limit will exist if we can find an x-tolerance δ that works.

For some intuition, think of it this way:

If we cannot find a value for ϵ for which there is no corresponding δ, then the limit does not exist

Example proof:

Consider the claim: limx2(2x+1)=5

First, we must determine a value for δ to work with. To find that, we begin with the final statement in our proof and work backwards:

|f(x)L|<ϵ|(2x+1)5|<ϵ|2x4|<ϵ

We can factor out the 2 to get:

2|x2|<ϵ

As we want this to be <ϵ, this will only be true if:

|x2|<ϵ2|xa|<δδ=ϵ2

Now that we have a value for δ, we can begin our proof

Our goal is to show that for every ϵ>0, there exists a δ>0, such that if 0<|x2|<δ, then |(2x+1)5|<ϵ

Suppose ϵ>0 has been provided
We then define δ=ϵ/2
Since ϵ>0, we have δ>0

By starting with the assumption, we have

0<|x2|<δ

Then after substutiting in for δ

0<|x2|<ϵ2

Then multiply both sides by 2:

2|x2|<ϵ

But we know that

2|x2|=|f(x)L|

So we have shown that:

|f(x)L|<ϵ

Properties of Limits

The triangle inequality

a,b,cR

|a+b||a|+|b|

The proof of the inequality below is covered in Linear Algebra I which uses the Cauchy-Schwarz theorem.

|ab||ac|+|cb|

To think of this intuitively, the magnitude of the sum is always less than the sum of the magnitudes. In terms of vectors, a direct path is always shorter than going to and back.

Uniqueness of limits + proof

limxaf(x)=L and limxaf(x)=ML=M

We're essentially trying to show that a function cannot have two difference limits at the same point. So the goal is to prove that if a limit exists, it's unique

If f(x) is within an ϵ band of L, and also within an ϵ band on M , then those bands must overlap, and therefore L must equal M.

Proof strategy (in words):

To show that L=M, we want to prove that the distance between those two is, |LM|<ϵ, ϵ>0

Proof - the epsilon-delta version

So, if the limits LM, that means there is some positive distance d that separates them:

d=|LM|>0

Next, we want to pick an ϵ that creates a gap between those intervals that don't overlap as that's the only way we have have f(x) not lying in both intervals.

ϵ=d2=|LM|2

And now we can use the definition of limit
For the first limit, we have:

limxaf(x)=Lϵ>0,δ1>0 s.t. 0<|xa|<δ1|f(x)L|<ϵ

And for the second limit:

limxaf(x)=Mϵ>0,δ2>0 s.t. 0<|xa|<δ2|f(x)M|<ϵ

Now, we take δ to be the smaller of the two so that we can have an x that is inside both δ intervals

δ=min(δ1,δ2)

That means that whenever 0<|xa|<δ, both conditions hold at once. So for any x close enough to a, f(x) is within ϵ of both L and M.

By definition of a limit, if L and M both exist, then when x is close enough to a, f(x) must be close enough to both L and M. So we're looking for any x close that is enough to a, so that f(x) is within ϵ of both L and M.

So |f(x)L|<ϵ and |f(x)M|<ϵ

And now we can use the triangle inequality, but we we add and subtract f(x). We want to go from something we know about f(x) and to something relating to the distance between L and M.

|LM|=|Lf(x)+f(x)M|

And by the triangle inequality:

|LM||Lf(x)|+|f(x)M|

Given what we already know about those terms:

|LM|<ϵ+ϵ=2ϵ

Given our chose of ϵ above, we have:

|LM|<2ϵ|LM|<2×|LM|2

But this is a a contradiction as we can't have a number that's strictly less than itself, and this means our assumption LM was wrong, and therefore L=M

Arithmetic of Limits

Suppose that limxaf(x)=L and limxag(x)=M, then the following properties hold

limxa(f(x)+g(x))=L+Mlimxa(f(x)g(x))=LMlimxaf(x)g(x)=LM$$$$limxaf(x)g(x)=LM, M0limxacf(x)=cLf(x)g(x) near aLM

Proof for the sum of limits

Given limxaf(x)=L, we know that

ϵ>0,δ1>0s.t.0<|xa|<δ1|f(x)L|<ϵ

And given the other limit, we also know that:

ϵ>0,δ1>0s.t.0<|xa|<δ2|f(x)M|<ϵ

This proof will we very similar to the uniqueness of limits in terms of algebraic manipulation. So once again we pick δ=min(δ1,δ2)

What we're trying to prove is limxa(f(x)+g(x))=L+M, in other words, f(x)+g(x) is really close to L+M, so the distance between them is really small, some value smaller than ϵ, so the below expression is what we're trying to prove:

|f(x)+g(x)(L+M)|<ϵ

We can't control f(x)+g(x) as a while, but we can control how far f(x) is from L and how far g(x) is from M. So we rewrite the expression to get:

|f(x)+g(x)(L+M)|=|(f(x)L)+(g(x)M)|

And now we can apply the triangle inequality to get:

|(f(x)L)+(g(x)M)||f(x)L|+|g(x)M|

Now because we want the whole expression to be <ϵ, we can make each term less than half of ϵ

|(f(x)L)|<ϵ2|(g(x)M)|<ϵ2

After adding the expressions above we have shown that

|f(x)L|+|g(x)M|<ϵ

Similar method can be used to prove the other properties in the arithmetic of limits

Evaluating polynomials

Let p(x) and q(x) be polynomials and aR s.t. q(a)0. Then

limxap(x)q(x)=p(a)q(a)

The proof is built from the limit properties in the section before. So we don't need to do any ϵδ proofs as we can assume the laws of arithmetic for limits

Squeeze theorem + proof

Imagine we have three functions:

f(x)g(x)h(x), x near a except maybe at a itself

If both the outer functions f and h "squeeze" towards the same limit L as xa, then the middle function g will also tend towards a.

We can use the inequality property to prove this theorem.

One-sided limits

Sometimes, we only care about how a function behaves as x approaches a point from either the left or the right.

Let f be a function defined on an interval around a

The right hand limit (approaching from values greater than a) is defined as:

limxa+f(x)=Lϵ>0,δ>0, s.t. 0<xa<δ|f(x)L|<ϵ

And the left hand limit (approaching from values smaller than a) is defined as:

limxaf(x)=Lϵ>0,δ>0, s.t. 0<ax<δ|f(x)L|<ϵ

If limxa+f(x)limxaf(x) , then the limit does not exist

Notice how in the one sided limit, there are no absolute value signs. This is because from the right hand side, the distance is already positive. If we did write 0<|xa|< for one sided limits, then it would allow x to approach from both directions, which is not what we want.

Limits at Infinity

Now, instead of approaching a finite point a, we can consider what happens as x grows without bound or decreases without bound

limxf(x)=Lϵ>0,N>0, s.t.x>N|f(x)L|<ϵ

Similarly,

limxf(x)=Lϵ>0,N>0, s.t.x<N|f(x)L|<ϵ

For a limit to exist, limxaf(x)=L means that the value of f(x) gets arbitrarily close to a finite number L, so if no such finite L exists, then the limit does not exist.
So the statement limxf(x)= just means that f(x) grows without a bound as x grows without a bound.

3. Continuity

Continuous Functions

Intuitively, continuous functions are those that we can draw without lifting the pencil off the page.
Definition A function f is continuous at a point a if $$ \lim_{x \to a} f(x) = f(a)$$
A function that is non-continuous is called discontinuous

Think back to limits, limits don’t care about what happens at a point, but continuity does.

Now we want to think about what it means for a function to be continuous on an interval. Just how we can take limits from the left and right, we can do the same for continuity:

Definition A function f is continuous at a on the right if

limxa+f(x)=f(a)

And it’s continuous on the left if $$ \lim_{x \to a^-} f(x) = f(a) $$

Continuity in an interval

Definition A function f is continuous (conts) on (b,c) if it is conts at a a(b,c)

If we were to restrict it to a half-open interval, then we would have to do one-sided continuity as we could only come in from the left or the right

Definition A function f is conts on [b,c] if it is conts at a a(b,c) and it is conts on the right at x=b and and conts on the left for x=c

Arithmetic for Continuous functions

The arithmetic of limits is the same as arithmetic for continuous functions, and they are listed here below. Continuity boils down to limits, everything in calculus boils down to limits.

Suppose f and g are conts at x=a, then

We can prove one of these properties below:

As f, g are conts at a, then limxaf(x)=a and limxag(x)=a

So by the arithmetic of limits and defn on continuity we have:

limxa(f+g)(x)=limxa(f(x)+g(x))=f(a)+g(a)=(f+g)(a)

As we did for limits, we can also have continuity for polynomials and composite functions:

We have that any polynomial p(x) is continuous at a aR, and any function p(x)q(x) Is continuous at a provided that q(a)0

Now for the composition: if g is conts at a, and if f is conts at g(a), then fg is conts at a

For the proof for the above, we want to think about the defn of composition, so

(fg)(a)=f(g(a))

So we want g to be conts at a, and then if f is conts at g(a), we have that the composition is conts at a. The actual proof of said to be more confusing than enlightening

Intermediate Value Theorem (IVT)

The intuitive idea behind this is, when we have two points connected by a continuous curve with one point below the line and another point above the line, then there is at least one place where the curve crosses the line.

Definition More formally, if a function f is a continuous function on the interval [a,b] and K is a number such that f(a)<K<f(b), then c(a,b) such that f(c)=K.

IVT is useful for a number of reasons, one of the applications is to find zeroes of a function, especially when we cannot do so by factoring

Consider the example below:
Show that sinx+2cosxx2=0 has a solution

At the moment this is an equation, so let’s turn into into a function
Let f(x)=sinx+2cosxx2, and now we want to show that f(c)=0 for some c

We know that f is conts because it is the sum of different conts functions by the arithmetic of conts functions.

And now we want to pick two points

f(0)=2>0$$$$f(π/2)=1(π/2)2<0

So by the IVT, there is a c(0,π/2)s.t.f(c)=0

Proof of the IVT (not something we’ll be tested on)

The IVT essentially says that if f:[a,b]R is conts, and K is such that f(a)<K<f(b), then there exists at least one c(a,b) such that f(c)=K

And now, we want to prove that such a c exists. So our idea is to to search for where f crosses the level K (and this is where bounds will come in), and then use the continuity of f to guarantee that we can trap that crossing point

We want to show that c[a,b] with g(c)=0

First step is to define a set, so let

S={x[a,b] g(x)0}

This means S collects all points where f(x)K

Some goofy algebra proof

Extreme Value Theorem

Maximum and Minimum

Before we talk about maximum and minimum, we need to define the terms lower and upper bounds

Definition A fn f on an interval I has an upper bound M if f(x)M xI

Definition f has a lower bound N if f(x)N xI

Definition Let f be a fn on an interval I. Then f has a maximum M if M is an upper bound and there is a cI s.t.f(c)=M

f has a minimum N if N is a lower bound and there is a dI s.t.f(d)=N

And now we can construct the definition for the extreme value theorem:

Definition
Suppose that f(x) is a continuous on the interval [a,b], then there are two numbers ac,db such that f(c) is an absolute maximum for the function and f(d) is the an absolute minimum for the function.

So if we have a continuous function on an interval , then we are guaranteed to have both an absolute maximum and an absolute minimum somewhere in the interval. The theorem doesn't tell us where they will occur or if they occur more than once, but we do know that they do exist somewhere.


4. Differentiation

At its core, differentiation is measures how fast something is changing, the rate of change. If f(x) is a curve, then its derivative at a point a is just the slope of the tangent line to the curve at that point.

To generalise this into a definition, take two points on a function f(x)

The the average rate of change (i.e. slope of the secant line) is given:

f(x+h)f(x)x+hx

Now if we let h0, then the secant line becomes a tangent, and the limit gives us the formal definition of the derivative:

f(x)=limh0f(x+h)f(x)h

Example with note of x

Not all functions are differentiable in their domain. Consider the function

f(x)=x

The domain of the function is [0,)

But the derivative is not defined at at 0 as derivatives are based on limits, and the left-hand limit h0 does not exist

Differentiability vs Continuity

A differentiable function is continuous. But a continuous function doesn’t have to be continuous

We just need a counterexample to prove the second statement above false. Consider the following function which is continuous:

f(x)=|x|(0,),f(x)=x(,0),f(x)=x

We can observe the following because of the gradient of tangent line (derivative):

limh0+f(x+h)f(x)h=1limh0f(x+h)f(x)h=1

The limits from the left and right are not equal, so the limit does not exist, so the function is not differentiable at 0. Geometrically, at x=0, the gradient changes abruptly from 1 to 1

Therefore, f(x)=|x| is conts. on R and differentiable on (,0)(0,)

Proof: differentiability continuity

As f is differentiable, we can use the definition to say that:

f(x)=limh0f(x+h)f(x)h exists 

Now if f is differentiable at some point x=a, then:

f(a)=limh0f(a+h)f(a)h

And based on the defn. of continuity, we want to show that $$ \lim_{ x \to a} f(x) = f(a) $$
So let h=xax=a+h
By the arithmetic of limits we have:

limxaf(x)f(a)=limh0[f(a+h)f(a)]

We have multiply and divide by h to get:

f(a+h)f(a)=hf(a+h)f(a)h

And now when we take limits:

limh0[f(a+h)f(a)]=limh0hf(a+h)f(a)h

Since we know that the derivative exists, we have by the arithmetic of limits:

limh0f(a+h)f(a)h=f(a) and limh0=0

Therefore,

limh0[f(a+h)f(a)]=0

So as $$ \lim_{ x \to a } f(x) - f(a) = 0 \implies \lim_{x \to a} f(x) = f(a_{}) $$

Arithmetic of Differentiable functions

As with functions and limits and continuity, there is arithmetic of differentiable functions:

(f+g)(x)=f(x)+g(x)(fg)(x)=f(x)g(x)

Next up is the Product Rule

(fg)(x)=f(x)g(x)+f(x)g(x)

And the Quotient rule

(fg)(x)=f(x)g(x)f(x)g(x)(g(x))2

Proof of the Product rule

To be done by future me, so I’ll just add a useful link here

One we have this, we can use the product rule to prove the quotient rule by writing the quotient as (fg1)(x)

Power rule

f(x)=xn,f(x)=nxn1, n1

This can be proved by induction and the product rule
Base case is n=1, so f(x)=1=1x0

Now assume that the rule holds for some n=k
So f(xk)=kxk1

Observe that $$ x^k = x^k \cdot x$$
So we can apply the product rule to get:

f(xk+1)=(xk)x+xk(x)

From our inductive case we know that:

(xk)=kxk1

and that (x)=1

So

f(xk+1)=(kxk1)x+xk(1)=(k+1)xk

So our assumptions works for n=k+1

If true for n=k, it's also true for n=k+1. As our assumption is true for n=1, by mathematical induction it's holds nZ+

Differentiating polynomials

If

p(x)=anxn+an1xn1++a1x+a0

Then

p(x)=annxn1+an1(n1)xn2+a1

Derivative as a Rate of Change

There is more than one way to think of what a derivative is for a function f(x).
It can either be viewed as the gradient of the tangent line to the curve f(x) as x.

Or, as a rate of change, i.e. the change in f values in proportion to the change in x values, namely

f(x+h)f(x)h

If we take limits, then we get the instantaneous rate of change:

limh0f(x+h)f(x)h=f(x)

In this course, we will be using Leibniz notation for the derivative: comes from the rate of change

Given a fn y=f(x), then

dydx=f(x)

Which basically is the rate of change of y values in proportion to change of x values

Chain rule

Essentially a method for taking the derivative of a composite function

Let y=fg Then y=f(u) where u=g(x)

So the rate of change of y with respect to x depends on the rate of change of y with respect to u and rate of change of u with respect to x.

dydx=dydududx

Going back to our definition for composite function.

We have y=fg, so y=f(g(x))

(fg)(x)=f(g(x))g(x)

Trig functions - key properties

ddx(sinx)=cosxddx(cosx)=sinxddx(tanx)=sec2xddx(secx)=secxtanxddx(cscx)=cscxcotxddx(cotx)=csc2x

Proof of (sinx)=cosx (mostly left to the reader)

We need to assume two lemmas here:

limx0sinxx=1limx0cosx1x=0

Using these definitions and the compound angle formula for sin(x), the proof is left as an exercise to the reader

(sinx)=limh0sin(x+h)sinxh=cos(x)

Exponentials and Logarithmic Functions

Just copy from the notes here for the first principles

(ax)=axf(0)

Where f(0) is a constant. There is a unique a with f(0)=1 This is when a=e

Lemma: $$ (\ln x)^` = \frac{1}{x} $$
As lnx and ex are inverse functions:

elnx=x

And now we can differentiate using the chain rule

elnxddx(lnx)=1xddx(lnx)=1ddx=1x

So $$ (a^x)^` = a^x \cdot \ln a $$

(logax)=1xlna

Hyperbolic Trig Functions

sinh(x)=exex2

Some properties of sinh(x):

cosh(x)=ex+ex2

Some properties of cosh(x)

Notice how the trig function cos2x+sin2x=1 means they like on the circle x2+y2=1. Similarly,

cosh2xsinh2x=1

Means the hyperbolic trig functions like on the hyperbole x2y2=1

Some other key derivatives, they can proved by writing the hyperbolic in the terms of exponentials

(sinhx)=coshx(coshx)=sinhxtanhx=sinhxcoshx

Implicit Differentiation

When we have functions that are in terms of both x and y, eg:

y3x2=3

We can’t solve explicitly for y as the function is defined implicitly. So to differentiate, we need to use the chain rule:

3y2y2x=0y=2x3y2

One of the things we can do with implicit is proof that derivative of rational powers

(xpq)=pq(xpq1)

Inverse Trig Functions

We need to restrict the functions so that they are a bijection and so we can find their inverse

sinx:[π2,π2][1,1]sin1x:[1,1][π2,π2]

Then (sin1x) is

Let y=sin1xsiny=x

And then we can differentiate implicitly:

cosyy=1y=1cosy

We know by the trig identity that cos2y+sin2y=1cosy=1sin2y

Hence, after substituting, we get the inverse of the trig functions:

(sin1x)=11x2(cos1x)=11x2(tan1x)=11+x2

For the inverse of tanx, we uses the identity that 1+tan2x=sec2x

5. Curve Sketching

Let f be a differentiable function, then a point c is called a critical point if f(c)=0

Definition f has a local minimum in an interval I if f(x)0f(x) xI

Definition f has a local maximum in an interval I if f(x)1f(x) xI

Maximum and Minimums

Proof of local minimum and maximum

So if f is differentiable on (a,b) and has a local max/local min on point c, then f(c)=0

We can prove the above by considering the limit definition at the point c

Suppose that c is a local max on (a,b)

This means by definition, there exists some small interval around c such that:

f(x)f(c) x near a

Now if we pick an h value that is positive and lands in the interval

Let h>0 s.t. c+h(a,b) so that the value is still defined

Now c is a local maximum, so

f(c)f(c+h)f(c+h)f(c)0

After dividing both sides by h we have:

f(c+h)f(c)h0

And now we can take the limit

limh0+f(c+h)f(c)h0

As f is differentiable, the limit from left equals limit from right, so this equals f(c)

Now we let h<0 s.t. f(c+h)f(c)0

limh0f(c+h)f(c)h0

Once again, as f is differentiable, so the left and right hand limits match, so this also equals f(c)

Now we have that 0f(c)0 , hence f(c)=0

Global minimum and maximum

Definition A fn f has an absolute (global) maximum at x0 if f(x0)f(x) xF
Definition A fn f has an absolute (global) minimum at x1 if f(x1)f(x) xF

There are also different kinds of local/global minimums and maximums

Definition A point c where f is conts but not differentiable is called a singular point

Extreme values (local max, local min) occur at:

Mean Value Theorem (MVT)

This will help us identify where the graph of f increase or decreases

Definition
Let f be a conts fn on [a,b] that is differentiable on (a,b). Then there is a point c(a,b) s.t.

f(c)=f(b)f(a)ba

Think of it as the gradient of a secant line of joining two points on a graph. So what this is saying is that there is some point c(a,b) such the gradient of a tangent line equals the secant line are the same

Proof of the MVT

The formal proof involve Rolle's Theorem. The idea here is that we want to "remove" the secant line so the endpoints are level.

Let g(x)=f(x) - equation of secant line.

The secant line passing through (a,f(a)) and (b,f(b)) has the gradient:

m=f(b)f(a)ba

And now we can use the straight line equation to get:

yy1=m(xx1)yf(a)=f(b)f(a)ba(xa)

Rearranging for y=L(x), we get:

L(x)=f(a)+f(b)f(a)ba(xa)$$Sowedefine$$g(x)=f(x)L(x)

Now, we check g at the endpoints

g(a)=f(a)L(a)=f(a)f(a)=0g(b)=f(b)L(b)=f(b)[f(a)+f(b)f(a)ba(ba)]=0

So we have that g(a)=g(b)=0

Using Rolle's Theorem (which is a special case of the MVT), we have that if a fn is conts on [a,b] and differentiable on (a,b) and g(a)=g(b), then c(a,b) s.t. g(c)=0.

What we want is to apply that special case, to the general case, and we have done that by removing the secant line equation.

And as our function g satisfies, so g(c)=0

Now, we if different g(x), we have:

g(x)=f(x)f(b)f(a)ba

And setting g(c)=0 gives us the MVT:

f(c)=f(b)f(a)ba

Intuitively, the MVT links average change to instantaneous change.
This will be useful in deriving the Fundamental Theorem of Calculus later in the course

Intervals of Increasing and Decreasing

Definition Let f be a fn on an interval I, then ;

Let f be a fn on an open interval I, then:

f(x)>0 xI f is increasingf(x)<0 xI f is decreasing

The proof above uses the MVT:

First, to use the MVT, we assume that f is conts on [a,b] and differentiable on the (a,b). Then, we pick any two points in that interval: x1,x2[a,b],x1<x2

By the MVT, c(x1,x2) s.t.

f(c)=f(x2)f(x1)x2x1

Now we consider cases

Case 1 - f(x)>0 everywhere on (a,b)

Then, c,f(c)>0
So by the MVT:

f(x2)f(x1)x2x1=f(c)>0

But as x2x1>0, we must have that f(x2)f(x1)>0f(x2)>f(x1) on (a,b). Therefore by definition, f is strictly increasing

Case 2- f(x)<0 everywhere on (a,b)

By the same logic,

c,f(c)<0f(x2)f(x1)x2x1<0

But as x2x1>0, we must have that f(x2)f(x1)<0f(x2)<f(x1) on (a,b). Therefore by definition, is strictly decreasing

Naturally, if f(x)=0 then

f(x2)f(x1)x2x1=f(c)=0f(x2)f(x1)=0f(x1)=f(x2)

So f would be constant in that interval

Concavity and Points of Inflection

Definition
Let f be a fn that has a second derivative on an open interval. Then

Graphically, the tangent lines are:

Definition
A fn f that has a second derivative has an inflection point at x if the concavity changes at x and f(x)=0.

So if f is a fn whose second derivative exists on an open interval I, then

Second Derivative Test

We can the f to determine the local max and min

So let f be a twice differentiable function on some open interval I, and for some c ,f(c) exists, then:

f(c)>0 c is a local minimumf(c)<0 c is a local maximumf(c)=0 more information needed

Horizontal and Vertical Asymptotes

Definition A graph has a vertical asymptote at x=a if:

limxa+f(x)=± or limxaf(x)=±

Definition A graph has a horizontal asymptote if

limxf(x)=L or limxf(x)=L

Where L is some finite number

The best way to practice this topic- do curve sketching

6. Integration

Anti-Derivatives

Definition Let f be a function, then a function g is the antiderivative of f if g is differentiable and g=f

Now suppose that g(x)=f(x). Then h(x)=g(x)+ch(x)=f(x) where c is a constant. The proof of this is fairly trivial. But what this means is that antiderivatives are not unique, as for any antiderivative, we can always add a constant to get another anti-derivative

Geometrically speaking, antiderivative has no meaning, but it will be used in finding integrals.

Definite Integral

Let f be a continuous, and suppose we want to find the area A between the curve and the x axis that lies over [a,b]

One way to approach this problem is to consider approximating the area of rectangles that we can fit below the curve with total area L, and a sequence of rectangles above the curve with a total area U. Then we have the area bounded as LAU. As the curve isn't made up on neat rectangles, we can approximate the area by taking a limit as those rectangles get thinner and thinner.

First, we divide our interval, [a,b] into segments:

[a,x1],[x1,x2],,[xn2,xn1],[xn1,xb],a<x1<x2<<xn1<b

We can let li be the length of segment [xi1,xi]. This division is called a partition, and we label it as P.

Consider restricting f to [xi1,xi]. So on that closed interval f is continuous so it has a maximum and minimum value by the EVT. Let Mi be that max value, and mi the min value.

Now we form rectangles in our restricted interval [xi1,xi]. The lower rectangle has area limi and the upper rectangle has area liMi

When we sum the rectangles, the areas of the upper and lower is given as follows:

L(f,P)=l1m1++lnmnU(f,P)=l1M1++lnMn

Therefore, we know that our area A is bounded by

L(f,P)AU(f,P)

Now, if we keep increasing n, i.e. make the rectangles thinner, then the lower and upper estimates get closer together, because f can't change much over the tiny intervals. It's a bit like the squeeze theorem.

Definition Let f be a continuous function on the interval [a,b], then the definite integral of f from a to b is the unique number I s.t. L(f,P)IU(,P) for all partitions of [a,b]. This is written as:

I=abf(x)dx

Properties of definite integrals

There are some nice properties of integrals because of the arithmetic of limits, and how everything boils down to limits

ab(cf(x)+dg(x))dx=cabf(x)dx+dabg(x)dx \int_{a} { #a} f(x) \ dx = 0atf(x) dx+tbf(x) dx=abf(x) dxbaf(x) dx=abf(x) dxf(x)g(x)x[a,b]abf(x) dxabg(x) dx

Fundamental Theorem of Calculus

There is a reason this is called the Fundamental Theorem of Calculus. Not only it shows us the relationship between integration and differentiation, but it also guarantees that any integrable function has an antiderivative, it guarantees that any continuous function has an antiderivative

Part I

If f is a continuous function over an interval [a,b], and the function F(x) is defined by:

F(X)=axf(t) dt

Then F(X)=f(x) over [a,b]

So F(X) is the area under f from a to x, it's the area accumulation function.

To show this holds, we apply the definition of the derivative to our function to get::

F(x)=limh0F(x+h)F(x)h=limh01h[ax+hf(t) dtaxf(t) dt]

After splitting the integral we have:

=limh01h[axf(t) dt+xx+hf(t) dtaxf(t) dt]=limh01hxx+hf(t) dt

Now we either estimate this using a lower and upper bound (which is how it's given in the notes), or we can use the Mean value theorem.

Notice that 1hxx+hf(t) dt is just the average value of the function f(x) over the interval [x,x+h]. So by the MVT, c[x,x+h]s.t.:

1hxx+hf(t) dt=f(c)

And now we can take limits, since c is between x and h, and cx and as h0 and f is continuous

limh0f(c)=limcx=f(x)

So combining all of this together we have:

F(x)=1hxx+hf(t) dt=limh0f(c)=f(x)

Part II

If f is continuous over the interval [a,b], and F(X) is any antiderivative of f(x), then

abf(x) dx=F(b)F(a)

What this means is that if we can find an antiderivative for the integrand, then we can evaluate the definite integral by evaluating the antiderivative at the endpoints of the interval

The proof can be done two ways again: using Riemann sums and using the definition of an antiderivative, both make use of the MVT. This one is similar to the one given in notes, but I attempted to explain it more detail:

Suppose F is an antiderivative of f, with f being continuous of [a,b]. Now let:

F(x)=axf(t) dt,G(x)=axf(t) dt

By Part I, we know that G is also an antiderivative of f
So we have that F(x)=G(x).

Now, if two functions have the same derivative, then their difference has derivative 0: FG=0
So by the MVT, this implies that FG is a constant function, i.e .c[a,b] s.t. G(x)=F(x)+c. When we let x=a and compute G(a) we have:

F(a)+c=G(a)=aaf(t) dt=0c=F(A)

So we have that G(x)=F(x)F(a).

And then we evaluate at x=b we have G(b) and using the definition of G(x) above we have:

abf(x) dx=G(b)=F(b)f(a)

Area Problem

Suppose f(x)g(x) on [a,b] and we want to find the area between f and g

So the area is given as:

abf(x)g(x) dx

If it’s not immediately obvious which curve is above, then we can evaluate the function at a point in the interval (a,b)

7. Methods of Integration

Substitution

This is like the integration version of chain rule

So by the chain rule we have that:

ddxf(g(x))=f(g(x))g(x)

And when we integrate we have:

f(g(x))g(x)=f(g(x))+c

Now if we let u=g(x) then dudx=g(x)du=g(x) dx

After substituting in we have:

f(g(x))g(x)=f(u)+c=f(u)+c=f(g(x))+c

It's important to not that the derivative is not a fraction, and the proper proof for why the fraction thing works is covered in Differentials

Interesting Trig odd and even properties

Below are the half angle formulas for sine and cosine which will be useful in integrating even powers of trig functions

cos2x=12(1+cos2x)sin2x=12(1cos2x)

To generalise:

sinmxcosnx dx

If either one of m or n is odd, then we can do this integral by substituting. This is because we will need at least one odd power to cancel out when we're substituting.

If both m and n are even, then we must use the half angle properties

Trig Substitutions

a2x2x=asinθa2+x2x=atanθx2a2x=asecθ

Integration by Parts

u dv=uvv du

Partial Fractions

Good for integrals in the for:

P(x)Q(x) dx

The reason we can factor Q(x) is because of the Fundamental Theorem of Algebra, which will be covered in Complex Analysis in Year 3

There are basically 3 main case cases of denominators

Case 1: Distinct linear factors

If Q(x)=(xa)(xb)(xc)

Then $$ \frac{P(x)}{Q(x)} = \frac{A}{x-a} + \frac{B}{x-b} + \frac{C}{x-c}$$
To find A,B,C, multiply both sides by the denominator and equate coefficients or substitute roots

Case 2: Repeated Linear Factors

If Q(x)=(xa)n
Then all we need is all powers up to n:

P(x)(xa)n=A1xa+A2(xa)2++An(xa)n

Case 3: Irreducible Quadratic factors

If the denominator has a quadratic that doesn't factor, then we can can't split it into linear factor like Axa, so the next thing is to have Ax+bax2+bx+c

This works because Ax+b is the most general linear numerator

So if Q(x)=ax2+bx+c

Then $$ \frac{P(x)}{ax^2 + bx+ c} = \frac{Ax +b}{ax^2 + bx+c} $$

8. Indeterminate Forms and Improper Integrals

L'Hôpital Rule

This is useful when a limit essentially gives us:

Because in these cases, the functions go to zero, or infinity at the same time, and we want to know which one goes to zero or infinity faster, so we can compares their rates of change (i.e. derivatives) to find that. There are two versions of this:

Let f,g be differentiable and g(x)0,

Version 1

Suppose limxaf(x)=0 and limxag(x)=0, then

limxaf(x)g(x)=Llimxaf(x)g(x)=L

Version 2

Suppose limxaf(x)=± and limxag(x)=±, then

limxaf(x)g(x)=Llimxaf(x)g(x)=L

And note that a here can be ±
ex grows faster than any polynomial, lnx grows slower than any polynomial,

Fun example

limx0+xx

We we can do some manipulation to get ln(xx)=xlnx and then:

limx0+xlnx=limx0+lnx1x=limx0+1x1x2=limx0+x=0

So to get back to the limit of xx, we need to exponentiate this so we have:

limx0+xx=e0=1

Pretty cool right!

Cauchy’s Mean Value Theorem

Before proving L’Hôpital, we need to prove the generalised version of the MVT.

The idea behind this is that if two functions satisfy the conditions of MVT, then some point c exists for where the ratio of their derivatives equals the ratio of their total changes.

So if f,g are continuous on [a,b] and differentiable on (a,b), and g(x)0 then

c(a.b) s.t. f(c)g(c)=f(b)f(a)g(b)g(a)

Similar to the proof of Rolle's Theorem and the MVT, we want to generate a function whose endpoints match, apply Rolle's Theorem, and then rearrange to get the expression

The full proof is in the lecture notes, and very similar to MVT, so I won't write it again here

Proof of L’Hôpital

The strategy here is to essentially apply CMVT to f and g on [a,x]
This gives us f(x)g(x)=f(c)g(c)
And then since c(a,x) when xa then ca
We can then take limits one the right hand side and conclude

Improper Integrals

An integral becomes improper if either the interval is infinite or there is a vertical asymptote.

Eg:

11x2 dx,011x dx

In these cases, the Riemann definition using sums of rectangles doesn't work, so we need to work using limits.

Integrals converging/diverging

Consider this example of a function:

af(x) dx

If a limit exists of this integral:

limbabf(x) dx=L

Then, the integral converges to L. If the limit does not exist, then the integral diverges

We can use the limit definition to show that a1x dx diverges

1b1x dx=lnx|1b=ln(b)ln(1)=lnblimb1b1x dx=limblnb=

So since the limit does not exist, the integral diverges

Similarly, we can use limits again to show that a1x2 dx converges

1b1x2 dx=1x|b1=(1b1)=11blimb1b1x2 dx=limb11b=1

So as the limit exists, the integral converges to 1

P-test

11xp dx converges p>1

We can test cases to see why this works. Consider:

1bxp dx

Case 1: p1

xp=xp+1p+1

So

1bxp dx=b1p11p

And now we can look at the limit as b.

p>11p<0b1p0p<11p>0b1p

So the limit exists and converges if p>1

Now, the case for when p=1

1b1x dx=lnb

And from the example we know this integral diverges as blnR. So p=1 diverges too.

Hence, the only case when the integral converges is when p>1

Another case to consider:

011xp dx converges p<1

The proof of this is left as an exercise to the reader

Other forms

Another form of an improper integrals is:

f(x) dx

In the case above, we can split the integral to get:

af(x) dx+af(x) dx

And then treat each of them as separate improper integrals. Here's a fun problem, solution to which is give in the notes. Show that:

11+x2 dx=π

Comparison Test

When thinking of integrals as areas under curve, if we have

0f(x)g(x)

Then the area under f is always under the area of g. So if g's area is finite, then f's area must be finite too as f fits under g

And if f's area is infinite, then g's area must also be infinite.

So suppose 0f(x)g(x) for all large x, then

ag(x) dx converges af(x) dx also convergesaf(x) dx diverges ag(x) dx diverges 

Note that the inequality direction matters, i.e. if f(x)g(x) then the convergence of g convergence of f, however, the converse tells us nothing.

Consider the example:

111+x+x2 dx

Now at first glance, 1+x+x2x2, so it might be tempting to consider the comparison with 1x2

But it gives us the inequality in the wrong direction, and so comparing bigger function to a convergent function tells us nothing

9.Taylor Series

The idea behind this is we can approximate a complicated looking function using a polynomial by matching all its derivatives at a point. So we image zooming in on a smooth curve at some point x=a and then if we zoom in enough, the curve looks like a straight line (the tangent)

3b1b has a good video explaining the intuition behind this. But in terms of approximating, the best polynomial approximation is the one matching all derivatives. So if our polynomial is

P(x)=c0+c1(xa)+c2(xa)2+

And we want this to behave exactly like f around the point a, then it must have the same value, slope, and curvature.

So if we matched the derivatives of all orders, then the polynomial behaves exactly like the function at that point

Therefore, the general formula for Taylor series of f around x=a is:

f(x)=n=0f(n)(a)n!(xa)n

When a=0, we get the Maclaurin series, which is just a special case of the Taylor series and is what given in the notes:

f(x)=n=0f(n)(0)n!(x)n

Computing the Taylor series for a function is just a matter of computing derivatives at x=0 and then putting that into the formula

Some common Maclaurin series are as follows:

ex=n=0xnn!=1+x+x22!+x33!+sinx=n=0(1)nx2n+1(2n+1)!=xx33!+x55!x77!+cosx=n=0(1)nx2n(2n)!=1x22!+x44!x66!+ln(1+x)=n=1(1)n1xnn=xx22+x33x44+

Cauchy’s Remainder Theorem

When we build a Taylor polynomial Pn(x), we know the value of f, its slope and curvature, etc. up to the nth derivative at a point. So the polynomial is perfect at the expansion point. But the moment we move away, the question then turns into an error approximation one, which is where the remainder theorem comes in

Geometrically, the first way the function can differ from our polynomial is through the (n+1)th derivative so up until the expansion point, the error function has n+1 zeroes. And the simplest function with (n+1) zeroes at 0 is: xn+1.

Therefore we know that the error function must look like: something × xn+1
And that 'something' is the (n+1) derivative at some point in between by the MVT logic

It's important to know that we're not saying that the error happens at some random point, but rather that the average behaviour of the n+1 derivative controlst he error.

So the remainder theorem is given by:

Rn+1(x)=fn+1(c)(n+1)!xn+1

Bounding the error

Bounding just means we don't know the exact value, but we know it cannot be bigger than some value, say M

So in our case, we don't know what this value of c is, but we know know how big the derivative can get via properties of the functions. So we instead bound the error:

|Rn+1(x)|M(n+1)!|x|n+1,M=max|fn+1(x)|

So this essentially gives us a max possible error, which basically means that no matter what, the error cannot be worse than this

10. Differential Equations

A differential equation is just a rule that tells us how a function changes, rather than what it is. So instead of y=x2, we have: dydx=2x

So am ODE describes a family of curves, and when we solve one, we are reconstructing the curve(s) which is consistent with the given rate of change

The order is the highest derivative present

Definition

A general solution of an ODE is the most general function y=f(x) that satisfies the ODE And if we want to specify the set of solutions, we can do it with initial or boundary conditions

An initial value problem (IVP) is an ODE together with an initial condition. A boundary value problem (BVP) is an ODE with boundary conditions

First order ODEs

An first order ODE has the general form:

dydx=f(x,y)

It's important to not that there isn't a single method to solving all DEs, the method depends on the structure of the function

A linear first order DE has the form:

dydx+p(x)y=q(x)

A DE is homogeneous if:

dydx+p(x)y=0

There are no external forces acting, so this is always separable

A DE is non-homogeneous if:

dydx+p(x)y=q(x)

Notice that the RHS term is not 0, so this will requires us to use integration factor method

Separation of Variables

A first order DE is separable if it can be arranged into:

g(y)dydx=h(x)

the idea here is that we can 'separate' all the y and the x dependencies

The derivative represents the change in y per change in x, so if the DE splits cleanly into a function of y times a function of x, then we can accumulate changes on each side independently to get:

g(y)dy=h(x)dx

Note: derivatives are NOT fractions, this is just using the chain rule backwards

The method of separation of variables is fairly simple, we rearrange to isolate the x and y terms:

g(y)dy=h(x)dx

And then integrate both sides

g(y) dy=h(x) dx+c

Integration factor

The problem we want to solve here is:

dydx+p(x)y=q(x)

The LHS is almost the derivative of a product, but not exactly, so we'll want to turn it into ddxsomething because derivatives or produces are easy to integrate. The notation used below is slightly different to used in notes, but the idea is the smae.

The product rule says:

ddx(μy)=μdydx+μy

We want this to match:

dydx+p(x)y

So we choose μ(x) such that:

μ(x)=μ(x)p(x)

This is a tiny DE whose solution is:

μ(x)=ep(x)dx

And the function μ(x) is called the integrating factor.

So given:

dydx+p(x)y=q(x)

The steps to solving this:

Compute:

μ(x)=ep(x)dx

Multiply the entire equation by μ(x)
The LHS becomes:

ddx(μy)

Integrate:

μy=μq(x)dx+C

Solve for y

2nd order constants coefficient ODEs

A second-order differential equation involves the second derivative:

y=d2ydx2

For ay+by+cy=0, we look for solutions of the form:

y=erx

The exponential function appear here because:

Characteristic equation:

When we substitute y=erx:

y=rerx,y=r2erx

We get:

ar2erx+brerx+cerx=0

Factor out erx0:

ar2+br+c=0

This quadratic is called the characteristic equation. So solving the DE reduces to solving a quadratic equation.

Now we can have 3 cases for the equation:

Case 1: Two distinct real roots:

y=Aer1x+Ber2x

Case 2: Repeated real root

y=(A+Bx)erx

The extra x is because erx alone is not sufficient to generate two independent solutions, so multiplying by x creates that linear independence

Case 3: Complex roots:

r=α±βi

y=eα(Acosβx+Bsinβx)

Non-homogeneous equations:

For ay+by+cy=f(x)
We write: y=yh+yp

where:

The homogeneous part gives the natural behaviour and the particular part accounts for forcing.

The reason we can split this might seem quite familiar. As linear operators satisfy:

L(y1+y2)=L(y1)+L(y2)

We can see that:

This is the same linearity idea we’ve seen in linear algebra.

So the steps are as follows:

Find the general solution of the associated homogeneous ODE which is called the complementary function (CF)
Find any solution to the full non-homogeneous ODE which is called the particular integral (PI)
Then the general solution is y=CF+PI

Trial Functions for Particular Integrals

Note that if a trial function has the same form as part of the complementary function, multiply the trial function by x to form a new trial function

Optional Asides:

Where is e coming from when finding complementary functions?

https://math.stackexchange.com/questions/764171/proof-that-ex-is-the-eigenvector-of-the-derivative-operator