Link back to Home page

Link to typed notes:

0. Complex Numbers

C = {a + b i ∣ a, b \in R}

Re (a + b i) = a

Im (a + b i) = b

Why is i on the other side??

Graphically, $z = a + b i \in C$ corresponds to the point $(a, b) \in R^{2}$

Definition Modulus of a complex number $a + b i$ is

∣ z ∣= \sqrt{a^{2} + b^{2}}

Think of it like vectors and distances from origin

Definition Argument of a complex number is the angle between $z$ and the positive $x$ axis measured in radians anti-clockwise

$A r g (z)$ is the principle value of the argument, but $a r g (z)$ is defined by adding + $2 π n$

The argument lies in the interval $(- π, π]$ : not including $- π$ but including $π$

e^{i θ} = c o s θ + i \sin θ, θ \in C

Polar form

Every non-zero complex number can be written in the form of

z = r (\cos θ + \sin θ) = r e^{i θ}

Where

$r =∣ z ∣, θ = a r g (z)$

Let $z_{1}$ be $r_{1} e^{i θ_{1}}$ and $z_{2}$ be $r_{2} e^{i θ_{2}}$

Multiplication

z_{1} z_{2} = (r_{1} r_{2}) e^{i (θ_{1} + θ_{2})}

Division

\frac{z_{1}}{z_{2}} = (\frac{r_{1}}{r_{2}}) e^{i (θ_{1} - θ_{2})}

Powers (De Moivre's)

(r (\cos θ + i \sin θ))^{n} = r^{n} (\cos n θ + i \sin n θ)

z^{n} = (r e^{i θ})^{n} = r^{n} e^{i n θ}

Roots of units

z^{\frac{1}{n}} = r^{\frac{1}{n}} e^{i \frac{(θ + 2 k π)}{n}}, k = 0, 1, . . . n - 1

Example

Take $z = 1 + i$

Modulus = $∣ z ∣= \sqrt{1^{2} + 1^{2}} = \sqrt{2}$
Argument = $A r g (z) = \arctan (1) = \frac{π}{4}$ $
Polar form: $z = \sqrt{2} (\cos \frac{π}{4} + i \sin \frac{π}{4})$
Exponential form: $\sqrt{2} e^{\frac{i π}{4}}$

Conjugates and Reciprocals

Why do they matter?

Getting rid of the $i$ in the denominator
- Long-standing convention that we want denominator to be real numbers
- Modulus and argument can be easily seen
Calculating the magnitude
- $\mid z \mid {#2} = z \bar{z}$
Later - calculating products in vector spaces
Roots of real polynomials as they come in conjugate pairs

If $z = a + b i$ , the conjugate is defined as: $\bar{z} = a - b i$

This is essentially reflecting $z$ is the real axis
If $z = r e^{i θ}$ then $\bar{z} = r e^{- i θ}$ $$ z \bar{z} = \mid z \mid

T o t h i n k a b o u t t h e p o i n t a b o v e g e o m e t r i c a l l y, $ z $ a n d $ \bar{z} $ a r e j u s t t w o v e c t o r s w h i c h h a v e t h e s a m e m a g n i t u d e b u t o n e o f t h e r e f l e c t i o n o f t h e o t h e r i n t h e r e a l a x i s, s o i f w e m u l t i p l y t h o s e t w o v e c t o r s, w e a r e e s s e n t i a l l y d o u b l i n g t h e m a g n i t u d e o f t h e v e c t o r . T h i s a l s o m a k e s i n t u i t i v e s e n s e i n p o l a r f o r m $ $ z^{- 1} z = 1

The reciprocal by definition is the number that when multiplied to gives the result of $1$
In terms of geometry, $z$ is essentially scaling a vector by $r$ and rotating it by $θ$ , and what $z^{- 1}$ does is scales the vector by $\frac{1}{r}$ and rotates it by $- θ$

Reciprocals: $z^{-1} = \frac{\bar{z}}{\mid z \mid

{ #2}
}$

Proof:

Let $z = a + b i$

We want $z^{- 1} = \frac{1}{z}$

Therefore:

\frac{1}{z} = \frac{1}{a + b i} \times \frac{a - b i}{a - b i} = \frac{a - b i}{a^{2} + b^{2}}

The conjugate $\bar{z} = a - b i$

And the modulus is $∣ z ∣= \sqrt{a^{2} + b^{2}}$
Therefore $$ z^{-1}= \frac{\bar{z}}{\mid z \mid
{ #2}
} , \space z \neq 0$$

Reciprocals in polar form

Any non-zero complex number can be written as:

z = r e^{i θ}

The reciprocal $z^{- 1}$ can be written as:

z^{- 1} = \frac{1}{z} = \frac{1}{r e^{i θ}} = \frac{1}{r} \times \frac{1}{e^{i θ}}

But $\frac{1}{e^{i θ}} = e^{- i θ}$

So we have

z^{- 1} = \frac{1}{r} e^{- i θ}

This form can also be derived from the earlier formula: $z^{-1} = \frac{\bar{z}}{\mid z \mid {#2} }$

We have $\bar{z} = r e^{- i θ}$ and $\mid z \mid {#2} = r^2$

\frac{\bar{z}}{\mid z \mid {#2} } = \frac{re^{-i \theta}}{r^2} = \frac{1}{r} e^{-i \theta}

Polar form is very useful when converting numbers from their cartesian form to mod-arg form

If we had $z = a + b i$ , we would have to expand a product, deal with conjugates and end up with fractions for both real and imaginary parts.

But if we had reciprocals in polar form with $z = r e^{i θ}$ , then we just need to invert the modulus and flip the angle

Consider this example: $z = 3 + 4 i$

Cartesian way:

z^{- 1} = \frac{3 - 4 i}{3^{2} + 4^{2}} = \frac{3}{25} - \frac{4}{25} i

And now consider the polar way:

$r = 5, θ = \arctan (\frac{4}{3})$

z^{- 1} = \frac{1}{5} e^{- i θ}

Proof of fundamental theorem of algebra here

Need intermediate value theorem to prove stuff

Roots and conjugate pairs

If a polynomial has real coefficients, then any non-real complex root must appear with its complex conjugate as another root

So if $a + b i, b \neq 0$ is a root, then $a - b i$ is also a root
Why?

Suppose the polynomial is

P (x) = a_{n} x^{n} + a_{n - 1} x^{n - 1} + \dots + a_{1} x + a_{0}, a_{k} \in R,

Now assume that $z \in C$ is a root:

P (z) = a_{n} z^{n} + a_{n - 1} z^{n - 1} + . . . + a_{1} z + a_{0} = 0

This is essentially saying that if we plug is $z$ which is in the set of complex numbers into $P (z)$ , the result is 0, which means $z$ is a solution to the polynomial

Then if we take the conjugate of both sides:

\bar{P (z)} = \bar{0}

However, $\bar{0} = 0$

And because the coefficients are real:

\bar{P (z)} = a_{n} {\bar{z}}^{n} + a_{n - 1} {\bar{z}}^{n - 1} + . . . + a_{1} \bar{z} + a_{0} = P (\bar{z})

So we've shown that $$ P(\bar{z}) =0 $$
Hence, if $z$ is a root, so is $\bar{z}$

Solving roots in polar form

1. Real $n$ space $R^{n}$

Definition The real n-space $R^{n}$ is the collection of n-tuple of real numbers

\underset{―}{v} = (x_{1}, x_{2}, \dots, x_{n}), \forall x_{i} \in R

Eg:
$R^{0}$ = a single point
$R^{1}$ = the real number line
$R^{2}$ = the plane
Etc…

$R^{n}$ can also be viewed as set of points

Scalar/Dot product

A scalar product (aka dot product) of two vectors returns a scalar (i.e a real or a complex number)

For $\underset{―}{u} = (u_{1}, u_{2}, \dots u_{n})$ and $\underset{―}{v} = (v_{1}, v_{2} \dots v_{n})$ in $R^{n}$ , we define

\underset{―}{u} \cdot \underset{―}{v} = u_{1} v_{1} + u_{2} v_{2} + \dots + u_{n} v_{n} = \sum_{i = 1}^{n} u_{i} v_{i}

Properties of scalar product

Commutative
Distributive

Definition Orthogonal vectors - vectors at right angles

\vec{a} \cdot \vec{b} = 0

Norm of a vector

Norm is basically the magnitude or length of a vector

| | \vec{v} | | = \sqrt{\vec{v} \cdot \vec{v}}

Unit vector - A vector with the magnitude of 1

\hat{a} = \frac{\vec{a}}{| | \vec{a} | |}

Generalized Pythagoras theorem:

|| \vec{a} + \vec{b} || {#2} = ||\vec{a} || + ||\vec{b} ||

Projections

Intuitively, imagine we have two vectors: $\vec{a}$ and $\vec{b}$ , and we want to know how much of $\vec{a}$ lies in the direction of $\vec{b}$ .

The projection of a vector $\vec{a}$ onto another vector $\vec{b}$ is the "shadow" of $\vec{a}$ in the direction of $\vec{b}$ . It tells you how much of $\vec{a}$ lies along $\vec{b}$ .

Now, the problem is, $\vec{a}$ might not lie perfectly along $\vec{b}$ , it might be slanted, so we want to find the part of $\vec{a}$ that does lie along it, and that's what projection is.

We know that any point on the line $\vec{b}$ must look like a multiple of $\vec{b}$ . So every point on that line can be written as:

\vec{b} = λ \vec{b}

We need to find a $λ$ multiple of $\vec{b}$ gives the point on that line where $\vec{a}$ drops a perpendicular, so we need a scalar $λ$ such that the point $P$ on the line through $\vec{b}$ satisfies the equation:

\vec{O P} = λ b

and the vector from $P$ to the tip of $\vec{a}$ (which is $a - λ b)$ is perpendicular to P

Since perpendicular vectors have dot product = 0, we have

(\vec{a} - λ \vec{b}) \cdot \vec{b} = 0

\vec{a} \cdot \vec{b} - λ \vec{b} \cdot \vec{b} = 0

λ = \frac{\vec{a} \cdot \vec{b}}{\vec{b} \cdot \vec{b}}

Therefore, the component of $\vec{a}$ along $\vec{b}$ is the scalar $\frac{\vec{a} \cdot \vec{b}}{| \vec{b} |^{2}}$ , and the projection is vector $λ \vec{b}$

Cauchy-Schwarz Inequality

Using the dot product definition, we have

\vec{a} \cdot \vec{b} = | \vec{a} | | \vec{b} | \cos θ

Since $\cos θ \leq 1$ , we immediately get:

| \vec{a} \cdot \vec{b} | \leq | \vec{a} | | \vec{b} |

This is the Cauchy-Schwarz inequality, and it essentially means that a project (shadow) can never be longer than the actual vector.

Triangle inequality

The triangle inequality here below can be proved using the Cauchy-Schwarz inequality:

| \vec{a} + \vec{b} |^{2} = | \vec{a} |^{2} + 2 (\vec{a} \cdot \vec{b}) + | \vec{b} |^{2}

After expanding the LHS, we can how apply the Cauchy-Schwarz inequality

| \vec{a} \cdot \vec{b} | \leq | \vec{a} | | \vec{b} |

So we have

| \vec{a} + \vec{b} |^{2} \leq | \vec{a} |^{2} + | \vec{a} | | \vec{b} | + | \vec{b} |^{2} = (| \vec{a} | + | \vec{b} |)^{2}

After taking square roots, we have:

| \vec{a} + \vec{b} | \leq | \vec{a} | + | \vec{b} |

Equation of a Line in $R^{n}$

To define a line, we need two things (maybe more, but two main):

A point the line passes through, called the position vector or the point of origin of the line. Call it $\vec{a}$ . So if the line passes through a point $A (x_{1}, y_{1}, z_{1})$ then $\vec{a} = (x_{1}, y_{1}, z_{1})$
A direction the line travels in, called the direction vector. Call it $\vec{d}$

There are a few ways to describe lines.

Vector equation

The vector equation of a line is the most geometric form, if the line passes through the point $A$ and is parallel to the direction of vector $\vec{d}$ , then every $r$ on the line can be reached by:

\vec{r} = \vec{a} + λ \vec{d}

where $λ$ is a scalar that moves along the line. Intuitively, we're starting at $\vec{a}$ and then moving in direction $\vec{d}$ for some distance $λ$

Parametric equation

This is just a component form of the vector equation, but it's useful when working with intersections or substituting into other equations for $λ$ .

$λ \in R$ is essentially the parameter controlling position along the line (and yes that's why it's parametric)

If we write vectors in components:

\vec{a} = (x_{1}, y_{1}, z_{1}), \vec{d} = (a, b, c), \vec{r} = (x, y, z)

Then

(x, y, z) = (x_{1}, y_{1}, z_{1}) + λ (a, b, c)

This gives

In this case, the parametric equation of a line is:

x = x_{1} + a λ

y = y_{1} + b λ

z = z_{1} + c λ

Cartesian equation

The cartesian form eliminates $λ$ and expresses the relation between $x, y, z$ . It's also useful when finding where two lines/planes intersect.

Converting from parametric to Cartesian equation example

When converting from parametric to cartesian, our goal is to eliminate $λ$

So given our three parametric equations above, we can solve each of them for $λ$ to get our cartesian equation of a line, which in general form is:

\frac{x - x_{1}}{a} = \frac{y - y_{1}}{b} = \frac{z - z_{1}}{c}

Geometrically speaking, we get a ratio of how much we've moved along the line (scaled by $λ$ ) in each coordinate direction

Parallel, Intersection, Skew

In $R^{2}$ , lines are either parallel or their intersect at a point. But in $R^{3}$ , lines can either be parallel, intersect, or skew, or coincident.

It's useful to outline what each of them mean and what their test would look like:

Definition

Coincident - the direction vectors are multiple of each other as one point satisfies the other line's equation
- The points lie on the same line
Parallel - direction vectors are multiples, but there is no common point
- The lines are in the same direction, but different positions
Intersecting - system has a single consistent solution for the parameters ( $λ, μ$ )
- The lines meet a single point
Skew - system has no solutions (inconsistent equations)
- Not parallel, never meet
- Lie in different planes

Forming and solving systems of equations to determine the arrangement of lines in $R^{3}$

Let the two lines $L_{1}$ and $L_{2}$ be given by:

L_{1} : \vec{r} = \vec{a_{1}} + λ \vec{d_{1}}

L_{1} : \vec{r} = \vec{a_{1}} + λ \vec{d_{2}}

where

$\vec{a_{1}}, \vec{a_{2}} \in R^{3}$ are position vectors of points on each line.
$\vec{d_{1}}, \vec{d_{2}} \in R^{3}$ are the direction vectors
$λ, μ \in R^{3}$ are scalar parameters

In component form, this gives a system of 3 linear simultaneous equations in the two unknowns $λ$ and $μ$

x_{1} + a_{1} λ = x_{2} + a_{2} μ

y_{1} + b_{1} λ = y_{2} + b_{2} μ

z_{1} + c_{1} λ = z_{2} + c_{2} μ

And we we can either

Eliminate one parameter
Use the determinant of matrix (more on this later)
Substitute using pairs of equations

Once we have our result, we can use our definitions to check the arrangement of lines in $R^{3}$

We can only have a unique point of intersection if the system has a unique consistent solution for $λ, μ$
If the system has no solution and are not parallel, they are skew - i.e. lie in different planes

Equation of a plane

Let's start with the intuition, what is a plane? (Yes it is something that flies..)

A plane is a flat 2D surface that extends infinitely in three dimensions. We can think of it as :

The set of all points that satisfy one linear condition in $x, y, z$
Or, all points that we can reach starting from one point and moving along two independent direction vectors

Definition A plane can be defined by a point $a$ that lies on it, and a normal vector $n$ that is perpendicular to the plane

Vector equation

So a plane is a set of all points $\vec{r} = (x, y, z)$ such that the vector $(\vec{r} - \vec{a})$ lies orthogonal (or perpendicular) to $n$

so if $\vec{r} = (x, y, z)$ is a general point on the plane, then we have

\vec{n} \cdot (\vec{r} - \vec{a}) = 0

Check that this makes sense, the dot product being zero means that the vectors are perpendicular

Cartesian equation

Say the normal vector $\vec{n} = (A, B, C)$ and $\vec{a} = x_{0}, y_{0}, z_{0}$ , then after computing the dot product and simplifying

A x + B y + C z + D = 0

Where $D = - (A x_{0} + B y_{0} + C z_{0})$
Also note that $$ \vec{r} \cdot \vec{n} = \vec{a} \cdot \vec{n} $$

Intuitively, this equation defines all the points $(x, y, z)$ that live in a plane tilted such that its perpendicular points in the direction of $(A, B, C)$

There are pretty good 3b1b videos on Linear Algebra for some visual understanding. I often imagine solving problems in Grant Sanderson's voice to give myself some confidence when approaching questions

Parametric equation

As with lines, planes can also be defined using a point and two non-parallel direction vectors

So if $a$ is a point on the plane, and $b$ and $c$ are two independent direction vectors lying in the plane, then any point on the plane can be written as:

\vec{r} = \vec{a} + λ \vec{b} + μ \vec{c}

And every point on the plane is reachable by some combination of $λ$ and $μ$

Vector product

To find perpendicular/orthogonal vectors, we need to have something that lets us calculate that, and this is where the vector product comes in. The ’trick’ so to speak of calculating this is similar to finding determinants in matrices

Let

a = (\begin{matrix} a_{1} \\ a_{2} \\ a_{3} \end{matrix}) and b = (\begin{matrix} b_{1} \\ b_{2} \\ b_{3} \end{matrix})

be two vectors in $R^{3}$ .

The vector product $a \times b$ is the vector in $R^{3}$ with coordinates

a \times b = (\begin{matrix} a_{2} b_{3} - a_{3} b_{2} \\ a_{3} b_{1} - a_{1} b_{3} \\ a_{1} b_{2} - a_{2} b_{1} \end{matrix}) .

Vector product properties

Vector products are NOT communicative and NOT associative

When we take the cross product $\vec{a} \times \vec{b}$ , we get a new vector that's orthogonal to both $\vec{a}$ and $\vec{b}$ . It's magnitude is given by:

| \vec{a} \times \vec{b} | = | \vec{a} | | \vec{b} | \sin θ

Parallelograms...

Where $θ$ is the angle between $\vec{a}$ and $\vec{b}$ . This is also the area of a parallelogram formed by the two vectors. And the direction of the cross product is orthogonal to the parallelogram.

Given that there are two directions to a plane, up and down, we need to know which direction the cross product points towards. And this is where the right hand rule comes in.

If we point the index finger along vector $\vec{a}$
The middle finger along vector $\vec{b}$ which is at right angles to $\vec{a}$
Then our thumb now points in the direction of $\vec{a} \times \vec{b}$

Now, if we flip the order, and instead do $\vec{b} \times \vec{a}$ , the magnitude remains the same, but the direction reverses

So we get:

\vec{a} \times \vec{b} = - \vec{b} \times \vec{a}

Geometrically speaking, the cross product represents the oriented area of the parallelogram spanned by $\vec{a}$ and $\vec{b}$ .

Intersection of Lines and Planes in $R^{3}$

There are 3 cases for an intersection between a line and a plane:

No intersection - $L$ is parallel to $π$
Infinitely many points - $L$ is contained in $π$
One point $L$ intersects $π$ in a single point

The first two cases are pretty straightforward. And the final case can be checked by computing the normal vector to $π$ and seeing if it’s orthogonal to the direction of $L$ .

Now, the intersection of 3 planes in $R^{3}$ has 4 possibilities:

The intersection is a single point
The intersection is a line (star)
The intersection is another plane when all planes coincide
The intersection is empty (at least two planes are parallel or they form a prism)

Distances in $R^{3}$

Point and a Plane

Given a point $P$ and a plane $π$ in $R^{3}$ , we want to find the shortest distance between a plane and a point

Say if we've got a plane:

\vec{n} \cdot (\vec{r} - \vec{a}) = 0

and a point $P$ with position vector $\vec{p}$

Then the distance from $P$ to the plane is the projection of $(\vec{p} - \vec{a})$ onto the normal $\vec{n}$ :

d = \frac{| \vec{n} \cdot (\vec{p} - \vec{a}) |}{| \vec{n} |}

Imagine standing above a plane with a flashlight pointing along the normal vector. Then the shadow of the position vector onto that normal is the perpendicular, i.e the shortest path.

Point and a line

Let's say we want to find the distance between the line $L : R = A + λ \vec{a}$ and the point $P$ , both in $R^{3}$

Use Pythag, we can show that the distance between $P$ and $L$ is equal to the $| \vec{P N} |$ where $\vec{N}$ is the point such that $\vec{P N}$ is perpendicular to $\vec{a}$

We can do this by considering projections. The projections of $\vec{P A}$ along $\vec{a}$ is

p r o j_{a} (\vec{P A}) = \frac{(\vec{P A} \cdot \vec{a})}{(\vec{a} \cdot \vec{a})} a

The $\vec{P A}$ here is made up of two parts, the parallel part (projection) and the perpendicular part. So to get the perpendicular component, we have to subtract the projection:

\vec{P N} = \vec{P A} - p r o j_{a} (\vec{P A})

The distance is therefore given by the norm or the magnitude of both sides

| | \vec{P N} | | = | | \vec{P A} - \frac{(\vec{P A} \cdot A)}{\vec{a} \cdot \vec{a}} a | |

2. Matrix Algebra

Definitions

Definition Suppose that $m, n \in N$ are natural numbers.
An $m \times n$ matrix $A$ is a rectangular array with $m$ rows and $n$ columns:

A = (\begin{matrix} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} & a_{m 2} & \dots & a_{m n} \end{matrix}) .

The coefficients $a_{11}, a_{12}, \dots, a_{m n}$ are called the entries of the matrix.
For any $i \in {1, \dots, m}$ and $j \in {1, \dots, n}$ , the entry $a_{i j}$ is said to be the $(i, j)$ -th element (or the $(i, j)$ -th entry) of $A$ .

$(A)_{i j}$ Is used to denote $a_{i j}$ .

Definition Zero matrix
Size $m \times n$ matrix in which all entries are $0$ .

Definition Diagonal matrix
A square matrix $A$ is diagonal if $a_{i j} = 0$ whenever $i \neq j$
(i.e., all non-diagonal entries are zero):

A = (\begin{matrix} a_{11} & 0 & \dots & 0 \\ 0 & a_{22} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & a_{n n} \end{matrix})

Definition Identity matrix
An $n \times n$ diagonal matrix with $1$ 's on the diagonals. Denoted by $I_{n}$

I_{n} = {(\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & 1 \end{matrix})}_{n \times n}

Thus $I_{n} = (δ_{i j})$ , where

δ_{i j} = {\begin{cases} 1, & if i = j, \\ 0, & if i \neq j . \end{cases}

The addition and subtraction for matrices is only defined for matrices of the same size, but multiplication, as well will see below, has other properties

Why can’t we add or subtract matrices that aren’t the same size??

Matrix multiplication

Let $A$ be an $m \times n$ matrix and $B$ be a $n \times p$ matrix.

then $C = A B$ is the $m \times p$ matrix defined as:

c_{i j} = \sum_{k = 1}^{n} a_{i j} b_{k j}

The matrix product is essentially the product of the row vectors in $a$ , and the column vectors in $b$ . So it works like a dot product.

C = A B = {(\begin{matrix} a_{1} \cdot b_{1} & a_{1} \cdot b_{2} & \dots & a_{1} \cdot b_{p} \\ a_{2} \cdot b_{1} & a_{2} \cdot b_{2} & \dots & a_{2} \cdot b_{p} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m} \cdot b_{1} & a_{m} \cdot b_{2} & \dots & a_{m} \cdot b_{p} \end{matrix})}_{m \times p}

So, $c_{i j} = a_{i} \cdot b_{j}$ , the $(i, j)$ -th entry of $A B$ is equal to the scalar product of the $i$ -th row of $A$ with the $j$ -th column of $B$
So the product $A B$ is only defined if the number of columns in $A$ equals the number of rows in $B$

Properties:

Distributivity
Associativity

Transposition of a Matrix

Let $A = (a_{j i})$ be an $m \times n$ matrix. The transpose of $A$ is the $n \times m$ matrix $A^{T}$ defined by

(A^{T})_{i j} = (A)_{j i} = a_{j i}, for all i, j, 1 \leq i \leq n, 1 \leq j \leq m .

A = {(\begin{matrix} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} & a_{m 2} & \dots & a_{m n} \end{matrix})}_{m \times n} then A^{T} = {(\begin{matrix} a_{11} & a_{21} & \dots & a_{m 1} \\ a_{12} & a_{22} & \dots & a_{m 2} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{1 n} & a_{2 n} & \dots & a_{m n} \end{matrix})}_{n \times m} .

In other words, rows of $A$ become columns of $A^{T}$ , and columns of $A$ become rows of $A^{T}$ .

Example:

{(\begin{matrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{matrix})}^{T} = (\begin{matrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{matrix}), {(\begin{matrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{matrix})}^{T} = (\begin{matrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 9 \end{matrix}) .

If $A = A^{T}$ , then the matrix $A$ is said to be symmetric

Properties

$(A^{T})^{T} = A$ - Flipping twice returns the original matrix
$(A + B)^{T} = A^{T} + B^{T}$ - Transposing a sum is the same as sum of the transposes
$(k A)^{T} = k A^{T}$ - a scalar doesn't affect a transpose
$(A B)^{T} = B^{T} A^{T}$ - when we flip a product, we reverse the roder

Inverse of a Matrix

For numbers, the inverse "undoes" the effect. Eg: $$ 3 \times \frac{1}{3} = 1 $$
For matrices, the inverse works in the same way, it undoes the transformation a the matrix does:
So $$ AA^{-1} = I$$
Where $I$ is the identity (the "do nothing") matrix

Only square matrices can inverted, but not all square matrices are invertible
An inverse exists iff $d e t (A) \neq 0$

An $n \times n$ matrix $A$ is said to be invertible if there exists an $n \times n$ matrix $B$ such that

A B = I_{n} and B A = I_{n},

where

I_{n} = (\begin{matrix} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & 1 \end{matrix})

is the $n \times n$ identity matrix. In this case, $B$ is called the inverse of $A$ , and is denoted $B = A^{- 1}$ .
Let $A$ be a $2 \times 2$ matrix with $a d - b c \neq 0$ .

A = (\begin{matrix} a & b \\ c & d \end{matrix})

Then $A$ is invertible and

A^{- 1} = \frac{1}{a d - b c} (\begin{matrix} d & - b \\ - c & a \end{matrix}) .

Properties

Let $A$ and $B$ be invertible $n \times n$ matrices. Then the following are true:

$A$ has a unique inverse
$A^{- 1}$ is invertible and $(A^{- 1})^{- 1} = A$
$A B$ is invertible and $(A B)^{- 1} = B^{- 1} A^{- 1}$
$A^{T}$ is invertible and $(A^{T})^{- 1} = (A^{- 1})^{T}$

Powers of a Matrix

If $A$ is a square matrix ( $n \times n$ ), then we can multiply it by itself:

A^{2} = A \cdot A, A^{3} = A \cdot A \cdot A

If $A$ and $B$ commute, ( $A B = B A$ ) then $(A B)^{n} = A^{n} B^{n}$
$A^{- n} = (A^{- 1})^{n}$

3. Systems of Linear Equations

A system of $m$ equations in $n$ unknown can be written as:
And in matrix form:

$A x = b$

(\begin{matrix} a_{11} & a_{12} & \dots & a_{1 n} \\ a_{21} & a_{22} & \dots & a_{2 n} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{m 1} & a_{m 2} & \dots & a_{m n} \end{matrix}) (\begin{matrix} x_{1} \\ x_{2} \\ ⋮ \\ x_{n} \end{matrix}) = (\begin{matrix} a_{11} x_{1} + a_{12} x_{2} + \dots + a_{1 n} x_{n} \\ a_{21} x_{1} + a_{22} x_{2} + \dots + a_{2 n} x_{n} \\ ⋮ \\ a_{m 1} x_{1} + a_{m 2} x_{2} + \dots + a_{m n} x_{n} \end{matrix}) .

Where:

$A$ = coefficient matrix
$x = (x_{1}, x_{2} \dots x_{n})^{T}$ - the column vector of variables
$b = (b_{1}, b_{2} \dots b_{m})^{T} \in R^{m}$

Augmented Matrix

The augmented matrix of system is

(A | b) = (\begin{matrix} a_{11} & \dots & a_{1 n} & b_{1} \\ ⋮ & ⋱ & ⋮ & ⋮ \\ a_{m 1} & \dots & a_{m n} & b_{m} \end{matrix})

Row Operations

There are 3 allowed row operations that allows us to simplify systems without changing the solution set. I.e. matrices are are row equivalent

Row operation are invertible, i.e. their inverses are also row operations

Row Scaling

R_{i} \to λ R_{i}

Row interchange (swap)

R_{i} \leftrightarrow R_{j}

Row replacement

R_{i} \to R_{i} + μ R_{j}, i \neq j, μ \in R

Gaussian Elimination

Consider we're trying to solve a system of simultaneous linear equations like

{\begin{cases} x + 2 y - z = 3 \\ 2 x + y + z = 8 \\ - 3 x + y + 2 z = - 5 \end{cases}

Instead of solving one variable at a time using substitution, we can turn this into an augmented matrix.

So we instead have:

(\begin{matrix} 1 & 2 & - 1 & 3 \\ 2 & 1 & 1 & 8 \\ - 3 & 1 & 2 & - 5 \end{matrix})

So each row is an equation, and each column, except the last one, is one variable's coefficients.

The idea now is to use the row operations listed above to reduce the augmented matrix to row echleon form (REF) or reduced row echleon form (RREF)

A pivot is the first non-zero entry in a row of a matrix

A matrix is is REF if:

Every pivot is to the right of all pivots above it
Zero rows are at the bottom

A matrix is in RREF if:

It’s already in REF
Every pivot is $1$
Every column containing a pivot has all entries equal to $0$

Definition Gaussian Elimination is the method of solving linear equations by starting with $(A | b)$ and performing row operations to put $A$ into RREF, then reading the solutions .

Each row in the matrix represents a plane in 3D (or a line in 2D or a hyperplane in higher dimensions). So Gaussian elimination is essentially rotating and scaling these planes until they're "aligned with the axes". The intersection point doesn't change, we just manipulate it to make it easier to understand

Reading solutions

Once our matrix is in REF, the augmented matrix will correspond to the solutions of the equation

(\begin{matrix} 1 & 0 & 0 & α \\ 0 & 1 & 0 & β \\ 0 & 0 & 1 & γ \end{matrix})

(\begin{matrix} x = α \\ y = β \\ z = γ \end{matrix})

Reducing a matrix to RREF

Our goal is to transform a given matrix $A$ into a version where:

A_{R R E F} = (\begin{matrix} 1 & 0 & 0 & \dots & * \\ 0 & 1 & 0 & \dots & * \\ 0 & 0 & 1 & \dots \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & \dots & 0 \end{matrix})

Each row starts with a $1$ (pivot) which is to the right of the pivot above it
Each pivot column has zeros above and below the pivot
Any rows of all zeroes go to the bottom

The means that:

Each pivot correspond to a leading variable

To reduce a matrix to RREF:

Starts with the first non-zero column going from left to right - this will be our pivot column
If the first entry is non-zero, use that, otherwise, swap that row with one below so that the pivot moves to the top of the block
Make the pivot $1$ using row operations
Use row operations to make all entries below the pivot = $0$
Now move diagonally (one column to the right and down) and repeat the process until all pivots have zeroes below
And go back and make all entries above the pivot zero

Matrix inverses using row operations

Definition A system of linear equation is consistent iff the augmented matrix, when put into this REF has no rows of the form $(0 \dots 0 | b)$ with $b \neq 0$

Each matrix has a unique matrix in RREF

If $A$ is invertible, then there is a unique solution given by $x = A^{- 1} b$

Given any $x$ with $A x = b$

x = I_{n} x = (A^{- 1} A) x = A^{- 1} (A x) = A^{- 1} b

Here, the key idea is that row operations that turn $A$ into $I$ will also turn $I$ into $A^{- 1}$ .

Definition Elementary matrix - A matrix obtained from the identity matrix $I_{n}$ by doing ONE row operation. Doing row operations is the same as multiplying by elementary matrices. One of the problem sheets goes through this proof

Each elementary matrix is invertible, and it’s inverse is an elementary matrix of the same type. Ie. if we multiply $E$ by any matrix $A$ , we perform that same row operation on $A$ .

If $A$ is an $n \times n$ matrix, then:

$A$ is invertible
The unique solution to $A x = 0$ is $x = 0$
The RREF of $A$ is $I_{n}$
$A$ is a product of $n \times n$ elementary matrices

If $A$ and $B$ are $n \times n$ matrices and $A B = I_{n}$ , then both $A$ and $B$ are invertible and $A^{- 1} = B$ and $B^{- 1} = A$

Algorithm for finding inverses using row operations

Suppose we have an $n \times n$ matrix $A$ and we want to find $A^{- 1}$ or show that it is not invertible

We first form an augmented matrix:

{(\begin{matrix} A | I_{n} \end{matrix})}_{n \times 2 n}

Then we apply row operations to this matrix to being the left half $A$ to $I_{n}$ . If this works, then we get: $(I_{n} | A^{- 1})$ so $A$ is invertible. It if doesn’t work, then $A$ is not invertible

Since we can get from $A$ to $I_{n}$ by applying finitely many row operations, there are elementary $n \times n$ matrices $E_{1}, E_{2}, \dots E_{k} = B$ s.t.

E_{k} (E_{2} (E_{1} A)) = I_{n} ⟹ (E_{k} \dots E_{1}) A = I_{n} ⟹ B A = I_{n}

⟹ B = A^{- 1}

Why the RHS equals $A^{- 1}$

As each row operation can be represented by an elementary matrix $E$ , when we do all of them in sequence, we're multiplying their product:

E_{k} E_{k - 1} \dots E_{1} A = I ⟹ (E_{k} E_{k - 1} \dots E_{1}) = A^{- 1}

But in the augmented matrix form, we've been applying those same exact operations to the identity matrix on the RHS, so the RHS ends up being:

E_{k} E_{k - 1} \dots E_{1} I = A^{- 1}

And therefore:

(A | I_{n}) \to (I_{n} | A^{- 1})

Example

Suppose we want to find the inverse of the matrix:

A = (\begin{matrix} 2 & 1 \\ 1 & 1 \end{matrix})

First we write it in the augmented form:

(\begin{matrix} 2 & 1 & | & 1 & 0 \\ 1 & 1 & | & 0 & 1 \end{matrix})

We want to scale so that the first pivot is 1, so we have: $R_{1} \to R_{1} - R_{2}$ . Remember to perform the same row operations on the RHS as well

(\begin{matrix} 1 & 0 & | & 1 & - 1 \\ 1 & 1 & | & 0 & 1 \end{matrix})

Next, we want to eliminate below the pivot: $R_{2} \to R_{2} - R_{1}$

(\begin{matrix} 1 & 0 & | & 1 & - 1 \\ 0 & 1 & | & - 1 & 2 \end{matrix})

Now, as the LHS is $I_{2}$ , this means the RHS is by definition $A^{- 1}$ . And we can check this by computing:

A A^{- 1} = (\begin{matrix} 2 & 1 \\ 1 & 1 \end{matrix}) (\begin{matrix} 1 & - 1 \\ - 1 & 2 \end{matrix}) = (\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}) = I_{2}

Rank of a Matrix

Definition Given a $m \times n$ matrix $A$ , the rank of $A$ is the number of non-zero rows in any REF of $A$

The rank essentially measures how many independent rows or columns a matrix has. So for an $n \times n$ matrix, having rank $n$ means that all its columns and rows are linearly dependent. So in this case:

No row/column can be written as a combination of others
Every pivot position is filled
So Gaussian elimination can turn $A$ into $I_{n}$ so that no zero rows appear
This means we can find $A^{- 1}$

Therefore, if $A$ is an $n \times n$ matrix then $A$ is invertible $⟺$ $r a n k (A) = n$
This is also why if there is a zero row in an RREF form, the inverse of a matrix does not exist

Properties of a Rank

Let $A$ be a $m \times n$ matrix and $B$ be any matrix:

$B$ row equivalent to $A$ $⟹$ rank( $B$ ) = rank( $A$ )
rank( $λ A$ ) = rank( $A$ ), $λ \neq 0$
rank( $A$ ) $\leq$ $min (m, n)$
- Based on the definition of rank, we can never have more than $m$ linearly dependent vectors in $R^{m}$ Same with the rows. So as row rank = column rank, the inequality follows.
rank( $A + B$ ) $\leq$ rank( $A$ ) + rank( $B$ ) - provided $B$ is $m \times n$
Rank( $A + B$ ) $\leq$ min(rank( $A$ ), rank( $B$ ))

4. Determinants

$M_{n} (R)$ - the set of all $n \times n$ matrices with real entries. We don't start by just saying what the determinant is, but what kind of function it must be. So the axiomatic definitions are given below. Note that these are slightly less formal than what's in the official lecture notes, but that's mostly down to making it easier and more intuitive to understand.

Definition
A function $D : M_{n} (R) \to R$ is a determinant if the following 3 conditions hold:

1. Linearity in each row

If we multiply a row by a scalar $λ$ , then the determinant scales by $λ$
And if we add the rows, determinants add. So the sum of the determinants is the determinant of the sums

D (\dots, λ r_{i}, \dots) = λ D (\dots, r_{i} \dots)

D (\dots, r_{i} + r_{i} + \dots) = D (\dots, r_{i}, \dots) + D (\dots r_{i}^{‘}, \dots)

2. Alternating

If $A \in M_{n} (R)$ has two equal rows $⟹$ $D (A) = 0$

So if two rows are equal, then the determinant is 0
Swapping two rows flips the sign of the determinant

3. Normalisation

D (I) = 1, I_{n} = n \times n identity matrix

Using the axioms above, we can further prove the row operations and their effects on matrices

Row interchange doesn't change the determinant
Swapping rows (row replacement) causes the determinant to be me multiplication by $- 1$ as it flips orientation
Multiplying a row by a scalar multiplies the determinant by that scalar.
- So if $A \to (R_{i} \to λ R_{i}) \to B ⟹ D (A) = \frac{1}{λ} D (B)$

D (B) = α D (A)

where $α$ is the type of row operation.

Determinants of Upper Triangular Matrices

Definition

An upper triangular matrix is a matrix where all entries below the diagonal are zero:

A = (\begin{matrix} a_{11} & a_{12} & a_{13} & a_{14} & \dots \\ 0 & a_{22} & a_{23} & a_{24} & \dots \\ 0 & 0 & a_{33} & a_{34} & \dots \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & 0 & a_{n n} \end{matrix})

So if $A$ is an upper triangular matrix, then the determinant of $A$ is the product of the diagonal entries of $A$ :

D (A) = a_{11} a_{22} a_{33} \dots a_{n n}

Intuitively, the matrix $A$ represents a transformation that:

Scales the first coordinate by $a_{11}$
Scales second coord by $a_{22}$
Third by $a_{33}$
And it may shear the shape horizontally because of the entries above the diagonal.

Each diagonal entry $a_{i i}$ is essentially a scaling of one axis direction. So if we started with the identity matrix (which has $D = 1$ ), and just scaled each axis by those number. The volume (which is basically what a determinant is) would be the product of all the diagonal entries. And shearing doesn't change the determinant as it's the same idea as adding a multiple of one row to another.

A more formal proof can be done by using row operations and transforming the above matrix into $I_{n}$ , and all those transformations involve row replacement options only.

Determinant of any square matrix

Once we have the facts above, we now have an algorithm to compute the determinant of any square matrix:

To compute $D (A)$ :

Perform Gaussian elimination to reduce $A$ to an upper triangular form i.e. the REF for a square matrix
Keep track of how each row operation affects $D$
Compute the determinant of the upper triangular matrix

Notation wise, determinant of $A$ = $| A | = det A$

Proof for a $2 \times 2$ matrix

For

A = (\begin{matrix} a & b \\ c & d \end{matrix})

det (A) = a d - b c

Geometrically speaking, in $R^{2}$ , the two column vectors of $A$ span a parallelogram. And the area of that = $| a d - b c |$

The sign of $a d - b c$ tells us if the orientation is preserved or flipped. If the parallelogram collapses to a line (area = 0), then then two vectors are linearly dependent, i.e. the matrix is not invertible.
That's why $A^{- 1} ⟺ det (A) \neq 0$
In the similar way, the rank tells us what the shape "collapses" to in higher dimensions.

Now, for the more formal proof, we can compute $det (A)$ by performing row operations until we make $A$ upper triangular, and then take the product of the diagonal entries

We can do the row operations $R_{2} \to R_{2} - \frac{c}{a} R_{1}$ which gives us:

B = (\begin{matrix} a & b \\ 0 & d - \frac{c}{a} \end{matrix})

From the property of determinants, the row replacement does not change the determinant, so $det (B) = det (A)$ . Now as $B$ is upper triangular, we can calculate the determinant by:

det (B) = a (d - \frac{c}{a} b) = a d - b c

Therefore, $det (A) = a d - b c$

The determinant of a matrix with integer entries is always an integer

Existence and Uniqueness

$\forall n \in N, \exists$ unique determinant $D : M_{n} (R) \to R$ satisfying the axiomatic definitions of determinants

This will be proved in Linear Algebra II by giving a different definition of determinant. So the lesson is to never trust anything.

Determinants and Invertibility

An n \times n matrix is invertible ⟺ det (A) \neq 0

Proof

Starting with the $⟹$ side:

If $A$ is invertible, then $\exists A^{- 1}$ s.t. $A A^{- 1} = I_{n}$

Now we can take the determinant of both sides

det (A A^{- 1}) = det (I_{n}) ⟹ det (A) det (A^{- 1}) = 1 ⟹ det (A) = \frac{1}{det (A^{- 1})}

This means that $det (A) \neq 0$ as $1 / 0$ is not defined

Now for $⟸$
For a bit of intuition, recall that the determinant is geometrically the scaling factor of volume when $A$ acts as a linear transformation. So if $det (A) = 0$ , then the volume collapses to a lower dimension, and that transformation is not reversible.

Now for the algebraic proof. We know that when we apply Gaussian elimination to turn $A$ into an upper triangular matrix ( $U$ ):

A = E_{k} \dots E_{2} E_{1} U

Each $E_{i}$ is an elementary matrix which is invertible. So we have that:

det (A) = (det (E_{k}) . . det (E_{1})) det (U)

Now, if $det (A) \neq 0 ⟹ det (U) \neq 0$
But $det (U)$ is the product of diagonal entries, so none of them can be zero. This means that all pivot positions in Gaussian elimination are non-zero, which means we can reduce $A$ all the way down to $I_{n}$ . Hence $A$ is invertible

There is another proof by contrapositive, given in the lectures notes.
So if we can prove that $A$ is not invertible, then $det (A) = 0$ , that will be same as proving the original statement

So if $A$ is not invertible, it means rank $A < n$ . This means that for any REF form $B$ of $A$ , $B$ has a row a zeroes. But $A$ can be reduced to this REF form via a finite sequence of row operations, so $det (B) = β \cdot det (A)$ for some $β \neq 0$ .

det (A) = \frac{1}{β} det (B) = 0 ⟹ B has a zero row

And if a matrix has a zero row, the determinant will be $0$ as the it's the product of diagonals in a upper triangular matrix.

Determinant using Cofactors

Now that we know how to calculate the determinant of a $2 \times 2$ matrix, we want to a formula that still captures that same alternating and linear property. And the cofactor expansion can be thought of as a recursive way to do this, it breaks down a big determinant into small ones (minors)

So if $A$ is an $n \times n$ matrix, then $det (A)$ is defined in terms of $(n - 1) \times (n - 1)$ determinants.

For some definitions, consider an $n \times n$ matrix $A = (a_{i j})$
Then:

minor $M_{i j}$ is the determinant of the smaller matrix we get after deleting the $i$ -th row and $j$ -th column.
cofactor $C_{i j}$ adds a sign to the minor:

C_{i j} = (- 1)^{i + j} M_{i j}

So to find the determinant, we pick a row/column, say row 1, then for each element $a_{1 j}$ in that row:

Delete its row and column from the matrix to get a smaller matrix called the minor
Multiply that smaller determinant by $a_{1 j}$
Multiply by a sign $(- 1)^{1 + j}$ that alternates $+ - + - +$ across the row

So formally, if

A = (\begin{matrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{matrix})

Then

det (A) = \sum_{i = 1}^{n} a_{1 j} c_{1 j} + a_{2 j} c_{2 j} + \dots + a_{n j} c_{n j}

Where

C_{1 j} = (- 1)^{1 + j} det (M_{1 j})

And $M_{1 j}$ is the minor obtained by deleting row $1$ and column $j$

So the formula for any $n \times n$ matrix $A = (a_{i j})$ , expand along the $i$ -th row:

det (A) = \sum_{j = 1}^{n} (- 1)^{i + j} a_{i j} det (M_{i j})

The sign pattern comes from one of the determinant axiom which tells us that if we swap two rows, the determinant changes the sign.

As we're expanding the determinant along a row, we're sort of "isolating" one element $a_{i j}$ and its minor. Intuitively, we're bringing $a_{i j}$ up to position by moving rows/columns around, and to account for that movement, we multiply by $(- 1)^{i + j}$

This is quite efficient when we have a matrix with lots of zero entries

#explainfuture

Determinant of a Transpose

For any square matrix $A$ ,

det (A^{T}) = det (A)

What this is means transposing a matrix (i.e. flipping rows and columns) of a matrix does not change its determinant

Intuitively, if we think determinant as the volume scaling factor, then transposing it just changes our point of view, so instead of looking at how $A$ acts on row vectors, you look at how it acts on column vectors. Hence, the volume scaling remains the same.

The other way to think about is using the fact that the determinant is the product of the diagonals, and when we transpose a square matrix, the diagonals remain the same, hence the determinant is the same.

Determinant of a Product

If $A, B$ are $n \times n$ matrices, then $det (A B) = det (A) det (B)$

Geometrically again, scaling a volume by a factor of $det (A)$ and scaling it by $det (B)$ is the same as scaling it by $det (A) det (B)$

More formally speaking, we can think of every invertible matrix $A$ as a product of elementary matrices:

A = E_{k} E_{k - 1} \dots E_{1}

And as each elementary matrix corresponds to a single row operation, and because we know how determinant changes under row operations:

det (E_{i} A) = det (E_{i}) det A $ $ a n d s i n c e e v e r y m a t r i x c a n b e b u i l t f r o m t h e e l e m e n t a r y m a t r i c e s, w e h a v e t h a t : $ $ det (A B) = det (A) det (B)

The proof in the notes is bit more detailed and uses the 3 cases of the elementary matrices, but the idea is the same.

Powers of a matrix

det (A^{k}) = (det A)^{k}, \forall k \in Z

det (A^{- 1}) = \frac{1}{det (A)}

The proof is rather trivial, and can be proved by induction if wanted as an exercise to the reader

Inverting a Matrix using Cofactors

This section is skipped as it will not be examined

5. Linear Transformations

This is where the proper linear algebra starts. A linear transformation is just taking a function from one vector space to another, following a couple of specific rules, namely additivity and scaling.

Definition Let $m, n \in N$ , and let $T : R^{n} \to R^{n}$ be a function. Then $T$ is a linear transformation if $\forall \vec{u}, \vec{v} \in R^{n}$ and $\forall λ \in R$ if we satisfy two properties: additivity and homogeneity:

T (\vec{u} + \vec{v}) = T (\vec{u}) + T (\vec{v})

T (λ \vec{u}) = λ T (\vec{u})

Intuitively, a linear transform never bends, curves, or shifts space, it can only scale, rotate, shear, reflect, etc. So essentially, if the origin doesn't stay fixed, then the transformation is not linear.

The most important thing is probably this: every linear transformation is a matrix multiplication.
So if $A$ is an $m \times n$ matrix, then

T : R^{m} \to R^{m} defined by T (x) = A x \forall x \in R^{n}

So every linear transform, is a multiplication by some matrix $A$ . So if we know what the transformation does to the basis vectors, we know what it does to every vector in the space. The proof of this is trivial, and is explained in the notes.

Now that we know that $T : R^{n} \to R^{m},$ is a linear transform, then standard basis vectors of $R^{n}$ can help us determine the whole transformation.

Consider the standard basis vectors in $R^{n}$ :

e_{i} = (0, \dots 0, 1, 0, \dots 0)

Where the $i$ -th entry is the vector entry $1$

And we know that any vectors $x$ can be written in terms of the basis vectors:

x = x_{1} e_{1} + x_{2} e_{2} + \dots + x_{n} e_{n}

So $T$ is linear, then

T (x) = x_{1} T (e_{1}) + x_{2} T (e_{1}) + \dots + x_{n} T (e_{n}) = (T (e_{1}) \dots T (e_{n})) x = A x

Hence just showing the basis vectors, is enough to define the entire transformation. It's like taking the unit square and seeing how it changes under a matrix.

Linear Transformation in $R^{2}$

Every linear transformation in $R^{2}$ is completely determined by: $T (e_{1}), T (e_{2})$ , where

e_{1} = (\begin{matrix} 1 \\ 0 \end{matrix}), e_{2} = (\begin{matrix} 0 \\ 1 \end{matrix})

So to understand a transformation, we need to think about what it does to $e_{1}, e_{2}$

Rotation about the origin

Let $T : R^{2} \to R^{2}$ be the rotation by $θ \in [0, 2 π)$ anti-clockwise about the origin:

Now think about what happens to $e_{1} = (1, 0)$ , when rotating, it lands at ( $\cos θ, \sin θ)$
And when $e_{2} = (0, 1)$ is rotated, it lands at $(\cos (θ + π / 2), \sin (θ + π / 2))$ , which simplified is $(- \sin θ, \cos θ)$ .
Therefore the matrix of rotation is given by:

A = (\begin{matrix} \cos θ & - \sin θ \\ \sin θ & \cos θ \end{matrix})

We can check that rotation preserves the area, so the determinant of this matrix is in $\cos^2 \theta + \sin {#2} \theta = 1$ fact

And therefore the matrix of linear transformation is given by:

T ((\begin{matrix} x \\ y \end{matrix})) = A (\begin{matrix} x \\ y \end{matrix}) = (\begin{matrix} \cos θ x & - \sin θ y \\ \sin θ x & \cos θ y \end{matrix}), \forall (\begin{matrix} x \\ y \end{matrix}) \in R^{2}

Reflection in a line through origin

Note that a reflection keeps points on the line fixes, and flips points perpendicular to the line. There is a diagram in the notes which makes it easy to see why its $2 θ$ , but another way to think about it to consider the basis vectors and how they're changing.

A reflection through the angle $θ$ is the same at a rotation by $- θ$ to align the axis, then reflect across the $x$ axis, and rotate it back by $θ$

The matrix is therefore given by:

T = (\begin{matrix} \cos (2 θ) & \sin (2 θ) \\ \sin (2 θ) & - \cos (2 θ) \end{matrix})

The determinant of this matrix, when calculated, is $- 1$

Stretch/shrink

A scaling just has the effect of multiplying axes. So the matrix here is self-explanatory:

T = (\begin{matrix} λ & 0 \\ 0 & μ \end{matrix})

$T$ stretches/shrinks the plane by $λ$ in the direction of the $x$ axis and by $μ$ in the $y$ axis

Shear

(\begin{matrix} 1 & λ \\ 0 & 1 \end{matrix}) (\begin{matrix} 1 & 0 \\ μ & 1 \end{matrix})

It keeps one of the axis fixed, and moves all the other points in that direction parallel to the axis.

Composition of Linear Transformation

Composition of transformations is the same as what it's like in functions:

(S \circ T) (x) = S (T (x))

So we apply $T$ first, then $S$ . It's just like functions, because transformation is a function.

Matrix multiplication is the same as composing linear transformations. So if $T (x) = A x$ and $S (x) = B x$ , then

(S \circ T) (x) = S (T (x)) = B (A x) = (B A) x

So the matrix of the composite transformation $S \circ T = B A$ . And that's why matrix multiplication is defined the way it is!

The composition of a linear transformation is a linear transformation. It can be proved using the definition of composition and the transformation definitions above

Finding linear transforms

Inverses of Linear Transformations

Definition A linear transformation $T : R^{n} \to R^{n}$ is invertible if there exists a linear transformation $S$ s.t.:

S \circ T = T \circ S = Id

Like any inverse, an inverse transformation just undoes the original transformation.

Proof of the inverse of linear transforms

Think back to calculus, an inverse exists $⟺$ $T$ is a bijection.

So suppose $T$ is a bijection, and now we want to show that it has an inverse linear transformation.

So let $S = T^{- 1}$

And now we need to show that $S$ is linear so that satisfies the definition of a linear transformation, ie. we need to show that:

S (u + v) = S (u) + S (v)

S (λ u) = λ S (u)

So let $a = S (u)$ and $b = S (v)$ , which lets us write $T (a) = u$ and $T (b) = v$ , and now we use this substitution.

Consider $T (a + b)$ , and because $T$ is linear, we can use the linearity of $T$ to get:

T (a + b) = T (a) + T (b) = u + v

T (S (u + v)) = u + v

As $S$ is the inverse of $T$ , we have:

T (S (u + v)) = T (a + b)

As $T$ is a bijection, we know that it's also an injection, so we have that: $T (x) = T (y) ⟹ x = y$ . And by that logic:

S (u + v) = a + b = S (u) + S (v)

Now... for the scalar multiplication, consider $T (λ a)$
As $T$ is linear:

T (λ a) = λ T (a)

T (S (λ u)) = λ u = T (λ a)

Using the injectivity of $T$ we have:

S (λ u) = λ a = λ (S (u))

Invertibility

Let $T : R^{n} \to R^{n}$ , then $T$ Is invertible $⟺ A$ is invertible, AND $T$ invertible, then matrix of $T^{- 1} = A^{- 1}$

As this is an $⟺$ proof, we need to show both sides

So $⟹$ $T$ is invertible, then $\exists S$ s.t.

S \circ T = Id, T \circ S = Id

So let $B$ be the matrix of $S$ , and $A$ be the matrix of $T$

By composition of matrices, $(S \circ T) = B A$ and the identity matrix is $I_{n}$

So since, $(S \circ T) = Id$ :

B A = I_{n}

And since $T \circ S = Id$ :

A B = I_{n}

Therefore we have that:

A B = B A = I_{n}

And by definition $A$ is invertible, and its inverse is $B$

Now for the $⟸$

We assume that $A$ is invertible, and now we need to show that $T$ is invertible

Let $S (x) = A^{- 1} x$ , and now we compute the composition $S \circ T$ :

(S \circ T)) (x) = S (T (x)) = S (A x) = A^{- 1} (A x) = (A^{- 1} A) x = I_{n} x = x

So we know that $S \circ T = Id$

Now to compute the other composition to check for invertibility:

(T \circ S) (x) = T (S (x)) = T (A^{- 1} x) = A (A^{- 1} x) = (A A^{- 1}) x = I_{n} x = x

So $T \circ S = Id$

And since both compositions have given us the identity, $S$ is the inverse of $T$ and since $S$ is linear and given by $A^{1}$ , we know that $T^{- 1} = A^{- 1}$

6. Subspaces of $R^{n}$

Vector subspaces

Intuitively speaking, a vector subspace of $R^{n}$ is just a subset of $R^{n}$ sitting inside it that behaves like a vector space on its own. It's a region where all vectors behave consistently under the transformation rules of the space. So it must include the origin.

It has three properties:

It must contain the origin (i.e. be non-empty)
It's closed under vector addition
Closed under vector multiplication

So anytime we're given a vector, we just need to check for the properties above to see if it's a subspace of $R^{n}$ or not

Null Spaces

Definition

Let $T : R^{n} \to R^{m}$ be a linear transformation. Then the null space (kernel) of $T$ is the subspace $R^{n}$ defined by:

N (T) = {\vec{x} \in R^{n} | T (\vec{x}) = \vec{0} \in R^{m}}

I.e. it's basically the set of all vectors that get mapped to the zero vector for some $m \times n$ matrix $A$ :

N (A) = {\vec{x} \in R^{n} | A (\vec{x}) = \vec{0} \in R^{m}}

The null space is a subspace of $R^{n}$ , and to show that, we just need to show that it satisfies the three properties:

$0 \in \ker (A)$
closed under addition
Closed under multiplication

And by linearity of $T$ and definition of a linear transformation, we can show that the null space/kernel is always a subspace

Invertibility and Null spaces

A is invertible ⟺ \ker (A) = \vec{0}

If the only solution to $A (x) = 0$ is $x = 0 ⟹$ $A$ is invertible. Another way to think about is that if there any non-zero vector $x \neq 0$ with $A x = 0$ , then

Linear Span

Linear span is just the set of all linear combinations of vectors, and this is always a subspace because linear combination rules automatically satisfy the subspace axioms

span {\vec{v_{1}} \dots \vec{v_{k}}} = {c_{1} \vec{v_{1}} + \dots + c_{k} \vec{v_{k}} \in R}

Range and Column Space

The column space is of a matrix $A$ is just the set of all linear combinations $\vec{v_{1}} \dots \vec{v_{k}}$ in the sub space, so:

col (A) = span (\vec{v_{1}} \dots \vec{v_{k}})

And the range is the subset of $R^{m}$ of a transformation $T : R^{n} \to R^{m}$ :

R (T) = {T (\vec{x}) | \vec{x} \in R^{n}}

And the range is also a subspace, which can be tested using the properties of subspace again

Linear Independence

Definition

Vectors are linear independent if:

c_{1} \vec{v_{1}} + \dots + c_{k} \vec{v_{k}} = \vec{0} \in R ⟹ c_{1} = c_{2} = \dots = c_{k} = 0

What this means is that no vector can be written as a combination of others. So if a columns in the matrix are linearly dependent:

The transformation keeps all directions and matrices are transformations
There is no collapse of space
$det \neq 0$
So matrix is invertible

Trivial solution just means that the solution is $0$ , so non-trivial solution means there is at least one non-zero solution. If vectors are linearly independent trivial solution exists. It's like if the only way to get $0$ is to make the coefficients $0$ , then all the vectors span in different directions.

So with linear independence, if we can combine the vectors with non-zero coefficients to make them $0$ , then one of them must be a combination of the other.

So for a square matrix, all of the conditions below are equivalent:

$\ker (A) = {\vec{0}}$
Columns independent
$r a n k (A) = n$
column space = $R^{n}$
$det (A) \neq 0$
$A$ being invertible

Test for linear independence

To see if vectors are linearly dependent or not, put the matrix in REF and look out for non-zero rows

Bases

A basis is just a minimal set of vectors that can generate (span) the whole space with no redundancy. So they give you all possible directions in your space, and the smallest possible set of directions. A bit like a choosing the axes of the space.

So in $R^{2}$ , the standard basis would be $e_{1} = (1, 0)$ and $e_{2} = (0, 1)$ . And that means any vector $(x, y)$ can be made by a combination of these two basis vectors

Definition
A collection of vectors $v_{1}, . . v_{k}$ form a basis of vector space $V$ if:

They span $V$
They are linearly independent

Basis are like the coordinate system for a vector space. The other important thing about basis is hat they are unique. Every vector $x$ has a unique representation

7. Eigenvectors, Eigenvalues and their applications

A bit of a diversion, as in my notes I will try and focus on the intuitive idea and then the definition.

So consider a linear transformation $T$ . Most vectors are rotated, stretched, squished, etc. So when we apply a matrix, most vectors change direction. But some vectors lie on their own span and don't change directions. I.e. they may stretch, shrink, flip, but they don't turn. And those no turning directions are what we call eigenvectors. And the amount they stretch/shrink by is the eigenvalue.

Definition
A number $λ \in R$ is said to be an eigenvalue of $A$ if there is a non-zero vector $\vec{v} \in R$ s..t

A \vec{v} = λ \vec{v}

$A$ is the matrix representation the transformation, and $\vec{v}$ is the eigenvector, and $λ$ is eigenvalue

So eigenvectors are the directions a transformation preserves, and eigenvalues are the scaling factors for those directions.

Determinant criteria

Notice how the left hand side of the equation is vector multiplication, but the right side is scalar. So to turn the RHS into a vector multiplication, we can write $λ = λ I_{n}$ where $I_{n}$ is the identity matrix.

So we have:

A \vec{v} = λ I \vec{v}

And we can rearrange and factor out $\vec{v}$ to get:

(A - λ I) \vec{v} = \vec{0}

When solving the equation above, all we're doing is looking for a non-zero vector that gets sent to zero by the linear transformation $A - λ I$

The reason for this is because the kernal of $A$ is just the set of directions that get sent to $0$ , i.e. the direction in which the transformation collapses into a lower dimension. It's answering the question: "For what values of $λ$ does the equation have a non-zero solution".

And this happens when the determinant is $0$ :

λ \in R is an eigenvalue of A ⟺ det (A - λ I_{n}) = 0

So an eigenvalue exists when the sifted matrix $A - λ I$ loses invertibility

Characteristic polynomial

p_{A} (λ) = 0 (⟺ det (A - λ I_{n}) = 0) is the characteristics equation of A

There’s a reason it’s called a characteristic polynomial, because it satisfies the characteristics of the matrix equation.

This basically helps us solve for values of $λ$ for which the transformation collapses, which is all an eigenvalue is. And the roots of this polynomial are exactly the values of $λ$ where $(A - λ I)$ is not invertible. So $p_{a} (λ) = 0 ⟺ λ$ is an eigenvalue

Upper triangular matrices

Say we have an upper triangular matrix:

A = (\begin{matrix} a_{1} & * & * \\ 0 & a_{2} & * \\ 0 & 0 & a_{3} \end{matrix})

And now consider the matrix $A - λ I$

A - λ I = (\begin{matrix} a_{1} - λ & * & * \\ 0 & a_{2} - λ & * \\ 0 & 0 & a_{3} - λ \end{matrix})

This is still an upper triangular matrix! And the determinant of this matrix is the product of its diagonal entries.

det (A - λ I) = (a_{1} - λ) (a_{2} - λ) \dots (a_{n} - λ)

And if we set this equal to $0$ , the roots will be just: $λ = a_{1}, a_{2} . . . a_{n}$

Therefore, the diagonal entries of an upper triangular matrix are the eigenvalues of that matrix!

Applications in Google’s Page Ranking system

Read the MathsRant blog page!

Symmetric Matrices

A matrix $A$ is symmetric because $A = A^{T}$ . So we can see that symmetric matrices correspond to transformations that don't twist space, only stretch, squash or reflect it

Let $A$ be an $n \times n$ symmetric matrix with real entries. Then:

$A$ has $n$ linear independent eigenvectors
All eigenvalues of $A$ are real.
There exists an orthonormal basis of eigenvectors (more on this this later)
$A$ is diagonalisable by an orthogonal matrix

We can also think about this as the dot product with matrix product as scalar product

This is a result of Spectral Theorem, but the proof was omitted.

So as $I_{n}$ is a symmetric matrix, all the eigenvectors are just the basis vectors. And an eigenbasis is just a basis of the space consisting entirely of eigenvectors of $A$ . So if $A$ has an eigenbasis, then every vector can be written as a combination of eigenvectors

We want to find linearly independent eigenvectors because a basis must be definition be linearly independent and geometrically, we need enough directions to describe the whole space. So if eigenvectors are dependent, they don't span enough directions.

Multiplicity

Multiplicity just tells us how many times an eigenvalue appears as a root of the characteristic polynomial. This is algebraic multiplicity.
Eg: $(λ - 2)^{3}) (λ + 1)$ has eigenvalue $2$ and algebraic multiplicity $3$

Then there is geometric multiplicity, which tells us the dimension of the eigenspace (number of independent eigenvectors).

The key rule is that $1 \leq$ geometric multiplicity $\leq$ algebraic multiplicity

However for symmetric matrices, the geometric multiplicity $=$ algebraic multiplicity

Orthogonality of eigenvectors - Thm 7.27

For symmetric matrices - eigenvectors corresponding to distinct eigenvalues are orthogonal.

Recall that two vectors being orthogonal means that they essentially meet at right angles and are in completely independent directions. So movement in one direction has no component in the other. So transformations by the eigenvectors stretches space along perpendicular directions for symmetric matrices.

Diagonalisation of a Matrix - Thm 7.30

At its core, diagonalisation is trying to answer a fairly simple question: can we find a coordinate system where the linear transformation acts in the simplest possible way? i.e. just stretching along axes without rotating or shearing?

So a diagonal matrix $D$ is a matrix s.t.

D = (\begin{matrix} λ_{1} & 0 & 0 \\ 0 & λ_{2} & 0 \\ 0 & 0 & λ_{3} \end{matrix}) ⟹ D (\begin{matrix} x \\ y \\ z \end{matrix}) = (\begin{matrix} λ_{1} x \\ λ_{2} y \\ λ_{3} z \end{matrix})

Geometrically, each coordinate directions moves independently, and no direction interferes with another. Most matrices are not in this form because diagonalisation is about changing basis to one that the transformation works well with, and that basis being the eigenbasis

Recall that $A v = λ v$
So applying $A$ to $v$ just stretches $v$ , so all eigenvectors are directions where the matrix already behaves diagonally. So if we choose all our basis vectors to be eigenvectors, then the matrix must be diagonal in that basis.

Change of basis

Suppose that $v_{1}, . . . v_{n}$ are all eigenvectors of $A$ and they form a basis

Let $P$ be a matrix with these eigenvectors as columns:

P = [\begin{matrix} v_{1} & v_{2} & \dots & v_{n} \end{matrix}]

Then all $P$ does is converts coordinates in the eigenbasis into standard coordinates. So $P^{- 1}$ converts standard coordinates into eigenbasis coordinates

Now consider taking a vector in standard coordinates of a matrix $A$

Convert it into eigenbasis coordinates $P^{- 1}$
Apply the transformation in that basis to get a diagonal matrix $D$
Convert back to standard coordinates $P$

Writing this as a composite transformation, we have that:

A = P D P^{- 1}

D = P^{- 1} A P

So finding a diagonal matrix just means finding perpendicular directions where the transformation acts by pure scaling.

Powers

Diagonalising also helps us computing large powers:

A = P D P^{- 1} ⟹ A^{k} = P D^{k} P^{- 1} ⟹ D^{k} = P^{- 1} A^{k} P = (\begin{matrix} λ_{1}^{k} & 0 & 0 \\ 0 & λ_{2}^{k} & 0 \\ 0 & 0 & λ_{3}^{k} \end{matrix})

This is also why symmetric matrices also diagonalise, because they have enough linearly independent vectors

8. Orthogonal sets and Quadratic forms

Orthogonal vectors are essentially directions that don't overlap. So they're independent in the geometric sense. Think of the $x$ and $y$ axis for example.

Definition A set of vectors ${v 1, . . . v_{n}} \in R^{n}$ is orthogonal if

v_{i} \cdot v_{j} = 0, \forall i \neq j

Therefore, we can see that orthogonal vectors are automatically linearly independent, this will help us compute projections, eigenvectors, and simplify quadratic forms

Definition An orthonormal set is the same as orthogonal, but now each vector has a unit length, i.e. we have normalised the vectors

So a set is orthonormal if:

v_{i} \cdot v_{j} = 0, \forall i \neq j and | | v_{i} | | = 1 \forall i

Orthonormal is more useful than just orthogonal because the coordinate along $v_{i} = x \cdot v_{i}$
So we just have to deal with dot products instead of worrying about scaling

Orthogonal matrices

Intuitively, an orthogonal matrix (thinking in terms of transformations):

Rotates spaces
Reflects space
never stretches or squashes space

So the lengths and angles are preserved

Definition
A matrix $P \in R^{n \times n}$ is orthogonal if:

P^{T} P = I

Then the following are all equivalent:

Columns of $P$ form an orthonormal basis
$P^{- 1} = P^{T}$
$| | P x | | = | | x | | \forall x$

The reason for why $P^{T} P = I$ is because:

Dot products of columns give the identity
Diagonal has a unit length (1)

Gram-Schmidt Orthogonalisation

Before diving into the algorithm, it's good to think about the problem this process is trying to solve.

Consider vectors spanning a space, most of them are messy and angled, and what we want is the same space, but having all directions be perpendicular.

So the intuition behind Gram-Schmidt is given the vectors $v_{1} . . . v_{n}$ :

Keel $\vec{v_{1}}$
From $\vec{v_{2}}$ , subtract the part pointing in the direction of $\vec{v_{1}}$ (the projection)
From $\vec{v_{3}}$ , subtract the projection of $\vec{v_{1}}, \vec{v_{2}}$
Normalise everything

So we remove the overlap of vectors step by step without changing the span.

I will not be covering the proof as that is given in the lecture notes, and my notes are just to cover a bit of the intuition that I find isn't in the notes, but it's also worth watching some YT videos to understand the intuition behind it. Tom Crawford has a nice video on it.

Orthonormal basis

A basis, as talked about briefly earlier, is just a coordinate system. Recall that for a set of vectors to form a basis of $R^{n}$ , the vectors must:

Span $R^{n}$
Be linearly independent

With that, an orthogonal basis is just a basis where all directions are perpendicular

Now for an important Theorem: Every subspace of $R^{n}$ has an orthonormal basis

What this means is that no matter how tilted or messy as subspace is:

We can always choose perpendicular axes inside it
Geometry is always recoverable

This isn't obvious, but Gram-Schmidt process constructs it

Diagonalising Orthogonal Symmetric Matrices

Recall from previous chapter that diagonalisation is simply finding a coordinate system where the matrix acts simply.

For an orthogonal symmetric matrix, $A = A^{T}$ :

Eigenvalues are real
Eigenvectors are orthogonal

Symmetric matrices also respect the dot product.

Quadratic Forms

Definition

A quadratic form is a function:

Q (x) = x^{T} A x \forall x \in R^{n}

They are used to measure curvature and distance-like behaviour, so they describe shapes instead of transformations.

So every quadratic form becomes a sum of squares

Michael Penn has a good video on quadratic forms under the Number Theory playlist. Most of these notes, especially for the final two chapters were completed during the holidays, so the priority is given to doing past papers and building sufficient intuition instead of copying proofs into my own notes.

Definiteness

When working with quadratic forms, definiteness is a property of the sign of $x^{T} A x$

We want to think of $x^{T} A x$ as a function that takes a vector $x$ and output a single number, and definiteness is about whether the sign of that number is consistent.

There are 4 kinds of definite. So we let $A$ be a symmetric matrix and define:

Positive/Negative definite

x^{T} A x > 0 \forall x \neq 0 or x^{T} A x < 0 \forall x \neq 0

This means that no matter which direction we go in, it is strictly positive/negative

Positive/Negative semi-definite

x^{T} A x \geq 0 \forall x \neq 0 or x^{T} A x \leq 0 \forall x \neq 0

Never positive/negative, but can be zero is some directions

Indefinite

x^{T} A x can be positive for some x, negative for others

So the sign depends on the direction

Link to eigenvalues

This is also nice reason for why diagonalisation exists:

For a symmetric matrix, we have:

A = P D P^{T} ⟹ x^{T} A x = λ_{1} y_{1}^{2} + \dots + λ_{n} y_{n}^{2}

where:

$λ_{i}$ are eigenvalues
$y$ is $x$ written in the eigenvector basis

Now, $y_{i}^{2} \geq 0$ , so the sign of each term comes from the $λ_{i}$ , so eigenvalues control definiteness

The lectures skipped the section on conic sections as that will not be assessed, and that concludes Linear Algebra I

Link back to Home page

Link to typed notes:

0. Complex Numbers

Why is i on the other side??

Polar form

Multiplication

Division

Powers (De Moivre's)

Roots of units

Example

Conjugates and Reciprocals

Why do they matter?

Reciprocals: $z^{-1} = \frac{\bar{z}}{\mid z \mid

Reciprocals in polar form

Roots and conjugate pairs

Solving roots in polar form

1. Real n space Rn

Scalar/Dot product

Properties of scalar product

Norm of a vector

Projections

Cauchy-Schwarz Inequality

Triangle inequality

Equation of a Line in Rn

Vector equation

Parametric equation

Cartesian equation

Converting from parametric to Cartesian equation example

Parallel, Intersection, Skew

Forming and solving systems of equations to determine the arrangement of lines in 𝟛R3

Equation of a plane

Vector equation

Cartesian equation

Parametric equation

Vector product

Vector product properties

Parallelograms...

Intersection of Lines and Planes in R3

Distances in R3

Point and a Plane

Point and a line

2. Matrix Algebra

Definitions

Matrix multiplication

Properties:

Transposition of a Matrix

Properties

Inverse of a Matrix

Properties

Powers of a Matrix

3. Systems of Linear Equations

Augmented Matrix

Row Operations

Row Scaling

Row interchange (swap)

Row replacement

Gaussian Elimination

Reading solutions

Reducing a matrix to RREF

Matrix inverses using row operations

Algorithm for finding inverses using row operations

Why the RHS equals A−1

Example

Rank of a Matrix

Properties of a Rank

4. Determinants

1. Linearity in each row

2. Alternating

3. Normalisation

Determinants of Upper Triangular Matrices

Determinant of any square matrix

Proof for a 2×2 matrix

Existence and Uniqueness

Determinants and Invertibility

Proof

Determinant using Cofactors

Determinant of a Transpose

Determinant of a Product

Powers of a matrix

Inverting a Matrix using Cofactors

1. Real $n$ space $R^{n}$

Equation of a Line in $R^{n}$

Forming and solving systems of equations to determine the arrangement of lines in $R^{3}$

Intersection of Lines and Planes in $R^{3}$

Distances in $R^{3}$

Why the RHS equals $A^{- 1}$

Proof for a $2 \times 2$ matrix

Linear Transformation in $R^{2}$

6. Subspaces of $R^{n}$