Basis and dimension | Ludek Cizinsky

👋 About the exercise session

Time flies by and we are already in the 5th week of the course! Let me quickly restate the topics we have covered:

(1) System of linear equations

(2) Matrices (and what they represent - linear transformations)

(3) The determinant (if non-zero, there exists unique solution to the given system of linear equations)

(4) Vector spaces (and spanning sets)

In the first two sessions, we learnt how to represent given problem using system of linear equations, translate it into a matrix form and then solve it using corresponding tools like Gauss-Jordan elimination. We also learnt that matrix, more specifically its columns, represent some linear transformation. In the session about determinant of given matrix, we learnt that non-zero determinant means that there exists some unique solution, otherwise there might be infinitely many solutions or none. Finally, in last week’s session, we talked about notion of vector spaces. In addition, we also mentioned how to determine if provided set of vectors spans given vector space.

In today’s session, we will talk about special vector spaces: column, row and null vector spaces and their relation to the solution of the given system of linear equations. The learning goals are more practical, like being able to compute basis rank of a matrix, but I will also do my best to give you the intuition why things are the way they are. I suggest you watch the following video from 3blue1brown from which I take a lot of inspiration in what I write in the below notes. I hope the video and my notes complement to each other well.

Last but not the least, we covered a lot of details including proofs, but try to keep in mind the big picture:

\[Ax = b\]

Simply put, this is what we have been studying in detail so far. 🙃 Without further ado, let’s get to work!

📓 Notes

\[\newcommand{\ihat}{\hat{\textbf{i}}} \newcommand{\jhat}{\hat{\textbf{j}}}\]

Basis of a vector space

In the second exercise session, I talked about two special vectors in 2D system called \(\ihat\) and \(\jhat\). These two vectors formed basis of \(R^2\). So what is so special about these two vectors? Well, just two things:

(1) They are linearly independent: as fancy as it might sound, it simply means that I can not obtain one vector using linear combination of the other vector(s). In case of \(\ihat\) and \(\jhat\), it simply means there exists no scalar c such that I can scale one vector to obtain the other. Clearly this holds for the two vectors since they are perpendicular to each other.

(2) They span the given vector space: meaning any vector in the given vector space can be written as a linear combination of the basis vectors.

Last week, we talked about how to prove the second property (see exercise 4). The first property is also relatively easy to check. For instance, if we wanted to test whether \(\ihat\) and \(\jhat\) form a set of linearly independent vectors, we need to find solution to the following linear equation:

\[c_1 \ihat + c_2 \jhat = 0\]

If \(c_1, c_2 = 0\), i.e., the system has only a trivial solution, we conclude that the two vectors are linearly indepenedent, otherwise they are dependent. We will practice this in this week’s exercise, so take a look if this still seems to abstract.

Note that sometimes it is possible to test only either of the properties to conclude that given set forms basis of given vector space - I discuss when this sometimes is in the below section.

Properties of basis and dimension of vector space

Below, I write down important theorems relating to the basis of vector space:

(1) Uniquness of basis representation: vector \(v \in V\) (\(V\) is a vector space) can be written only one way as a linear combination of vectors in the given basis set. We will talk about the importance of this in the following exercise session.

(2) Number of vectors in basis: if given vector space \(V\) has basis that contains \(n\) vectors, then any other \(basis\) of this vector space must have \(n\) vectors

(3) Basis and linear dependence: if given vector space \(V\) has basis that contains \(n\) vectors, then any set containing more than \(n\) vectors is linearly dependent set

The last two theorems evolve around number of vectors in basis denoted by \(n\). This is in fact quite important number because it determines dimension of given vector space: \(dim(V) = n\). In the below sections, we will discuss more in depth the relation between dimension of the given column space and solution to the given system of linear equations.

Lastly, if we are asked to determine whether the given set with n vectors forms basis for the given n dimensional vector space \(V\), we can check only one of the above properties.

What does it actually mean to have zero, unique or infinitely many solutions?

Recall that in the intro I said that it is all about understanding the following equation:

\[Ax = b\]

and the role of the respective terms \(A, x, b\). Therefore, the above equation can be written in words as:

Take \(x\) from the original vector space, apply some linear transformation \(A\) which should yield corresponding vector \(b\) in the transformed vector space

I am writing should because we actually do not know if this is possible. We can encounter following two cases:

(1) non-zero determinant (\(A^{-1}\) exists): in this case we map the given vector \(x\) from \(n\) dimensional space into transformed \(n\) dimensional vector space. We do not lose any dimensions. Meaning, space does not get squashed and we have a unique solution, i.e., for given \(x\) there exists precisely one corresponding vector \(b\).

(2) zero-determinant (\(A^{-1}\) does NOT exist): in this case, we know that there is either zero or infinitely many solutions. Visually, this means that the transformation squashes the original \(n\) dimensional space into transformed space which has less than \(n\) dimensions.

With all of this being said, in practice we actually know \(b\) and are looking for \(x\). Recall that for this specific reason we rewrote the equation above as:

\[x = A^{-1}b\]

In words, to obtain \(x\), we reverse the linear transformation. For instance if \(A\) represents rotation by 90 degrees left, then \(A^{-1}\) represents the same rotation but to the right. This particular example corresponds to the first case, i.e., dimension of the transformed vector space remains same and we have a unique solution.

However, what if the transformation squashes the vector space? (zero determinant) This means one of the following two things:

(1) zero solutions: \(b\) does not live in vector space spanned by column vectors of \(A\). In other words, there is no way how we can combine column vectors from \(A\) to obtain \(b\).

(2) infinitely many solutions: One or more \(x\) (original vector space) are mapped to one \(b\) (transformed vector space)

Column, row, null space and rank of linear transformation

Column vector space is simply set of vectors that can be obtained by linearly combining column vectors of the given matrix \(A\) representing some linear transformation. Same applies for row vector space, except here you combine row vectors.

Note that the above definition means for each spaces says that given vectors (column or row) spans the given vector space, BUT just because they span it, does NOT mean they form its basis. And we actually DO care about how basis of the transformed vector space look like or rather how many vectors they have. (I will explain why in a second) So how do we get basis from column or row vectors?

For row vectors, you reduce the matrix into the row echelon form and the rows with leading 1 are the basis vectors. Same procedure follows for the column vectors, except you first transpose the matrix and then reduce it to the row echelon form.

Now to the why we actually care. Following the procedure defined above, we might either:

(1) Get same number of basis vectors as column/row vectors

(2) Or less

In the first case, it means that the transformation does NOT reduce the transformed vector space. Therefore:

Determinant of \(A\) is non-zero \(\Rightarrow\) \(A^{-1}\) exists \(\Rightarrow\) system has unique solution

In the other case, the transformed vector space is squashed. In other words, some set of vectors from the original vector space will be squashed into a single vector in the new transformed vector space. (infinitely many solutions) While some other vectors in the original vector space will not even have corresponding vectors in the transformed vector space. (zero solution)

Now is the proper time to introduce notion of rank. Rank of the matrix \(A\) (a.k.a. some linear transformation) tells you dimension of the transformed vector space. If we have full rank matrix, i.e., number of variables (a.k.a. dimension of the original vector space) is equal to the dimension of the transformed vector space, we can say the system has a unique solution. But what if for instance we have 3 variables and \(rank(A) = 2\)? Then of course we are in the situation of having zero or infinitely many solutions. Note that I am saying rank of matrix \(A\) instead of either rank of column or vector space, because these two are equivalent.

Naturally, one might ask where did the 1 dimension then go? Well, the simple answer is null space of \(A\). Null space of \(A\) are formally all \(x\) that satisfy \(Ax = 0\). In words, we mapped them to the null vector in the transformed vector space, i.e., they lose their identity. Dimension of this null space is called nullity. And in relation to number of variables \(n\) in our system, we can say that:

\[n = rank(A) + null(A)\]

where \(null(A)\) denotes the dimension of nullspace. (a.k.a. nullity) Now, you might ask, what is so special about nullspace? I mean, there are certainly also other set of vectors \(x\) that end up being mapped to some single \(b\), why do we not care about these sets? Well, based on the definition of a vector space, we need it to include zero vector. This implies that also corresponding subspaces of this vector space must have zero vector. And guess what, the nullspace is the only set that includes this special zero vector and as such it is subspace of the original vector space. As we discussed last session, in mathematics, it is simply nice to know something is a subspace because then you know what kind of rules apply. But there is another cool property of nullspace.

Imagine, we take two vectors \(u, v\) from the original vector space. We know that for \(v\), we can map it to corresponding vector \(b\) in transformed space given that \(b \neq 0\). For \(u\), we can do the same but given that \(b = 0\), i.e., \(u\) is null-space of \(A\). Therefore, we can write \(Au = 0\). Similarly, for \(v\), we can write \(Av = b\). So what happens if we add \(u, v\) together and transform it? Let’s try:

\[A(v + u) = Av + Au = Av + 0 = Av = b\]

This means that the solution to \(Ax = b\) where \(b \neq 0\) (non-homogenous system) can be written in the form \(x = x_p + x_h\) where \(x_p\) is the concrete solution (in my example \(v\) - there exists mapping from \(x_p\) to non-zero \(b\)) and \(x_h\) which is some solution to \(Ax = 0\) (\(x_h\) is from null-space).

⛳️ Learning goals checklist

Let me first say that this has been a lot to digest. So do not worry if you feel overwhelmed. As I said in the beginning, my notes focused on the intuition behind why certain things work the certain way. While having an intuition is important, at this stage of learning, you should as a priority focus on obtaining the practical skills:

Be able to check if a set of vectors is linearly independent, check if it is a basis
Compute bases for the column, row and null space, and their corresponding dimensions
Describe the set of all solutions to a given system of linear equations (zero, one or infinitely many)