Powered by Blogger.

Basis and Dimension

Preliminary: Euclidean Space and Vectors

A sheet of paper is two-dimensional because, given any reference point on the sheet, which we can call $(0,0)$, any other point can be specified by how far to the right and how far up it is from the reference. This is just the notion of an arrow with a magnitude and direction from the first post on vectors.

Similarly, space as we know it is 3-dimensional because, given any reference point, which we can call $(0,0,0)$, any other point can be specified by its distance in length, width, and depth from the reference.

In this post, we'll formalize the dimension of a vector space by introducing the concept of a basis. I will introduce the basic terminology as well as full proofs of a few important results; while this won't be an exhaustive linear algebra primer by any means, any further questions are welcome in the comments section.


Basis of a vector space


Sticking with the 3-dimensional space example, let's call the $x$-axis the left-right direction, so that $(1,0,0)$ corresponds to a step one unit to the right. A step forward would be along the $y$-axis, or $(0,1,0)$, and a step upwards would be along the $z$-axis at $(0,0,1)$. As discussed in the preliminary post, an arrow of any direction and magnitude can be written as a linear combination of these 3 vectors. In symbols, if we call these three ${\bf e}_1$, ${\bf e}_2$, and ${\bf e}_3$, then any vector ${\bf v}$ can be written as $$
{\bf v} = a_1 {\bf e}_1 + a_2 {\bf e}_2 + a_3 {\bf e}_3
$$ for some scalars $a_1, a_2, a_3$. We write this vector as ${\bf v} = (a_1, a_2, a_3)$, implicitly expressing it in terms of the ${\bf e}_i$'s. Since any vector can be written this way, we say that the ${\bf e}_i$'s span the space ${\Bbb R}^3$.

If a vector ${\bf v}$ can be written as a linear combination of another set of vectors as above (without all $a_i$'s equal to $0$), then it is called linearly dependent on those vectors. If not, then the vectors are linearly independent. Clearly, we can't combine the ${\bf e}_i$ vectors above in any way to get the other ones, so they are linearly independent.

A set of vectors which (1) is linearly independent and (2) spans the entire vector space is called a basis, and the ${\bf e}_i$'s above are called the standard basis of ${\Bbb R}^3$. Essentially, a basis is a "minimal" (see the next section) set which still spans all of $V$ using the vector space operations of vector addition and scalar multiplication: the linear independence requirement ensures that a basis excludes any extra "trash" that doesn't add to the span. For example, since scalar multiplication is built into any vector space, if we have ${\bf e}_1 = (1,0,0)$ in our basis, we clearly don't also need $(2,0,0) = 2 {\bf e}_1$; any vector expressed as a linear combination including $(2,0,0)$ could include ${\bf e}_1$ instead with the coefficient doubled, so $(2,0,0)$ really isn't adding to the span.

On the other hand, we could remove $(1,0,0)$ from the standard basis and replace it with $(2,0,0)$ to obtain a different basis, so a basis certainly isn't unique. In fact, we will prove below that any linearly independent set of vectors can be extended to a basis. However, the constants $a_i$ which express a given vector in terms of a particular basis are unique: if different constants $b_i$ existed, then we could have a vector ${\bf v}$ such that $$
\begin{align}
{\bf v} &= a_1 {\bf e}_1 + a_2 {\bf e}_2 + a_3 {\bf e}_3 \\[1mm]
&= b_1 {\bf e}_1 + b_2 {\bf e}_2 + b_3 {\bf e}_3 \\[2mm]
\implies {\bf v} - {\bf v} = {\bf 0} &= (a_1-b_1) {\bf e}_1 + (a_2-b_2) {\bf e}_2 + (a_3-b_3) {\bf e}_3
\end{align}
$$ If $a_i \neq b_i$ for at least one value of $i$, then this would contradict the fact that the basis vectors are linearly independent (since a linear combination of them without all coefficients equal to 0 yields the zero vector). Thus, we must have $a_i=b_i$ for all values of $i$, i.e. the representation of ${\bf v}$ in terms of the particular basis is unique. The same argument applies to vector spaces other than just ${\Bbb R}^3$.


Dimension of a vector space


The number of vectors forming a basis (which is infinite for some spaces) is called the dimension of the space. ${\Bbb R}^3$ has dimension $3$ because it has $\{ {\bf e}_1, {\bf e}_2, {\bf e}_3 \}$ as a basis, while the $\ell^p$ spaces from the last post have infinite dimension. Their standard bases consist of sequences with a $1$ for the $i$-th component and zeros elsewhere.

Defining dimension in this way implies that every basis has the same number of elements. This is indeed the case, and the proof for finite-dimensional spaces is a consequence of the following related lemma.

Lemma: If $B = \{ {\bf b}_1, {\bf b}_2, \dotsc, {\bf b}_n \}$ is a basis for a vector space $V$, and $W = \{ {\bf w}_1, {\bf w}_2, \dotsc, {\bf w}_m \}$ is a linearly independent set of vectors, then $m \leq n$. If $m=n$, then $W$ spans $V$ and is thus another basis.

Proof: Since $B$ is a basis for $V$, we can express ${\bf w}_1$ as a linear combination of the vectors in $B$, $$
{\bf w}_1 = c_1 {\bf b}_1 + c_2 {\bf b}_2 + \dotsb + c_n {\bf b}_n
$$ where at least one of the $c_i$'s is nonzero. Let $j$ be an index such that $c_j \neq 0$. Then we have $$
{\bf b}_j = \tfrac{1}{c_j}{\bf w}_1 - \tfrac{c_1}{c_j}{\bf b}_1 - \dotsb - \tfrac{c_{j-1}}{c_j}{\bf b}_{j-1} - \tfrac{c_{j+1}}{c_j}{\bf b}_{j+1} - \dotsc - \tfrac{c_n}{c_j}{\bf b}_n
$$ Since $B$ is a basis, any vector ${\bf v} \in V$ can be written as a linear combination of the ${\bf b}_i$'s: $$
\begin{align}
{\bf v} &= v_1 {\bf b}_1 + v_2 {\bf b}_2 + \dotsb + v_n {\bf b}_n \\[2mm]
&= v_1 {\bf b}_1 + \dotsb + v_{j-1} {\bf b}_{j-1} \\[1mm]
& \ \ \ + v_j \underbrace{\left[ \tfrac{1}{c_j}{\bf w}_1 - \tfrac{c_1}{c_j}{\bf b}_1 - \dotsb - \tfrac{c_{j-1}}{c_j}{\bf b}_{j-1} - \tfrac{c_{j+1}}{c_j}{\bf b}_{j+1} - \dotsc - \tfrac{c_n}{c_j}{\bf b}_n \right]}_{{\bf b}_j} \\[1mm]
& \ \ \  +  v_{j+1} {\bf b}_{j+1} + \dotsb + v_n {\bf b}_n
\end{align}
$$ This eliminates ${\bf b}_j$ from the expansion, showing that the arbitrary vector ${\bf v}$ is also a linear combination of the set $B_1 = \{ {\bf b}_1, {\bf b}_2, \dotsc , {\bf b}_{j-1}, {\bf w}_1, {\bf b}_{j+1}, \dotsc , {\bf b}_n \}$. Thus, $B_1$ spans $V$.

Furthermore, $B_1$ is linearly independent: since ${\bf w}_1$ can be uniquely expressed as a linear combination of the vectors in the original basis $B$, a linear combination of the vectors in $B_1$ can be uniquely rewritten as a linear combination of vectors in $B$ by replacing ${\bf w}_1$ by its expansion $c_1 {\bf b}_1 + c_2 {\bf b}_2 + \dotsb + c_n {\bf b}_n$. Since that unique expansion includes a nonzero factor of ${\bf b}_j$, the resulting linear combination of elements of $B$ has at least one nonzero coefficient. Therefore, if such a combination yielded ${\bf 0}$, that would contradict the fact that $B$ (assumed to be a basis) is linearly independent. So $B_1$ must be linearly independent and thus is a basis for $V$.

Now, since $B_1$ is a basis, we can uniquely express ${\bf w}_2$ as a linear combination $$
{\bf w}_2 = d_1 {\bf b}_1 + d_2 {\bf b}_2 + \dotsb + d_{j-1} {\bf b}_{j-1} + d_j {\bf w_1} + d_{j+1} {\bf b}_{j+1} + \dotsb + d_n {\bf b}_n
$$ Since $W$ is linearly independent, it is not the case that ${\bf w}_2 = d_j {\bf w}_1$ with all other $d_i$'s equal to zero, so there exists a $d_k \neq 0$ with $k \neq j$. Then we can repeat the whole process to kick out ${\bf b}_k$ and replace it with ${\bf w}_2$, yielding a new basis $B_2$.

If $m=n$, then we can continue in this fashion until we kick out the last of the original vectors from $B$ at the $n$-th iteration, obtaining a new basis $B_n$ consisting of the vectors ${\bf w}_1, {\bf w}_2, \dotsc, {\bf w}_n$. This proves that any linearly independent set of $n$ vectors is a basis.

Furthermore, if $m>n$, we can continue the process and uniquely express ${\bf w}_{n+1}$ as a linear combination of the basis $B_n = \{ {\bf w}_1, {\bf w}_2, \dotsc, {\bf w}_n \}$. This is a contradiction, since $W$ is assumed to be linearly independent. Thus, it must be the case that $m \leq n$.
$\square$

Corollary: If $A = \{ {\bf a}_1, \dotsc {\bf a}_m \}$ and $B = \{ {\bf b}_1, \dotsc {\bf b}_n \}$ are two bases for a finite-dimensional vector space $V$, then $m=n$.

Proof: By the above lemma, $m \leq n$ since $A$ is linearly independent, and $B$ is a basis. Also, $n \leq m$ since $B$ is linearly independent, and $A$ is a basis. Therefore, $m=n$.
$\square$

Note that this also allows us to conclude that in a vector space $V$ with dimension $n$, a set with fewer than $n$ vectors cannot span $V$. Thus, a basis is a minimal set that still spans all of $V$. On the other hand, by the lemma, a basis is a maximal set of vectors that is still linearly independent.


Extending to a basis


If we have a vector space $V$ with finite dimension $n$ and a set $S$ of $p$ linearly independent vectors with $p<n$, then we can extend $S$ to a basis for $V$.

Indeed, since $S$ has only $p$ vectors, it cannot span $V$. Thus, there exists a vector ${\bf v}_1$ such that ${\bf v}_1$ is not in the span of $S$ (the set of all linear combinations of the vectors in $S$). So $S \cup \{ {\bf v}_1 \}$ is a linearly independent set of $p+1$ vectors.

Continuing in this manner until we have exactly $n$ vectors (which will take $n-p$ steps), we will obtain a set $S'$ of $n$ linearly independent vectors. The lemma above implies that $S'$ will span $V$ and thus be a new basis.

In practice, finding the vectors ${\bf v}_1, {\bf v}_2, \dotsc, {\bf v}_{n-p}$ as described above consists of either guess-and-check or solving sets of linear equations. For example, if we have the vectors $(0,3,0)$ and $(1,0,-2)$ and would like to extend to a basis for ${\Bbb R}^3$, then we need to find a vector ${\bf x} = (x_1, x_2, x_3)$ such that ${\bf x}$ is linearly independent from the first two vectors. If ${\bf x}$ were a linear combination, then we would have:$$
(x_1, x_2, x_3) = a_1 (0,3,0) + a_2 (1,0,-2)
$$ This gives us 3 equations: $$
\begin{align}
x_1 &= a_2 \\[2mm]
x_2 &= 3 a_1 \\[2mm]
x_3 &= -2 a_2
\end{align}
$$ We just need to find one vector which does not satisfy these equations, regardless of the choice of the $a_i$'s. Since the first and third equations imply that $-\tfrac{1}{2}x_3 = x_1$, any vector not satisfying that equation will work. In this case ${\bf x} = (0,0,1)$ would work. Note that the solution is not unique; there are infinitely many solutions, such as ${\bf x} = (0,0,x_3)$ for any value of $x_3$ as well as many other solutions.

While the system of equations above was rather easy, we can use the more formal method of Gaussian elimination to simplify tougher systems.


Extending to a basis- infinite-dimensional case


In an infinite-dimensional vector space, we can also extend a linearly independent set to a basis, but the proof of this fact is not as simple as setting up a system of linear equations to solve (there would be infinitely many equations). In fact, the proof relies on the axiom of choice and thus does not show us how to actually perform the basis extension. For the sake of completeness, I'll provide a sketch of the proof anyway, with the warning that it won't make much sense to those not somewhat well versed in set theory.

Sketch of proof: Let $V$ be a vector space, not necessarily finite-dimensional, and let $S$ be a linearly independent subset of $V$. Let $A$ be the collection of sets whose members are the linearly independent subsets of $V$ which contain $S$. In symbols, $$
A = \{ T \subseteq V : \ S \subseteq T \text{ and $T$ is linearly independent} \}
$$ An arbitrary chain in $A$ takes the form $T_1 \subseteq T_2 \subseteq T_3 \subseteq \dotsb$ where $T_i \in A$ for $i=1,2,3,\dotsc$. The set $T_{\cup} = \bigcup_{i=1}^{\infty}{T_i}$ is an upper bound for the chain in that it is bigger (as defined by the $\subseteq$ relation) than each element of the chain.

Since all chains have an upper bound, Zorn's Lemma (equivalent to the axiom of choice, though I will omit the proof of this fact) implies that $A$ has a maximal element, i.e. an element $M$ such that $T \subseteq M$ for all $T \in A$. Suppose $M$ does not span $V$. Then there exists a vector ${\bf z} \in V$ which is not a linear combination of the vectors in $M$. But then $M' = M \cup \{ {\bf z} \}$ is a linearly independent set containing $M$ and thus $S$. So $M' \in A$, a contradiction since $M$ was supposed to be a maximal element. Thus it must be the case that $M$ does span $V$, so $M$ is a basis.
$\square$

As with most AC-based proofs, this one isn't very satisfying, but there you have it: even in the infinite-dimensional case, you can extend a linearly independent set to a basis (but I can't tell you how!).

That will do it for this post- thanks for reading and please post any questions in the comments section.

0 comments:

Post a Comment