Powered by Blogger.

The Plane in R^3

Prerequisites: Euclidean Space and Vectors

Suppose we have a plane in ${\Bbb R}^{3}$ which contains the point $(0,0,0)$ but does not intersect the axes at any other point. How many octants does the plane intersect?

Let's start by analyzing the 2-dimensional analog, then come back to the 3-dimensional problem with an algebraic approach and a geometric interpretation. Finally, we'll try to generalize the answer to higher dimensions.

Lines in the Plane ${\Bbb R}^{2}$

The 2-dimensional analog of a plane in ${\Bbb R}^{3}$ containing the origin is a line in ${\Bbb R}^{2}$ containing the origin. For those familiar with linear algebra, the former is a vector subspace of ${\Bbb R}^{3}$ of dimension 2 (i.e. co-dimension 1), and the latter is a vector subspace of ${\Bbb R}^{2}$ of dimension 1 (still co-dimension 1). If you aren't familiar with linear algebra, ignore that sentence and just keep reading.

The equation of a line in ${\Bbb R}^{2}$ is $y = mx + b$, where $m$ is the slope ("rise over run," or change in $y$ per change in $x$), and $b$ is the $y$-intercept (where the line hits the $y$-axis). A point $(x_{0},y_{0})$ is on the line if it satisfies the equation, i.e. if the equality $y_0 = mx_0 + b$ is true.

A line passing through the origin has a $y$-intercept of 0, thus $b=0$ and the equation is simply $y=mx$. Now, if $m$ is 0, then the line would just go along the $x$-axis, so if it doesn't touch the axis, we must have $m \neq 0$, so either $m > 0$ or $m < 0$. In the case where $m>0$, a positive $x$ value gives a positive $y$, and a negative $x$ gives a negative $y$, so the line would pass through quadrants 1 and 3 (top-right and bottom-left). In the case where $m<0$, the same analysis shows that the line passes through quadrants 2 and 4 (top-left and bottom-right). So the line passes through 2 quadrants.

The $y = mx + b$ formulation works fine for this question, but if we want to put $x$ and $y$ on more even footing (this will come in handy in the higher-dimensional cases so that we don't always have to solve for one of the variables), we can use the other form of the equation of a line, $ax + by = c$, where $a,b,c \in {\Bbb R}$ are constants. In order to have the origin on the line, we must have $c=0$, because the point $(0,0)$ is on the line, and thus $a(0)+b(0) = 0 = c$. so the equation is simply $ax+by=0$. If either $a$ or $b$ were zero, the line would just be one of the axes, so we must have $a,b \neq 0$. Whether they are positive or negative, you can work out by plugging in positive or negative $x$ and $y$ values that the line will either pass through quadrants 1 and 3 or 2 and 4.

For example, if $a,b>0$, then if $x>0$, we must have $y<0$ in order to have $ax+by=0$. Similarly, if $x<0$, then $y$ would have to be positive. So the line passes through quadrants 2 and 4. We get the same result in the case where $a,b<0$. If $a$ and $b$ have opposite signs, then the line will pass through quadrants 1 and 3.

Notice that the line does not pass through the quadrant containing the point $(a,b)$. The vector $(a,b)$ is actually perpendicular to the line $ax+by=0$, and we can see that from the fact that the equation can be rewritten as ${\bf n} \cdot {\bf x} = 0$ where ${\bf n} = (a,b)$ and ${\bf x} = (x,y)$ (recall that two vectors are perpendicular if and only if their dot product is zero).

The Plane in Space ${\Bbb R}^{3}$

Let's go back to the ${\Bbb R}^{3}$ case, building off of the above discussion. The general equation of a plane is $ax+by+cz=d$, where $a,b,c,d \in {\Bbb R}$ are constants. In order for the plane to contain $(0,0,0)$, we must have $d=0$, so the equation is now just $ax+by+cz=0$. If any of the constants is 0, then the plane will actually look like the equation of a line from above. For example, if $c=0$, then we'd have $ax+by=0$, which would be a line in the $xy$-plane extended vertically up and down in the positive and negative $z$ directions, and in fact containing the entire $z$-axis. Can you see why (hint: show that the points on the $z$-axis, i.e. points of the form $(0,0,z)$, all satisfy the equation of the plane)?

We can also interpret the equation geometrically as follows: the equation $ax+by+cz=0$ is equivalent to ${\bf n} \cdot {\bf x} = 0$, where ${\bf n} = (a,b,c)$ and ${\bf x} = (x,y,z)$. Note that here, ${\bf n}$ and ${\bf x}$ are vectors, and their dot product is a scalar, so the $0$ on the right is a scalar zero, not the zero vector ${\bf 0} = (0,0,0)$.

As mentioned above, the dot product of two vectors is zero if and only if the vectors are perpendicular. Therefore, this equation is saying that any vector ${\bf x}$ that is perpendicular to ${\bf n}$ is on the plane. For this reason, ${\bf n}$ is called the plane's normal vector (normal is a synonym for perpendicular, as is orthogonal, which is also used frequently). In the example above where $c=0$, the plane's normal vector is $(a,b,0)$, which lies in the $xy$-plane. Thus, the $z$-axis, being orthogonal to the $xy$-plane, is contained in our plane.

Now, in order to answer the geometric question of which vectors are orthogonal to ${\bf n}$, we can look at the algebraic equation $ax+by+cz=0$.

Since the plane does not touch the axes except at the origin, we must have $a,b,c \neq 0$. As an example, let's look at the case where $a,b,c>0$. Then we can have the following combinations for $(x,y,z)$ in order to have $ax+by+cz=0$:
$$(+,+,-) \\
(+,-,+) \\
(+,-,-) \\
(-,-,+) \\
(-,+,-) \\
(-,+,+)$$ The remaining two combinations, $(+,+,+)$ and $(-,-,-)$, do not work, because then the left side of the equation would have to be positive or negative (respectively) and thus not zero.

If we grind through the algebra of the other 7 combinations for $(a,b,c)$, we see that we get 6 possibilities each time, so the plane intersects 6 of the 8 octants, and we have the answer to the problem. I'm not going to go through all the cases, because that would be quite boring, but you can see that there is a certain symmetry in the plane's equation between $(a,b,c)$ and $(x,y,z)$. Once you've solved it for the case $a,b,c>0$, you've pretty much solved it for all the cases. Can you see why? So we've got the answer- it's 6.

The Hyperplane in ${\Bbb R}^{n}$

${\Bbb R}^{n}$ is the $n$-dimensional analog of ${\Bbb R}^{3}$ and is the set of ordered $n$-tuples of real numbers: ${\Bbb R}^{n} =
\ \colon \
{\scr each} \ x_i \in {\Bbb R}
\}$. We can't picture this $n$-dimensional space, but we can use the same types of algebraic equations that work in ${\Bbb R}^{3}$ to analyze ${\Bbb R}^{n}$.

${\Bbb R}^{n}$ is divided into $2^n$ orthants, also known as hyperoctants or $n$-hyperoctants, based on the signs, positive or negative, of the $n$ components of a point. A 2-hyperoctant is a quadrant in ${\Bbb R}^{2}$ and a 3-hyperoctant is an octant in ${\Bbb R}^{3}$. The $x_i$-axis is the set of points where all coordinates except possibly the $i^{\scr th}$ are zero.

A hyperplane in ${\Bbb R}^{n}$ is a set $P$ of points (equivalently, vectors) that are orthogonal to a normal vector ${\bf n} = (a_1, a_2, ... a_n)$. In symbols, $P = \{ {\bf x} \in {\Bbb R}^{n} \ \colon \ {\bf n} \cdot {\bf x} = 0 \}$. For those familiar with linear algebra, the hyperplane containing the origin is a vector subspace of ${\Bbb R}^{n}$ of dimension $n-1$, i.e. co-dimension 1. A hyperplane in ${\Bbb R}^{2}$ is a line, and a hyperplane in ${\Bbb R}^{3}$ is a plane.

How many $n$-hyperoctants does a hyperplane $P \subset {\Bbb R}^{n}$ intersect, given that it contains the origin, but does not intersect the axes at any other point?

To answer this question, we can use the discussion above from the $n=2$ and $n=3$ cases and generalize the results. We can then prove the answer is correct using induction.

When we went from $n=2$ to $n=3$, we took the equation $ax+by=0$ (i.e. the line in ${\Bbb R}^{2}$ with normal vector $(a,b)$), extended it into 3-dimensional space to make the plane whose normal vector is $(a,b,0)$, and then added a non-zero third coordinate to the normal vector to "tilt" the plane off of the $z$-axis.

Now, a hyperplane (including the line and plane in the $n=2$ and $n=3$ cases) is orthogonal to its normal vector ${\bf n}$ as well as the negative of the normal vector, $-{\bf n}$. In fact, the hyperplane is orthogonal to any scalar multiple of ${\bf n}$, but my point in mentioning $-{\bf n}$ is that the hyperplane won't intersect the $n$-hyperoctants that contains ${\bf n}$ or $-{\bf n}$.

Let's look at the case where ${\bf n}$ lies in the first $n$-hyperoctant, i.e. has all positive coordinates. As mentioned above, the other cases are pretty much the same because of the symmetries of the equation ${\bf n} \cdot {\bf x} = \sum_{i=1}^{n}{a_i x_i} = 0$, so the number of $n$-hyperoctants the hyperplane intersects is the same in all cases. In the case that the $a_i$ are positive, the hyperplane doesn't intersect the first $n$-hyperoctant or the one with all negative coordinates (whatever number we want to assign to that one).

In the ${\Bbb R}^{2}$ case, the line intersects quadrants 2 and 4. When we extended ${\bf n} = (a,b)$ to ${\bf n} = (a,b,0)$ in ${\Bbb R}^{3}$, we got a plane that contained the entire $z$-axis. The intersection of this plane with the $xy$-plane is the line $ax+by=0$, which remains the case regardless of the third coordinate of ${\bf n}$. Now, this plane intersects the octants $(+,-,+)$, $(+,-,-)$, $(-,+,+)$, and $(-,+,-)$. We took the original quadrants 2 and 4 and multiplied them by 2 to get 4 octants.

When we add a non-zero third coordinate to ${\bf n}$ (let's assume it's positive), the new plane also intersects two additional octants: $(+,+,-)$ and $(-,-,+)$. The first two coordinates of these two would have not been included in the 2-d case, but the third coordinate allows us to use those combinations and still get the equation $ax+by+cz$ to equal zero. $(+,+,+)$ and $(-,-,-)$ still don't work though.

The same logic works when going from ${\Bbb R}^{n}$ to ${\Bbb R}^{n+1}$ when $n>2$, and we can prove it by induction.

Thoerem: For $n \geq 2$, a hyperplane in ${\Bbb R}^{n}$ containing the origin, but not intersecting the coordinate axes at any other point, intersects $2^{n}-2$ $n$-hyperoctants.

The proof is a bit lengthy, but basically just formalizes the idea of extending the line in ${\Bbb R}^{2}$ into a plane in ${\Bbb R}^{3}$ and then tilting it off the $z$-axis

Proof: The base case of $n=2$ was already shown above.

For the induction step, assume the theorem is true for ${\Bbb R}^{n-1}$, and consider a hyperplane $P = \{{\bf x} \in {\Bbb R}^{n} \ \colon \ {\bf n} \cdot {\bf x}=0 \}$ where ${\bf n} = (a_1,a_2,...,a_n)$.

The equation of $P$ is $\sum_{i=1}^{n}{a_i x_i} = \sum_{i=1}^{n-1}{a_i x_i} + a_n x_n = 0$. By the induction hypothesis, the solutions to the equation $\sum_{i=1}^{n-1}{a_i x_i} = 0$ intersect $2^{n-1}-2$ $(n-1)$-hyperoctants. Let's call those solutions $P_{n-1}$, which is a hyperplane in ${\Bbb R}^{n-1}$

Take a point ${\bf x}_{0, n-1} = (x_{0,1}, x_{0,2},...,x_{0,n-1}) \in {\Bbb R}^{n-1}$ which satisfies the equation of $P_{n-1}$. If $x_{0,1}>0$, then $x_{0,1}+\epsilon>0$ as well, where $\epsilon = \frac{1}{2}|x_{0,1}|$. Similarly, if $x_{0,1}<0$, then $x_{0,1}+\epsilon<0$ as well, so the point ${\bf x}_{1,n-1} = (x_{0,1}+\epsilon, x_{0,2},...,x_{0,n-1})$ is in the same $(n-1)$-hyperoctant as ${\bf x}_{0, n-1}$. By a similar argument, so is the point ${\bf x}_{2,n-1} = (x_{0,1}-\epsilon, x_{0,2},...,x_{0,n-1})$.

Define the points ${\bf x}_{1} = (x_{0,1}+\epsilon, x_{0,2},..., x_{0,n-1}, -\frac{a_1}{a_n}\epsilon), \ {\bf x}_{2} = (x_{0,1}-\epsilon, x_{0,2},..., x_{0,n-1}, \frac{a_1}{a_n}\epsilon) \in {\Bbb R}^{n}$. Then $$
{\bf n} \cdot {\bf x}_{1}
&= a_1 (x_{0,1}+\epsilon) + a_2 x_{0,2} + ... + a_{n-1} x_{0,n-1} + a_n (-\frac{a_1}{a_n}\epsilon) \\[2mm]
&= a_1 (x_{0,1}+\epsilon -\epsilon) + a_2 x_{0,2} + ... + a_{n-1} x_{0,n-1} \\[2mm]
&= a_1 x_{0,1} + a_2 x_{0,2} + ... + a_{n-1} x_{0,n-1} = 0
with the final equality being true because ${\bf x}_{0,n-1} \in P_{n-1}$.

This shows that ${\bf x}_{1} \in P$. Similarly, ${\bf x}_{2} \in P$. The first $n-1$ coordinates of these two points are in the same $(n-1)$-hyperoctant as ${\bf x}_{0,n-1}$, and the $n^{\scr th}$ coordinates of ${\bf x}_{1}$ and ${\bf x}_{2}$ have opposite sign. This shows that we have kept the $2^{n-1}-2$ $(n-1)$-hyperoctants of the $(n-1)$-hyperplane when we extended it into ${\Bbb R}^{n}$ and actually multiplied them by 2 (by adding both positive and negative $n^{\scr th}$ coordinates) to get $2(2^{n-1}-2)$ = $2^{n}- 4$ $n$-hyperoctants.

We just need to show that we've also added two more $n$-hyperoctants. These are the ones where the first $n-1$ coordinates all have the same sign or all have the opposite sign as the first $n-1$ coordinates of ${\bf n}$, just like when we went from $n=2$ to $n=3$ above. Examples of solutions to the equation of $P$ that are in those 2 $n$-hyperoctants are $(a_1,a_2,...,a_{n-1},-\dfrac{1}{a_n}\sum_{i=1}^{n-1}{a_i^2})$ and $(-a_1,-a_2,...,-a_{n-1},\dfrac{1}{a_n}\sum_{i=1}^{n-1}{a_i^2})$.

So now we are up to $2^{n}-4+2 = 2^{n}-2$, so $P$ intersects at least that many $n$-hyperoctants. There are only 2 more $n$-hyperoctants, and those are the ones that contain $\pm {\bf n}$, but we already know that points in those $n$-hyperoctants cannot satisfy the equation of ${\bf n} \cdot {\bf x} = 0$, so $P$ intersects exactly $2^{n}-2$ of the $2^n$ $n$-hyperoctants, and the theorem is proved.

Here's a diagram illustrating the objects described in the proof in the case where $n=3$ and $n-1=2$. Apologies for the low quality (I made it in MS Paint), but note that the bottom of the red plane, $P$, comes out towards the viewer, in front of the blue plane, and the top half of $P$ is behind the blue plane. The points ${\bf x}_{0,n-1}$, ${\bf x}_{1,n-1}$, and ${\bf x}_{2,n-1}$ are in in the $x_{1}x_{2}$-plane, with ${\bf x}_{1,n-1}$ in front of $P$ and ${\bf x}_{2,n-1}$ behind $P$.

Thanks for reading. Post any questions in the comments section.


  1. \mathfrak Fantastic!

    Question: Is hyperplane, same as hypersurface?
    whatever u told here about hyperoctants, hold good?

  2. A hypersurface is a generalization of a hyperplane in the context of manifolds. I'm not sure the question about octants would make sense in that context since a manifold need not be in ${\Bbb R}^{n}$ (though it has a map to ${\Bbb R}^{n}$ in a neighborhood of each point). Maybe there's a way to reformulate the question to apply in that context...