Powered by Blogger.

Power Means (AM-GM Part 3)

In the last two posts on the AM-GM inequality (Part 1 and Part 2), I introduced two types of averages: the arithmetic mean and the geometric mean. We saw that the geometric mean was the more appropriate choice for portfolio returns (due to the compounding effect) and that the geometric mean is always less than the arithmetic mean (or equal when all inputs are equal).

In essence, the AM-GM inequality tells us that the GM weights smaller inputs more than the AM would; a loss in one year continues to affect a portfolio in the following years, since next year's percentage gain applies to a smaller portfolio balance. Thus the GM takes into account the relevance of the product of the inputs, while the AM applies when the sum of the inputs is most relevant.

In this post, I will introduce a family of different "average" formulas and show how the AM-GM inequality is a special case of a more general result.


The Harmonic Mean ($p=-1$)


Suppose we take a 3-leg trip where we travel 20 miles in 30 minutes (40 mph) in the first leg, 65 miles in 60 minutes (65 mph) in the second leg, and 90 miles in 90 minutes (60 mph) in the third leg. At the most basic level, our "average speed" for the trip should be the constant speed that one would need to travel for the total time, 180 minutes, to cover the total distance, 175 miles, or $$
\dfrac{175 \text{ miles}}{180 \text{ min}} \times \dfrac{60 \text{ min}}{1 \text{ hour}} = 58.33 \ \dfrac{\text{miles}}{\text{hour}}
$$ This is analogous to the portfolio example, in which the GM (CAGR) was the constant return which would yield the final portfolio value from the initial one by annual compounding (i.e. multiplying instead of adding) over the full time period in question.

In our travel example, the arithmetic mean will give us the correct result if we use a time-weighted AM: $$
\left( 40 \times \dfrac{30}{180} \right) + \left( 65 \times \dfrac{60}{180} \right) + \left( 60 \times \dfrac{90}{180} \right) = 58.33 \ \checkmark
$$ However, if we weight by distance instead, the AM will return 59.57 mph, an overestimation: $$
\left( 40 \times \dfrac{20}{175} \right) + \left( 65 \times \dfrac{65}{175} \right) + \left( 60 \times \dfrac{90}{175} \right) = 59.57 \ ( \text{overestimate})
$$ When taking an average of ratios (speed is the ratio of distance to time) and weighting by the numerator, we need to use a different average, the (weighted) harmonic mean, defined as the reciprocal of the arithmetic mean of the reciprocals: $$
\text{HM}(x_1, x_2, \dotsc, x_n) = \left( \sum_{i=1}^{n}{w_i  x_i^{-1}} \right)^{-1}
$$ where the $w_i$'s are a sequence of weights which add up to 1; the unweighted HM is simply the special case where all weights equal $1/n$. Note that in order to avoid issues with division by zero, we restrict the definition of the HM to positive inputs only.

The below table shows the results of the AM and HM for our travel example:


In addition to confirming that the HM yields the "average" that we sought in the case of numerator-weighting, the table also suggests that the HM and AM satisfy an inequality similar to AM-GM. Indeed they do, as we shall see in the next section.


Power Means


Let $x_1, x_2, \dotsc, x_n$ be positive real numbers and $p$ be a non-zero real number (not necessarily an integer), and let $w_1, w_2, \dotsc, w_n$ be a sequence of positive weights which add up to 1. Then the (weighted) power mean with exponent $p$ of the $x_i$'s is defined as $$
M_{p}(x_1, x_2, \dotsc, x_n) = \left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{1/p} \tag{$\star$}
$$ Note that the unweighted version is obtained by setting all weights equal to $1/n$.

The AM, GM, and HM are each special cases: $M_1$ is the arithmetic mean, and $M_{-1}$ is the harmonic mean. Now, due to the division by $p$ in the exponent, we can't technically define the case of $p=0$ directly from $( \star )$, so instead we define $M_0$ to be the geometric mean, which turns out (proof deferred to the end of this post) to equal $\displaystyle{\lim_{p \rightarrow 0}{M_p}}$.

Similarly, we define $M_{-\infty}(x_1,x_2, \dotsc , x_n)=\min(x_1, x_2, \dotsc, x_n)$ and $M_{\infty}(x_1,x_2, \dotsc , x_n)=\max(x_1, x_2, \dotsc, x_n)$. The two relevant limits also turn out (proof deferred to the end as well) to equal these definitions. All of the power means are "averages" in the sense that they fall between the minimum and maximum values of the inputs, i.e. $$
M_{-\infty}(x_1, x_2, \dotsc, x_n) \leq M_{p}(x_1, x_2, \dotsc, x_n) \leq M_{\infty}(x_1, x_2, \dotsc, x_n)
$$ for all values of $p$. Furthermore, at the end of the post, we'll prove the following result, which implies the above, as well as the AM-GM and AM-HM inequalities.

Power Means Inequality: For $-\infty \leq p < q \leq +\infty$ and positive inputs $x_1, x_2, \dotsc, x_n$, we have $$
M_{p}(x_1, x_2, \dotsc, x_n) \leq M_{q}(x_1, x_2, \dotsc, x_n)
$$ with equality if and only if $x_1 = x_2 = \dotsb = x_n$.
$\square$

Below is a chart of the power means for different values of $p$ with the $x_i$'s being the set of numbers $\{ 1,2,3,4,5,6,7,8,9,10 \}$ and the weights all equal to $1/10$, i.e. unweighted means:


As expected based on the power means inequality, the orange graph of $M_p$ is strictly upward-sloping with increasing $p$. If you want to play around with the input set and weights and see how this affects the chart, you can download my Excel sheet at this link. After clicking the link, make sure to press the download button at the top so that you can open the file in Excel instead of Google docs:


Now, let's look at a few examples of the power means inequality applied.

Example 1: If $p,q,x,y>0$ and $p+q \leq 1$, then we have $(px+qy)^2 \leq px^{2} + qy^{2}$,

Proof: Applying the power means inequality with weights $\frac{p}{p+q}$ and $\frac{q}{p+q}$ to $M_{1}(x,y)$ (the arithmetic mean) and $M_{2}(x,y)$ (called the quadratic mean or root mean square), we obtain:$$
\begin{alignat}{5}
&&\frac{p}{p+q}x &+ \frac{q}{p+q}y \ &
&\stackrel{(1)}{\leq}& \ \ &\left( \frac{p}{p+q}x^{2} + \frac{q}{p+q}y^{2} \right)^{1/2} \\[2mm]

\stackrel{(2)}{\implies} \ &&px&+qy \ & &\leq& \ \ &(p+q)^{1/2} \left( px^{2} + qy^{2} \right)^{1/2} \\[2mm]

\stackrel{(3)}{\implies} \ &&(px&+qy)^{2}\  & &\leq& \ \ &(p+q) \left( px^{2} + qy^{2} \right) \\[2mm]

\stackrel{(4)}{\implies} \ &&(px&+qy)^{2} \  & &\leq& \ \ & px^{2} + qy^{2} \\
\end{alignat}
$$ (1) is the power means inequality. (2) is multiplication by $(p+q)$ on both sides, and (3) is squaring both sides; neither of these operations reverse the direction of the inequality. Finally, (4) is because $p+q \leq 1$.
$\square$

Example 2: If $a,b,c, \geq 0$, then $\sqrt{3(a+b+c)} \geq \sqrt{a} + \sqrt{b} + \sqrt{c}$.

Proof: This one is quite simple with an application of the power means inequality: $$
\underbrace{\frac{a+b+c}{3}}_{M_{1}(a,b,c)} \geq
\underbrace{\left( \frac{\sqrt{a} + \sqrt{b} + \sqrt{c}}{3} \right)^{2}}_{M_{1/2}(a,b,c)}
$$ from which the desired inequality follows immediately by taking the square root and then multiplying by 3 on both sides. Once again, these operations do not reverse the direction of the inequality.
$\square$

Finally, like AM-GM, we can use the power means inequality to verify the minimum value of a function subject to a constraint. Note that in the examples of Part 1, a volume-like constraint on the product of the inputs, along with a suitable function involving a sum, allowed us to use the AM-GM inequality. In this last example, however, the function consists of a sum of reciprocals, symmetric in the inputs, and thus calls for the harmonic mean; the constraint, a symmetric sum of the inputs, calls for the arithmetic mean.

Example 3: Let $x,y,z>0$ with $x+y+z=1$. Find the minimum value of the function $$
f(x,y,z) = \frac{1}{x} + \frac{1}{y} + \frac{1}{z}
$$ Solution: The AM-HM inequality (power means inequality with $p=-1$ and $q=1$) implies that $$
\begin{align}
\frac{1}{f(x,y,z)} =
\underbrace{\frac{1}{\frac{1}{x} + \frac{1}{y} + \frac{1}{z}}}_{\frac{1}{3}M_{-1}(x,y,z)}
&\leq
\underbrace{\frac{1}{3} \cdot \frac{x+y+z}{3}}_{\frac{1}{3}M_{1}(x,y,z)} \\
&= \frac{x+y+z}{9} \\
&= \frac{1}{9} \tag{$x+y+z=1$}
\end{align}
$$ Since $x,y,z>0$, so is $f(x,y,z)$, so we can multiply both sides by $9f(x,y,z)$ without reversing the direction of the inequality, and we obtain $$
9 \leq f(x,y,z)
$$ Since $f(\frac{1}{3},\frac{1}{3},\frac{1}{3})=9$, this must be the minimum value.
$\square$

The main take-away from these examples is that the power means inequality generates all sorts of inequalities as long as we have an expression that looks something like $M_p$ for some $p$. We have the flexibility to choose any $p$ and $q$ that suit our needs for the particular problem.

That concludes the non-proof portion of this post.


Deferred Proofs


Let's start with the proofs (straight from Wikipedia) of the limit cases $\lim_{p \rightarrow 0}{M_p}$ and $\lim_{p \rightarrow \pm \infty}{M_p}$.

Proof that $\lim_{p \rightarrow 0}{M_p} = M_0$ (the GM): To begin, note that $$
\begin{align}

\lim_{p \rightarrow 0}{M_p}

&= \lim_{p \rightarrow 0}{\left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{1/p}} \\[3mm]

&= \lim_{p \rightarrow 0}{\exp \left( \ln \left[ \left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{1/p} \right] \right)} \tag{$\exp(\ln[z])=z$ for $z>0$}\\[3mm]

&= \exp \left( \lim_{p \rightarrow 0}{\ln \left[ \left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{1/p} \right]} \right) \tag{since exp is continuous} \\[3mm]

&\stackrel{( \spadesuit )}{=} \exp \left( \lim_{p \rightarrow 0}{
\frac{\ln \left( \sum_{i=1}^{n}{w_i x_i^p} \right)}{p}
} \right) \tag{properties of logs}

\end{align}
$$ We can use L'Hôpital's rule to evaluate this limit since it is the indeterminate form $\frac{0}{0}$: $$
\begin{align}
\lim_{p \rightarrow 0}{
\frac{\ln \left( \sum_{i=1}^{n}{w_i x_i^p} \right)}{p}
}
&= \lim_{p \rightarrow 0}{
\frac
{ \frac{d}{dp} \ln \left( \sum_{i=1}^{n}{w_i x_i^p} \right)}
{ \frac{d}{dp} p}
} \\[5mm]

&= \lim_{p \rightarrow 0}{
\frac
{ \frac{\sum_{i=1}^{n}{w_i x_i^p \ln(x_i)}}{\sum_{i=1}^{n}{w_i x_i^p}}}
{1}
} \\[3mm]

&= \lim_{p \rightarrow 0}{
\frac{\sum_{i=1}^{n}{w_i x_i^p \ln(x_i)}}{\sum_{i=1}^{n}{w_i x_i^p}}
} \\[3mm]

&= \frac{\sum_{i=1}^{n}{w_i \ln(x_i)}}{\sum_{i=1}^{n}{w_i}} \tag{$\lim \nolimits _{p \rightarrow 0}{x_{i}^{p}}=1$} \\[3mm]

&= \frac{\sum_{i=1}^{n}{w_i \ln(x_i)}}{1} \tag{weights sum to 1} \\[3mm]

&= \ln \left[ \prod_{i=1}^{n}{x_{i}^{w_i}} \right] \tag{properties of logs}

\end{align}
$$ Finally, we can substitute the result of the limit evaluation back into $( \spadesuit )$: $$
\begin{align}
\lim_{p \rightarrow 0}{M_p}
&= \exp \left( \ln \left[ \prod_{i=1}^{n}{x_{i}^{w_i}} \right] \right) \\[2mm]

&= \prod_{i=1}^{n}{x_{i}^{w_i}} \\[2mm]

&= M_{0}(x_1, x_2, \dotsc , x_n) \tag{definition of weighted GM}
\end{align}
$$ $\square$

Note that in the last line, the definition of the weighted GM coincides with the definition of the unweighted GM we saw in Part 1 when we set $w_i = \frac{1}{n}$ for each $i$: $$
\prod_{i=1}^{n}{x_{i}^{1/n}} = \prod_{i=1}^{n}{\sqrt[n]{x_{i}}} = \sqrt[n]{x_1 x_2 \dotsm x_n}
$$
Proof that $\lim \limits_{p \rightarrow \infty}{M_{p}} = M_{\infty}$ (maximum): To begin, assume that $x_1 \geq x_2 \geq \dotsb \geq x_n$. This is without loss of generality because if our original list does not satisfy this condition, we can rearrange the $x_i$'s. Note that we also rearrange the weights $w_i$ so that this does not change the value of the sum in the formula for $M_p$. Thus, we have: $$
\lim_{p \rightarrow \infty}{M_p}

= \lim_{p \rightarrow \infty}{\left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{1/p}}

= x_1 \lim_{p \rightarrow \infty}{
\left( \sum_{i=1}^{n}{w_i \left( \frac{x_i}{x_1} \right)^p} \right)^{1/p}
}

=x_1
$$ where the final equality is because each ratio $x_i / x_1 \leq 1$ due to our ordering of the $x_i$'s. Thus, they don't blow up the sum when raised to the power of $p$, and the entire limit goes to 1 as the exponent $1/p \rightarrow 0$. Furthermore, $x_1 = \max(x_1, x_2, \dotsc, x_n)$ due to the ordering.
$\square$

Proof that $\lim_{p \rightarrow -\infty}{M_p} = M_{-\infty}$ (minimum): This proof piggybacks off of the $+\infty$ case.

Note that for $p>0$, $$
M_{-p}(x_1, x_2, \dotsc, x_n) = \left( \sum_{i=1}^{n}{w_i x_i^{-p}} \right)^{-1/p} = \frac{1}{M_{p}(\frac{1}{x_1}, \frac{1}{x_2}, \dotsc, \frac{1}{x_n})}
$$ Taking the limit as $p \rightarrow \infty$, $$
\begin{align}
\lim_{p \rightarrow \infty}{M_{-p}(x_1, x_2, \dotsc, x_n)}

&= \frac{1}{
\lim_{p \rightarrow \infty}{M_{p}(\frac{1}{x_1}, \frac{1}{x_2}, \dotsc, \frac{1}{x_n})}
} \\[3mm]

&= \frac{1}{
M_{\infty}(\frac{1}{x_1}, \frac{1}{x_2}, \dotsc, \frac{1}{x_n})
} \\[3mm]

&=\frac{1}{
\max(\frac{1}{x_1}, \frac{1}{x_2}, \dotsc, \frac{1}{x_n})
}
\end{align}
$$ Finally, $\max(\frac{1}{x_1}, \frac{1}{x_2}, \dotsc, \frac{1}{x_n}) = \frac{1}{\min(x_1, x_2, \dotsc, x_n)}$, so that $\lim_{p \rightarrow \infty}{M_{-p}(x_1, x_2, \dotsc, x_n)} = M_{-\infty}$.
$\square$

Now to the main event (at least for the nerds who read this far): the proof of the power means inequality. We'll start with two lemmas.

Lemma 1: $M_p \geq M_q \iff M_{-p} \leq M_{-q}$.

Proof: Suppose $M_p = \sqrt[p]{\sum_{i=1}^{n}{w_i x_i^p}} \geq \sqrt[q]{\sum_{i=1}^{n}{w_i x_i^q}} = M_q$. Since this inequality holds for any list of $n$ input values, we can replace each $x_i$ with its reciprocal, so we have $$
\sqrt[p \uproot 3]{\sum_{i=1}^{n}{w_i x_i^{-p}}} \geq \sqrt[q \uproot 3]{\sum_{i=1}^{n}{w_i x_i^{-q}}}
$$ Raising both sides to the power of $-1$ (which replaces the $\sqrt[p]{\ \ }$ with $\sqrt[-p]{\ \ }$) is a strictly decreasing function and thus reverses the sign of the inequality. Thus $$
\sqrt[-p \uproot 3]{\sum_{i=1}^{n}{w_i x_i^{-p}}} \leq \sqrt[-q \uproot 3]{\sum_{i=1}^{n}{w_i x_i^{-q}}}
$$ That is, $M_{-p} \leq M_{-q}$. Since the same arguments all work in reverse, this proves the asserted equivalence.
$\square$

Lemma 2: For any $p>0$, $M_{-p} \leq M_0 \leq M_p$.

Proof: Using properties of logarithms $( \dagger )$ and then Jensen's inequality $( \ddagger )$ (which applies in reverse since the $\ln$ is concave instead of convex), we have: $$
\ln \left( \prod_{i=1}^{n}{x_i^{w_i}} \right)
\stackrel{( \dagger )}{=}
\sum_{i=1}^{n}{w_i \ln(x_i)}
\stackrel{( \ddagger )}{\leq}
\ln \left( \sum_{i=1}^{n}{w_i x_i} \right)
$$ Since the exponential function ($f(z) = e^z$) is a strictly increasing function of the input $z$, we can apply it to both sides without reversing the direction of the inequality. This eliminates the logs and yields $$
\begin{align}
&&   &&\prod_{i=1}^{n}{x_i^{w_i}} &\leq \ \sum_{i=1}^{n}{w_i x_i} \\[2mm]

&&\stackrel{( \clubsuit )}{\implies} &&\prod_{i=1}^{n}{x_i^{p w_i}} &\leq \ \sum_{i=1}^{n}{w_i x_i^p} \\[2mm]

&&\stackrel{( \heartsuit )}{\implies} &&\left( \prod_{i=1}^{n}{x_i^{p w_i}} \right)^{1/p} &\leq \ \left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{1/p} \\[2mm]

&&\iff &&\prod_{i=1}^{n}{x_i^{w_i}} &\leq \ \left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{1/p}

\end{align}
$$ In line $( \clubsuit )$, we replaced each $x_i$ by $x_i^p$, which is valid since the inequality $( \ddagger )$ holds for any choice of inputs. In the next line $( \heartsuit )$, we raised both sides to the power of $1/p$; note that raising an argument to a positive power is a strictly increasing function and thus this does not reverse the direction of the inequality. In the final line, we just simplified the left side.

This completes the proof that $M_0 \leq M_p$ for $p>0$. The proof that $M_{-p} \leq M_0$ is exactly the same, except that we substitute $-p$ everywhere there's a $p$ above. This reverses the inequality in line $( \heartsuit )$ since raising to a negative power is a decreasing function.
$\square$

Proof of the power means inequality: There are 3 cases to be proved:

  1. $0<p<q$
  2. $p<q<0$
  3. $p<0<q$

Lemma 2 tells us that the power mean for any positive (negative) power is always greater (less) than or equal to the geometric mean. This automatically proves the power means inequality for case 3. Furthermore, by Lemma 1, cases 1 and 2 are equivalent, so we actually only need to prove the power means inequality for case 1.

To that end, define the function $f$ which maps a positive real number $x$ to the positive real number $f(x) = x^{q/p}$. The second derivative of this function is $$
f''(x) = \left( \frac{q}{p} \right) \left( \frac{q}{p}-1 \right) x^{q/p-2}
$$ which is positive for $x>0$ since $q/p>1$. Therefore, $f$ is convex, and so by Jensen's inequality, we have $$
f \left( \sum_{i=1}^{n}{w_i x_i^p} \right) \leq \sum_{i=1}^{n}{w_i f(x_i^p)}
$$ that is, $$
\begin{align}
&& &&\left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{q/p} &\leq \sum_{i=1}^{n}{w_i (x_i^p)^{q/p}} \\[2mm]

&&\iff &&\left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{q/p} &\leq \sum_{i=1}^{n}{w_i x_i^q} \\[2mm]

&&\stackrel{( \diamondsuit )}{\implies} &&\left( \sum_{i=1}^{n}{w_i x_i^p} \right)^{1/p} &\leq \left( \sum_{i=1}^{n}{w_i x_i^q} \right)^{1/q}

\end{align}
$$ In line $( \diamondsuit )$, we raised both sides to the power of $1/q$, a positive number, and thus didn't reverse the direction of the inequality. This completes the proof of case 1 and therefore of the power means inequality.
$\square$

Thanks for reading- please post any questions in the comments section.

1 comments: