## Stars and Bars

This post comes in response to the very first reader request:

My request is for a topic on object set problems that come up often on the GRE: permutations and combinations. I know the basics, like combination = order doesn't matter & permutation = order does matter, and the former is reduced in number from the latter. Where it gets dicey for me is when it comes to situations of repetition vs. no repetition.

So, for example, with a combination with repetition (a store sells shirts that come in 6 different colors and I have a coupon to get 3 shirts of whatever color, how many variations of that can I make?) what is the best way to solve for that when you can have two or more colors be the same? I know there are formulas for each scenario that you can use, but I guess I'm curious about the basic proof behind the formulas (especially the combination with repetition).

Indeed, there is a formula for this exact scenario, but as you mentioned, let's use the question from the request above to illustrate the thought processes required to solve this problem in a few different ways, and then derive the general formula as a result.

The question here is how many different sets of 3 shirts we can make given 6 choices of shirt color. If we were picking one shirt for Monday, one for Tuesday, and one for Wednesday (i.e. if order mattered), then there would be $6^3 = 216$ choices, since we can choose from a set of 6, 3 times. But since the order does not matter in this question, this would be double-counting a lot of different combos: if we make 3 sequential selections from the pool of 6 colors, there are 3 which result in, for example, 2 blue shirts and 1 red shirt (R-B-B, B-R-B, and B-B-R). And there would be $6=3!$ (that's 3 factorial) different ways to end up with the set B,Y,G: B-Y-G, B-G-Y, G-B-Y, G-Y-B, Y-G-B, and Y-B-G.

As we can see from the two examples above, starting with $6^3$ and then subtracting out the repetitions is a bit tricky because the number of repetitions actually depends on which colors we choose (more precisely, on how many different colors end up in our set of 3). If each shirt is the same color, there are no repetitions to count; if all 3 shirts are different colors, there are 6 repetitions. And if we choose 1 of one color and 2 of another, there are 3 repetitions. We need to consider these cases separately in order to correctly count all the possibilities.

Case 1: 3 shirts of 3 different colors
In this case, we just need to choose 3 colors from the 6 possible ones, and then our set is completely determined, so there are $6 \choose 3$ possibilities, where $6 \choose 3$ is the binomial coefficient, calculated as $\dfrac{6!}{3! (6-3)!}$.

Case 2: 2 of one color and 1 of another
In order to choose such a set of 3, we need to first choose two colors from the 6. There are $6 \choose 2$ ways to do that. Next, we need to pick which of the two colors will be the color of 2 shirts and which will be the color of 1. There are 2 options for this, so the number of ways to create a set falling under this case is $2 \times {6 \choose 2}$.

Case 3: 3 shirts of the same color
This is the easiest case- there are clearly 6 ways to choose 3 shirts of all the same color, since all we need to do is choose the color. Note that we can also write this as $6 \choose 1$.

Finally, the answer to this question is \begin{align} \text{# 3-shirt sets given 6 color choices} &= \sum_{\text{cases}}{\text{# per case}} \\[3mm] &= {6 \choose 3} + \left[ 2 \times {6 \choose 2} \right] + {6 \choose 1} \\[3mm] &= 20 + [2 \times 15] + 6 \\[3mm] &= 56 \end{align} We can actually back out the $6^3$ from the case where order matters based on the above by multiplying by the number of repetitions in each case: $6^3 = 216 = (6 \times 20) + (3 \times 2 \times 15) + 6$.

### The general formula: "stars and bars" (or "shirts and bars"...)

In the example above, in the case where we choose 2 colors, say red and blue, for 3 shirts, we need to choose how many of the 3 will be red and how many will be blue. Obviously, there are 2 choices for this: 1-2 and 2-1.

If we had needed to pick, for example, 5 shirts, it would have been a bit more complicated: we could have chosen between 1 and 5 different colors. If we look at the case where we choose 3 different colors (say blue, red, and fuschia) for 5 shirts, we would need to choose how many shirts would be each of the 3 colors; there are 6 ways to do this: 1-1-3, 1-3-1, 3-1-1, 1-2-2, 2-1-2, and 2-2-1. There are also numerous other combinations to count, so this case is a lot harder to count directly than the example above.

Luckily, there is an easy tool called stars and bars that helps us visualize and count in these tougher examples. I'm going to use shirts and bars instead since we're talking about shirts, but it's the same thing.

If we have 5 shirts and 3 colors, we can see in the pictures below that there are 5 shirts, and thus $5-1=4$ slots in which to insert $3-1=2$ bars to divide the shirts into 3 groups to be colored in. The second/third pictures corresponds to option 1-2-2 from above.

This clearly illustrates that if there are 6 colors, and we need to choose 5 shirts, then in the case where we have 3 different colors, the number of possible combinations is: \begin{align} &\text{ways to choose 3 colors from 6} \times \text{ways to color 5 shirts with 3 colors} \\[3mm] &={6 \choose 3} \times {{5-1} \choose {3-1}} \\[3mm] &= 20 \times 6 \\[3mm] &= 120 \end{align} Using this logic, we can see that the general formula for the number of ways to choose $k$ shirts from a pool of $n$ colors is the sum of the numbers for each case: $$\sum_{j=1}^{k}{{n \choose j}{{k-1} \choose {j-1}}} \tag{1}$$ where $j$ is the number of different colors, ${n \choose j}$ is the number of ways to choose those colors, and ${{k-1} \choose {j-1}}$ is the number of ways to color the $k$ shirts using the $j$ colors, which we got via stars and bars above. Try this formula for case where $n=6$ and $k=3$, and you will see we recover the answer 56 from our first example.

Note that the sum in (1) only goes up to $k$, since we can't select more colors than the number of shirts (here, I'm assuming $n>k$ as in our example). However, if $j>k$, then the number of ways to choose $j$ items from a set of $k$ is 0, so ${{k-1} \choose {j-1}}=0$. This means we can extend the limit of summation up to $n$ without changing the value of the sum. We can also extend the lower limit of summation from 1 to 0, since the $j-1$ factor will zero out the second term when $j=0$ (i.e. there are zero ways to choose a negative number of items from a pool of $k-1$). Note that the same logic holds if $n \leq k$, since then the $n \choose j$ factor becomes zero when $j>n$. Thus, regardless of whether $n>k$ or $n \leq k$, we have: $$\sum_{j=1}^{k}{{n \choose j}{{k-1} \choose {j-1}}} = \sum_{j=0}^{n}{{n \choose j}{{k-1} \choose {j-1}}}$$ but furthermore: \begin{align} \sum_{j=0}^{n}{{n \choose j}{{k-1} \choose {j-1}}} &= \sum_{j=0}^{n}{{n \choose j}{{k-1} \choose {(k-1)-(j-1)}}} \\[3mm] &= \sum_{j=0}^{n}{{n \choose j}{{k-1} \choose {k-j}}} \\[3mm] &= {{n+k-1} \choose {k}} \end{align} where the first equality is due to the fact that for any positive integers $m,r$ with $r<m$, we have ${m \choose r} = {m \choose {m-r}}$. This just means that selecting $r$ items from a pool of $m$ is equivalent to choosing $m-r$ items not to include in your selection. The third equality is called the Vandermonde Convolution, and if you read the pool handicapping post, you might have guessed that it has both an algebraic and a combinatorial proof. I'll leave them to you to figure out (see the end of this post for a hint on each).

The fact that the sum from before boils down to a single binomial coefficient suggests that there is a simpler solution to our counting problem. Indeed there is, and it's actually another stars and bars method.

Going back to the 6 colors, 3 shirts example, let's draw 3 shirts and place $6-1=5$ bars anywhere to the left or right of any shirt. We are allowed (and in this example, forced) to place two bars next to each other with no shirt in between. Suppose we then color the shirts in a fixed order (e.g. blue, red, green, yellow, fuschia, orange): every shirt to the left of the first bar (if any shirts) will be blue, shirts between the first and second bar would be red, between the second and third green, and so forth. Shirts to the right of the fifth bar would be orange. So the case of 2 blues and 1 red would look like this:
Note that there are $6+3-1$ objects ($6-1$ of them are bars, and 3 are shirts), and choosing a set of 3 shirts from the 6 color options is equivalent to simply choosing where in the line-up of 8 objects the 3 shirts should go. Thus, the number of ways to do this is ${{6+3-1} \choose {3}} = {8 \choose 3} = 56$, the same answer as before.

To recap, you now know two different stars and bars methods which you can apply in combinatorial counting problems. I hope this was informative and thoroughly addressed the request. If any questions remain, feel free to ask in the comments section.

Lastly, here are the hints for the two proofs of the Vandermonde Convolution:

Algebraic proof: Note that ${{n+k-1} \choose {k}}$ is the coefficient of $x^k$ in the expansion of $(1+x)^{n+k-1}$. What is the coefficient of $x^k$ in the expansion of $(1+x)^{n} (1+x)^{k-1}$?

Combinatorial Proof: Suppose we have a bag of $n+k-1$ (numbered) marbles, of which $n$ are blue and $k-1$ are red. How many different sets of $k$ marbles can we create from the bag? How many ways can we choose $k$ marbles from the bag, where $j$ are blue and $k-j$ are red?