[ Can you explain this recursive "n choose k" code to me? ]
Here is the code to a subset problem with arguments n and k. n represents the total number of students and k represents the amount of the students I want to get out of n. The code attempts to give the number of possible combinations of pulling k number of students out of n number of students.
def subset(n, k):
if k == 0:
return 1
if n == k:
return 1
else:
return subset(n-1, k-1) + subset(n-1, k)
I understand the first part of the recursive call, but I'm having trouble understanding the + subset(n-1, k) part. Can anyone explain this to me?
Answer 1
The recursion is based on a simple observation, for which I will give a combinatorial argument, as to why it is true, rather than a mathematical proof through formulae.
Whenever you choose k
elements out of n
, there are two cases:
- You choose element
#n
- You don't choose element
#n
Since these events are mutually exclusive, the total amount of combinations is given by the amount of combinations when choosing #n
, and those when you don't choose #n
.
Choosing element #n
Since we have already chosen one element, we need only choose another k-1
elements. Also, since we have decided upon one element – as to whether it is included or not – already, we only need to consider the remaining n-1
elements.
Thus, the amount of combinations for choosing element #n
is given by
subset(n - 1, k - 1)
Not choosing element #n
There are still k
elements to choose, but since we have already made up our mind about element #n
, there remain only n - 1
elements to choose from. Thus:
subset(n - 1, k)
The base case
The recursion uses the fact, that we can usually differentiate between two situations, solutions where element n
is part of that solution, and those where it is not.
However, such a distinction can not always be made:
- When choosing all elements
- or when choosing no elements at all
In these cases, there is only exactly one solution, hence
if k == 0:
return 1
if n == k:
return 1
Ensuring it works
To do that, we need to convince ourselves (or prove) that the base case is always hit at some point.
Let us assume, that n < k
at some point. Since per our assumption, n
was originally greater or equal to k
, there must have been some point where n = k
, because n
and k
decrease in unison or only n
decreases by one, i.e. it follows
This implies, that there must have been a call to subset(n - 1, k)
for it to happen, that n
decreases below k
. However, this is not possible since we have a base case on n = k
where we return a constant 1
.
We conclude that either n
decreases at some point such that n = k
, or decrease in unison exactly k
times such that k = 0
.
Thus, the base case works.
Answer 2
This is more a mathematical problem and not a programming question. What you are doing is calculating the Binomial coefficient, the formula is
(n, k) = (n-1, k-1) + (n-1, k) with all (n, 0) and (n, n) having a value of 1.
Please see here for a full explanation. A recursive solution can be seen here. In case the mentioned link goes invalid simply search for it using Google. I doubt you will get a better explanation on SO than after reading this article on Wikipedia.