Collatz Conjecture
Is this sequence familiar to you?
1, 2, 8, 3, 6, 9, 17, 4, 20, 7, 15, 10, 10, …
It’s called A008908 in the Online Encyclopedia of Integer Sequences. You probably encountered in your first programing class. The teacher probably called the assignment something like the 3n+1 problem, or the hailstorm sequence problem.
Depending on the language you were learning, you probably ended up writing something which looked like this (Lisp):
(defun collatz (n) (if (= 1 n) 1 (if (evenp n) (1+ (collatz (/ n 2))) (1+ (collatz (1+ (* 3 n)))) ) ) )
or this (Python):
def collatz(n): steps = [] while n > 1: steps.append(n) if (n&1): n = 3*n+1 else: n = n/2 return steps
or even this (MATLAB):
function y = collatz(n) y = n; while n > 1 if rem(n,2) == 0 n = n/2; else n = 3*n+1; end y = [y n]; end end
This is a favorite programming assignment for a number of reasons. One is that it’s very easy to implement in most languages. Another is that it does something surprising. If you plot the first 200 values of the sequence, you get something which looks like this:
That’s a more complex behavior than you might have guessed from the definition of the function.
One of the other reasons this is a favorite is that there’s a deep bit of mathematics hiding behind it. In 1937, Lothar Collatz explored it and proposed that the length was finite for all values of n. This became known as the Collatz Conjecture. Since then, a number of mathematicians have attempted to prove this conjecture, but it has remained unproven. It’s always kind of neat for a student to stumble into something like this in a seemingly simple homework assignment.
A lot of programmers become fascinated with this sequence and end up spending hours searching for patterns in it. I’ve wasted some time at it myself. XKCD joked about this phenomena a while back:
Well, this week Gerhard Opfer published a paper which claims to have finally proven the conjecture. If his proof holds up, it’ll be kind of cool to see the problem solved after all these years. It’ll also be kind of sad, won’t it? It’s kind of nice to have beginning programmers stumbling into the unknown like that. It kind of gives them the sense that there can be surprises hiding around any corner.
Brian Hayes has a nice account on his blog. I hadn’t seen the Kakutani quote before.
Jason Davies has a nice visualization of this.
Via Flowing Data.
It doesn’t work in matlab for some reason. :/
That function returns the series generated from a seed. The series A008908 is the number of values in the series starting with different numbers. So to generate the first 10,000 values of A008908, you would use the collatz function like so:
npts=1e4;
c=zeros(1,npts);
for i=1:npts
c(i) = numel(collatz(i));
end
plot(c)