Project Euler #4 – Big Stuff Going On

This post discusses problem #4 from Project Euler. We’ll work out a solution in Python, discussing how to improve the speed using some math.

Problem statement

A number is called a palindrome if it is the same when written forwards or backwards. For example, 32823 and 11766711 are palindromes, but 17375 is not a palindrome. What is the largest palindrome that can be obtained by multiplying two 3-digit numbers together?

We’ll solve this by writing a program in Python. We’ll start by writing a basic solution. After that, we can improve it by taking advantage of some mathematical properties of multiplication and palindromes.

The first attempt

We’re looking for the biggest palindrome that appears as $x\cdot y$ where $x$ and $y$ are 3-digit numbers. So the most straightforward thing to do is to make a list of all such palindromes, and then pick out the biggest palindrome on the list. We can do this as follows:

Loop over pairs of 3-digit numbers, $x$ and $y$ .
Multiply each together to get a product $p = x\cdot y$ .
Test $p$ to see if it is a palindrome.
If $p$ is a palindrome, add it to a list of palindromes.
Return the maximum of the list of palindromes.

Here’s the Python code for one implementation of this idea:

def f1():
    palindromes = []
    for x in range(100,1000):
        for y in range(100,1000):
            p = x*y
            if str(p) == str(p)[::-1]:
                palindromes.append(p)
    biggest = max(palindromes)
    print biggest

We’ve tested whether $p$ is a palindrome by converting it to a string (using str()) and comparing it to the reversed string (using the [::-1] slice). We’ve also written this as a function f1() so that we can run it later, time it, and compare it to our later solutions.

To test how long this function takes, we can use the time module. The code below calls f1() a certain number of times, then prints out the total time taken and the average time for each call of f1().

#import time
N = 10 # how many times to try
t0 = time.time()
for k in xrange(N):
    f1()
t1 = time.time()
total = t1 - t0
print total
print total/N, 'seconds per call'

For me, f1 takes 1.0946 seconds per call. The code above only performs f1 ten times, for a total of around 10 seconds.

In the next sections, we significantly improve the speed by taking advantage of some properties of multiplication and palindromes.

Speeding things up

When trying to speed up the code, we can look at it and ask ourselves whether we are being redundant or ignoring something special about the problem.

The nested for-loops in f1 are a good place to start looking. For ideas of what to take advantage of, we can make some guesses by thinking about keywords or special phrases that distinguish this problem:

multiplication
palindromes
biggest
3-digit numbers

So we’ll want to see if we can make use of these special features of the problem.

Let’s start by looking at the for-loops and thinking about multiplication.

Commutativity

We’re using the for-loops to loop over pairs $(x,y)$ so we can make our products $p=x\cdot y$ . Currently, we’re looking at $900\cdot 900= 810000$ pairs $(x,y)$ . One of the first observations we can make is:

We don’t have to look at all pairs $(x,y)$ to get all products $p = x\cdot y$ .

One reason this is true is because $x\cdot y$ is the same as $y\cdot x$ . Right now, we are getting each value of $p$ at least twice because multiplication is commutative. We can immediately reduce the number of pairs we look at by half. Below, we update our code to only look at pairs $(x,y)$ where $x \leq y$ . We can change our code to do this by changing the bounds for the loop involving $y$ .

def f2():
    palindromes = []
    for x in range(100,1000):
        for y in range(x,1000):
            p = x*y
            if str(p) == str(p)[::-1]:
                palindromes.append(p)
    biggest = max(palindromes)
    print biggest

For me, this updated function f2 takes 0.5511 seconds per call. This reduced the time by about half.

We can reduce this further by taking advantage of another property of multiplication.

Order

Imagine going through our current program. We move along, increasing $x$ and $y$ one at a time, occasionally finding a palindrome and adding it to our list. If we printed out these palindromes along the way, we would see that sometimes they increase and sometimes they decrease. However, in the end, we are only interested in the biggest palindrome we find.

This suggests that many of our iterations are wasted. Whenever we consider a pair $(x,y)$ where $p=x\cdot y$ is smaller than a palindrome we have already found, we are wasting time. So, we might be able to save some time by avoiding these cases as much as possible.

For example, the test str(p) == str(p)[::-1]str(p) == str(p)[::-1]str(p) == str(p)[::-1] takes some time, but is unnecessary if $p$ is small. Below, we’ve updated the code to keep track of the biggest palindrome so far, rather than keep a list of palindromes. This lets us reduce to only checking $p$ when it might matter, which turns out to be a few thousand cases.

def f3():
    biggest = 0
    for x in range(100,1000):
        for y in range(x,1000):
            p = x*y
            if p > biggest:
                if str(p) == str(p)[::-1]:
                    biggest = p
    print biggest

For me, f3 takes 0.1255 seconds per call, which is only about 23% of the previous time.

We can make more use of order by remembering that multiplication is order-preserving:

If $x_1 > x_2$ , then $x_1 \cdot y > x_2 \cdot y$ , for positive $x_1, x_2, y$ .
If $y_1 > y_2$ , then $x \cdot y_1 > x \cdot y_2$ , for positive $x, y_1, y_2$ .

Right now, our code searches through $x$ and $y$ starting with low numbers around 100, and increases up to larger numbers near 999. But, our biggest prime is likely going to be found by multiplying numbers near 999.

This is important to notice for two reasons:

We are probably wasting a lot of time looking at the smaller values of $x$ and $y$ .
$p=x\cdot y$ can only decrease when we decrease either of $x$ or $y$ .

The second point above might require careful reading, but it lets us skip a lot more pairs. It tells us that if $p=x\cdot y$ is too small, then we can skip any smaller $y' < y$ in our inner for-loop. Also, since the biggest $y$ we ever use is 999, if $p = x\cdot 999$ is too small, we can skip any smaller $x$ in our outer for-loop.

We will update our code in two ways:

We will start our loops at 999 and decrease toward 100.
We will skip smaller $x$ and $y$ when possible using breaks, as suggested in the last paragraph.

def f4():
    biggest = 0
    for x in range(999,99,-1):
        if x*999 < biggest:
            break
        for y in range(999,x-1,-1):
            p = x*y
            if p <= biggest:
                break
            if str(p) == str(p)[::-1]:
                biggest = p
    print biggest

For me, f4 takes 0.0048 seconds per call, which is about 4% of the previous time.

So far we’ve made use of some properties of multiplication and the ordering. We haven’t yet used properties of palindromes. We’ll make one final improvement in the next section.

Divisibility

The palindromes we are interested in are products of 3-digit numbers. Our code is still looking at a few thousand pairs $(x,y)$ , but we can get this down to a few hundred by using a property of these palindromes. Namely, palindromes with an even number of digits are divisible by 11.

If we multiply two large 3-digit numbers together, we get a 6-digit number. If this 6-digit number is a palindrome, it only has three distinct digits $d_1,d_2,d_3$ (because palindromes are the same backwards and forwards) and is equal to the following.

$10^5 d_1 + 10^4 d_2 + 10^3 d_3 + 10^2 d_3 + 10^1 d_2 + d_1$

This can be rewritten as

$(10^5 + 10^0)\cdot d_1 + (10^4 + 10^1)\cdot d_2 + (10^3 + 10^2)\cdot d_3$

which is divisible by 11 because each coefficient is.

This means that we only need to consider $p$ which are multiples of 11. Moreover, since $p=x\cdot y$ and 11 is prime, this means we only need to consider pairs $(x,y)$ where at least one of $x$ or $y$ is divisible by 11. This is another significant reduction in the number of pairs we need to look at.

We should be careful about this idea, however. For smaller values of $x$ and $y$ , we can get a 5-digit product, which we haven’t discussed. But, since we expect the biggest palindrome to come from the larger pairs, this shouldn’t be an issue. If we obtain a biggest palindrome which is 6 digits long, we will not have missed anything.

The final code for our discussion is below. We have broken the construction of the inner for-loop into two cases:

If $x$ is divisible by 11, we loop over $y \geq x$ like before.
If $x$ is not divisible by 11, we loop over $y \geq x$ which are multiples of 11, working from 990 down.

We’ve also stored str(p) in a variable to save a second type-cast. Testing shows this speeds things up some.

def f5():
    biggest = 0
    for x in xrange(999,99,-1):
        if x*999 <= biggest:
            break
        if x%11==0:
            for y in xrange(999,x-1,-1):
                p = x*y
                if p <= biggest:
                    break
                s = str(p)
                if s == s[::-1]:
                    biggest = p
                    break
        else:
            for y in xrange(990,x-1,-11):
                p = x*y
                if p <= biggest:
                    break
                s = str(p)
                if s == s[::-1]:
                    biggest = p
                    break
    print biggest

For me, f5 takes 0.000648 seconds per call, which is about 14% of the previous time. This works out to around 0.06% of the time taken by our first function f1. In other words, f5 ended up being about 1690 times faster than f1.