Project Euler #1-5

To keep my algorithmic programming skills in shape, I decided to pick up the Project Euler Challenge. This daily training in computational thinking and debugging will hopefully serve well as ‘mental gymming’. In this post, I share the first 5 problems in this challenge.

Problem 1

Problem statement: Find the sum of all the multiples of \(3\) or \(5\) below \(N\).

Input: \(T+1\) lines of integers. First line contains \(T\) that denotes the number of test cases. This is followed by \(T\) lines, each containing an integer, \(N\).

Constraints:

\[1 \leq T \leq 10^5\] \[1 \leq N \leq 10^9\]

This means our solution has to take into account up to 100,000 unique values of up to a billion!

First thoughts

First up, let’s scope out the frame of the program. We parse the input (assuming only positive real numbers given in constraints, and that the number of lines match \(T+1\)), reading each test case and immediately returning the result.

def get_sum(n):
    # TODO

t = int(input().strip())

for a0 in range(t):
    n = int(input().strip())
    print(get_sum(n))

Naive for-loop count

A first glance, the problem is reminiscent of the FizzBuzz problem commonly used in software engineering interviews. We note that this problem, however, requires us to sum multiples of \(3\) and \(5\), and a check for divisibility with either would count multiples of \(15\) once only. Hence, we’re spared from having to deal with the latter case.

def get_sum(n, i=1, val=0):
    while i < n:
        if i%3==0 or i%5==0:
            val+=i
        i+=1
    return val

One quick note is the solution is \(\mathcal{O}(N.T)\) in runtime, where we count up to \(N\) for each \(T\) integers. This solution can be very slow if we need to calculate large numbers many times (say, an input of \(\{9999, 10000, 10001\}\)). Is there a way to save previously calculated results?

For-loop counting with cache

In this implementation, we store each calculated input-value pair into a dictionary. For each additional input, we add the new input into a cached list of previously calculated inputs, checking if the new value is larger than any previously calculated inputs (in decreasing order for efficiency). The trade-off we pay is in storage of input cache as a list and input-value cache as a dictionary takes up \(\mathcal{O}(T)\) in memory size, as well as additional time required for to sort the list of previous inputs (estimating the python implementation to be \(\mathcal{O}(Tlog(T))\)). However, we greatly save the computation required if \(N >> T\).

def get_sum(n, i=1, val=0):
    while i < n:
        if i%3==0:
            val+=i
        elif i%5==0:
            val+=i
        i+=1
    return val

t = int(input().strip())

inputs = []
ans_dict = {}

for a0 in range(t):
    n = int(input().strip())
    inputs.insert(0,n)
    inputs.sort(reverse=True)
    i = 0
    while i < len(inputs):
        saved_val = inputs[i]
        if n > saved_val:
            ans = get_sum(n, saved_val, ans_dict.get(saved_val))
            ans_dict[n] = ans
            print(ans)
            break
        elif i == len(inputs)-1:
            ans = get_sum(n)
            ans_dict[n] = ans
            print(ans)
        i+=1

Unfortunately, submitting the solution to Hackerrank yields only 60%, failing two of the test cases due to runtime timeout! Is there a more efficient solution around?

Arithmetic progression

Looking at the requirements of the problem carefully, we are required to return the sum of values subject certain conditions. Can we express the problem as a series of arithmetic progressions (in the form of multiples of values)? Concretely, we can simply calculate the value of a series using the formula below:

\(S_{n}=\frac{n}{2}(2a+(n-1).d)\), where \(n\) is the value we wish to sum to,\(a\) is the first number in the series and \(d\) is the common difference between numbers in the series. Note that the small \(n\) in this AP-summation formula differs from our problem’s definition of \(N\) by at least \(1\), as we are required to sum values up to \(N\) but not inclusive.

The solution below comprises an addition of two series in multiples of \(3\) and \(5\), deducting a series in multiples of \(15\) to prevent double-counting.

def get_sum(n):
    n = n-1
    sum3 = 0 if n < 3 else int((n - (n%3))/3)
    sum5 = 0 if n < 5 else int((n- (n%5))/5)
    sum15 = 0 if n < 15 else int((n- (n%15))/15)
    return (sum5*(10+(sum5-1)*5) + sum3*(6+(sum3-1)*3) - sum15*(30+(sum15-1)*15))>>1

t = int(input().strip())

for a0 in range(t):
    n = int(input().strip())
    print(get_sum(n))

Special note: We do bit-wise manipulation to divide the series by two instead of the normal division. This is because float division leads to large rounding errors during type-casting back to integers when the values are very large, which is well within the constraints of the problem.

It’s interesting to see that the most computationally efficient solution (\(\mathcal{O}(1)\) in runtime) is so concise. Also, solving this problem required thought not just in computing, but mathematical efficiency… looking forward to the other problems in the series!

Problem 2

In this problem, we look at the definitive recursion problem in computer science: dealing with the Fibonacci sequence by summing over even numbers in the sequence for the second example:

High level idea: Find the sum of all the even numbers of the Fibonacci sequence below \(N\).

Input: \(T+1\) lines of integers. First line contains \(T\) that denotes the number of test cases. This is followed by \(T\) lines, each containing an integer, \(N\). (Note: This is similar to the first problem). I have a feeling this is going to be the Hackerrank default template).

Constraints:

\[1 \leq T \leq 10^5\] \[10 \leq N \leq 4 \times 10^6\]

Note that the value of N begins from 10.

First thoughts

Let’s define the vanilla Fibonacci sequence:

\[F_{n} = F_{n-1} + F_{n-2}\]

This means that for each number of the sequence, it is obtained by adding the previous two numbers before it. Here is the first \(10\) numbers in the sequence:

\[(0),1,2,3,5,8,13,21,34,55,89,\ldots\]

For example, to obtain the fourth number in the sequence \(F_{4}\), we sum \(2\) and \(3\) to obtain \(5\).

Here is the code implementation of the Fibonnaci sequence in Python:

# naive fibonnaci
def fib(n):
    if n == 1:
        return 1
    elif n == 0:
        return 0
    else:
        return fib(n-1) + fib(n-2)

Recursive Even Fibonacci

Next, we analyse the definition of the Fibonacci sequence. As we are required to obtain the sum of only even-numbered Fibonacci numbers, can we obtain a formula to just calculate said numbers in the sequence?

First, we observe the following lemmas regarding additions of numbers:

Odd + Even = Odd

Odd + Odd = Even

Considering the definition of the Fibonacci sequence being the sum of the previous two numbers, if and only if the previous two numbers are odd will the next number be even. Based on this findings, our goal is to obtain a formulation of the original Fibonacci number in terms of every third number of the sequence. After working it out for some time, we derived a new formulation for the \(n^{th}\) number in the even-numbered Fibonacci sequence as follows:

\[F_{n} = 4 \times F_{n-1} + F_{n-2}\]

Expressed in code below:

# recursive even fibonnaci
def fib(n):
    if n == 1:
        return 2
    elif n == 0:
        return 0
    else:
        return 4*fib(n-1) + fib(n-2)

Now the solution is starting to take form. Can we further optimize it? What happens when \(N\) is very large? In this case, our recursive depth stack will hit its limit.

Iterative Even Fibonacci

To prevent the recursive depth stack error, we create an iterative version of the even fibonacci method. To calculate the preceeding two numbers in the sequence, we save them as variables outside the while loop, which checks if the subsequent calculated value in the sequence is still \(<N\) before summing to give the answer.

# iterative even fibonacci
def fib(n, store):
    n_minus_2 = 0
    n_minus_1 = 2
    ans = 2
    while n_minus_1 < n:
        old_n_minus_2 = n_minus_1
        n_minus_1 = 4*n_minus_1 + n_minus_2
        n_minus_2 = old_n_minus_2
        if n_minus_1 < n:
            ans += n_minus_1
    print(ans)

This concludes the second Euler challenge solution! Futher optimizations can include caching previous calculated Fibonacci numbers, but we leave that for future work.

Problem 3

High level idea: Find the largest prime factor of a given \(N\).

Input: \(T+1\) lines of integers. As always, the first line contains \(T\) that denotes the number of test cases. This is followed by \(T\) lines, each containing an integer, \(N\).

Constraints:

\[1 \leq T \leq 10^5\] \[10 \leq N \leq 10^12\]

Note that the value of N begins from 10.

##3 First thoughts One way to tackle the problem is to factorise a given number into its primes and return its largest factor. Prime factorisation is one of the first few lessons taught in secondary school math… it’s interesting to get to solve it programmatically now!

A prime number is a natural number greater than \(1\) that is only divisible by itself and \(1\).

Based on that definition,we can iteratively build a list of prime numbers to check if the next number is divisible by any number in our list of primes. We can omit checking divisibility with non-primes (i.e. composite numbers), as divisibility with composite numbers can then be further decomposed into primes, defying the definition of prime numbers.

Iterative generation

In our solution, we first instantiate a list of primes under the smallest expected input test case, \(10\). This gives us a list \(\{2,3,5,7\}\). In the main method prime_factor, if we are given \(N\) greater than the largest value in our generated prime number list primes, we run get_primes to generate the list of primes \(\leq N\). Given the full list of primes, we check if the input number is a prime and returning it if so (trivially). Otherwise, we check for divisibility with primes in our list of primes, saving the biggest prime and returning it when the list is exhausted.

def is_prime(n, primes):
    # if n <= primes[-1]:
    #     return True if n in primes else False
    # else:
    sqrt_n = int(math.sqrt(n))
    for prime in primes:
        # prime >= sqrt_n or 
        if n % prime == 0:
            return False
        if prime >= sqrt_n:
            return True
    return True
                
                    
# instantiating with primes under 10
primes = [2, 3, 5, 7]

# build list of primes, biggest prime in list > n
def get_primes(n, primes):
    last = primes[-1]
    while n > primes[-1]:
        last += 2
        if is_prime(last, primes):
            primes.append(last)

def prime_factor(n, primes):
    if n > primes[-1]:
        get_primes(n, primes)
    if n in primes:
        return n
    else:
        biggest_prime = 0
        for prime in primes:
            if n % prime == 0:
                biggest_prime = prime
        return biggest_prime

Better stopping criterion

Next, we improve the speed of the algorithm by using two tricks: Firstly, we reduce the input number as we divide by a divisible prime number, and check to see if the new number is then smaller than the next prime divisor. For the second trick, note that there exists a unique decomposition of a composite number into its primes. This allows us to continue the search for prime division by checking divisibility with primes larger than the previously saved biggest_prime. Optimization implemented in code below.

def prime_factor(n, primes):
    if n > primes[-1]:
        get_primes(n, primes)
    if n in primes:
        return n
    else:
        biggest_prime = 0
        i = 0
        while i < len(primes):
            prime = primes[i]
            if prime > n:
                break
            if n % prime == 0:
                biggest_prime = prime
                n /= prime
            i += 1            
        return biggest_prime

Stopping at \(\sqrt{n}\)

To speed things up, we can shorten the search space for checking if a number \(n\) is a prime by checking for divisibility \(\leq \sqrt{n}\). Here’s why:

If the number \(n\) is not a prime, it can be expressed into at least two numbers \(a\) and \(b\). \(a \geq \sqrt{n} \implies b < \sqrt{n}\), so finding \(b\) in that case would suffice to show it is not a prime.

This is expressed in the Python code below:

def is_prime(n, primes):
    # if n <= primes[-1]:
    #     return True if n in primes else False
    # else:
    sqrt_n = int(math.sqrt(n))
    for prime in primes:
        if n % prime == 0:
            return False
        if prime >= sqrt_n:
            return True
    return True

Alas, it seems like we are still not fast enough. Is needing to calculate all the prime numbers holding us back?

No prime cache

In this solution, we borrow the continuous division of input \(n\) from our better stopping criterion section, but omit the need to check against prime numbers. First, we note the lemma that there exists a unique prime factorisaion for any natural number. Thus, we can contiuously check for divisibility with any natural number starting from 2, reducing our search space by whittling at the number.

In the code implementation below, we first continuously divide the input number by \(2\) if possible, then do so for \(i = 3, 5, 7, \ldots\). We check until \(\sqrt{n}\), returning the result if the final quotient is greater than two (indicating the input value is a prime number), or the saved largest prime.

This solution omits the need to generate prime numbers when we are not required to do so.

def prime_factor(n):
    while n % 2 == 0:
        n = n>>1
    if n == 1:
        return 2
    stop = int(math.sqrt(n))
    prime = 0
    i = 3
    while i <= stop:
        while n % i == 0:
            n = int(n/i)
            prime = i
        i += 2
    return n if n > 2 else prime

To be honest, the final solution was obtained from looking up discussion boards, as my initial assumptions that the prime number list is essential was crippling my ability to generate a faster solution. Looking forward, I should overturn my assumptions only calculate values when needed!

Problem 4

High level idea: Find the largest palindrome \(<N\) that is the sum of two three-digit integers.

Input: \(T+1\) lines of integers. As always, the first line contains \(T\) that denotes the number of test cases. This is followed by \(T\) lines, each containing an integer, \(N\).

Constraints:

\[1 \leq T \leq 100\] \[101101 \leq N \leq 10^6\]

Note that the value of N begins from 101101.

First thoughts

Let’s understand what a palindrome is. Basically, a palindromic number is visually symmetrical, i.e. a number that remains the same when its digits are reversed.

Based on that definition, let’s build a simple function that helps us check if a number is a palindrome:

def is_palindrome(n):
    number = str(n)
    i = 0
    while i <= len(number)/2-1:
        if number[i] != number[-1-i]:
            return False
        i += 1
    return True

We worked with prime factorisation in the third problem. Can we use the functions we wrote to help solve this problem?

First, let’s start with the main loop, which iterates over the range of expected \(N\) values in decreasing order to check if it is a palindrome. We can then check each palindrome to see if it is a product of two three-digit numbers, returning the palindrome if so.

def main(n):
    i = n-1
    while i >= 101101:
        if is_palindrome(i):
            if factor(i):
                return i
        i -= 1

How does prime factorisation come into play? Here is the key idea: We break down each palindrome we find into its primes, we can then check all permutations of multiples of its primes that are under \(1000\) (to satisfy the three-digit requirement). At the end of it, if we can find up to two such values, then our mission is complete. Here is the attempt in code form:

# prime factorisation
def factor(n):
    factors = []
    while n % 2 == 0:
        n = n>>1
        factors.append(2)
    stop = int(math.sqrt(n))
    i = 3
    while i <= stop:
        while n % i == 0:
            n = int(n/i)
            factors.append(i)
        i += 2
    if n > 2:
        return False # if the palindrome is a prime, it cannot be a multiple of three-digit numbers
        
    for factor in factors:
        if factor > 1000:
            return False

    # checking permutations of factors to form two three-digit values
    i = 0
    j = len(factors) - 1
    while j > 0:
        val1, val2 = factors[i], factors[j]
        val = val1 * val2
        # print(val1, val2, i, j)
        if val < 1000:
            factors.remove(val1)
            factors.remove(val2)
            factors.append(val)
            if len(factors) == 2:
                return True
            j = len(factors) -1
        j -= 1
    return False

Unfortunately, this solution is corrrect for only certain test cases. One mistake in this solution is the permutations are not exhaustive. This attempt was abandoned as searching all permutations of factor multiplication for each palindrome is too tedious. Can we instead generate palindromes from multiplying numbers, instead of decomposing them into its factors and re-generating its product?

Generating all palindromes from multiples

First, we create a function that generates all palindromes from numbers between \(100\) and \(999\). We save the result as a list, which is computed only once.

def get_palindromes():
    pal = []
    for i in range(100,1000):
        for j in range(100,1000):
            val = i * j
            if is_palindrome(val) and val not in pal:
                pal.append(val)
    return pal

The main loop simply checks for the largest number in the palindrome list.

Final thoughts: Again, I overcomplicated the problem by attempting to innovate on the math front. Turns out ‘brute force’ approach of generating the list of palindromes and checking for existence is enough to solve this problem.

Problem 5

High level idea: Find the smallest multiple that is divisible by all integers \(\leq N\).

Input: \(T+1\) lines of integers. As always, the first line contains \(T\) that denotes the number of test cases. This is followed by \(T\) lines, each containing an integer, \(N\).

Constraints:

\[1 \leq T \leq 10\] \[1 \leq N \leq 40\]

Solution

This problem requires us to find the lowest common multiple of all numbers up to \(N\).

We can iteratively build the list of prime multiples that satisfies the LCM problem. According to the fundamental theorem of arithmetic, for every composite number \(\mathbb{C} \in \{1,\ldots, N \}\), there exists prime numbers smaller than the composite number which are factors of the composite.

For example, if we have pre-computed the list of prime factors up to \(N = 3\), our list will look like \(\{2,3\}\). To produce the LCM of up to \(4\), we need to add \(2\) and not \(4\), as \(4\) is a composite number that can be factorised into two \(2\)s. Hence, our final list will be \(\{2,3,2\}\)

In the code implementation below, we save a list of prime factors, factors, that satisfies the problem for \(N =\) max_n. num_factors is a dictionary that gives us the number of prime multiples to multiply over when obtaining the LCM for a given \(N\). Our algorithm will thus check if each factor in factors is a factor of \(N\), dividing it and storing the new value if so. We then store the final prime remainder into factors, updating num_factors to include the new value of \(N\) that we computed.

factors = [1]
num_factors = {1:1}
max_n = 1

def build_arr(n, max_n):
    if max_n < n:
        i = max_n
        while i < n:
            i += 1
            dividend = i
            for divisor in factors:
                if dividend % divisor == 0:
                    dividend /= divisor
            if dividend > 1:
                factors.append(int(dividend))
            num_factors[i] = len(factors)
            max_n += 1
        return max_n

We can then multiply over the list of primes to regenerate the LCM. For further time optimization at the cost of memory, we can cache results in a dictionary, which can even be stored as a tuple with our previous num_factors dictionary.

def lcm(n):
    i = 0
    ans = 1
    while i < num_factors[n]:
        ans *= factors[i]
        i += 1
    return ans

Bill Pung

Project Euler #1-5

Problem 1

First thoughts

Naive for-loop count

For-loop counting with cache

Arithmetic progression

Problem 2

First thoughts

Recursive Even Fibonacci

Iterative Even Fibonacci

Problem 3

Iterative generation

Better stopping criterion

Stopping at \(\sqrt{n}\)

No prime cache

Problem 4

First thoughts

Generating all palindromes from multiples

Problem 5

Solution

Comments