Friday 3 July 2009

Scala, lazy evaluation and the Sieve of Eratosthenes

This blog posting is primarily concerned with looking at lazy evaluation in functional programing, what it means and how it can be put to use.

Lazy evaluation can be described as an expression that has a value, but that is not evaluated until it's actually needed. Conversely strict refers to the opposite property where all expressions are evaluated up front when declared.

Some functional languages are described as Pure Lazy (no, this is not a derogatory term!), to reflect the fact that all evaluation is performed on demand. Haskell is on such language. Scala has both aspects of strict evaluation and lazy evaluation, as such it couldn't be referred to as 'pure' in either sense.

By means of the simplest example I can think of to illustrate Lazy evaluation, consider the following interaction with the Scala CLI:


scala> lazy val a = b + 1; lazy val b = 1;
a: Int = <lazy>
b: Int = <lazy>

scala> a
res9: Int = 2

scala> b
res10: Int = 1


I think this illustrates lazy evaluation, without defining the 'vals' as lazy the statement lazy val a = b + 1; lazy val b = 1; as val a = b + 1; val b = 1; couldn't be evaluated because b would be undefined when evaluating a!

The examples are based on the Sieve of Eratosthenes. Eratosthenes was an ancient Greek Mathmetician from 276 BC – c. 195 BC. The Sieve of Eratosthenes is a ancient algorithm for finding prime numbers from 2 .. n. Stated simply, this algorithm can be expressed as:


  • Create a list of all the numbers from 2 .. n.
  • Starting at p = 2
  • Strike out all multiples of p up until n from the list
  • Let p now be the first (lowest) number that has not been struck-off the list
  • Repeat the last two steps until p^2 > n
  • All remaining numbers not struck from the list are prime

In Haskell this can be expressed very elegantly and succinctly, as follows:

primes = sieve [2..]
sieve (p : xs) = p : sieve [x | x <− xs, x `mod` p > 0]


Note that [2..] creates an infinite steam of all the integers from 2 to infinity. You might think this is impossible and would create an out of memory error on the machine it ran on, but this is perfectly valid in a pure lazy language because none of the numbers are actually generated until they are requested!

In Scala there's not really a way to represent this quite so succinctly, but it is possibly to create a Lazy stream of numbers using:

var is = Stream from 2

which is essentially equivalent to Haskell's:

[2..]

Syntax.


Another important thing to note and be aware of is that Scala's List class is not a lazy list. Therefore, trying to make the equivalent of the Haskell algorithm in Scala using a List is not going to work.

Scala's Stream class however is Lazy, and so it can form the basic of the equivalent algorithm in Scala. To be specific, Scala's Stream class is strict about head and lazy about tail - which is ok since head contains a finite set of 0|1 elements.


The following Scala code shows two ways to implement the Sieve using Scala Lazy streams.


Example 1:


package primes;

/* Haskell...

primes = sieve [2..]
sieve (p : xs) = p : sieve [x | x <− xs, x `mod` p > 0]
*/
object Primes1 {

def primes : Stream[Int] = {

var is = Stream from 2

def sieve(numbers: Stream[Int]): Stream[Int] = {
Stream.cons(
numbers.head,
sieve(for (x <- numbers.tail if x % numbers.head > 0) yield x))
}

sieve(is)
}

def main(args : Array[String]) = {

primes take 100 foreach println
}
}

object Primes2 {
def primes = {
def sieve(is: Stream[Int]): Stream[Int] = {
val h = is.head
Stream.cons(h, sieve(is filter (_ % h > 0)))
}

sieve(Stream.from(2))
}

def main(args: Array[String]) {
println(primes take 100 toList)
}
}



Note the use of the 'take' method to extract the first 100 primes from the stream (so the program terminates!).


This second example shows how we can define a Lazy list like trait and then use it to implement the same, equivalent behavior:


Example 2:


package primes


object PrimesLazyTrait {

sealed trait Lazy[+A] {
def head: A = this match {
case Nil => error("head called on empty")
case Cons(h, _) => h()
}

def tail: Lazy[A] = this match {
case Nil => error("tail on empty")
case Cons(_, t) => t()
}

def filter(p: A => Boolean): Lazy[A] = this match {
case Nil => Nil
case Cons(h, t) => if(p(h())) Cons(h, () => t() filter p) else t() filter p
}

def foreach(f: A => Unit) {
this match {
case Nil =>
case Cons(h, t) => {
f(h())
t() foreach f
}
}
}

def toList: List[A] = this match {
case Nil => scala.Nil
case Cons(h, t) => h() :: t().toList
}
}

final case class Cons[+A](h: () => A, t: () => Lazy[A]) extends Lazy[A]
case object Nil extends Lazy[Nothing]

def from(n: Int): Lazy[Int] = Cons(() => n, () => from(n + 1))

def primes = {
def sieve(is: => Lazy[Int]): Lazy[Int] = {
lazy val h = is.head
Cons(() => h, () => sieve(is filter (_ % h > 0)))
}

sieve(from(2))
}

def main(args: Array[String]) {
primes foreach println
}
}




In this version we see the Lazy trait provides the necessary methods head, tail, filter, foreach and toList.


These little examples help demonstrate some of the power, elegance and conciseness of lazy functional programming - as well as the beauty of this Ancient Greek algorithm for finding primes.


The basic algorithms complexity is:

  • O((nlogn)(loglogn))

And has a memory requirement of:

  • O(n).

Without any optimization.

--

Additional example - e (Euler's number):

Calculating e, a list of converging numbers in the sequence of e

e can be defined as lim n->inf ( 1 + 1/n)^n and therefore expressed directly as a function e of n as follows:

def e(n : Int) = { Math.pow((1.0 + (1.0/n)), n) }

So, how can we turn this into a stream? If we create a stream of numbers from 1 that are mapped using the e function, which can be defined as an anon lambda argument to map:

val es = Stream.from(1).map((n => { Math.pow((1.0 + (1.0/n)), n) }))

Example usage of the e stream:

es take 5 toList


.

5 comments:

TimA said...

I've updated your example to use the new #:: operator available in the lastest eclipse plugin nightly build

http://quoiquilensoit.blogspot.com/2009/10/scala-lazy-evaluation-and-sieve-of.html

jwhiteheadcc said...
This comment has been removed by the author.
jwhiteheadcc said...

This has to be one of the least CPU-efficient for finding primes with hundreds of digits. ;)

However, that's not the point. This is a just plain cool algorithm for finding primes. In theory, new computing hardware could make the the sieve popular, again. It won't probably happen with binary(digital) computers, though.

Everyone is using the remainder theorim or such right now. You should post a link to some of the prime number sites. Mersenne forums for example.

jwhiteheadcc said...

Bah, spammer. Note that qubits are not the only possibility. In theory, optical methods come to mind. I wonder how one would use something like interference patterns to implement algorithms... supposedly this is how some antiradar equipment work?

Davicero said...

I followed your steps in C++11. Very inspiring article of yours. Thank you.

A lazy stream implementation in C++11