« How to find out which Active Directory security group you belong to | Main | Find the size of your Mac's Music folder using the command line »
Thursday
Jun162011

Is Scala, or Go, the next Java?

A recurring meme of the Scala community is that Scala is a better Java than Java. Or Scala is the next Java. Some even say it will replace Java.

I must admit, I salivate over things like list comprehensions, closures, and a REPL. And for that, Scala had me at hello.

Toss in a little immutability and Erlang’esq message passing to simplify concurrency oriented programming, and I’m taking Scala home to meet mom.

When I discovered Akka, a Scala library that gives you things like supervision, and other goodies pulled from the Erlang/OTP pot of gold, I was in like a bear claw in a honey pot.

But then it happend. I set up a Lift site on my box, and started hacking. Scala’s documentation is good, but not as superlative as what I’ve seen with other platforms and languages. This led me to start reading a lot of Scala source code. That’s when I found a mole with a hair growing out of it.

Scala embraces creativity. For example, you can write your own domain specific language. You can overload operators like Bill Clinton re-defines the word “is”.  Add these capabilities together and devs start writing code that reads like Mrs. Cleaver’s lines from Airplane!

Sheeeet. Transation: Scala is a beautiful language and an engineering marvel. The virtues of its flexibility far outweigh source code I do not understand.

But my last smokey smokey off the Scala hooka went down like the first time I swallowed my saliva with a chew in. If you’ve done this, you know what I’m talking about.

Point a browser window at Typesafe. Scala. Akka. Simple. That’s the tag line. Typesafe was created by the inventors of Scala and Akka. This firm is backed by some serious pedigree. The firm was created in part to deliver a simple one stop shop for all your IDE, Scala language, and build tool needs.

I liked what I saw, so I checked out the getting started guide. Typesafe’s tutorial walks a user through writing a program that calculates pi concurrently. The equation is:

The Scala tutorial performs the calculation by sending messages to a bunch of worker processes whose job it is to calculate an iteration of k. It then collects the results from the workers and prints it out. It’s kind of map reduce’ish.

Wikipedia says you need to sum 300 iterations to get a precision of 3.14. The tutorial burns through 1000 iterations.

Normally, I’d think this solution is pretty slick. It features Java’s familiar inheritance, a new twist on the singleton, some mixed in traits, some message passing, and a cool Akka doohicky called a load balancer (it load balances tasks across the pool of worker processes).

Here’s the Scala version of the code.

package akka.tutorial.first.scala

import akka.actor.{Actor, PoisonPill}
import Actor._
import akka.routing.{Routing, CyclicIterator}
import Routing._

import System.{currentTimeMillis => now}
import java.util.concurrent.CountDownLatch

object Pi extends App {

 calculate(nrOfWorkers = 4, nrOfElements = 10000, nrOfMessages = 10000)

 // ====================
 // ===== Messages =====
 // ====================
 sealed trait PiMessage
 case object Calculate extends PiMessage
 case class Work(start: Int, nrOfElements: Int) extends PiMessage
 case class Result(value: Double) extends PiMessage

 // ==================
 // ===== Worker =====
 // ==================
 class Worker extends Actor {

   // define the work
   def calculatePiFor(start: Int, nrOfElements: Int): Double = {
     var acc = 0.0
     for (i <- start until (start + nrOfElements))
       acc += 4.0 * (1 - (i % 2) * 2) / (2 * i + 1)
     acc
   }

   def receive = {
     case Work(start, nrOfElements) =>
       self reply Result(calculatePiFor(start, nrOfElements)) // perform the work
   }
 }

 // ==================
 // ===== Master =====
 // ==================
 class Master(nrOfWorkers: Int, nrOfMessages: Int, nrOfElements: Int, latch: CountDownLatch)
   extends Actor {

   var pi: Double = _
   var nrOfResults: Int = _
   var start: Long = _

   // create the workers
   val workers = Vector.fill(nrOfWorkers)(actorOf[Worker].start())

   // wrap them with a load-balancing router
   val router = Routing.loadBalancerActor(CyclicIterator(workers)).start()

   // message handler
   def receive = {
     case Calculate =>
       // schedule work
       for (i <- 0 until nrOfMessages) router ! Work(i * nrOfElements, nrOfElements)

       // send a PoisonPill to all workers telling them to shut down themselves
       router ! Broadcast(PoisonPill)

       // send a PoisonPill to the router, telling him to shut himself down
       router ! PoisonPill

     case Result(value) =>
       // handle result from the worker
       pi += value
       nrOfResults += 1
       if (nrOfResults == nrOfMessages) self.stop()
   }

   override def preStart() {
     start = System.currentTimeMillis
   }

   override def postStop() {
     // tell the world that the calculation is complete
     println(
       "\n\tPi estimate: \t\t%s\n\tCalculation time: \t%s millis"
       .format(pi, (System.currentTimeMillis - start)))
     latch.countDown()
   }
 }

 // ==================
 // ===== Run it =====
 // ==================
 def calculate(nrOfWorkers: Int, nrOfElements: Int, nrOfMessages: Int) {

   // this latch is only plumbing to know when the calculation is completed
   val latch = new CountDownLatch(1)

   // create the master
   val master = actorOf(new Master(nrOfWorkers, nrOfMessages, nrOfElements, latch)).start()

   // start the calculation
   master ! Calculate

   // wait for master to shut down
   latch.await()
 }
}

There is one problem. I’ve been toying around with a language called Go. I can’t put it down. That’s because after Go, programming in anything else seems as difficult as balancing the State of California’s budget.

You can summarize the Go programming language in one word. Simple.

Here’s a version of Go that solves this problem using Go idioms.

Instead of load balancing tasks across a pool of worker processes, this program spawns a goroutine (think lightweight process) to solve each iteration of the equation. Processes run in parallel. After each process does the math, it adds the sends the result to a channel. Finally the function that spawned the goroutines in the first place collects the results from the channel and sums them.

Here’s the code:

package main

import (
   "fmt"
   "math"
)

func main() {
   //spawn 300K goroutines to calculate pi
   fmt.Println(CalculatePi(300000))
}


func CalculatePi(term int) float64 {
   
   ch := make(chan float64)

   //map
   for i := 0; i <= term; i++ {
       go formula(ch, float64(i))
   }
   result := 0.0
   //reduce
   for i := 0; i <= term; i++ {
       result += <- ch
   }
   return result
}


func formula(ch chan float64, term float64) {
   result := 4 * (math.Pow(-1, term)/(2.0 * term + 1.0))
   ch <- result
}

Hmmm. 33 lines. Not bad. Let’s do a deep dive.

If you don’t have Go already, follow the installation instructions at golang.org. Next, dump the above code into your favorite text editor then save it as pi.go.

From the command line, you can compile, link, then execute your code by running this command:

$ 6g pi.go && 6l pi.6 && ./6.out
3.14159598691202

Let’s revisit what we just did.

Each file must be assigned to a package. The main function in the main package is what gets executed first in a Go program.

Main tells the function CalculatePI to compute 300,000 iterations of the Pi equation.

CalculatePI is dirty simple. It begins by defining a channel. In this case, I’m using a channel as a threadsafe queue to add the result of each iteration to.

Next, CalculatePi loops 300,000 times, each time spawning a new goroutine to calculate the next iteration of pi using the formula function.

Finally, CalculatePi pulls each result off of the queue, summing the result in the process.

This code runs on a dual core Mac mini in about 2 seconds. Think about this. The code spawned 300,000 processes and retrieved the results in 2 seconds. That’s totally awesome!

Goroutines and channels are my favorite aspect of the Go programming language. Goroutines provide you with a simple mechanism to spawn off pieces of work to happen at the same time.  And channels enable you to easily synchronize access to a mutable data structure.

Scala is an immensely powerful and beautiful langauge. But Go’s simplicity and power is attractive. And the source code is easy to read and understand. For these reasons, I think Go has a shot at being the next Java.

PrintView Printer Friendly Version

EmailEmail Article to Friend

Reader Comments (15)

This post is not exactly fair. It compares a goroutines-based solution with an actors library-based solution. These are pretty different approaches to concurrency. Of course, that contributes to the resulting code complexity.

June 17, 2011 | Unregistered CommenterMichael

How much memory did it eat? And what about scala performance/memory usage?

June 17, 2011 | Unregistered Commenterhtosnly

I like Go, but I cannot accept that a statically typed language invented in the 21st century does not have any support for generic data structures. There is a load of literature on the subject, as well as many existing implementation of all sorts of ways to do it. I can understand that it is not necessary for the authors of Go, but until Go has an answer to that problem, I cannot consider it.

June 17, 2011 | Unregistered Commentergnuvince

Akka is a big/enterprise solution. Where you have a lot of additional functionality.
If you only want to distribute the work on different threads/actors you could use Futures:

import scala.actors.Futures._
// Create the 300'000 Actors which each compute one step
val f = 0.to(300000).map{i => future(4.0 * (1 - (i % 2) * 2) / (2 * i + 1))}
// Wait for all futures to finish and sum them up
val pi = f.map{e => e()}.sum
=> pi = 3.14159598691202

June 17, 2011 | Unregistered CommenterFabian

Using the parallel collections is probably much more efficient than using futures.
(0 to 300000).par.map(i => 4. * math.pow(-1, i) / (2. * i + 1.)).sum

June 17, 2011 | Unregistered Commenterllemieng

Correction, using this I can max out my 6 core machine
(0 to 300000000).par.view.map(i => 4. * math.pow(-1, i) / (2. * i + 1.)).sum
The view eliminates the intermediary results

June 17, 2011 | Unregistered Commenterllemieng

Dude, your Go program produces the wrong value for Pi. Should be 3.141592653589793...

June 17, 2011 | Unregistered CommenterGordon

@Gordon, if you _read_ the post, you'll understand why it's different.

June 17, 2011 | Unregistered CommenterGustavo Niemeyer

I'm not sure this is the best example of parallelism in Go. The cost for spawning a Goroutine is dwarfing the cost of calculating each term. I found this unaltered code took about 2s to run on my machine, but removing the goroutines entirely and reducing it to a simple loop brought that down to around 0.06s.

I've cooked up a new version that runs in around 0.02s when I export GOMAXPROCS=4. It's basically your code, but instead of spawning 300,000 Gorountines, it splits the work across only 8. Keep in mind that if you're not setting GOMAXPROCS, all your gorountines will run on a single OS thread. It's fun to add a fmt.Println(start, end) at the start of formula() so that you can see them running out of order.


package main

import (
"fmt"
"math"
)

const NCPU = 8

func main() {
fmt.Println(CalculatePi(300001))
}


func CalculatePi(end int) float64 {
ch := make(chan float64)
for i := 0; i < NCPU; i++ {
go formula(ch, i*end/NCPU, (i+1)*end/NCPU)
}
result := 0.0
for i := 0; i < NCPU; i++ {
result += <-ch
}
return result
}


func formula(ch chan float64, start, end int) {
result := 0.0
for i := start; i < end; i++ {
x := float64(i)
result += 4 * (math.Pow(-1, x) / (2.0*x + 1.0))
}
ch <- result
}

This form seems to be eating my indentation, but you should get the idea. Nothing a little gofmt can't fix.

June 17, 2011 | Unregistered Commentermtklein

Thanks everybody for the comments.

I agree with Gus that this isn't the best example of parallelism. A computation so puny is really not even worth the overhead of spawning extra processes. And this comment applies to the Scala example on Typesafe's website also.

One thing I didn't mention is that on my Mac mini, when I set MAXGOPROCS to 2 (dual core machine) the example takes much longer to complete. Must be the overhead of spawning 300K goroutines over 2 cores.

What the example does articulate however, is that you can easily split up chunks of work, then aggregate the results back to a single mutable variable.

Go's built-in language constructs make this so easy to do. No need for external libraries. I think that is pretty cool.

June 17, 2011 | Registered CommenterCharles Thompson

thought simple, i can always see some sexy code in go..

June 17, 2011 | Unregistered Commenterfanwall

"Go's built-in language constructs make this so easy to do. No need for external libraries. I think that is pretty cool."

Interesting. Most people out there have the complete opposite opinion. Why is it not possible in Go to just implement it in a library?

June 19, 2011 | Unregistered Commentersteve

I always remember that scene when I think of that movie the i think it was a funny scene.
best essay editing service

January 12, 2012 | Unregistered Commentermolly

When I checked Go (on Windows) about a year ago, it was cheating as far as concurrency was concerned. It was using a single thread for all go routines regardless of the value of MAXGOPROCS. The language has a potential but the implemenations are not yet ready.

May 28, 2012 | Unregistered CommenterFrancis

Good comaprism.
I like go very much for it's simplicity and power.
But I think - regarding to Java - the Java programmers will not leave there plattform (JVM). On the JVM will be the decission between Scala and Clojure.

September 20, 2012 | Unregistered CommenterAlexander

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>