I was looking for a convenient way to access Cassandra from Scala for a new version of our app that is currently in development. After a long and tedious afternoon I finally got something working that feels solid, so I thought I’d share my experience here.
The first time I checked the Client Options wiki page there were a couple of projects listed for the Scala language. I wasn’t convinced by any of the alternatives I looked at. Given that Cassandra is such a fast-moving target, I felt that the client library needs to have at least one committed developer (and some community momentum) behind it.
I took a quick look at scromium but I wasn’t convinced with the choice of different token separators. I then looked more deeply into scalandra but just couldn’t make it work with Cassandra 0.6-rc1. I was initially interested in the Akka option since I was already looking at Akka for it’s distributed actors but I was disappointed by its current implementation.
You can use Akka to access Cassandra via the Pluggable Persistence module. According to the documentation you can bypass the Akka STM and use it as a library; they call it the Cassandra Session API. After a lot of tinkering, building Thrift on MacOS X, installing MacPorts to get the missing pieces, etc. I still couldn’t get a simple test to run against Cassandra 0.6-rc1. I later noticed that even Akka developers were considering a shift to Hector, a Java-based library. Given this endorsement I decided to try it myself.
Hector is a thin wrapper over the Cassandra Thrift API. It basically makes it a bit more object-oriented and easy to use from Java.
It turns out that talking to Cassandra from Scala using Hector is very simple, albeit not very Scala-ish. I set up a basic Scala project in Eclipse and added the Hector libraries to the classpath. Then I modified ExampleClient.java and with Cassandra running on the default port it worked like a charm:
import me.prettyprint.cassandra.service.CassandraClientPoolFactory
import me.prettyprint.cassandra.utils.StringUtils._
import org.apache.cassandra.thrift.{Column, ColumnPath}
object CassandraScalaHectorTest {
def main(args : Array[String]) : Unit = {
val pool = CassandraClientPoolFactory.INSTANCE.get()
// var client = pool.borrowClient(Array("cas1:9160", "cas2:9160")
var client = pool.borrowClient(Array("localhost:9160"))
try {
val keyspace = client.getKeyspace("Keyspace1")
val columnPath = new ColumnPath("Standard1").setColumn(bytes("column-name"))
// insert
keyspace.insert("key", columnPath, bytes("value"))
// read
val col = keyspace.getColumn("key", columnPath)
System.out.println("Stored value: " + string(col.getValue()))
// This line makes sure that even if the client had
// failures and recovered, a correct
// releaseClient is called, on the up to date client.
client = keyspace.getClient()
} finally {
// return client to pool. do it in a finally block to
// make sure it's executed
pool.releaseClient(client)
}
}
}
As you can see this was only slightly modified so it compiles in Scala. It’s a little cleaner and less verbose than the Java version, mostly because of Scala’s type inference, which reduces the need to declare all types.
One of the things I don’t like is the explicit conversions to and from byte arrays. This is exactly the kind of boilerplate that can be done for you by using Scala’s implicit conversions and, overall, it’s just not very idiomatic Scala. Although I’m not a purist and I could live with this solution, I would prefer a cleaner approach. I still haven’t ruled out the possibility of wrapping Hector in Scala and providing a couple of conversions to make it work as expected.
Since I started working on this test the Cassandra Client Options wiki page was updated with a new Scala library called cascal. I haven’t had a chance to try it yet but it looks promising so I will evaluate that next.
Perhaps there’s a silver bullet I haven’t even heard off yet. I’m curious to know how many people are fronting Cassandra with a Scala app.


Social comments and analytics for this post…
This post was mentioned on Twitter by rantav: Using hector, a java client for cassandra I wrote from Scala http://theikester.wordpress.com/2010/04/07/access-cassandra-from-scala-hector/…
Hey, I maintain scromium and I wanted to know what it was specifically that turned you off to the project. I’m using it myself, so the API is mainly oriented around my needs, but I’m always open to some constructive criticism from other folks. Thanks.
Hi Cliff. Yours was one of the first ones I looked at and I honestly didn’t explore it much. So I wasn’t making a judgement on the merits of the implementation. My first gut feeling was that my data access code would look a bit awkward with all the different separator characters (i.e. () % / !). I prefer a simpler nomenclature, specially in Scala, where so many libraries do different things with non-alphanumeric characters. But maybe I should revisit it
Thanks for sharing it with the world.
Just to let you know, Scalandra has been updated to work with Cassandra 0.6. While Scalandra is far from perfect, it has been able to do the job for us.
Cascal does indeed look interesting. The project seems to be active, but then again it has only been around for few weeks so it’s hard to say what will happen in long run. IMHO, Cascal is also another victim of operator-itis, but this is probably just a personal preference.
Thanks for the heads up. I’ll have to give Scalandra another try.
I think Cascal is simpler in operators, although I’m not sure if the double back slash is absolutely necessary. Some of the other stuff Chris and his team are doing with simple object mapping seems very practical and utilitarian.
[...] Uncategorized on April 23, 2010 by theikester I recently wrote about my experience using Hector (a Java-Cassandra client library) from Scala. Although I found it [...]
nice info, thanks