How Java helped OpenWorm wriggle to life
Inside the open source Java project dedicated to creating a virtual organism inside a computer.
In this article, originally published in the January edition of JAX Magazine, Elliot Bentley looks at the bio-cyber breakthrough that’s getting the Javasphere excited.
Streaming 200MB per second to browsers is just one of many engineering challenges faced when simulating a living organism. A video of a wriggling microscopic worm seems an unlikely news story, but over Christmas 2013, it was featured on dozens of sites including Mail Online, Engadget and BoingBoing.
This was no ordinary worm, however: it was a computer simulation, the culmination of over two years’ work by an international team of scientists and engineers. Together they have been building OpenWorm, “the world’s first virtual organism in a computer”, and this two minute long video was recorded to show off its muscle systems in action.
Under the surface of this worm is an incredibly complex and entirely open source Java application built with OSGi and Spring. OpenWorm co-founder Matteo Cantarelli told JAXenter that Java was an obvious choice for a project of this scale.
“Basically, we wanted an enterprise solution, because what we’re building will have to scale up a lot,” he said in a phone interview. “When you compare the speed of Python and Java when looking at how big the objects in memory are, those kinds of things – for something robust and long-term, Java was what we ended up going with.”
Cantarelli originally trained in electrical and systems engineering, before working for around six years as a software engineer. In his spare time, however, he was exploring a very different side of software: simulating life itself.
“I started a process of self-educating, just reading loads of [biology] books,” says Cantarelli. Eventually, he met a group of like-minded people over the internet – or as Cantarelli puts it, “A couple of individuals who had the same dream: to simulate a C. elegans”.
Cut it, and it’ll bleed Java
Caenorhabditis elegans, or just C. elegans to friends, is a microscopic nematode worm that, thanks to its prevalence and simplicity, has become one of the most-studied organisms in the world. Not only was it was the first multicellular organism to have its entire genome sequenced, but the life patterns of every single one of its 1,031 cells have been studied in depth.
Despite this, modelling a C. elegans on a computer was still a wildly ambitious task when the four founders began work on OpenWorm. The complexity of even the simplest of organisms rivals some of the toughest problems in computer science; and to make things even more complex, the group had settled on a flexible client-server architecture that needed to be scalable.
Everything in OpenWorm is written twice – firstly by the scientist half of the team, who create models based on real-life observations in Python and C++, and then a second time by the engineering half as a highly modular Java application.
“We wanted to build something with a client-server architecture, and we wanted something robust,” says Cantarelli, “and Java has a very good track record as offering server-based applications. We wanted something that was, for instance, strongly typed.”
“It’s not plain Java, because that wouldn’t cut it. We used a lot of OSGi, in the sense that we want the whole architecture to be modular and independent, so we want good dependencies decoupling. And using Spring offers benefits along the same line.” Eclipse Virgo was chosen because it was “one of the first web servers to allow OSGi within a web server container”.
Cantarelli stresses the importance of this modularisation. Not only are systems within the worm kept separate, but the organism’s data is decoupled too – providing a solid foundation with which different organisms could also be simulated.
“The idea is that the architecture and the platform will be reusable entirely,” says Cantarelli. “The worm will be just data.”
Burrowing OpenWorm through the internet
The current build of OpenWorm runs on Amazon EC2, but Cantarelli says there is ongoing research into a “more diverse composition for the different nodes of the network”. Some aspects of the simulation might require high-performance computing, for example, while others might use a parallelised architecture. At this stage, though, no permanent decisions have been made.
This client-server model may be forward-thinking, but Cantarelli describes it as an “ambitious choice”. The idea, he says, is to make the worm simulation so accessible that it can be used “the same way as Google Docs”.
But Google Docs deals mostly with text, whereas an organism, even one as simple as a nematode, is infinitely more complex. For an accurate visualisation, huge amounts of data will need to be streamed to the client, and this presents one of the largest engineering challenges of the project.
What does this data consist of? “If you think of the particle simulation, for instance,” explains Cantarelli, “that particle simulation there has roughly 150,000 particles. That is the current model, so it’s not even the final model, and it is only one aspect [of the entire model].
“And by one aspect, I mean it is just concerned with the particle physics of the locomotion of the worm, so how the muscle stretches, how the body stretches, how the body wriggles, friction with the liquid or gel that the worm is swimming into.”
Compared to the type of data typically streamed over the web, the output of this calculation is enormous. “Each timestep is in the order of magnitude of five megabytes,” says Cantarelli. “Now, imagine that you want to stream at 35 frames per second… you’re looking at a huge bandwidth [requirement].”
As a result, this highly detailed model outputs a whopping 200MB per second – around a hundred times the bandwidth needed for 4K video. Work hasn’t yet started on optimising this stream, but Cantarelli says that it will require some creative thinking on the part of the engineers.
It’s one of many technical challenges yet to be dealt with, but Cantarelli and his engineers are on track to dealing with them. In just two years OpenWorm has already grown from a four-person side project to a sprawling organisation.
“I wake up every day and there’s new stuff happening, and there’s new people doing things, new people contributing, and it just makes me very happy,” says Cantarelli. “The project now has a life of its own!”