Must go faster

FastR: A new R virtual machine written in Java

Jane Elizabeth
FastR
© Shutterstock / Phongphan

Looking for an alternative to R? Rethink your approach to data science and statistical analysis with FastR, a virtual machine written in Java that boasts a faster, more efficient runtime.

R is something of a niche language. Unless you’re an academic or a data scientist, most developers tend to give this dynamic language a pass. However, with over 3 million users, it’s no slouch for a programming language first developed in 1993.

However, R does show its age sometimes. R is generally quite slow because it lacks it a JIT. R is also something of a memory hog. This is because it has large objects, allocates profusely, and has a non-moving garbage collector. Additionally, R has tricky semantics, making it difficult for undergraduates and graduate students alike to learn it in their data science labs or poli sci research courses.

As a “legacy software that must be maintained, R must also evolve to meet new challenges.”

FastR

So, R must evolve. But how?

One way is through the FastR project, an attempt to rethink how to implement R. This virtual machine leverages tested technologies to serve as an implementation of R in Java on top of Truffle. FastR is efficient, compatible, and a polyglot.

R is neither the fastest nor most efficient programming language around. FastR improves on the original by making extensive use of dynamic optimization features provided by the Truffle framework. This removes abstractions introduced by the R language, allowing the Graal compiler to create optimized machine code on the fly.

SEE MORE: Have we bridged the gap between Data Science and DevOps?

The Truffle framework also allows FastR to address language incompatibility issues. R is powerful and flexible, but oftentimes interfaces to other languages like Java, Fortran, and C/C++ create a significant overhead. This is caused by = the different execution strategies employed by different languages, e.g., compiled vs. interpreted, and by incompatible internal data representations.

How do we fix this kind of fundamental mismatch? The Truffle framework builds the necessary polyglot primitives directly into the runtime. As a result, FastR uses this infrastructure to allow multiple languages to interact transparently and seamlessly. No matter the language boundary, all parts of the polyglot application can be compiled by the same compiler. They can be executed and debugged simultaneously as well.

Plus, the addition of a JIT compiler can only help with the speed issue.

Where to get it

FastR is currently available in two forms:

  • As a pre-built binary with the Truffle implementations of Ruby and JavaScript for Linux and Mac OS X. Unfortunately, there is no Windows version available yet.
  • As a source release on GitHub. This option does not come with Ruby or JavaScript

While FastR is eventually intended to be a drop-in replacement for R, the implementation is still a work in progress. Contribution and collaboration is welcome. If you’re interested in learning more about FastR, head on over to GitHub to find out more about this interesting approach to R.

Author
Jane Elizabeth
Jane Elizabeth is an assistant editor for JAXenter.com

Comments
comments powered by Disqus