JAX London 2014: A retrospective
Rejigging gingerbread

Facebook’s “completely insane” Dalvik hack


Struggling with the limitations of legacy Android? Facebook’s engineers decided to hack the underlying VM.

The pain caused to app developers by Android’s slow OS adoption is well-documented. However, hacking the Dalvik virtual machine itself must be the most extreme workaround yet documented.

Version 2.0 of Facebook’s Android app, released in December, was praised for its speed increases – attributed to its use of native code rather than its old hybrid model. Yet the development process was anything but smooth, Facebook engineer David Reiss revealed yesterday, as the porting of large amounts of JavaScript to Java resulted in too many methods for older versions of the Dalvik VM to handle.

The ‘dexopt’ program, used to prepare an app for installation on a specific phone, has a considerable smaller ‘LinearAlloc’ buffer on older versions of Android, including Froyo and Gingerbread. As this limit translates to around 3 million method references (according to TechCrunch), Facebook’s app must have contained a considerably higher number.

After “a bit of panic,” wrote Reiss, the team decided to split the app into multiple dex files, and then inject these directly into the system class loader. “This isn’t normally possible, but we examined the Android source code and used Java reflection to directly modify some of its internal structures.”

Unfortunately, this wasn’t the end of the team’s problems: as they later discovered, the LinearAlloc buffer exists not just in dexopt but “within every running Android program”.

There was no way to work around this with dex files since all of our classes were being loaded into one process, and we weren’t able to find any information about anyone who had faced this problem before (since it is only possible once you are already using multiple dex files, which is a difficult technique in itself). We were on our own.

Their solution? Hacking the Dalvik engine itself, using a Java Native Interface extension to replace the existing LinearAlloc buffer with a larger one.

“At first, this idea seemed completely insane,” wrote Reiss. “Modifying the internals of the Java class loader is one thing, but modifying the internals of the Dalvik VM while it was running our code is incredibly dangerous.”

The journey didn’t end there, however – in testing, they found that the Galaxy S II (“The most popular Gingerbread phone of all time”) failed to support their hack thanks to a small tweak by manufacturers Samsung, and so the JNI extension had to be redesigned to detect any similar modifications.

While a clever hack, many have panned Facebook’s engineers for failing to reduce their method names down to a manageable size in the first place. “This is madness. Instead of admitting you had simply too many methods to manage [...] you blindly moved forward, hacking away until you were lucky enough to find a way to make it work,” Chris Schmitz commented on the post, while on Hacker News, Julian Morrison wrote: “They bloated the app so badly, they [...] had to monkey patch the OS to let it run at all”.

One the one hand, it’s a triumph of hackery over careful planning and design – but on the other, it’s an example of where an open-source operating system can be truly useful. “Needless to say,” concluded Reiss, “without Android’s open platform, we wouldn’t have had the opportunity to ship our best version of the app.”

Photo by Aiden Jones.

comments powered by Disqus