days
-7
0
hours
0
-6
minutes
-5
-4
seconds
-5
-4
search
You sure you've been doing it the right way?

Akka anti-patterns: Java serialization

Manuel Bernhardt
akka

The Akka documentation discourages the use of Java serialization but performance is just one reason to not use Java serialization – there are stronger reasons to not engage with it, especially in the context of Akka. What to do instead? In this article, Manuel Bernhardt gives some alternatives.

Akka makes use of serialization when messages leave the JVM boundaries. This can happen in mainly two scenarios: sending messages over the network when using Akka Cluster (do not use Akka Remote directly) or using Akka Persistence.

Now here’s the catch: the default serialization technology configured in Akka is nothing but the infamous Java serialization, which Mark Reinhold called a “horrible mistake” and which Oracle plans to dump in the near future anyway.

When using Java serialization, you get warning messages in your logs:

[WARN] [03/29/2016 16:53:40.137] [LookupSystem-akka.remote.default-remote-dispatcher-8] 
[akka.serialization.Serialization(akka://LookupSystem)] Using the default Java serializer for class 
[sample.remote.calculator.Subtract] which is not recommended because of performance implications.
Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
 

That’s nice, but it’s not enough – they are, after all, easy to ignore (and the warnings can be turned off – which can be tempting).

The Akka documentation discourages the use of Java serialization – there’s a note in the serialization documentation that mainly points out the performance issue:

 "It is highly encouraged to disable java serialization, so please plan to do so at the earliest possibility you have in your project.
One may think that network bandwidth and latency limit the performance of remote messaging, but serialization is a more typical bottleneck."

Now, performance is one reason to not use Java serialization – but there are stronger reasons to not engage with it, especially in the context of Akka:

  • security: Java serialization is a known attack vector for some time now
  • schema evolution: when using Akka Persistence for event sourcing, you will encounter situations in which you need to evolve the stored messages (unless you don’t use the project in production and the requirements never evolve). Java serialization is ill-suited for dealing with changes in message encoding, there are ways to be backwards compatible but they are cumbersome at best. If you don’t plan ahead and have a running production system with Java serialization, you’re in for a bad surprise – you’ll have to switch the serialization technology mid-flight in a running production system, which needs to be very carefully coordinated
  • message protocol evolution: when you’re running a clustered application and want to redeploy it in a rolling fashion (i.e. upgrading one node after the other), you’ll be faced with having nodes at different protocol versions running. There is no silver bullet for this – for incompatible changes, you’ll likely need to build message adapters anyway – but for more common changes such as adding or renaming a field, binary protocols make life a lot easier than Java serialization

What to do instead?

When starting a new project, the first thing you should do is to disable Java serialization by adding this line to application.conf:

akka.actor.allow-java-serialization = off
 

Next, you will need a binary serialization protocol. You really don’t want to use JSON, unless you are the type of person who enjoys load-testing their garbage collector as Martin Thompson puts it. There are quite a few around by now:

Martin Kleppmann wrote a nice comparison of Avro, Protocol Buffers and Thirft with a focus on message evolution. There are different approaches in binary protocols, the ones that use a per-field approach and the ones that go for a full schema – both approaches have their pros and cons and really need to be evaluated in the light of your project and organizational structure.

Akka uses Protocol Buffers internally, which has the advantage that integrating those is quite straightforward. For example, if you use ScalaPB to generate case classes from the Protocol Buffer definitions you only need the following configuration to get things to work:

akka {
  actor {
    # repeated here, because you really should have this line in your project
    akka.actor.allow-java-serialization = off
 
    # which serializers are available under which key
    serializers {
      proto = "akka.remote.serialization.ProtobufSerializer"
    }
 
    # which interfaces / traits / classes should be handled by which serializer
    serialization-bindings {
      "scalapb.GeneratedMessage" = proto
    }
  }
}
 

One more thing: with the rise of Akka Typed, the best practice of defining one protocol per actor (instead of sharing messages) becomes even the more relevant. As such, one approach I’ve used successfully in a number of projects now uses the following pattern (in combination with ScalaPB):

  • define one protocol file for each actor
  • the proto file is placed in the same directory structure as the package structure of the actor, e.g. src/main/protobuf/foo/bar/SomeActorProtocol.proto for the actor foo.bar.SomeActor
  • enable the right options to have only one scala source file generated for all actor messages

For example, the SomeActorProtoco.proto file would look as follows:

syntax = "proto3";
 
import "scalapb/scalapb.proto";
import "google/protobuf/timestamp.proto";
 
package foo.bar.SomeActorProtocol;
 
option (scalapb.options) = {
    single_file: true
    flat_package: true
    preamble: "sealed trait Command"
    preamble: "sealed trait Event"
};
 
// Commands
 
message Greet {
    option (scalapb.message).extends = "Command";
    string message = 1;
}
 
// Events
 
message GreetingReceived {
    option (scalapb.message).extends = "Event";
    String greeting = 1;
    google.protobuf.Timestamp timestamp = 2;
}
 

That’s it! If you realize now that you’ve been doing it all wrong and are likely going to run into trouble with Java serialization, I invite you to check out the Akka anti-patterns overview. Chances are that this won’t lift your mood, but you might find out one useful thing or two.

This post was originally published on Manuel Bernhardt’s blog.

Author
Manuel Bernhardt
Manuel Bernhardt is a passionate engineer, author, speaker and consultant who has a keen interest in the science of building and operating networked applications that run smoothly despite their distributed nature. Since 2008, he has guided and trained enterprise teams on the transformation to distributed computing. In recent years he is focusing primarily on production systems that embrace the reactive application architecture, using Scala, Play Framework and Akka to this end. Manuel likes to travel and is a frequent speaker at international conferences. He lives in Vienna where he is a co-organizer of the Scala Vienna User Group. Next to thinking, talking about and fiddling with computers he likes to spend time with his family, run, scuba-dive and read. You can find out more about Manuel's recent work at http://manuel.bernhardt.io.