This ain't your grandma's Java

Everything I know about Java collections I learned from Scala

From December's JAX Magazine, Jessica Kerr explains how using Scala concepts can help us when using plain old Java.

Why should a Java developer learn a new language? To write better Java, of course!

Scala, a language that compiles to Java bytecode, enables stronger object-oriented design and slathers functional support on top of that. Who would have thought that playing around with Scala would teach me new ways of working with such a core piece of Java as Collections?

Techniques and idioms of Scala open whole new lines of reasoning that apply right back to Java code. This article details a few lines of reasoning: how they’re exemplified in Scala, how we can use them in Java, and how we’ll use them even more in Java soon.

Do you think this is simple?

I’m going to show a piece of Java code, contrast it with Scala, and then call out three disadvantages of the Java style that I never thought about before learning Scala. Then I’ll explain what it is about Scala that makes its superior style possible, and finally show how we can do the same thing in Java with a little more effort.

Here’s a common piece of Java code in Listing 1. Say we have a list of nearby creatures and we want a list of all the hostile ones:

Listing 1

 

public List<Creature> getHostileCreatures(List<Creature> allCreatures) {
        List<Creature> output = new LinkedList<Creature>();
        for(creature : allCreatures) {
                if (isHostile(creature)) {
                        output.add(creature);
                }
        }
        return output;
}

 

This code is familiar to any Java programmer. Loop through a list, pulling out all the elements that meet a condition. It’s all over the Enterprise Java code I’ve written and maintained for years. The only part that varies is the conditional expression (highlighted in bold in Listing 1).

This common recipe is not a design pattern – don’t give it credit! It is a quilting pattern ie. a section of code that appears repeatedly in many places, with only specific sections varied. Remember: just because your grandmother wrote Java this way doesn’t mean you have to. It’s so familiar we don’t have to read it; we can skip straight to the conditional, to the one snippet of meaning in these nine lines.

I’ll show you simple

Reducing the boilerplate of quilting patterns is a strength of Scala. What are we really trying to do in getHostileCreatures? We’re going from one list to another with fewer elements; like coffee grinds the unwanted elements don’t make it through the filter. Here’s how this looks in Scala:

val enemies = allCreatures.filter(isHostile) 

Oh, that’s shorter. It’s declarative: it states what we’re trying to do – filter some elements out of the list – without showing how that is done. The immediate benefit of this style is readability. Unlike Listing 1, this is readable to people who haven’t spend a decade staring at idiomatic Java code. One caveat – some people expect filter to eliminate elements that meet the condition; it seems backwards. However, this syntax describes the elements in the returned list, such as “creatures that are hostile.” We’re making a positive statement about the elements we want.

In addition to readability, there are two other big wins to Scala’s declarative style here. Follow along.

Simple means the hard stuff is easy

When we separate “what we’re doing” from “how we’re doing it,” we get more flexibility in the “how.” If this is the big battle scene at the end, the list of creatures could be huge. We could break it into pieces and filter in parallel! In Java, that’s way too much work. But in Scala:

val enemies = allCreatures.par.filter(isHostile)

The par property of Scala’s default List returns a different implementation of List, one whose filter method breaks the list into sections and filters each section on a different thread. The operation we’re performing isn’t inherently “loop through each element and add some to a new list;” that’s an implementation detail hard-coded into the Java version. Separating “what” from “how” means the “how” can be customized independently. This is a separation of concerns. It meets the Prime Directive of development, the Single Responsibility Principle.

The third superiority of Scala’s filter method lies in this line from the Java code:

List<Creature> output = new LinkedList<Creature>();

The particular implementation of List is tightly coupled with this method. What if lists of creatures are more efficiently expressed as ArrayList? Or not as lists at all, but as sets or streams? Every place this quilting pattern was used, a concrete implementation of List is stitched into our application.

Consider filtering an extremely long list, something read from a 250M file. The style in Listing 1 requires the entire list in memory all at once; every line must be processed before the method returns. For serious data processing, this is not acceptable. Scala’s filter method, on the other hand, is defined on collection types that work differently, including streams and iterators. For processing a very long file, I’ll stick with an Iterator (see Listing 2). 

Listing 2

val source = io.Source.fromFile("creatures.scala")
val linesIterator = source.getLines()
linesIterator.filter(isRelevant).foreach { s =>
   // process and output
}
source.close()

 

The entire file never has to be in memory in this Scala code (Listing 2). The filter is applied lazily, one row at a time, as the elements are requested from the filtered iterator. Elements that were filtered out or have already been processed can be garbage collected. We can process files of arbitrary size in limited memory.

What's going on here?

We’ve seen that Scala’s collection processing idioms are more readable and more flexible than traditional Java style. There are many more collection-processing idioms that work the same way. Before looking at a few of those, let’s explore what it is about Scala that makes this style both possible and preferred.

val enemies = allCreatures.filter(isHostile)

Here, isHostile is a function that accepts a Creature and returns a Boolean.

def isHostile(creature : Creature) = creature.alignment == EVIL

In Scala, functions are first-class citizens, right up there with primitives and objects in Java. This means functions can be parameters to other functions. It’s convenient, and highly idiomatic in Scala to pass functions and methods around like this. Or to declare our own little anonymous function right inside the argument list:

val enemies = allCreatures.filter((creature) => isHostile(creature))

It is like being able to pass just the important snippet from Listing 1isHostile(creature) – into a method. This is how we separate “what” we’re doing from “how” we’re doing it. Passing bits of code around enables much more fine-grained separation of concerns.

That isn't magic

Now that we’ve defined what’s going on here, we can do it in Java. One way to pass code is the Strategy design pattern, which wraps a “how” in a custom object. In list filtration, the Strategy is “how can I know whether an element belongs in the output?” We call this function of one argument that returns boolean a Predicate. This works in Java:

Listing 3

 

import com.google.common.base.Predicate;
import static com.google.common.collect.Iterables.filter;
…
private static final Predicate<Creature> isHostile = new Predicate<Creature>() {
    @Override
    public boolean apply(Creature input) {
        return input.alignment == EVIL;
    }
};
…
Iterable<Creature> enemies = filter(allCreatures, isHostile);

 

The Guava library from Google provides some of the useful collections processing methods that are prevalent in Scala. There is some pain in the declaration of the Predicate because we have to wrap the code in an anonymous class. However, after shipping that mess off to a constant somewhere, we get all three benefits: our application logic is declarative and readable; the mechanics of filter are encapsulated elsewhere; and the filter method works on any Iterable or Iterator in a just-in-time fashion. Using Guava in this way, we aren’t tying ourselves to a particular implementation of List.

Caution

Be careful when switching between Iterables and Iterators. Iterables can be viewed, looped over, filtered, and transformed many times. These operations don’t change the state of a List or Set. With an Iterator you only get one shot – call one operation on an Iterator, and consider that variable expired.

The bad news is: this doesn’t give the warm fuzzies we’re use to from the quilting-pattern for-loop. Some of my past colleagues have declared use of Guava Iterables unreadable, because it is unfamiliar to them.

The good news: in Java 8 this will look a lot more like Scala:

Iterable<Creature> enemies = allCreatures.filter(creature -> isHostile(creature));

Two Java 8 features combine to make this possible: extension methods and lambda expressions. Virtual extension methods add to existing interfaces (such as Iterable) without breaking any old code. They do this by writing new methods in terms of existing interface methods. The quilting pattern for-loop will be encapsulated in a filter method on List. Hurray, we never have to write it again! The second killer feature in Java 8 is called lambda expressions. These eliminate the boilerplate in the Guava example, where it took seven lines to declare a Predicate.

In Java 8, we will write code snippets inline. The compiler forms these into whatever anonymous implementation of a single-method interface is required. creature -> isHostile(creature) becomes a Predicate at compilation. The code is much less cluttered, and no import is required. Therefore, if your coworkers complain about Guava collections hurting their poor widdle eyes, tell them to suck it up because a year from now this will be idiomatic, native Java.

Back up a step

That was a whole lot of analysis about a very simple task. It is worthwhile because the same thought process works on other operations on collections. For a quick example, instead of “extract some elements” substitute “pull some piece of data out of the elements.” Let’s take our hostile creatures and translate it into a list of their remaining strength (hit points). The same logic applies.

Listing 4 shows the Java example:

 

List<Integer> getHitPoints(List<Creature> input) {
        List<Integer> output = new LinkedList<Integer>();
        for(creature : input) {
                output.add(creature.getHitPoints());
        }
        return output;
}
List<Integer> enemyStrengths = getHitPoints(enemies);

 

That’s a quilting pattern all right. What we’re doing is turning each list element into something else. The one meaningful piece of code is creature.getHitPoints().

Here’s the Scala. For mathy historical reasons the method that does “translate each element into something else” is called “map.”

val enemyStrengths = enemies.map(_.hitPoints)

Scala programmers don’t like typing, so we use that underscore character to refer to the list element. This is equivalent to:

val enemyLife = enemies.map((creature) => creature.hitPoints)

In Java 7 + Guava, the method is called “transform” (Listing 5):

 

import com.google.common.base.Function;
import static com.google.common.collect.Iterables.transform;
…
    private static final Function<Creature, Integer> getHitPoints = new Function<Creature, Integer>() {
        @Override
        public Integer apply(Creature creature) {
            return creature.hitPoints;
        }
    };
…
  Iterable<Integer> enemyLife = transform(enemies, getHitPoints);

 

This has exactly the same perks as replacing the other quilting pattern: readable application logic, flexibility in concrete implementation, and just-in-time operations. The transforming function won’t be applied until later code requests elements from enemyLife.

Do it once, do it twice, third time generalize 

Here, we took a piece of code so common I could write it with my eyes closed, and stopped writing it over and over. After working with Scala’s collections library, every time typing code gets this easy, I stop and ask myself, “Wait. Has someone done this for me already? How can I do this more generally?”

For example, perhaps I want to categorize my enemies according to character class, mages in one group, ranged fighters in another, melee fighters in a third. I want a map of character classes to lists of enemies. In standard Java I’d create the map by hand, looping through the list. In Scala:

val enemiesByClass = enemies.groupBy(_.characterClass)

Now enemiesByClass is a map whose keys are character classes, and whose values are sequences of Creatures. What kind of sequence? Whatever kind enemies is. It could be a List, a Set, or a streaming Iterator. Dividing a list into categories like this is a general enough operation that someone has implemented it for me. I didn’t have to think about how to instantiate a mutable map and loop through a list and instantiate new lists to go into the map values – all I had to think about was what I wanted to do. My code makes that clear, and my code is better for it.

Guava gives us this same ability in Java 7. Its Multimap is a generalization of maps that hold multiple values per key. To create the one we want:

Multimap<CharacterClass, Creature> enemiesByClass = Multimap.index(enemies, getCharacterClass)

When code is passed around as a parameter, no repetition is too small to be eliminated. This doesn’t mean we should generalize every for loop we see; there’s a balance. But every time I think about writing a for-loop, I ask myself, “Is there a function to do this in one line?” In Scala, the answer is usually “yes.”

Want to combine the elements of a list into one? That’s reduce. It could be a sum or a concatenation or any other operation that turns two elements into one element. Want to find a particular element? Find. Turn a list of lists into one combined list? Flatten. Check whether all elements meet a condition? You get the idea.

Purpose Scala's GenIterable Guava Iterable
Get the first element head getFirst
Get the first element head getFirst
form a comma-separated string mkString(“, ”) Joiner.on(“, ”).join
backwards reverse reverse
search through lists    
get the first element that satisfies a condition find tryFind or firstMatch
get the index of the first element that meets a condition indexWhere (lastIndexWhere) indexOf
ask a question    
Does any element meet this condition? exists any or anyMatch
Do all elements meet this condition? forall all or allMatch
choose a subset    
Forget the first so many elements drop, dropWhile skip
Keep only the first so many elements take limit
Keep elements that do NOT meet a condition filterNot removeIf
divide into smaller collections    
break into collections of a specific size grouped partition
split into ones that meet the predicate and ones that don’t partition  
remove duplicates distinct toImmutableSet
build a new collection    
put collections together concat concat
transform each element into 0 or more elements flatMap transform and then flatten
 
 Table 1: Useful collection functions for Guava and Scala

Table 1 lists some other useful collection functions in Guava and Scala. The only consistency is inconsistency, so look for these same functions in other languages (such as Groovy) under entirely different names.

The payload

Each of these tools is useful, but the benefits compound when you put them together. Let’s get the mage with the fewest hit points. In Scala:

val target = enemiesByClass.get(Mage).map(_.minBy(_.hitPoints))

Guava offers this fluent interface in Java:

Creature target = FluentIterable.from( creaturesByClass.get(CharacterClass.MAGE)) .toImmutableSortedSet(compareByHp).first();

Composition is the silver bullet of Scala and other functional languages. The smaller the functions, the more separated the concerns, the more ways we can assemble them. Going from Java to Scala felt like graduating from Mega Bloks to Lego. There are so many more ways to fit the pieces together!

You don’t have to switch to Scala in order to get the benefits of thinking and coding in smaller, more precise pieces. Use Guava, cross your fingers that Java 8 comes out with lambda expressions in 2013, and consider your options before you embed another quilting patterns in your code. Trust me, it’s more beautiful without them.

Zoom in on the code (1)

FluentIterable.from( creaturesByClass.get(CharacterClass.MAGE)) .toImmutableSortedSet(compareByHp).first();

  • from: Guava’s FluentIterable wraps any Iterable with a fluent interface.

  • get: Multimap’s get method never returns null – only an empty collection.

  • toImmutableSortedSet: Passing a Comparator to a sort method is normal Java – see, this stuff isn’t weird!

  • first: This is a standard Java Set method. It will throw an exception if the set is empty, which will happen after you’ve killed all the enemy mages.

Zoom in on the code (2)

val target = enemiesByClass.get (Mage).map(_.minBy(_.hitPoints))

  • get: Scala’s map method never returns null; instead, you get a collection of 0 or 1 elements, called Option. If there are any mages, the Option will contain a list of them.

  • map: The operation passed to map applies to the list of mages inside the Option.

  • minBy: This will choose the minimum mage, where .hitPoints defines what minimum means. We don’t have to pass a Comparator to minBy; any function that converts a Creature into something Comparable will do.

  • target’s type is Option[Creature]. If the enemy has mages, target contains the weakest one. If there are no mages to fight, target is empty.

Further reading:

Author BioJessica Kerr is a Java-turned-Scala developer in St. Louis, MO. Her passion for learning is countered only by her taste for whisky. She participates in the JetBrains Academy, helps organize the local Java User Group, entertains her two daughters, and tweets as @jessitron.

This article previously featured in JAX Magazine TomEE: The Tomcat you love and more. For that issue and others, check here.

Jessica Kerr
Jessica Kerr

What do you think?

JAX Magazine - 2014 - 05 Exclucively for iPad users JAX Magazine on Android

Comments

Latest opinions