This ain't your grandma's Java

Everything I know about Java collections I learned from Scala

JessicaKerr
Scala-logo

From December’s JAX Magazine, Jessica Kerr explains how using Scala concepts can help us when using plain old Java.

From December’s JAX Magazine, Jessica Kerr explains how using
Scala concepts can help us when using plain old Java.

Why should a Java developer learn a new language? To write
better Java, of course!

Scala, a language that compiles to Java bytecode, enables
stronger object-oriented design and slathers functional support on
top of that. Who would have thought that playing around with Scala
would teach me new ways of working with such a core piece of Java
as Collections?

Techniques and idioms of Scala open whole new lines of reasoning
that apply right back to Java code. This article details a few
lines of reasoning: how they’re exemplified in Scala, how we can
use them in Java, and how we’ll use them even more in Java
soon.

Do you think this is simple?

I’m going to show a piece of Java code, contrast it with Scala,
and then call out three disadvantages of the Java style that I
never thought about before learning Scala. Then I’ll explain what
it is about Scala that makes its superior style possible, and
finally show how we can do the same thing in Java with a little
more effort.

Here’s a common piece of
Java code in
Listing 1. Say we have a list of nearby creatures and we
want a list of all the hostile ones:

Listing 1

 

public List<Creature> getHostileCreatures(List<Creature> allCreatures) {
        List<Creature> output = new LinkedList<Creature>();
        for(creature : allCreatures) {
                if (isHostile(creature)) {
                        output.add(creature);
                }
        }
        return output;
}

 

This code is familiar to any Java programmer. Loop through a
list, pulling out all the elements that meet a condition. It’s all
over the Enterprise Java code I’ve written and maintained for
years. The only part that varies is the conditional expression
(highlighted in bold in Listing 1).

This common recipe is not a design pattern – don’t give it
credit! It is a quilting pattern ie. a section of code that appears repeatedly in many places,
with only specific sections varied. Remember: just because your
grandmother wrote Java this way doesn’t mean you have to.

It’s so familiar we don’t have to read it; we can skip straight to
the conditional, to the one snippet of meaning in these nine
lines.

I’ll show you simple

Reducing the boilerplate of quilting patterns is a strength of
Scala. What are we really trying to do in getHostileCreatures? We’re
going from one list to another with fewer elements; like coffee
grinds the unwanted elements don’t make it through the filter.
Here’s how this looks in Scala:

val enemies =
allCreatures.filter(isHostile)
 

Oh, that’s shorter. It’s declarative: it states what we’re
trying to do – filter some elements out of the list – without
showing how that is done. The immediate benefit of this style is
readability. Unlike Listing 1, this is readable to
people who haven’t spend a decade staring at idiomatic Java code.
One caveat – some people expect filter to eliminate elements that
meet the condition; it seems backwards. However, this syntax
describes the elements in the returned list, such as “creatures
that are hostile.” We’re making a positive statement about the
elements we want.

In addition to readability, there are two other big wins to
Scala’s declarative style here. Follow along.

Simple means the hard stuff is easy

When we separate “what we’re doing” from “how we’re doing it,”
we get more flexibility in the “how.” If this is the big battle
scene at the end, the list of creatures could be huge. We could
break it into pieces and filter in parallel! In Java, that’s way
too much work. But in Scala:

val enemies =
allCreatures.
par.filter(isHostile)

The par property of Scala’s default List returns
a different implementation of List, one whose filter method breaks
the list into sections and filters each section on a different
thread. The operation we’re performing isn’t inherently “loop
through each element and add some to a new list;” that’s an
implementation detail hard-coded into the Java version. Separating
“what” from “how” means the “how” can be customized independently.
This is a separation of concerns. It meets the Prime Directive of
development, the Single Responsibility Principle.

The third superiority of Scala’s filter method lies in this line
from the Java code:

List<Creature>
output = new LinkedList<Creature>();

The particular implementation of List is tightly coupled with
this method. What if lists of creatures are more efficiently
expressed as ArrayList? Or not as lists at all, but as sets or
streams? Every place this quilting pattern was used, a concrete
implementation of List is stitched into our application.

Consider filtering an extremely long list, something read from a
250M file. The style in Listing 1 requires the
entire list in memory all at once; every line must be processed
before the method returns. For serious data processing, this is not
acceptable. Scala’s filter method, on the other hand, is defined on
collection types that work differently, including streams and
iterators. For processing a very long file, I’ll stick with an
Iterator (see Listing 2). 

Listing 2

val source = io.Source.fromFile("creatures.scala")
val linesIterator = source.getLines()
linesIterator.filter(isRelevant).foreach { s =>
   // process and output
}
source.close()

 

The entire file never has to be in memory in this Scala code
(Listing 2). The filter is applied lazily, one row
at a time, as the elements are requested from the filtered
iterator. Elements that were filtered out or have already been
processed can be garbage collected. We can process files of
arbitrary size in limited memory.

What’s going on here?

We’ve seen that Scala’s collection processing idioms are more
readable and more flexible than traditional Java style. There are
many more collection-processing idioms that work the same way.
Before looking at a few of those, let’s explore what it is about
Scala that makes this style both possible and preferred.

val enemies =
allCreatures.filter(isHostile)

Here, isHostile is a function that accepts a Creature and
returns a Boolean.

def
isHostile(creature : Creature) = creature.alignment ==
EVIL

In Scala, functions are first-class citizens, right up there
with primitives and objects in Java. This means functions can be
parameters to other functions. It’s convenient, and highly
idiomatic in Scala to pass functions and methods around like this.
Or to declare our own little anonymous function right inside the
argument list:

val enemies =
allCreatures.filter((creature) =>
isHostile(creature))

It is like being able to pass just the important snippet from
Listing 1isHostile(creature) – into a
method. This is how we separate “what” we’re doing from “how” we’re
doing it. Passing bits of code around enables much more
fine-grained separation of concerns.

That isn’t magic

Now that we’ve defined what’s going on here, we can do it in
Java. One way to pass code is the Strategy design pattern, which
wraps a “how” in a custom object. In list filtration, the Strategy
is “how can I know whether an element belongs in the output?” We
call this function of one argument that returns boolean a
Predicate. This works in Java:

Listing 3

 

import com.google.common.base.Predicate;
import static com.google.common.collect.Iterables.filter;
…
private static final Predicate<Creature> isHostile = new Predicate<Creature>() {
    @Override
    public boolean apply(Creature input) {
        return input.alignment == EVIL;
    }
};
…
Iterable<Creature> enemies = filter(allCreatures, isHostile);

 

The Guava library from Google provides some of the useful
collections processing methods that are prevalent in Scala. There
is some pain in the declaration of the Predicate because we have to
wrap the code in an anonymous class. However, after shipping that
mess off to a constant somewhere, we get all three benefits: our
application logic is declarative and readable; the mechanics of
filter are encapsulated elsewhere; and the filter method works on
any Iterable or Iterator in a just-in-time fashion. Using Guava in
this way, we aren’t tying ourselves to a particular implementation
of List.

Caution

Be careful when switching between Iterables and Iterators.
Iterables can be viewed, looped over, filtered, and transformed
many times. These operations don’t change the state of a List or
Set. With an Iterator you only get one shot – call one operation on
an Iterator, and consider that variable expired.

The bad news is: this doesn’t give the warm fuzzies we’re use to
from the quilting-pattern for-loop. Some of my past colleagues have
declared use of Guava Iterables unreadable, because it is
unfamiliar to them.

The good news: in Java 8 this will look a lot more like
Scala:

Iterable<Creature>
enemies = allCreatures.filter(creature ->
isHostile(creature));

Two Java 8 features combine to make this possible: extension
methods and lambda expressions. Virtual extension methods add to
existing interfaces (such as Iterable) without breaking any old
code. They do this by writing new methods in terms of existing
interface methods. The quilting pattern for-loop will be
encapsulated in a filter method on List. Hurray, we never have to
write it again! The second killer feature in Java 8 is called
lambda expressions. These eliminate the boilerplate in the Guava
example, where it took seven lines to declare a Predicate.

In Java 8, we will write code snippets inline. The compiler
forms these into whatever anonymous implementation of a
single-method interface is required. creature ->
isHostile(creature)
becomes a Predicate at compilation. The
code is much less cluttered, and no import is required. Therefore,
if your coworkers complain about Guava collections hurting their
poor widdle eyes, tell them to suck it up because a year from now
this will be idiomatic, native Java.

Back up a step

That was a whole lot of analysis about a very simple task. It is
worthwhile because the same thought process works on other
operations on collections. For a quick example, instead of “extract
some elements” substitute “pull some piece of data out of the
elements.” Let’s take our hostile creatures and translate it into a
list of their remaining strength (hit points). The same logic
applies.

Listing 4 shows the Java example:

 

List<Integer> getHitPoints(List<Creature> input) {
        List<Integer> output = new LinkedList<Integer>();
        for(creature : input) {
                output.add(creature.getHitPoints());
        }
        return output;
}
List<Integer> enemyStrengths = getHitPoints(enemies);

 

That’s a quilting pattern all right. What we’re doing is turning
each list element into something else. The one meaningful piece of
code is creature.getHitPoints().

Here’s the Scala. For mathy historical reasons the method that
does “translate each element into something else” is called
“map.”

val
enemyStrengths = enemies.map(_.hitPoints)

Scala programmers don’t like typing, so we use that underscore
character to refer to the list element. This is equivalent to:

val enemyLife =
enemies.map((creature) => creature.hitPoints)

In Java 7 + Guava, the method is called “transform”
(Listing 5):

 

import com.google.common.base.Function;
import static com.google.common.collect.Iterables.transform;
…
    private static final Function<Creature, Integer> getHitPoints = new Function<Creature, Integer>() {
        @Override
        public Integer apply(Creature creature) {
            return creature.hitPoints;
        }
    };
…
  Iterable<Integer> enemyLife = transform(enemies, getHitPoints);

 

This has exactly the same perks as replacing the other quilting
pattern: readable application logic, flexibility in concrete
implementation, and just-in-time operations. The transforming
function won’t be applied until later code requests elements from
enemyLife.

Do it once, do it twice, third time generalize 

Here, we took a piece of code so common I could write it with my
eyes closed, and stopped writing it over and over. After working
with Scala’s collections library, every time typing code gets this
easy, I stop and ask myself, “Wait. Has someone done this for me
already? How can I do this more generally?”

For example, perhaps I want to categorize my enemies according
to character class, mages in one group, ranged fighters in another,
melee fighters in a third. I want a map of character classes to
lists of enemies. In standard Java I’d create the map by hand,
looping through the list. In Scala:

val
enemiesByClass = enemies.groupBy(_.characterClass)

Now enemiesByClass is a map whose
keys are character classes, and whose values are sequences of
Creatures. What kind of sequence? Whatever kind enemies is. It could be a
List, a Set, or a streaming Iterator. Dividing a list into
categories like this is a general enough operation that someone has
implemented it for me. I didn’t have to think about how to
instantiate a mutable map and loop through a list and instantiate
new lists to go into the map values – all I had to think about was
what I wanted to do. My code makes that clear, and my code
is better for it.

Guava gives us this same ability in Java 7. Its Multimap is a
generalization of maps that hold multiple values per key. To create
the one we want:

Multimap<CharacterClass,
Creature> enemiesByClass = Multimap.index(enemies,
getCharacterClass)

When code is passed around as a parameter, no repetition is too
small to be eliminated. This doesn’t mean we should generalize
every for loop we see; there’s a balance. But every time I think
about writing a for-loop, I ask myself, “Is there a function to do
this in one line?” In Scala, the answer is usually “yes.”

Want to combine the elements of a list into one? That’s reduce.
It could be a sum or a concatenation or any other operation that
turns two elements into one element. Want to find a particular
element? Find. Turn a list of lists into one combined list?
Flatten. Check whether all elements meet a condition? You get the
idea.

Purpose Scala’s
GenIterable
Guava Iterable
Get the first element head getFirst
Get the first element head getFirst
form a comma-separated string mkString(“, ”) Joiner.on(“, ”).join
backwards reverse reverse
search through lists    
get the first element that satisfies a
condition
find tryFind or firstMatch
get the index of the first element that meets a
condition
indexWhere (lastIndexWhere) indexOf
ask a question    
Does any element meet this condition? exists any or anyMatch
Do all elements meet this condition? forall all or allMatch
choose a subset    
Forget the first so many elements drop, dropWhile skip
Keep only the first so many elements take limit
Keep elements that do NOT meet a condition filterNot removeIf
divide into smaller
collections
   
break into collections of a specific size grouped partition
split into ones that meet the predicate and
ones that don’t
partition  
remove duplicates distinct toImmutableSet
build a new collection    
put collections together concat concat
transform each element into 0 or more
elements
flatMap transform and then flatten
 
 Table 1: Useful collection functions for Guava and
Scala

Table 1 lists some other useful collection
functions in Guava and Scala. The only consistency is
inconsistency, so look for these same functions in other languages
(such as Groovy) under entirely different names.

The payload

Each of these tools is useful, but the benefits compound when
you put them together. Let’s get the mage with the fewest hit
points. In Scala:

val target =
enemiesByClass.get(Mage).map(_.minBy(_.hitPoints))

Guava offers this fluent interface in Java:

Creature target =
FluentIterable.from( creaturesByClass.get(CharacterClass.MAGE))
.toImmutableSortedSet(compareByHp).first();

Composition is the silver bullet of Scala and other functional
languages. The smaller the functions, the more separated the
concerns, the more ways we can assemble them. Going from Java to
Scala felt like graduating from Mega Bloks to Lego. There are so
many more ways to fit the pieces together!

You don’t have to switch to Scala in order to get the benefits
of thinking and coding in smaller, more precise pieces. Use Guava,
cross your fingers that Java 8 comes out with lambda expressions in
2013, and consider your options before you embed another quilting
patterns in your code. Trust me, it’s more beautiful without
them.

Zoom in on the code (1)

FluentIterable.from(
creaturesByClass.get(CharacterClass.MAGE))
.toImmutableSortedSet(compareByHp).first();

  • from: Guava’s FluentIterable wraps any
    Iterable with a fluent interface.

  • get: Multimap’s get method never
    returns null – only an empty collection.

  • toImmutableSortedSet: Passing a
    Comparator to a sort method is normal Java – see, this stuff isn’t
    weird!

  • first: This is a
    standard Java Set method. It will throw an exception if the set is
    empty, which will happen after you’ve killed all the enemy
    mages.

Zoom in on the code (2)

val
target = enemiesByClass.get
(Mage).map(_.minBy(_.hitPoints))

  • get: Scala’s
    map method never returns null; instead, you get a collection of 0
    or 1 elements, called Option. If there are any mages, the Option
    will contain a list of them.

  • map: The operation passed to
    map applies to the list of mages inside the Option.

  • minBy:
    This will choose the minimum mage, where
    .hitPoints
    defines what minimum means. We don’t
    have to pass a Comparator to
    minBy; any function that converts a Creature into
    something Comparable will do.

  • target’s
    type is Option[Creature]. If the
    enemy has mages,
    target contains the weakest one. If there are no mages
    to fight,
    target is empty.

Further reading:

Author BioJessica Kerr is a
Java-turned-Scala developer in St. Louis, MO. Her passion for
learning is countered only by her taste for whisky. She
participates in the JetBrains Academy, helps organize the local
Java User Group, entertains her two daughters, and tweets as
@jessitron.

This article previously featured in JAX Magazine TomEE: The
Tomcat you love and more. For that issue and others, check
here
.

Author
JessicaKerr
Jessica Kerr is a Java-turned-Scala developer in St. Louis, MO. Her passion for learning is countered only by her taste for whisky. She participates in the JetBrains Academy, helps organize the local Java User Group, entertains her two daughters, and tweets as @jessitron.
Comments
comments powered by Disqus