A Lot to Explore

Java 7 – Project Coin

Ben Evans
Java-7-Project-Coin-teaser-pic

Welcome to Java 7. Things around here are a little different than you may be used to…

Welcome to Java 7. Things around here are a little different
than you may be used to! This is a really good thing – we have a
lot to explore now that the dust has settled and Java 7 is on its
way. We’re going to warm you up with a gentle introduction to Java
7, but one that still acquaints you with its powerful features.
We’ll showcase Project Coin, a collection of small yet effective
new features. You’ll learn new syntax, such as an improved way of
handling exceptions (multi-catch). You’ll also learn about
try-with-resources and how it helps you to avoid bugs in the code
that deals with files or other resources such as JDBC. By the end
of this article, you’ll be writing Java in a new way and you’ll be
fully primed and ready for larger changes in Java 7 such as NIO.2.
So, let’s get going, shall we!

We’re going to talk in some detail about some of the proposals
in Project Coin. We’ll discuss the syntax and the meaning of the
new features and also some of the “whys” – that is, we’ll try to
explain the motivations behind the feature whenever possible
without resorting to the full formal details of the proposals. All
that material is available from the archives of the coin-dev
mailing list so, if you’re a budding language designer, you can
read the full proposals and discussion there. Without further ado,
let’s kick off with our very first new Java 7 feature – string
values in a switch statement.

Strings in Switch

Java’s switch statement allows you to write an efficient
multiple-branch statement without lots and lots of ugly nested ifs,
like this:

In Java 6 and earlier versions, the
values for the cases can only be constants of type byte, char,
short, and int (or, technically, their reference-type equivalents,
Byte, Character, Short, and Integer) or enum constants. With Java
7, the spec has been extended to allow for Strings to be used as
well – they’re constants after all:

 

 

In all other respects, the switch
statement remains the same; like many Project Coin changes, this
really is a very simple change for making life in Java 7 a little
easier.

 

Enhanced syntax for numeric
literals

 

Several proposals offered new syntax
for integers. The aspects that were ultimately chosen were:

 

• Numeric constants (one of the
integer primitive types) expressed as binary values.

• A specific suffix to denote that
an integer constant has type short or byte.

• Use of underscores in integer
constants for readability.

 

None of these is particularly
earth-shattering at first sight but all have, in their own way,
been a minor annoyance to the Java programmer.

 

The first two are of special
interest to the low-level programmer – the sort of person who works
with raw network protocols, encryption, or other pursuits where
they may have to indulge in a certain amount of bit twiddling. So
let’s take a look at those first.

 

Binary Literals

 

Before Java 7, if you’d wanted to
manipulate a binary value, you’d either have had to engage in
awkward (and error-prone) base conversion or write an expression
like this:

 

int x =
Integer.parseInt(“1100110″, 2);

 

This is a lot of typing just to
ensure that x ends up with that bit pattern (which is 102 in
decimal, by the way). There’s worse to come though. Despite looking
fine at first glance, there are a number of problems. It:

 

• Is really verbose.

 

• Has a performance hit for that
method call.

 

• Means you’d have to know about the
two-argument form of parseInt().

 

• Requires you to remember the
detail of how parseInt() behaves when it has two args.

 

• Makes life hard for the JIT
compiler.

 

• Is representing a compile-time
constant as a runtime expression (so it can’t be used as a value in
a switch statement).

 

• Will give you a runtime exception
(but no compile-time exception) if you get a typo in the binary
value.

 

Fortunately, with the advent of Java
7, we can now write:

int x = 0b1100110;

 

Now, no one’s saying that this is
doing anything that couldn’t be done before, but it has none of the
problems we listed above. So, if you’ve got a reason to work with
binary values – for example, low-level handling of bytes, where you
can now have bit-patterns as binary constants in switch statements
– this is one small feature that you might find helpful.

 

Underscores in
numbers

 

You’ve probably noticed that the
human mind is really quite radically different from a computer’s
CPU. One specific example of this is in the way that our minds
handle numbers. Humans aren’t in general very comfortable with long
strings of numbers. That’s one reason we invented the hexadecimal
system – because our minds find it easier to deal with shorter
strings that contain more information rather than long strings
containing little information per character.

That is, we find 1c372ba3 easier to
deal with than 00011100001101110010101110100011, even though a CPU
would really only ever see the second form.

 

One way that we humans deal with
long strings of numbers is to break them up. A US phone number is
usually represented like this:

 

404-555-0122

 

Other long strings of numbers have
separators too:

 

$100,000,000 (Large sums of
money)

08-92-96 (UK banking sort codes)

 

Unfortunately, both , and – have too
many possible meanings within the realm of handling numbers while
programming, so we can’t use either of those as a separator.
Instead, the Project Coin proposal borrowed an idea from Ruby and
introduced the underscore, _, as a separator. Note that this is
just a bit of easy-on-the-eyes compile time syntax – the compiler
just strips out those underscores and stores the usual digits.

 

So, you can write 100_000_000 and
you should hopefully not confuse that with 10_000_000 (unlike
100000000, which is easily confused with 10000000). Or, to apply
this to our own examples:

 

long l2 = 2_147_483_648L;

int bitPattern =
0b0001_1100__0011_0111__0010_1011__1010_0011;

 

Notice how much easier it is to read
the value assigned to l2. (Yes, it’s 2G, and it’s too big to fit
into an int.) By now, you should be convinced of the benefit of
these tweaks to the handling of integers, so let’s move on.

     
   

 

Improved exception
handling

 

There are two parts to this
improvement – multi-catch and, effectively, final rethrow. To see
why they’re helpful, consider the following Java 6 code, which
tries to find, open, and parse a configuration file and handles a
number of different possible exceptions, as shown in listing 1.
getConfig() is a method that can encounter a number of different
exceptional conditions:

 

• The configuration file may not
exist.

• It may disappear while we’re
trying to read from it.

• It may be malformed
syntactically.

• It may have invalid information in
it.

 

The exceptions really fit into two
distinct functional groups. Either the file is missing or bad in
some way or the file is present and correct in theory but was not
retrievable (perhaps because of hardware failure or network
outage). It would be nice to compress the cases down into just
these two cases.

 

Java 7 allows us to do this, as
shown in listing 2.

 

Note that the exception has to be
handled in the catch block as the common supertype of the possible
exceptions (which will usually be Exception or Throwable in
practice) because the exact type is not knowable at
compile-time.

 

An additional bit of new syntax is
for helping with rethrowing exceptions. In many cases, developers
may want to manipulate a thrown exception before rethrowing it. The
problem is that, in previous versions of Java, code like this:

 

 

will force the programmer to declare
the exception signature of this code as Throwable – the real
dynamic type of the exception has been swallowed. However, it’s
relatively easy to see that the exception can only be an
IOException or a SQLException and, if we can see it, then so can
the compiler. In this snippet, we’ve made a single word change to
use the next Java 7 syntax:

 

 

The appearance of the final keyword
indicates that the type that is actually thrown is the runtime type
of the exception that was actually encountered – in this example,
this would either be IOException or SQLException. This is referred
to as “final rethrow” and can protect against throwing an overly
general type here, which then has to be caught by a very general
catch in a higher scope. Enhancements in the compiler mean that the
final keyword is actually optional, but we’ve found that, while
starting out with this feature, it’s actually easier to include
it.

 

In addition to these general
improvements to exception handling, the specific case of handling
resources has been improved in 7 – so that’s where we’ll turn
next.

 

Try-with-resources


This change is easy to explain but
has proven to have hidden subtleties, which made it much less easy
to implement than originally hoped. The basic idea is to allow a
resource (for example, a file or something a bit like one) to be
scoped to a block in such a way that the resource is automatically
closed when control exits the block.

 

 

This is an important change for the
simple reason that virtually no one gets the manual handling of
resource closing 100 percent right. Until recently, even the
reference how-to from Sun was wrong. The proposal submitted to
Project Coin for this change includes the astounding claim that two
thirds of the uses of close() in the JDK had bugs in them!

 

Fortunately, compilers can be made
to excel at producing exactly the sort of pedantic, boilerplate
code that humans so often get wrong, and that’s the approach taken
by this change, which is usually referred to as
try-with-resources.

 

This is a big help in writing
error-free code. To see just how much, consider how you would write
a block of code in order to read from a URL-based stream URL and
write to a file with Java 6. It would look something like it’s
shown in listing 3.

 

 

The key point here is that, when
handling external resources, Murphy’s Law applies – anything can go
wrong at any time:

 

1. The InputStream can fail:

• To open from the URL.

• To read from it.

• To close properly.

 

2. The file corresponding to the
OutputStream can fail:

• To open.

• To write to it.

• To close properly.

  1. Or have some combination of more
    than one of the above.

This last possibility is actually
where a lot of the headaches come from – the possibility of some
combination of exceptions is very difficult to deal with well.

 

Let’s consider some Java 7 code for
saving code from the web. As the name suggests, url is a URL object
that points at the entity we want to download, and file is a File
object where we want to save what we’re downloading. Let’s look at
Listing 4.

 

 

 

This basic form shows the new syntax
for a block with automatic management – the try with the resource
in round brackets. For C# programmers, this is probably a bit
reminiscent of a using clause and that’s a good starting point when
working with this new feature. The resources are used by the block
and then automatically disposed of when you’re done with them. You
still need to worry about handling exceptions with regards to
finding the valid resource in the first place, but, once you’re
using it, the resource gets automatically closed.

 

This is the main reason for
preferring the new syntax – it’s just much less error prone – the
compiler is not susceptible to the mistakes that basically every
developer will make when trying to write this type of code
manually.

 

Diamond syntax

 

One of the problems with generics is
that the definitions and the setup of instances can be really
verbose. Let’s suppose that you have some users, whom you identify
by a user id (which is an integer), and each user has some lookup
tables, and the tables are specific to each user. What would that
look like in code?

 

That’s quite a mouthful, and almost
half of it is just duplicated characters. Wouldn’t it be better if
we could just write something like the code below, and have the
compiler just infer the type information on the right hand
side?

 

Map<Integer, List<String,
String>> usersLists = new
HashMap<>();

 

Thanks to the magic of Project Coin
– you can. In Java 7, the shortened form for declarations like that
is entirely legal. It’s backwards compatible as well so, when you
find yourself revisiting old code, you can just cut the older, more
verbose declaration and start using the new type-inferred syntax to
save a few pixels.

 

Simplified varargs method
invocation

 

This is one of the simplest changes
of all – it just moves a warning about type information for quite a
specific case where varargs combines with generics in a method
signature. Put another way, unless you’re in the habit of writing
code that takes as arguments a variable number of references of
type T and does something to make a collection out of them, such as
code that looks like this:

 

 

 

Then you can move on to the next
section. Still here? Good. So what’s this issue all about?

Well, as you probably already know,
a varargs method is one that takes a variable number of parameters
(all of the same type) at the end of the argument list. What you
may not know is how varargs is implemented. All of the variable
parameters at the end are put into an array (which the compiler
automatically creates for you) and are passed as a single
parameter.

 

This is all well and good, but here
we run into one of the admitted weaknesses of Java’s generics – you
are not normally allowed to create an array of a known generic
type. So, this:

 

HashMap<String, String>[]
arryHm = new HashMap<>[2];

 

Won’t compile; you can’t make arrays
of a specified generic type. Instead, you have to do this:

 

HashMap<String, String>[]
warnHm = new HashMap[2];

 

Which gives a warning that has to be
ignored. Notice that you can define warnHm to be of the type array
of HashMap<String, String>. You just can’t create any
instances of that type and, instead, have to hold your nose (or at
least, suppress the warning) and force an instance of the raw type
(which is array of HashMap) into warnHm.

 

These two features – varargs methods
really working on the synthetic arrays that the compiler conjures
up and arrays of known generic types not being valid instantiable
types – come together to cause us a slight headache. Consider this
bit of code:

 

 

 

The compiler will attempt to create
an array to contain hm1 and hm2, but the type of the array should
strictly be one of the forbidden array types. Faced with this
dilemma, the compiler basically cheats and breaks its own rule
about the forbidden array of generic type. It creates the array
instance but grumbles about it, producing a compiler warning that
mutters darkly about “uses unchecked or unsafe operations.”

 

From the point of view of the type
system, this is fair enough. However, the poor developer just
wanted to use what seemed like a perfectly sensible API and now
there are these scary-sounding warnings for no adequately explained
reason.

 

What’s changed in Java
7


Java 7 brought a change in the
emphasis of the warning. After all, there is a potential for
violating type safety in these types of constructions, and somebody
had better be informed about them. There’s not much that the users
of these types of APIs can really do, though. Either the code
inside doSomething() is evil and violates type safety or it
doesn’t. In any case, it’s out of the developer’s hands.

 

The person who should really be
warned about this issue is the person who wrote doSomething() – the
API producer, rather than the consumer. So that’s where the warning
goes – it’s moved from the site of the API use (the warning used to
be triggered when the code that used the API was compiled) to the
site where the API was defined (so the warning is now triggered
when an API is written, which has the possibility to trigger this
kind of potential type safety violation). The compiler warns the
coder implementing the API and it’s up to them to pay proper
attention to the type system.

 

Changes to the type
system

 

That’s an awful lot of words to
describe a very small change. Moving a warning from one place to
another is hardly a game-changing language feature, but it does
serve to illustrate one very important point. Earlier in this
paper, we mentioned that Project Coin encouraged contributors to
mostly try and stay away from the type system when proposing
changes.

This example shows how involved you
need to get when figuring out how different features of the type
system interact, and how that interaction will alter when a change
to the language is implemented. This isn’t even a particularly
complex change; larger changes would be far, far more involved with
potentially dozens of subtle ramifications.

 

This final example illustrates how
intricate the effect of small changes can be and completes our
discussion of the changes brought in by Project Coin. Although they
represent mostly small syntactic changes, once you’ve started using
them in practice, you will probably find that they have a positive
impact on your code that is out of proportion with the size of the
change.

 

Summary

 

This article has been all about
introducing some of the smaller changes in the syntax for Java 7.
You saw that, although the changes are not earth-shattering, Java 7
will be a little bit easier to write in a more concise and
error-free manner. You also learned that there can be challenges
that cause language designers to make smaller and more conservative
changes than they might otherwise wish.

 

We hope you enjoyed this article and
look forward to discussing more about Java 7 and polyglot
programming on the JVM with you in a pub near you soon!

 


Author
Ben Evans
Ben Evans is a member of the Java SE/EE Executive Committee, helping define standards for the Java ecosystem. He works as a technical architect and development lead in the financial industry. He is an organizer for the UK Graduate Developer Community, a co-leader of the London Java Community and a regular speakeron Java, concurrency, new programming languages and related topics.
Comments
comments powered by Disqus