Java 7 – Project Coin
Welcome to Java 7. Things around here are a little different than you may be used to…
Welcome to Java 7. Things around here are a little different than you may be used to! This is a really good thing – we have a lot to explore now that the dust has settled and Java 7 is on its way. We’re going to warm you up with a gentle introduction to Java 7, but one that still acquaints you with its powerful features. We’ll showcase Project Coin, a collection of small yet effective new features. You’ll learn new syntax, such as an improved way of handling exceptions (multi-catch). You’ll also learn about try-with-resources and how it helps you to avoid bugs in the code that deals with files or other resources such as JDBC. By the end of this article, you’ll be writing Java in a new way and you’ll be fully primed and ready for larger changes in Java 7 such as NIO.2. So, let’s get going, shall we!
We’re going to talk in some detail about some of the proposals in Project Coin. We’ll discuss the syntax and the meaning of the new features and also some of the “whys” – that is, we’ll try to explain the motivations behind the feature whenever possible without resorting to the full formal details of the proposals. All that material is available from the archives of the coin-dev mailing list so, if you’re a budding language designer, you can read the full proposals and discussion there. Without further ado, let’s kick off with our very first new Java 7 feature – string values in a switch statement.
Strings in Switch
Java’s switch statement allows you to write an efficient multiple-branch statement without lots and lots of ugly nested ifs, like this:
In Java 6 and earlier versions, the values for the cases can only be constants of type byte, char, short, and int (or, technically, their reference-type equivalents, Byte, Character, Short, and Integer) or enum constants. With Java 7, the spec has been extended to allow for Strings to be used as well – they’re constants after all:
In all other respects, the switch statement remains the same; like many Project Coin changes, this really is a very simple change for making life in Java 7 a little easier.
Enhanced syntax for numeric literals
Several proposals offered new syntax for integers. The aspects that were ultimately chosen were:
• Numeric constants (one of the integer primitive types) expressed as binary values.
• A specific suffix to denote that an integer constant has type short or byte.
• Use of underscores in integer constants for readability.
None of these is particularly earth-shattering at first sight but all have, in their own way, been a minor annoyance to the Java programmer.
The first two are of special interest to the low-level programmer – the sort of person who works with raw network protocols, encryption, or other pursuits where they may have to indulge in a certain amount of bit twiddling. So let’s take a look at those first.
Before Java 7, if you’d wanted to manipulate a binary value, you’d either have had to engage in awkward (and error-prone) base conversion or write an expression like this:
int x = Integer.parseInt(“1100110″, 2);
This is a lot of typing just to ensure that x ends up with that bit pattern (which is 102 in decimal, by the way). There’s worse to come though. Despite looking fine at first glance, there are a number of problems. It:
• Is really verbose.
• Has a performance hit for that method call.
• Means you’d have to know about the two-argument form of parseInt().
• Requires you to remember the detail of how parseInt() behaves when it has two args.
• Makes life hard for the JIT compiler.
• Is representing a compile-time constant as a runtime expression (so it can’t be used as a value in a switch statement).
• Will give you a runtime exception (but no compile-time exception) if you get a typo in the binary value.
Fortunately, with the advent of Java 7, we can now write:
int x = 0b1100110;
Now, no one’s saying that this is doing anything that couldn’t be done before, but it has none of the problems we listed above. So, if you’ve got a reason to work with binary values – for example, low-level handling of bytes, where you can now have bit-patterns as binary constants in switch statements – this is one small feature that you might find helpful.
Underscores in numbers
You’ve probably noticed that the human mind is really quite radically different from a computer’s CPU. One specific example of this is in the way that our minds handle numbers. Humans aren’t in general very comfortable with long strings of numbers. That’s one reason we invented the hexadecimal system – because our minds find it easier to deal with shorter strings that contain more information rather than long strings containing little information per character.
That is, we find 1c372ba3 easier to deal with than 00011100001101110010101110100011, even though a CPU would really only ever see the second form.
One way that we humans deal with long strings of numbers is to break them up. A US phone number is usually represented like this:
Other long strings of numbers have separators too:
$100,000,000 (Large sums of money)
08-92-96 (UK banking sort codes)
Unfortunately, both , and – have too many possible meanings within the realm of handling numbers while programming, so we can’t use either of those as a separator. Instead, the Project Coin proposal borrowed an idea from Ruby and introduced the underscore, _, as a separator. Note that this is just a bit of easy-on-the-eyes compile time syntax – the compiler just strips out those underscores and stores the usual digits.
So, you can write 100_000_000 and you should hopefully not confuse that with 10_000_000 (unlike 100000000, which is easily confused with 10000000). Or, to apply this to our own examples:
long l2 = 2_147_483_648L;
int bitPattern = 0b0001_1100__0011_0111__0010_1011__1010_0011;
Notice how much easier it is to read the value assigned to l2. (Yes, it’s 2G, and it’s too big to fit into an int.) By now, you should be convinced of the benefit of these tweaks to the handling of integers, so let’s move on.
Improved exception handling
There are two parts to this improvement – multi-catch and, effectively, final rethrow. To see why they’re helpful, consider the following Java 6 code, which tries to find, open, and parse a configuration file and handles a number of different possible exceptions, as shown in listing 1. getConfig() is a method that can encounter a number of different exceptional conditions:
• The configuration file may not exist.
• It may disappear while we’re trying to read from it.
• It may be malformed syntactically.
• It may have invalid information in it.
The exceptions really fit into two distinct functional groups. Either the file is missing or bad in some way or the file is present and correct in theory but was not retrievable (perhaps because of hardware failure or network outage). It would be nice to compress the cases down into just these two cases.
Java 7 allows us to do this, as shown in listing 2.
Note that the exception has to be handled in the catch block as the common supertype of the possible exceptions (which will usually be Exception or Throwable in practice) because the exact type is not knowable at compile-time.
An additional bit of new syntax is for helping with rethrowing exceptions. In many cases, developers may want to manipulate a thrown exception before rethrowing it. The problem is that, in previous versions of Java, code like this:
will force the programmer to declare the exception signature of this code as Throwable – the real dynamic type of the exception has been swallowed. However, it’s relatively easy to see that the exception can only be an IOException or a SQLException and, if we can see it, then so can the compiler. In this snippet, we’ve made a single word change to use the next Java 7 syntax:
The appearance of the final keyword indicates that the type that is actually thrown is the runtime type of the exception that was actually encountered – in this example, this would either be IOException or SQLException. This is referred to as “final rethrow” and can protect against throwing an overly general type here, which then has to be caught by a very general catch in a higher scope. Enhancements in the compiler mean that the final keyword is actually optional, but we’ve found that, while starting out with this feature, it’s actually easier to include it.
In addition to these general improvements to exception handling, the specific case of handling resources has been improved in 7 – so that’s where we’ll turn next.
This change is easy to explain but has proven to have hidden subtleties, which made it much less easy to implement than originally hoped. The basic idea is to allow a resource (for example, a file or something a bit like one) to be scoped to a block in such a way that the resource is automatically closed when control exits the block.
This is an important change for the simple reason that virtually no one gets the manual handling of resource closing 100 percent right. Until recently, even the reference how-to from Sun was wrong. The proposal submitted to Project Coin for this change includes the astounding claim that two thirds of the uses of close() in the JDK had bugs in them!
Fortunately, compilers can be made to excel at producing exactly the sort of pedantic, boilerplate code that humans so often get wrong, and that’s the approach taken by this change, which is usually referred to as try-with-resources.
This is a big help in writing error-free code. To see just how much, consider how you would write a block of code in order to read from a URL-based stream URL and write to a file with Java 6. It would look something like it’s shown in listing 3.
The key point here is that, when handling external resources, Murphy’s Law applies – anything can go wrong at any time:
1. The InputStream can fail:
• To open from the URL.
• To read from it.
• To close properly.
2. The file corresponding to the OutputStream can fail:
• To open.
• To write to it.
• To close properly.
Or have some combination of more than one of the above.
This last possibility is actually where a lot of the headaches come from – the possibility of some combination of exceptions is very difficult to deal with well.
Let’s consider some Java 7 code for saving code from the web. As the name suggests, url is a URL object that points at the entity we want to download, and file is a File object where we want to save what we’re downloading. Let’s look at Listing 4.
This basic form shows the new syntax for a block with automatic management – the try with the resource in round brackets. For C# programmers, this is probably a bit reminiscent of a using clause and that’s a good starting point when working with this new feature. The resources are used by the block and then automatically disposed of when you’re done with them. You still need to worry about handling exceptions with regards to finding the valid resource in the first place, but, once you’re using it, the resource gets automatically closed.
This is the main reason for preferring the new syntax – it’s just much less error prone – the compiler is not susceptible to the mistakes that basically every developer will make when trying to write this type of code manually.
One of the problems with generics is that the definitions and the setup of instances can be really verbose. Let’s suppose that you have some users, whom you identify by a user id (which is an integer), and each user has some lookup tables, and the tables are specific to each user. What would that look like in code?
That’s quite a mouthful, and almost half of it is just duplicated characters. Wouldn’t it be better if we could just write something like the code below, and have the compiler just infer the type information on the right hand side?
Map<Integer, List<String, String>> usersLists = new HashMap<>();
Thanks to the magic of Project Coin – you can. In Java 7, the shortened form for declarations like that is entirely legal. It’s backwards compatible as well so, when you find yourself revisiting old code, you can just cut the older, more verbose declaration and start using the new type-inferred syntax to save a few pixels.
Simplified varargs method invocation
This is one of the simplest changes of all – it just moves a warning about type information for quite a specific case where varargs combines with generics in a method signature. Put another way, unless you’re in the habit of writing code that takes as arguments a variable number of references of type T and does something to make a collection out of them, such as code that looks like this:
Then you can move on to the next section. Still here? Good. So what’s this issue all about?
Well, as you probably already know, a varargs method is one that takes a variable number of parameters (all of the same type) at the end of the argument list. What you may not know is how varargs is implemented. All of the variable parameters at the end are put into an array (which the compiler automatically creates for you) and are passed as a single parameter.
This is all well and good, but here we run into one of the admitted weaknesses of Java’s generics – you are not normally allowed to create an array of a known generic type. So, this:
HashMap<String, String> arryHm = new HashMap<>;
Won’t compile; you can’t make arrays of a specified generic type. Instead, you have to do this:
HashMap<String, String> warnHm = new HashMap;
Which gives a warning that has to be ignored. Notice that you can define warnHm to be of the type array of HashMap<String, String>. You just can’t create any instances of that type and, instead, have to hold your nose (or at least, suppress the warning) and force an instance of the raw type (which is array of HashMap) into warnHm.
These two features – varargs methods really working on the synthetic arrays that the compiler conjures up and arrays of known generic types not being valid instantiable types – come together to cause us a slight headache. Consider this bit of code:
The compiler will attempt to create an array to contain hm1 and hm2, but the type of the array should strictly be one of the forbidden array types. Faced with this dilemma, the compiler basically cheats and breaks its own rule about the forbidden array of generic type. It creates the array instance but grumbles about it, producing a compiler warning that mutters darkly about “uses unchecked or unsafe operations.”
From the point of view of the type system, this is fair enough. However, the poor developer just wanted to use what seemed like a perfectly sensible API and now there are these scary-sounding warnings for no adequately explained reason.
What’s changed in Java 7
Java 7 brought a change in the emphasis of the warning. After all, there is a potential for violating type safety in these types of constructions, and somebody had better be informed about them. There’s not much that the users of these types of APIs can really do, though. Either the code inside doSomething() is evil and violates type safety or it doesn’t. In any case, it’s out of the developer’s hands.
The person who should really be warned about this issue is the person who wrote doSomething() – the API producer, rather than the consumer. So that’s where the warning goes – it’s moved from the site of the API use (the warning used to be triggered when the code that used the API was compiled) to the site where the API was defined (so the warning is now triggered when an API is written, which has the possibility to trigger this kind of potential type safety violation). The compiler warns the coder implementing the API and it’s up to them to pay proper attention to the type system.
Changes to the type system
That’s an awful lot of words to describe a very small change. Moving a warning from one place to another is hardly a game-changing language feature, but it does serve to illustrate one very important point. Earlier in this paper, we mentioned that Project Coin encouraged contributors to mostly try and stay away from the type system when proposing changes.
This example shows how involved you need to get when figuring out how different features of the type system interact, and how that interaction will alter when a change to the language is implemented. This isn’t even a particularly complex change; larger changes would be far, far more involved with potentially dozens of subtle ramifications.
This final example illustrates how intricate the effect of small changes can be and completes our discussion of the changes brought in by Project Coin. Although they represent mostly small syntactic changes, once you’ve started using them in practice, you will probably find that they have a positive impact on your code that is out of proportion with the size of the change.
This article has been all about introducing some of the smaller changes in the syntax for Java 7. You saw that, although the changes are not earth-shattering, Java 7 will be a little bit easier to write in a more concise and error-free manner. You also learned that there can be challenges that cause language designers to make smaller and more conservative changes than they might otherwise wish.
We hope you enjoyed this article and look forward to discussing more about Java 7 and polyglot programming on the JVM with you in a pub near you soon!