days
-1
0
hours
0
-2
minutes
0
-2
seconds
-5
-9
search
At long last

Java 13 – why text blocks are worth the wait

Tim Zöller
java
© Shutterstock / Visual society

For a long time, Java developers had to take unattractive detours when formulating multi-line string literals, while enviously switching to Scala, Kotlin, Groovy and other languages. With JEP 355, text blocks are now being previewed in Java 13.

Writing string literals in Java code over multiple lines and possibly formatted with whitespace is annoying for many developers. For example, to format an SQL statement in code clearly over several lines, it is common to use the + operator with manually inserted line breaks at the end (Listing 1).

To see that most languages already contain the ability to properly format string literals, just look at Groovy, Scala and Kotlin. Other higher programming languages like C#, Swift or Python are in no way inferior. With such a high number of models, it is possible to adapt the best properties of the implementations for Java. This can be seen from JEP 326 for raw string literals, originally intended for Java 12. The community had many comments and objections, whereupon the proposal was withdrawn.

 

 

JEP 355 has taken many of the comments into account and now allows developers starting with Java 13 to easily define multi-line strings in code. The feature is first introduced in JDK 13 as a preview – so if you want to use it, you have to compile your code with the --enable-preview flag.

String statement = "SELECT\n" +
              	 "   c.id,\n" +
              	 "   c.firstname,\n" +
             	  "   c.lastName,\n" +
             	  "   c.customernumber\n" +
             	  "FROM customers c;";

JEP 326: Raw Strings

In order to better classify some of the design decisions for text blocks, it is worth taking a look at the withdrawn JEP 326. This had the JDK 12 as its goal and intended to introduce raw strings in Java – strings that can extend over several lines and do not interpret escape sequences.

In order to be as flexible as possible and to be able to function independently of the contained text, an arbitrary number of backquotes (`) was suggested as the boundary sequence. If the string itself contains a backquote, the boundary sequence would have to contain at least two. If the sequence contains two consecutive backquotes, the boundary sequence would have at least three, and so on.

SEE ALSO: Java 13 – a deep dive into the JDK’s new features

In the announcement that the JEP will be withdrawn, Brian Goetz mentions some criticisms from the community that led to this decision. Many developers feared that the variable number of characters in the boundary sequence could be confusing for people and development environments. There has also been criticism that the backquote introduces one of the few unused delimiters into code that would be “used up” and could not be used for future features.

Additionally, the fact that any multi-line string literal with the proposed feature would automatically have to be a raw string gave rise to the decision to withdraw JEP 236. The approaches contained in JEP 355 explicitly build on the findings of this process.

Delimiters and Formatting

Multi-line string literals are enclosed by a sequence of three quotation marks (“””). The opening character sequence may only be optionally followed by whitespaces and a mandatory line break. The actual content of the literal starts in the next line of code (Listing 2). Single and double quotation marks may occur in the literal without escaping them. If the literal itself contains three quotation marks, at least one of them must be escaped.

String correctString = """ // First character only in next line, correct
                     	{
                     	  "type": "json",
                     	  "content": "sampletext"
                     	}
	                     """;
 
String incorrectString = """{ // Characters after delimiter sequence, incorrect
                            	"type": "json",
                            	"content": "sampletext"
 	                         }
                           """;

Multi-line string literals are processed at compile time. The compiler always proceeds according to the same schema: All line breaks in the character string are first converted into a line feed (\u000A) independent of the operating system. Sequences explicitly entered with escape characters in the text such as \n or \r are excluded from this.

In the second step, the whitespace owed to the code formatting is removed. It is marked with the position of the concluding three quotation marks. This allows you to place text blocks in the code so that it matches the rest of the code formatting (Listing 3).

For Scala and Kotlin, multiline literals must either be written left-aligned without whitespace in the source text, or they must be freed from whitespace by string manipulation. Java, on the other hand, is strongly based on Swift’s procedure, a cleanup of the strings at runtime is not necessary.

Finally, all escape sequences in the text are interpreted and resolved. After compiling, it is no longer possible to find out how a string was defined in the code, whether it was defined as a multi-line string or not.

String string1 = """
             	{
             	  "type": "json",
        	       "content": "sampletext"
             	}
             	""";  
 
// Content of string1.
// The left margin is marked by | for illustration purposes.
//|{
//|   	"type": "json",
//|   	"content": "sampletext"
//|}
 
String string2 = """
             	{
             	  "type": "json",
             	  "content": "sampletext"
             	}
      	""";  
 
// Content of string2.
// The left margin is marked by | for illustration purposes.
//|   	{
//|   	       "type": "json",
//|    	"content": "sampletext"
//|   	}

Use cases

With the new possibilities that text blocks offer developers, code can be written cleaner and/or more readable in some cases. As mentioned, it is now possible, for example, to display SQL statements in Java code more clearly. Since text blocks can be used everywhere conventional string literals are allowed, this even applies to named queries with JQL or native SQL (Listing 4), which are defined within annotations.

@Entity
@NamedQuery(name = "findByCustomerNumberAndCity",
        	query = """
               	 from Customer c
                	where c.customerNo = :customerNo
                	and c.city = :city
                	""")
public class CustomerEntity {
 
  String customerNo;
  String city;
 
  public String getCustomerNo() {
    return customerNo;
  }
 
  public void setCustomerNo(String customerNo) {
    this.customerNo = customerNo;
  }
}

Further useful applications can be found if you want to use string literals as templates. This can be useful either for JSON Payloads in unit tests that you want to use to test a REST service, or for preparing HTML snippets on the server side (Listing 5).

String htmlContent = """
                 	<div>
                   	<h2>My header</h2>
                   	<ul>
                     	<li>An entry</li>
                     	<li>Another entry</li>
                   	</ul>
                 	</div>
                 	""";

In addition, text blocks significantly simplify the use of polyglot features, be it with the old rhino engine or with the context of GraalVM. The JavaScript code defined in the literal would be much less readable and maintainable if the construct mentioned at the beginning had to be used with string concatenation and manual line breaks (Listing 6).

ScriptEngine engine = new ScriptEngineManager().getEngineByName("js");
Object result = engine.eval("""
                        	function add(int1, int2) {
                          	return int1 + int2;
                        	}
                        	add(1, 2);""");

New methods

JEP 355 adds three new instance methods to the String class: formatted, stripIndent, and translateEscapes. While formatted is an auxiliary method that makes working with multiline literals easier, stripIndent and translateEscapes represent the last two compile steps in their interpretation.

As mentioned, multi-line strings are excellent for defining templates in code. To fill placeholders in string templates with values, the static method format already exists in the String class. Since multi-line defined string literals do not differ in the use of single-line literals, they can of course also be formatted with this method.

In order to provide a clearer approach for text blocks, the new instance method formatted has been added to the String class, which behaves analogously to the static method format. The result is also a string formatted with placeholders, but in combination with text blocks it results in code that is tidier (Listing 7).

String.format(„Hallo, %s, %s dich zu sehen!“, „Duke“, „schön“);
 
String xmlString = """
               	<customer>
                     <no>%s</no>
                     <name>%s</name>
               	</customer>
               	""".formatted("12345", "Franz Kunde");

The stripIndent method removes whitespace in front of multi-line strings that all lines have in common, i.e. moves the entire text to the left without changing the formatting. The method translateEscapes interprets all escape sequences contained in a string. Listing 8 clarifies this behavior once again.

// The left margin is marked by | for illustration purposes.
String example = " a\n  b\\n   c";
 
System.out.println(example);
//| a
//|  b\n   c
 
System.out.println(example.stripIndent());
//|a
//| b\n   c
 
System.out.println(example.translateEscapes());
//| a
//|  b
//|   c
 
System.out.println(example.stripIndent().translateEscapes());
//|a
//| b
//|  c

Conclusion

With JEP 355, multi-line string literals are introduced in a way that quickly feels natural both in their declaration and in their formatting. The developers have benefited from the experience gained in other programming languages and have chosen a way that makes post-processing of strings, for example to remove annoying unnecessary whitespace and at the same time allowing a readable formatting of source code.

The introduction is rounded off by three new methods in the String class: formatted, stripIndent and translateEscapes. Not only do they make the new text blocks easier to use, but they can also be used to format strings from other sources, remove indentations, or interpret escape sequences.

Personally, I’m very much looking forward to the introduction of text blocks as a fully rounded out feature and notice almost daily how much I’m missing them at the moment.

Author
java

Tim Zöller

Tim Zöller has been working with Java for ten years, and for fun he sometimes develops software in Clojure. He works as an IT consultant and software developer at ilum:e informatik ag and is co-founder of JUG Mainz.


Leave a Reply

1 Comment on "Java 13 – why text blocks are worth the wait"

avatar
400
  Subscribe  
Notify of
Nestor Custodio
Guest

Oh wow, Java’s got heredocs now!
Welcome to 1998, you guys!