Today, Java 13 will be released. In addition to otherwise rather inconspicuous innovations, the release extends the language by multi-line string literals, in Java’s jargon text blocks or text blocks , which were introduced with JEP 355. Logging is a great way to generate longer text output, create snippets in other languages such as HTML or JSON – which will make some tests more readable – or even formulate SQL statements.
By comparison, the result is identical to this classic and much more unpleasant variant:
Before you can use text blocks in Java 13, you have to unlock them, because they are initially only included as a feature preview. For compilation and execution on the command line or in the Build Tool, the parameter –enable preview must be appended to java and javac . In Eclipse, there is a corresponding setting in the compiler options. In Intellij IDEA, the language level for the module must be chosen appropriately.
Three quotes for a text block
Text blocks can be used anywhere else where classic string literals “like this” are accepted and they have a simple syntax:
They start with three quotes and a line break. They end with three quotes, either in the last line of the block or on a separate line. Within the string, escape sequences are interpreted as usual, so the strings are not raw.
The line break at the beginning of the block does not become part of the terminating string. If the closing quotation marks are in their own line, the string ends with a line break; Space characters in this line, such as spaces and tabs, never become part of the string.
Since single quotes in the context of a text block have no syntactic meaning, so in particular do not end it, escaping is not necessary. This eliminates the usual source of escape sequences and makes it much easier to embed languages like HTML or JSON in Java.
If you have a text block in the source code, it is of course the job of the compiler to process it. How, illuminate the next pages. But it makes sense to speak briefly about the result.
If the compiler has examined the block and translated it into a string, the result is indistinguishable from a common string literal with the same content. In the above example, this means that jsonBlock.equals (jsonPhrase) is true.
But the compiler goes one step further. String constants are deduplicated at compile time to save memory at runtime. As a result, the same literals are not only equal , but also equal to == . This is handy if the same snippets appear in many places in the code, but can be very uncomfortable if you forget that and, for example, use a string as a lock.
This mechanism is used regardless of whether a string was created with a literal or a text block. This means that even jsonBlock == jsonPhrase is true. So text blocks are a pure compiler feature and at runtime you can not tell how a string instance was defined in the code.
Normalization of the end of lines
An interesting aspect of text blocks is that properties of the text files that contain the source code mingle with the semantics of the code. Obviously this will happen with line endings: Whether lines ending with \ r \ n (common on Windows) or \ n (LF – common on Unix) is a configuration-only question, but if the compiler tacitly takes over, that will affect up the generated strings – that would be uncomfortable. The same applies to space characters that are at the end of a line, as these are rarely visible and their adoption in the string can easily cause confusion.
Therefore, the compiler normalizes line endings by removing all space characters after the last other character and replacing all newline combinations (that is, \ r , \ r \ n, and \ n ) with \ n . Since escape sequences are evaluated normally (after this normalization), CRLF can be generated by ending lines with the sequence \ r :
It is more exciting and a little more complicated than at line ends at the indentation. Here developers and compilers have to distinguish between incidental and intentional indentation: The indentation of the code due to the formatting should have no effect on the created string – it is irrelevant. However, further indentation should be recognized as intentional and be reflected in the result.
Indentation and formatting of text blocks
In the previous example, jsonBlock means that the opening and closing brackets should be at the beginning of their respective line, while the three lines in between are each indented with a tab.
To make this possible, the compiler applies a not uninteresting algorithm, which has the following effects:
1. All space characters are treated the same – in particular a space like a tab.
2. All empty space lines are used to remove the same number of leading space characters as many that at least one line no longer starts with such a character.
3. If the closing quotation marks are in their own line, this will also be taken into account in the previous step.
Point 2 causes the jsonBlock to contain the desired result. Because no matter how far the code is indented, this indentation is recognized as marginal and removed. But what to do if all lines are indented and should start with a tab, for example?
Indentation vary – by formatting
If the indentation is to be varied, point 3 comes into play: since the line with the closing quotation marks is taken into account when determining the characters to be removed, no more space characters are removed than the last line contains. Effectively, the other lines are aligned with the closing quotation marks. If they are further engaged, this is reflected in the result.
In this example, the six lines that are taken into account when determining the indent begin with 2, 3, 3, 3, 2, or 1 tab. The compiler will remove a tab. As a result, all other lines remain indented with one tab (opening and closing brackets) or two tabs (remaining three lines).
Since, for additional indentation, the line contents must be positioned relative to the closing quotation marks, this implies that the “” “ must be in a separate line, as we mentioned at the beginning, which depends on a line break at the end of the string if you want the indentation, but not the line break, this is where the new methods of the class String come into play.
Indentation vary – using string methods
Since Java 12, there is the method String :: indent to indent all lines of a string with the given number of spaces:
In jsonIndentMethod, there are four spaces in front of the opening and closing brackets. The other three lines are somewhat confusing: they start with four spaces plus the indentation according to the formatting of the source code. If this is indented with four spaces, this is very good. But if there are for example two spaces or a tab, the resulting indentation is a bit bumpy. Unfortunately there is no convenient solution for this.
However, assuming an indentation with four spaces, jsonIndentMethod is almost identical to jsonIndentBlock . The only difference is that the latter ends with a line break. If you do not want that and still need indentation before each line, you can access String :: indent as described .
In addition, String :: stripIndent was added in Java 13 . This method removes indents according to the same logic as the compiler. In the following string indentedJsonLiteral, each row is indented with at least one tab:
The extra tab makes indentedJsonLiteral unequal to the introductory jsonBlock , but stripIndent <String jsonBlock = “” ” fixes that and indentedJsonLiteral.stripIndent (). Equals (jsonBlock) is true.
Formatting text blocks
Jim Laskey and Stuart Marks of Oracle have written a developer guide to blocks of text that are worth reading. It contains these ten recommendations (slightly reworded and without the eleventh):
1. One should use text blocks where they improve the readability of the code, especially in multi-line strings.
2. If a string without concatenation and insertion of \ n fits into a line, it should probably be defined as a literal.
3. There is no objection to using escape sequences like \ n in blocks of text, if that promotes readability.
4. In most cases, the three opening quotation marks should be at the end of their line and the three closing in their own.
5. Text lines and closing quotation marks should not be aligned with the opening quotation marks.
6. Instead of using text blocks within more complex expressions such as stream pipelines, they are mostly better preserved in local variables or static constants for readability.
7. Due to the equal treatment of a tab and a space, text blocks should only be indented with one of the two characters to avoid uneven results.
8. If a block of text is to contain more than three quotes in a row, an escaping of the first of three is sufficient.
9. By default, lines of text should be aligned according to standard formatting.
10. If text blocks contain very long lines, horizontal scrolling can be prevented by aligning the lines to the far left.
This covers not only technical considerations such as syntax, deduplication, normalization of line endings and dealing with indentation, but also stylistic ones. There is even more information in this blog post about text blocks.