Strings

Strings for human readable output

A very frequent use of strings is for human readable output to a console, a log file, etc. A simple and easy syntax to create such strings simplifies everyday programming and debugging significantly.

`+` operator


out.println("x is "+x+", y is "+y+"!");

This would imply a concatenation operator + on strings. It is nice since it provides no specific syntax, but it requires four additional characters "++" to include a single value x.

Open parameter list


out.println("x is ",x,", y is ",y,"!");

This would imply println to have an open parameter list. It also does not require specific syntax, but it also requires four additional characters ",," to include a single value x.

Braces `{}` to enclose expressions


out.println("x is {x}, y is {y}!");

Parsing this is somewhat strange, in particular if we do not print a variable, but an arbitrary expression such as y*y. On the plus side, it requires only two additional characters {} to include a single value x and it is somewhat nice to read.

A big disadvantage here is that this approach is not open to internationalization, i.e, replacing the strings with variables that depend on the language environment.

Prefixed `$`


println("x is $x, y is $y!");

This Kotlin-approach requires only one additional character $. However, to include more than a single field identifier or to separate it from following text, 3 additional characters are required, e.g., "length is ${length}cm".

Implicit conversions


out.println("x is " x ", y is " y "!");

This would require a sequence of strings and expressions to imply a conversion to strings and their concatenation. So pure sequencing is an operator. It requires two additional characters " to include a single value x (the spaces are optional). The needed change in the grammar is minor.

C-printf style


printf("x is %d, y is %d!", x, y);

This C-printf style formatting is very flexible, but separates the formatting from the variable. It requires three extra characters %d, for each value printed.

C++ style


std:cout << "x is " << x << ", y is " << y << "!\n";

The C++ style may have looked cool 20 years ago, but is little helpful, it adds 6 characters "<<<<" for every value to be printed.

Sidef style


say ("Partition %5d into %2d prime piece" % (num, parts),
parts = 1 ? ':  ' : 's: ', prime_partition(num, parts).join('+') || 'not possible')

Sidef uses the operator % on Strings to format arguments provided as a tuple argument. Quite nice.

Fuzions Approach to human readable strings

The main idea is to provide an abstract string class that can be converted into a list or stream of bytes. So strings do not need to be physically present in memory, string concatenation means creating a list (stream) by concatenating two existing lists (streams).

As in Java, infix + can be used as a standard way to concatenate strings with other values, by calling the as_string feature that is defined for every feature via inheritance from Any:


out.println("x is "+x+", x+y is "+(x+y)+"!");

Support for {x} notation within strings can be reflected by the grammar by introducing tokens lstring, rstring and mstring for strings that end and/or start with braces. Then, the string in


out.println("x is {x}, x+y is {x+y}!");

would be split by the lexer into three different string tokens t_lstring x is, an t_mstring , x+y is and an t_rstring !, and all the tokens of the expressions x and x+y in between. The parser could then be extended to support expressions of the form


string -> t_string
        | t_lstring { expr t_mstring }* expr t_rstring

and convert them into an AST that is equal to the code using normal strings and infix + explicitly.

A bit more tricky is handling of $ as in


out.println("x is $x, y is $y!");

The lexer could treat $ as a string termination character like ", convert the $x into a special identifier sident and parse the remainder as if a new string was started with ". The grammar would then become


string -> ( t_string
          | t_lstring { expr t_mstring }* expr t_rstring
          ) { t_sident string }

The big advantage: This whole string magic would be handled mostly by the lexer and in part by the parser. The problem is solved once the AST has been built.

Debugging output

Python permits handy debug string using f'{x * y =}' as shorthand for f'x * y = {x * y}'. NYI: Would be convenient for debugging, might consider this for Fuzion as well.

Formatting

Python permits f'{num:.2f}' to use a type-specific formal string, in this case .2f for two decimals, when printing a value. The same could be achieved in Fuzion using an operator infix $(string format). Then we could have a Fuzion string like "{num$".2f"}" to achieve the same effect. NYI: is this useful enough to be supported?

Text Blocks

Where introduced in Java 15: JEP 378: Text Blocks. Basically using """ to start and end a multi-line string constant. This is now supported in Fuzion, too.

last changed: 2025-05-12

next: Array Constants or Initialization