Strings
Strings for human readable output
A very frequent use of strings is for human readable output to a console, a log file, etc. A simple and easy syntax to create such strings simplifies everyday programming and debugging significantly.
+
operator
out.println("x is "+x+", y is "+y+"!");
This would imply a concatenation operator +
on strings. It is nice since it
provides no specific syntax, but it requires four additional characters "++"
to include a single value x
.
Open parameter list
out.println("x is ",x,", y is ",y,"!");
This would imply println to have an open parameter list. It also does not
require specific syntax, but it also requires four additional characters ",,"
to
include a single value x
.
Braces {}
to enclose expressions
out.println("x is {x}, y is {y}!");
Parsing this is somewhat strange, in particular if we do not print a
variable, but an arbitrary expression such as y*y
. On the plus side, it requires
only two additional characters {}
to include a single value x
and it is
somewhat nice to read.
A big disadvantage here is that this approach is not open to internationalization, i.e, replacing the strings with variables that depend on the language environment.
Prefixed $
println("x is $x, y is $y!");
This Kotlin-approach requires only one additional character $
. However,
to include more than a single field identifier or to separate it from following
text, 3 additional characters are required, e.g., "length is
${length}cm"
.
Implicit conversions
out.println("x is " x ", y is " y "!");
This would require a sequence of strings and expressions to imply a
conversion to strings and their concatenation. So pure sequencing is an
operator. It requires two additional characters "
to include a single value
x
(the spaces are optional). The needed change in the grammar is minor.
C-printf style
printf("x is %d, y is %d!", x, y);
This C-printf style formatting is very flexible, but separates the formatting
from the variable. It requires three extra characters %d,
for each value
printed.
C++ style
std:cout << "x is " << x << ", y is " << y << "!\n";
The C++ style may have looked cool 20 years ago, but is little helpful, it adds
6 characters "<<<<"
for every value to be printed.
Sidef style
say ("Partition %5d into %2d prime piece" % (num, parts),
parts = 1 ? ': ' : 's: ', prime_partition(num, parts).join('+') || 'not possible')
Sidef uses the operator %
on Strings to format arguments provided as a
tuple argument. Quite nice.
Fuzions Approach to human readable strings
The main idea is to provide an abstract string class that can be converted into a list or stream of bytes. So strings do not need to be physically present in memory, string concatenation means creating a list (stream) by concatenating two existing lists (streams).
As in Java, infix +
can be used as a standard way to concatenate strings with
other values that are stringable
, an abstract feature providing
as_string
:
out.println("x is "+x+", x+y is "+(x+y)+"!");
Support for {x} notation within strings can be reflected by the grammar by introducing tokens lstring, rstring and mstring for strings that end and/or start with braces. Then, the string in
out.println("x is {x}, x+y is {x+y}!");
would be split by the lexer into three different string tokens t_lstring x
is
, an t_mstring , x+y is
and an t_rstring !
, and all the tokens of
the expressions x
and x+y
in between. The parser could then be
extended to support expressions of the form
string -> t_string
| t_lstring { expr t_mstring }* expr t_rstring
and convert them into an AST that is equal to the code using normal strings and infix + explicitly.
A bit more tricky is handling of $ as in
out.println("x is $x, y is $y!");
The lexer could treat $
as a string termination character like "
,
convert the $x into a special identifier sident
and parse the remainder as
if a new string was started with "
. The grammar would then become
string -> ( t_string
| t_lstring { expr t_mstring }* expr t_rstring
) { t_sident string }
The big advantage: This whole string magic would be handled mostly by the lexer and in part by the parser. The problem is solved once the AST has been built.
Debugging output
Python permits handy debug string using f'{x * y =}'
as shorthand
for f'x * y = {x * y}'
. NYI: Would be convenient for debugging,
might consider this for Fuzion as well.
Formatting
Python permits f'{num:.2f}'
to use a type-specific formal string,
in this case .2f
for two decimals, when printing a value. The same
could be achieved in Fuzion using an operator infix $(string
format)
. Then we could have a Fuzion string
like "{num$".2f"}"
to achieve the same effect. NYI: is this useful
enough to be supported?
Text Blocks
Where introduced in Java 15: JEP 378: Text
Blocks. Basically using """
to start and end a multi-line
string constant.