TADS Tip No. 2

The Subtleties of Whitespace

By: Mark J. Musante

Copyright 1998 by Mark J. Musante. All rights reserved.

TADS has a neate feature built-in to it: collapsing whitespace. Whitespace is a generic term used to talk about ASCII characters which separate words. The ones that TADS uses are the space, the newline, the tab and the line break.

The space character is simply " "; the newline is "\n"; the tab is a "\t"; and the line break is a "\b". For the first two characters, the space and the newline, TADS compresses them when they are output. So a line which looks like:

	"This        is \n\n  a test!"
will display on the screen as:
This is
a test!

For the tab and the line break, on the other hand, TADS allows you to string as many of these together as you like, and no compression will occur. So if you write something like:

	"Now\t\tis the time\b\bfor all good\t\t\tmen..."
it will show up like:
Now        is the time

for all good       men...

Another effect that TADS has on your output is to automatically insert two spaces after a "." or ":" character. But it will do that ONLY IF there is already a space there. For example:

	"One:two\nthree: four"
will end up looking like:
three:  four
Notice the extra space inserted after the ":" character. TADS will behave identically if you replace the ":"s in the above example with "."s.

What if you only want just one space after the colon? There's a special escape in TADS just for that purpose. When you insert a tab or a newline, you say "\t" or "\n", using the "\" character as an escape to tell the TADS program not to print out a "t" or an "n", but to print the tab or newline as appropriate.

The escape for a special space character is "\ ". That's the backslash followed by a space. Now we change our example to look like:

	"One:two\nthree:\ four"
Thus we have escaped the space, forcing it to be ignored by the TADS whitespace formatter. The result will look like:
three: four

Using this method, you can force TADS to print as many space characters as you want. Simply escape each one and have your fill. But what about escaping newlines? You can't say "\n\n\n" and expect to get three of them, since TADS collapses them into a single newline. And TADS doesn't let you escape the already-escaped "\n" character.

The answer, of course, is "\b". Using "\b" guarantees a blank line for you. The sequence "\n\b" is equivalent to "\b", so saying "this is a test\n\b" works the same as "this is a test\b" -- each one will write the "this is a test" line followed by a single blank line. Multiple "\b" sequences do not get collapsed, so, as in the second example code above, "hello\b\bworld" will print "hello" follwed by two blank lines, followed by "world".

The next bit of trickery involving whitespace that I'd like to cover here is in handling daemons. TADS can be told to run a method or a function each turn through the use of its notify, and setdaemon built-in functions. There is also the setfuse function which causes something to be run once at some specified time in the future.

What does that have to do with whitespace? Well, typically a daemon will display some sort of output each turn; fuses, too, will typically tell the user something interesting has happened once the fuse "burns down". This could acutally end up interfering with regular user actions.

Let's create an example. Say we have a daemon which prints "You hear a faint dripping noise coming from the east." for ever turn that the user is in a certain room. Let's also put a light in the room that looks like this:

light: fixeditem
	noun = 'light'
	sdesc = "light"
	ldesc = "The light is affixed to the wall. It is burnt out."
	location = certain_room

So far so good. But lets take a look at a sample transcript:

Time passes...
You hear a faint dripping noise coming from the east.

>x light
The light is affixed to the wall.  It is burnt out.You hear a faint dripping
noise coming from the east.


Oops! What happened? Why are "burnt out." and "You hear" touching one another? The first thing to look at, though, is that, between the first two sentences, there are two spaces. The code only specified one. Remember the discussion above: the characters ":" and "." are automatically followed by two spaces if there already is at least one space there. So you can see that TADS followed that rule: there was one space (we could have put in a dozen and ended up with the same result) and TADS reformatted it to have two spaces.

But notice that the last character of the ldesc for the light object is a "."; so far so good, right? But the FIRST character of the daemon's output is a "Y". So there is no space between the "." and the "Y" and, therefore, TADS assumes you don't want any spaces there at all (it doesn't insert spaces where there are none; that would be A Bad Thing To Do).

The solution is to always end text strings with a space. Change the ldesc of the light object to be:

	ldesc = "The light is affixed to the wall. It is burnt out. "

The change is subtle, but it will result in the daemon's text getting two spaces after the "." of the light's ldesc. Which is exactly what we want.

There's nothing wrong with having too many spaces in your output -- TADS will collapse them into 1 or 2 depending on the situation. So as long a you follow the rule of always put a space at the end of every text string, your games output will be formatted nicely and legibly.

The last bit of subtlety involving whitespace is the use of the "<<" and ">>" construct within strings. Most of the time this works just like the say built-in fuction. For example, if count is a variable which represents an integer, the following two lines produce identical output:

	"The count is "; say( count ); ". ";
	"The count is <<count>>. ";

The problem comes in when the line of text that you're trying to write is so long that it wraps around within your source code to the next line. For example:

	ldesc = "This is a really long line of text which serves to illustrate
		an interesting point. "

Notice how the ldesc text is indented within the code. There is one tab at the beginning of the first line and two tabs at the beginning of the next. Due to TADS' whitespace collapsing algorithm, the user doesn't see the two tabs that are actually in the middle of the string. Instead, TADS prints out a single space.

But what if we had a "<<>>" construct in there? As long as it's imbedded or a number, we have no problems:

couch: fixeditem
	// ...
	count = 69105
	ldesc = "On this couch you see <<self.count>> marbles, each of which
		glisten in the light. "
	// ...

Everything comes out as you'd expect. But what if we tried to put a string in there. How about this:

marbles: fixeditem
	// ...
	sdesc = "marbles"
	verDoTake( actor ) = {
		"You try to pick them up, but carrying that many
		<<self.sdesc>> at once proves to be impossible. ";
	// ...

In this case, TADS doesn't know if the self.sdesc is supposed to be part of the word "many" or not. The output ends up looking like:

You try to pick them up, but carrying that manymarbles at once proves to be

And the "many" and "marbles" end up getting smashed together. The answer is to either put an escaped-space (the "\ " sequence) before the <<self.sdesc>>, or rearrange the line so that the sdesc isn't the first (or last) item.

In other words, either:

	verDoTake( actor ) = {
		"You try to pick them up, but carrying that many
		\ <<self.sdesc>> at once proves to be impossible. ";
	verDoTake( actor ) = {
		"You try to pick them up, but carrying that
		many <<self.sdesc>> at once proves to be impossible. ";
will work.

If you like, you can take a look at some sample source code which may help to illustrate what's going on.

And, as always, feel free to email me at olorin@std.com if you have any questions or desire clarification.