Chapter 2. Tcl—Introduction And Overview

Expect does not have its own special-purpose language. Expect uses Tcl, a popular language for embedding in applications. Tcl provides lots of basic commands such as if/then/else, while, and set. Expect extends the language with commands such as expect and interact.

This chapter is an introduction and overview of Tcl. While not covering all of Tcl, this chapter does provide everything that the rest of the book depends on, and this is enough to write virtually any Expect script. Even if you already know Tcl, you may find it helpful to read this chapter. In this chapter, I will emphasize things about Tcl that you may not have thought much about before.

You probably want to get on with using Expect, and I can understand the urge to skip this chapter in the hopes of learning as little Tcl as possible so you can put Expect to work for you now. Please be patient and it will all fit together that much more easily.

If you do skip this chapter and you find yourself wondering about points in the other chapters, turn back to this chapter and read it.

A few concepts will not be covered here but will be explained as they are encountered for the first time in other chapters. The index can help you locate where each command is first defined and used.

I will occasionally mention when a particular Tcl command or feature is similar to C. It is not necessary that you know C in order to use Tcl, but if you do know it, such statements are clues that you can rely on what you already know from that language.

Everything Is A String

The types of variables are not declared in Tcl. There is no need since there is only one type: string. Every value is a string. Numbers are strings. Even commands and variables are strings! The following commands set the variable name to the string value "Don“, the variable word to the value "foobar“, and the variable pi to "3.14159“.

set name Don
set word foobar
set pi 3.14159

Variable names, values, and commands are case sensitive. So the variable name is different than Name.

To access a variable’s value, prefix the variable name with a dollar sign ($). The following command sets the variable phrase to foobar by retrieving it from the variable word.

set phrase $word

Variable substitutions can occur anywhere in a command, not just at the beginning of an argument. The following command sets the variable phrase2 to the string "word=foobar“.

set phrase2 word=$word

You can insert a literal dollar sign by prefixing it with a backslash. The following command sets the variable money to the value "$1000“.

set money $1000

The backslash is also a useful way to embed other special characters in strings. For example, " " is a tab and "" is a backspace. Most of the Standard C backslash conventions are supported including followed by one to three octal digits and x followed by any number of hex digits.[7] I will mention more of these escapes later.

# stick control-Z in a variable
set controlZ 32
# define control-C
set controlC x03
# define string with embedded control-Z and tab
set oddword foo32bar	gorp

A command beginning with "#" is a comment. It extends to the end of the line. You can think of "#" as a command whose arguments are discarded.

Multiple commands can be strung together if they are separated by a semicolon. A literal semicolon can be embedded by prefacing it with a backslash.

set word1 foo; set word2 bar   ;# two commands
set word3 foo;bar             ;# one command

The ";#" sequence above is a common way of a tacking comments on the end of a line. The ";" ends the previous command and the "#" starts the comment. Writing the ";#" together avoids the possibility of having your comment unintentionally accepted as additional arguments because of a forgotten semicolon.

Commands are normally terminated at the end of a line, but a backslash at the end of a line allows multi-line commands. The backslash-newline sequence and any following whitespace behaves as if it were a single space. Later, I will show other ways of writing multi-line commands.

set word 
       really-long-string-which-does-not-quite-fit-on-previous-line

Quoting Conventions

Tcl separates arguments by whitespace (blanks, tabs, etc.). You can pass space characters by double quoting an argument. The quotes are not part of the argument; they just serve to keep it together. Tcl is similar to the shell in this respect.

set string1 "Hello world"

Double quotes prevent “;” from breaking things up.

set string2 "A semicolon ; is in here"

Keeping an argument together is all that double quotes do. Character sequences such as $, , and  still behave as before.

set name "Sam"
set age 17
set word2 "My name is $name; my age is $age;"

After execution of these three commands, word2 is left with the value "My name is Sam; my age is 17;“.

Notice that in the first command Sam was quoted, while in the second command 17 was not quoted, even though neither contained blanks. When arguments do not have blanks, you do not have to quote them but I often do anyway—they are easier to read. However, I do not bother to quote numbers because quoted numbers simply look funny. Anyway, numbers never contain whitespace.

You can actually have whitespace in command names and variable names in which case you need to quote them too. I will show how to do this later, but I recommend you avoid it if possible.

Return Values

All commands return values. For example, the pid command returns the process id of the current process. To evaluate a command and use its return value as an argument or inside an argument, embed the command in brackets. The bracketed command is replaced by its return value.

set p "My pid is [pid]."

The set command returns its second argument. The following command sets b and a to 0.

set b "[set a 0]"

When it comes to deciding what are arguments, brackets are a special case. Tcl groups everything between matching brackets, so it is not necessary to quote arguments that only have spaces inside brackets. The following are all legal.

set b [set a 0]
set b "[set a 0]"
set b "[set a 0]hello world"
set b [set a 0]hello

After execution of this last command, a is set to "0" and b is set to "0hello“.

With only one argument, set returns the value of its first argument.

set c [set b]

Calling set with one argument is similar to using the dollar sign. Indeed, the previous command can be rewritten as "set c $b“. However they are not always interchangeable. Consider the two commands:

set $phello
set [set p]hello

The first returns the value of the variable phello. The second returns the value of the variable p concatenated with the string "hello“.

The $ syntax is shorter but does not automatically terminate the end of the variable. In the rare cases where the variable just runs right into more alphanumeric characters, the one-argument set command in brackets is useful.

The one-argument set command is also useful when entering commands interactively. For example, here is what it looks like when I type a command to the Tcl interpreter:[8]

tclsh> set p
12389

After entering "set p“, the return value was printed. When using Tcl interactively, you will always see the return value printed. When executing commands from a script, the return values are discarded. Inside scripts, use the puts command to print to the standard output.

Puts

You can print values out with the puts command. In its simplest form, it takes a single string argument and prints it followed by a newline.

puts "Hello word!"
puts "The value of name is $name"

Expressions

Strings can be treated as expressions having mathematical values. For example, when the string "1+1" is treated as an expression, it has the value 2. Tcl does not automatically treat strings as expressions, but individual commands can. I will demonstrate this in a moment. For now, just remember that expressions are not complete commands in themselves.

Expressions are quite similar to those in the C language. The following are all valid expressions:

1 + 5
1+5
1+5.5
1e10+5.5
(5%3)*sin(4)
1 <= 2
(1 <= 2) || (2 != 3)

All of the usual mathematical operators are available including +, -, *, /, and % (modulo). Many functions exist such as sin, cos, log, and sqrt. Boolean operators include || (or), && (and), and ! (not). The usual comparison operators are available (<= (less than or equal), == (equal), != (not equal), etc.). They return 0 if the expression is false and 1 if it is true.

Whitespace may be used freely to enhance readability. A number of numeric forms are supported including scientific notation as well as octal (any number with a leading 0) and hexadecimal (any number with a leading 0x). Functions such as floor, ceil, and round convert from floating-point to integer notation.

Precedence and associativity follow C rules closely. For instance, the expression "1-2*3" is interpreted as "1-(2*3)" because multiplication is of higher precedence than addition. All binary operators at the same level of precedence are left-associative. This means, for instance, that the expression "1-2-3" is interpreted as "(1-2)-3“. Since Tcl expressions rarely become complex, I will omit a lengthy discussion of the numerous levels of precedence, and instead note that you can always use parentheses to override a particular precedence or associativity.[9] See the Tcl reference material for the complete list of operators and their precedences.

Variable values may also be used in expressions.

1 + $age
$argc < 10

Return values can be used in expressions using the bracket notation. For example, an expression to compare the current process id to 0 is:

[pid] == 0

If the process id is 0, the expression equals 1; otherwise it equals 0.

Expressions are not commands by themselves. Rather, certain commands treat their arguments as expressions, evaluating them in the process of command execution. For example, the while command treats its first argument as an expression. I will describe while and similar commands later.

The expr command takes any number of arguments and evaluates them as a single expression and returns the result.

set x "The answer is 1 + 3"
set y "The answer is [1 + 3]"
set z "The answer is [expr 1 + 3]"

After evaluation of the first command, x has the value "The answer is 1 + 3“. The last command leaves z with the value "The answer is 4“. The middle command causes an error. "1 + 3" is not a valid command because 1 is not a command.

Here is a more complicated-looking command (legal this time). It computes a result based on the current process id value and the value of the variable mod.

set x [expr (5 % $mod) + ( 17 == [pid])]

Braces—Deferring Evaluation

Braces are similar to double quotes. Both function as a grouping mechanism; however, braces defer any evaluation of brackets, dollar signs, and backslash sequences. In fact, braces defer everything.

set var1 "a$b[set c]
"
set var2 {a$b[set c]
}

After evaluation of these two commands, var1 contains an "a" followed by the values of b and c, terminated by a return character. The variable var2 contains the characters "a“, "$“, "b“, "[“, "s“, "e“, "t“, " “, "c“, "]“, "“, and "r“.

As with double quotes, the braces are not part of the argument they enclose. They just serve to group and defer. The primary use of braces is in writing control commands such as while loops, for loops, and procedures. These commands need to see the strings without having $ substitutions and bracket evaluations made.

Control Structures

Control structures change the flow of control. Many of the control structures in Tcl are patterned directly after their C equivalents. Tcl gives you the power to write your own control structures, so if you do not like those of C, you may yet find happiness. I will not describe how to do it, but it is surprisingly easy. (The hard part is designing something that makes sense.)

The while Command

The while command loops until an expression is false. It looks very similar to a while in the C language. The following while loop computes the factorial of the number stored in the variable count.

set total 1

while {$count > 0} {
    set total [expr $total * $count]
    set count [expr $count−1]
}

The body of the loop is composed of the two set commands between the braces. The body is executed as long as $count is greater than 0.

Taking a step back, the while command has two arguments. The first is the controlling expression. The second is the body. Notice that both arguments are enclosed in braces. That means that no $ substitutions or bracket evaluations are made. For instance, this while command literally gets the string "$count > 0" as its first argument. Similarly, for the body. So how does anything happen?

The answer is that the while command itself evaluates the expression. If true (nonzero), the while command evaluates the body. The while command then re-evaluates the expression. As long as the expression keeps re-evaluating to a nonzero value, the while command keeps re-evaluating the body.

It is useful to compare this with the set command. The set command does not do any evaluation of its second argument. Consider this command:

set count [expr $count−1]

The [expr ...] part is evaluated before the set command even begins. If count is 7, the set command sees an argument of 6. In contrast, the while command sees the argument "$count > 0“. It would not make any sense to evaluate that expression before the while command, since it has to change every time through the loop.

Using braces this way is fundamental toward the correct use of Tcl’s control structures. You will see that all the other ones follow easily from this.

The incr Command

Many loops use a counter of some sort. Incrementing or decrementing a counter is so common that there is a command to simplify it. It is called incr. It modifies the variable given as its first argument. With no other argument incr adds one, otherwise incr adds the remaining argument.

The two commands are equivalent:

set count [expr $count−1]
incr count −1

The for Command

The for command is similar to the while command. The for command has a controlling expression and a body. However, before the expression is a “start” argument, and after the expression is a “next” argument.

for start expression next {
    # commands here make up
    # the body of the for
}

Both the start and next arguments are commands. The start argument is executed before the first evaluation of the controlling expression. The next argument is evaluated immediately after the body of the loop.

The code shown earlier to compute a factorial can be simplified using the incr command and a for loop as follows:

for {set total 1} {$count > 0} {incr count −1} {
    set total [expr $total * $count]
}

Either of the start or next argument can be empty, but you have to leave a placeholder. For example, you could express an infinite loop as:

for {} {1} {} {
    ... some command ...
}

The if Command

In its simplest form, the if command takes a controlling expression and a body to execute if the expression is nonzero. It looks a lot like a while command, but the body is executed at most once.

if {$count < 0} {
    set total 1
}

If present, an optional else fragment is executed only if the expression evaluates to zero. Here is an example:

if {$count < 5} {
    puts "count is less than five"
} else {
    puts "count is not less than five"
}

It is also possible to add more conditions using elseif arguments. Any number of elseif arguments may be used.

if {$count < 0} {
    puts "count is less than zero"
} elseif {$count > 0} {
    puts "count is greater than zero"
} else {
    puts "count is equal to zero"
}

In the while and for commands, the controlling expressions are written with braces to defer their evaluation. Their evaluation is deferred because they need to be re-evaluated repeatedly. The expression in an if is not re-evaluated and so it does not need to be deferred. The braces are still useful to group the arguments of the expression together but if the grouping behavior is not needed, then the braces can be omitted entirely. For example, the following two commands are equivalent:

if $a {incr a}
if {$a} {incr a}

The following two commands are not equivalent.

while $a {incr a}
while {$a} {incr a}

It does not hurt to write braces around all expressions; however, if you frequently read other people’s code, you must get used to seeing the braces omitted in some expressions.

The switch Command

The switch command is similar to the if command but is more specialized. Instead of evaluating a mathematical expression, the switch command compares a string to a set of patterns. Each pattern is also associated with a body of Tcl commands. The first pattern that matches has its associated body evaluated.

Here is a fragment that sets the variable type depending on the value of count. For example, if count is the string big, then type is set to array. If count matches none of the choices, the special default body is used.

switch -- $count 
  1 {
    set type byte
} 2 {
    set type word
} big {
    set type array
} default {
    puts "$count is not a valid type"
}

By default, shell-style pattern matching is used. For example "?" matches any single character and "*" matches any string. I will describe the different types of pattern matching in more detail later.

The "--" immediately after "switch" is a placeholder for several flags that can modify the way the switch command works. The flags are rarely used so I will not describe them; nevertheless it is still a good idea to use the "--“. This will prevent your string from inadvertently matching a flag.

Continuation Lines

By default, commands do not continue beyond the end of a line. However, there are several exceptions to this. One is that a backslash at the end of a line continues the command. I used this in the previous example where the first line of the switch had a backslash to continue the command. Without it, the command would have ended after $count and the 1 on the next line would have mistakenly been interpreted as another command.

Another exception is that open braces cause commands to continue across lines. This is precisely how I have written all of the other multi-line examples so far. Fortunately, this style looks a lot like another common style—the C language. Even if you are not used to C, it will be helpful if you adopt the C formatting style—just leave an open brace at the end of the current line and you can omit the backslashes.

Consider the following three if commands:

if {$count < 0} {
    puts "count is less than zero"
}

if {$count < 0} 
    { puts "count is less than zero"
}

if {$count < 0}
{
    puts "count is less than zero"
}

The first two examples are correct. The third one is incorrect—the if command is missing a body.

Open braces nest, so this guideline works if you write braced commands inside of braces. Later in this chapter and in the next, I will return to the subject of braces and how to use them effectively.

Double quotes and brackets also cause commands to continue across lines. As before, a or literal newline is retained and a backslash-newline-whitespace sequence is replaced by a single space. Compare the following two commands:

set oneline "hello
     world"
set twolines "hello
     world."

After execution, oneline is set to "hello world“, while twolines is set to "hello<newline><space><space><space><space>world“.

The break And continue Commands

The break and continue commands change the normal flow inside control structure commands such as for and while.

The break command causes the current loop command to return so that the next command after the loop can run. For example, the following would loop infinitely except for the break command in the middle. If a ever equals three, the break command will execute and the while command will return.

set a 0
while {1} {
    incr a
    if {$a == 3} break
    puts "hello"
}

The continue command drives control back to the top of the loop so that no more commands are executed during the current iteration. In the following example, the continue is executed whenever the the value of a modulo three is not equal to zero. This has the effect of printing "hello" three times for every "there" printed.

set a 0
while {1} {
    incr a
    puts "hello"
    if {$a%3 != 0} continue
    puts "there"
}

The proc And return Commands

It is possible to create your own commands using the proc command. Such commands are called procedures but they behave the same as if they were built-in commands.

The proc command takes three arguments. The first argument is a command name. The second argument is a list of variables, each initalized to an argument of the procedure when it is called. (The variables are occasionally called formal parameters or just parameters to distinguish them from the actual arguments to which they are set.) The third argument of proc is a body of code.

The following command defines a procedure called fib that computes the nth Fibonacci number given any two starting numbers. Fibonacci numbers are sequences of numbers where each new number in the sequence is generated by adding the most recent two together. The starting two numbers are the first two arguments. The last argument defines which element of the sequence to return.

proc fib {pen ult n} {
    for {set i 0} {$i<$n} {incr i} {
        set new [expr $ult+$pen]
        set pen $ult                        ;# new penultimate value
        set ult $new                        ;# new ultimate value
    }
    return $pen
}

The return command in the last line takes its argument and makes the procedure fib itself return with that value. Once defined, fib can then be used as a command. For example, it could be called as:

set m [fib 0 1 9]

Although fib always returns a number, any string can be returned using return. A lot of procedures do not have a need to return anything—they just need to return. In this case, it is not necessary to provide return with an argument. The return command itself may be omitted if it would otherwise be the last command in a procedure. Then, the procedure returns with whatever is returned by the last command executed within the procedure.

You can force a single procedure to return by using return. In contrast, use the exit command to make the script (i.e., process) end and return to the shell (or whatever invoked the original script). The exit command works even inside of a procedure. The exit command can only return a number because that is all that UNIX permits. For example:

exit 1

Procedures can be called only after they are defined. Procedures share a single namespace with no name scoping. That means that once a procedure is defined, it can be called from any procedure, including itself. Here is another version of fib. This one is recursive—it calls itself.

proc fib {pen ult n} {
    if {$n == 0} {
        return $pen
    }
    return [fib $ult [expr $pen+$ult] [expr $n−1]]
}

In contrast, variables are usually local to the current procedure. In the example above, the variables ult, pen, and n, are not visible to any procedures that call fib such as expr. They are not even visible to other invocations of the fib procedure. The fib procedure calls expr, passing it the value of ult but not the string "ult“. The expr command cannot modify the variable ult. Computer scientists call this pass by value.

It is possible for a procedure to change a variable in the caller’s scope. The set and incr commands handle their first argument this way. This technique is called pass by reference, and I will explain how to write procedures that do this on page 56. Another technique to communicate values between commands is to use global variables. Using a lot of global variables can be confusing, but since most scripts are short, global variables are a very popular technique.

The global command identifies variables to consider global. This means that references to those variables inside the procedure are the same as references to those variables outside all the procedures.

For example, you could define the constant pi as a global variable outside any procedure. A procedure that needed the value would then access it using the global command.

proc area_of_circle {radius} {
    global pi

    return [expr 2*$pi*$radius]
}

You can list additional variable names as arguments to a global command, and you can have multiple global commands in a procedure.

global pi e golden_ratio

The return command may be omitted if it is the last command in a procedure. In that case, the procedure returns the last value computed. For example, the previous procedure could be simplified as follows:

proc area_of_circle {radius} {
    global pi

    expr 2*$pi*$radius
}

The source Command

Procedures and variables that are used in numerous scripts can be stored in other files, allowing them to be conveniently shared. A file of commands can be read with the source command. Its argument is the name of a file to read. Tcl understands the tilde convention from the C shell. For example, the following command reads the file definitions.tcl from your home directory:

source ~/definitions.tcl

As the file is read, the commands are executed. So if the file contains procedure definitions, the procedures will be callable after the source command returns. Tcl’s source command is similar to the C shell’s source command.

Tcl’s library facility provides a way to automatically source files as needed. It is described on page 66.

The return command can be used to make a source command return. Otherwise, source returns only after executing the last command in the file.

More On Expressions

In the while and if command examples, I enclosed the expressions in braces and said that the expressions were evaluated by the commands themselves. For example, in the while command, the expression was

$count > 0

Because the expression was wrapped in braces, evaluation of $count was deferred. During expression evaluation, a $ followed by a variable name is interpreted in just the same way it is done with arguments that are not wrapped in braces. Similarly, brackets are also interpreted the same way in both contexts. For this reason, commands like the following two have the same result. But in the first command, $count is evaluated before expr executes, while in the second command, expr itself evaluates $count.

expr $count > 0
expr {$count > 0}

The expr command can perform some string operations. For example, quoted strings are recognized as string literals. Thus, you can say things like:

expr {$name == "Don"}

Unquoted strings cannot be used as string literals. The following fails:

expr {$name == Don}

Unquoted strings are not permitted inside expressions to prevent ambiguities in interpretation. But strings in expressions can still be tricky. In fact, I recommend avoiding expr for these implicit string operations. The reason is that expr tries to interpret strings as numbers. Only if they are not numbers, are they treated as uninterpreted strings. Consider:

tclsh>if {"0x0" == "0"} {puts equal}
equal

Strings which are not even internally representable as numbers can cause problems:

tclsh> expr {$x=="1E500"}
floating-point value too large to represent
    while executing
"expr {$x=="1E500"}"

On page 46, I will describe the "string compare" operation which is a better way of comparing arbitrary strings. However, because many people do use expr for string operations, it is important to be able to recognize and understand it in scripts.

Lists

In the proc command, the second argument was a list of variables.

proc fib {ult pen n} {

The parameter list is just a string containing the characters, "u“, "l“, "t“, " “, "p“, "e“, "n“, " “, and "n“. Intuitively, the string can also be thought of as a list of three elements: ult, pen, and n. The whitespace just serves to separate the elements.

Lists are very useful, and Tcl provides many commands to manipulate them. For example, llength returns the length of a list.[10]

tclsh> llength "a b c"
3
tclsh> llength ""
0
tclsh> llength [llength "a b c"]
1

In the next few sections, I will describe more commands to manipulate lists.

Selecting Elements Of Lists

The lindex and lrange commands select elements from a list by their index. The lindex command selects a single element. The lrange command selects a set of elements. Elements are indexed starting from zero.[11]

tclsh> lindex "a b c d e" 0
a
tclsh> lindex "a b c d e" 2
c
tclsh> lrange "a b c d e" 0 2
a b c
tclsh> llength [lrange "a b c d e" 0 2]
3

You can step through the members of a list using an index and a for loop. Here is a loop to print out the elements of a list in reverse.

for {set i [expr [llength $list]−1]} {$i>=0} {incr $i −1} {
    puts [lindex $list $index]
}

Iterating from front to back is much more common than the reverse. In fact, it is so common, there is a command to do it called foreach. The first argument is a variable name. Upon each iteration of the loop, the variable is set to the next element in the list, provided as the second argument. The third argument is the loop body.

For example, this fragment prints each element in list.

foreach element $list {
    puts $element
}

After execution, the variable element remains set to the last element in the list.

Varying Argument Lists

Usually procedures must be called with the same number of arguments as they have formal parameters. When a procedure is called, each parameter is set to one of the arguments. However, if the last parameter is named args, all the remaining arguments are stored in a list and assigned to args. For example, imagine a procedure p1 definition that begins:

proc p1 {a b args} {

If you call it as "p1 red box cheese whiz“, a is set to red, b is set to box, and args is set to "cheese whiz“. If called as "p1 red box“, args is set to the empty list.

Here is a more realistic example. The procedure sum takes an arbitrary number of arguments, adds them together, and returns the total.

proc sum {args} {
    set total 0
    foreach int $args {
        incr total $int
    }
    return $total
}

I will show another example of a procedure with varying arguments on page 57.

Lists Of Lists

List elements can themselves be lists. This is a concern in several situations, but the simplest is when elements contain whitespace. For example, there must be a way to distinguish between the list of "a" and "b" and the list containing the single element "a b“.

Assuming an argument has begun with a double quote, the next double quote ends the argument. It is possible to precede double quotes with backslashes inside a list, but this is very hard to read, given enough levels of quoting. Here is a list of one element.

set x ""a b""

As an alternative, Tcl supports braces to group lists inside lists. Using braces, it is easy to construct arbitrarily complex lists.

set a "a b {c d} {e {f g {xyz}}}"

Assuming you do not need things like variable substitution, you can replace the top-level double quotes with braces as well.

set a {a b {c d} {e {f g {xyz}}}}

This looks more consistent, so it is very common to use braces at the top level to write an argument consisting of a list of lists.

The reason braces work so much better for this than double quotes is that left and right braces are distinguishable. There is no such thing as a right double quote, so Tcl cannot tell when quotes should or should not match. But it is easy to tell when braces match. Tcl counts the left-hand braces and right-hand braces. When they are equal, the list is complete.

This is exactly what happens when Tcl sees a for, while, or other control structure. Examine the while command below. The last argument is the body. There is one open brace on the first line and another on the fourth. The close brace on the sixth line matches one, and the close brace on the last line matches the other, terminating the list and hence the argument.

while {1} {                        ;# one open brace
    incr a
    puts "hello"
    if {$a%3 != 0} {                    ;# two open braces
        continue
    }                    ;# one open brace
    puts "there"
}                        ;# zero open braces

Double quotes and brackets have no special meaning inside of braces and do not have to match. But braces themselves do. To embed a literal brace, you have to precede it with a backslash.

Here are some examples of lists of lists:

set x  { a b c }
set y  { a b {Hello world!}}
set z  { a [ { ] } }

All of these are three-element lists. The third element of y is a two-element list. The first and second elements of y can be considered one-element lists even though they have no grouping braces around them.

tclsh> llength $y
3
tclsh> llength [lindex $y 2]
2
tclsh> llength [lindex $y 0]
1

The second and third elements of z are both one-element lists.

tclsh> lindex $z 1
[
tclsh> lindex $z 2
 ]

Notice the spacing. Element two of z is a three-character string. The first and last characters are spaces, and the middle character is a right bracket. In contrast, element one is a single character. The spaces separating the elements of z were stripped off when the elements were extracted by lindex. The braces are not part of the elements.

The braces are, however, part of the string z. Considered as a string, z has eleven characters including the inner braces. The outer braces are not part of the string.

Similarly, the string in y begins with a space and ends with a right brace. The last element of y has only a single space in the middle.

tclsh> lindex $y 2
Hello world!

The assignment to y could be rewritten with double quotes.

tclsh> set y2 { a b "Hello World" }
 a b "Hello World"
tclsh> lindex $y2 2
Hello world!

In this case, the last element of y2 is the same. But more complicated strings cannot be stored this way. Tcl will complain.

tclsh> set y3 { a b "My name is "Goofy"" }
 a b "My name is "Goofy""
tclsh> lindex $y3 2
list element in quotes followed by "Goofy""" instead of space

There is nothing wrong with y3 as a string. However, it is not a list.

This section may seem confusing at first. You might want to come back to it after you have written some Tcl scripts.

Creating Lists

With care, lists can be created with the set command or with any command that creates a string with the proper structure. To make things easier, Tcl provides three commands that create strings that are guaranteed to be lists. These commands are list, lappend, and linsert.

The list And concat Commands

The list command takes all of its arguments and combines them into a list. For example:

tclsh> list a b "Hello world"
a b {Hello world}

In this example, a three-element list is returned. Each element corresponds to one of the arguments. The double quotes have been replaced by braces, but that does not affect the contents. When the third element is extracted, the braces will be stripped off.

tclsh> lindex [list a b "Hello World"] 2
Hello World

The list command is particularly useful if you need to create a list composed of variable values. Simply appending them is insufficient. If either variable contains embedded whitespace, for example, the list will end up with more than two elements.

tclsh> set a "foo bar "hello""
foo bar "hello"
tclsh> set b "gorp"
gorp
tclsh> set ab "$a $b"                            ;# WRONG
foo bar "hello" gorp
tclsh> llength $ab
4

In contrast, the list command correctly preserves the embedded lists. The list command also correctly handles things such as escaped braces and quotes.

If you want to append several lists together, use the concat command.

tclsh> concat a b "Hello world"
a b Hello world

The concat command treats each of its arguments as a list. The elements of all of the lists are then returned in a new list. Compare the output from concat (above) and list (below).

tclsh> list a b "Hello world"
a b {Hello world}

Here is another example of concat. Notice that whitespace inside elements is preserved.

tclsh> concat a {b {c d}}
a b {c d}

In practice, concat is rarely used. However, it is helpful to understand concat because several commands exhibit concat-like behavior. For example, the expr command concatenates its arguments together before evaluating them—much in the style of concat. Thus, the following commands produce the same result:

tclsh> expr 1 - {2 - 3}
-4
tclsh> expr 1 - 2 - 3
-4

Building Up Lists With The lappend Command

Building up lists is a common operation. For example, you may want to read in lines from a file and maintain them in memory as a list. Assuming the existence of commands get_a_line and more_lines_in_file, your code might look something like this:

while {[more_lines_in_file]} {
    set list "$list [get_a_line]"
}

The body builds up the list. Each time through the loop, a new line is appended to the end of the list.

This is such a common operation that Tcl provides a command to do this more efficiently. The lappend command takes a variable name as its first argument and appends the remaining arguments. The example above could be rewritten:

while {[more_lines_in_file]} {
    lappend list [get_a_line]
}

Notice that the first argument is not passed by value. Only the name is passed. Tcl appends the remaining arguments in place—that is, without making a copy of the original list. You do not have to use set to save the new list. This behavior of modifying the list in place is unusual—the other list commands require the list to be passed by value.

The linsert Command

Like its name implies, the linsert command inserts elements into a list. The first argument is the name of a list. The second argument is a numeric index describing where to insert into the list. The remaining arguments are the arguments to be inserted.

tclsh> set list {a b c d}
a b c d
tclsh> set list [linsert $list 0 new]
new a b c d
tclsh> linsert $list 1 foo bar {hello world}
new foo bar {hello world} a b c d

The lreplace Command

The lreplace command is similar to the linsert command except that lreplace deletes existing elements before inserting the new ones. The second and third arguments identify the beginning and ending indices of the elements to be deleted, and the remaining arguments are inserted in their place.

tclsh> set list {a b c d e}
a b c d e
tclsh> lreplace $list 1 3 x y
a x y e

The lsearch Command

The lsearch command is the opposite of the lindex command. The lsearch command returns the index of the first occurrence of an element in a list. If the element does not exist, −1 is returned.

tclsh> lsearch {a b c d e} "c"
2
tclsh> lsearch {a b c d e} "f"
−1

The lsearch command uses the element as a shell-style pattern by default. If you want an exact match, use the -exact flag.

tclsh> lsearch {a b c d ?} "?"
0
tclsh> lsearch -exact {a b c d ?} "?"
4

The lsort Command

The lsort command sorts a list. By default, it sorts in increasing ASCII order.

tclsh> lsort {one two three four five six}
five four one six three two

Several flags are available including -integer, -real, and -decreasing to sort in ways suggested by their names.

It is also possible to supply lsort with a comparison procedure of your own. This is useful for lists with elements that are lists in themselves. However, lists of more than a hundred or so elements are sorted slowly enough that it is more efficient to have them sorted by an external program such as the UNIX sort command.[12]

The split And join Commands

The split and joincommands are useful for splitting strings into lists, and vice versa—joining lists into strings.

The split command splits a string into a list. The first argument is the string to be split. The second argument is a string of characters, each of which separates elements in the first argument.

For example, if the variable line contained a line from /etc/passwd, it could be split as:

tclsh> split $line ":"
root Gw19QKxuFWDX7 0 1 Operator / /bin/csh

The directories of a file name can be split using a "/“.

tclsh> split "/etc/passwd" "/"
{} etc passwd

Notice the empty element because of the / at the beginning of the string.

The join command does the opposite of split. It joins elements in a list together. The first argument is a list to join together. The second argument is a string to place between all the elements.

tclsh> join {{} etc passwd} "/"
/etc/passwd

With an empty second argument, split splits between every character, and join joins the elements together without inserting any separating characters.

tclsh> split "abc" ""
a b c
tclsh> join {a b c} ""
abc

More Ways To Manipulate Strings

There are a number of other useful commands for string manipulation. These include scan, format, string, and append. Two more string manipulation commands are regexp and regsub. Those two commands require a decent understanding of regular expressions, so I will hold off describing regexp and regsub until Chapter 6 (p. 135). Then, the commands will be much easier to understand.

The scan And format Commands

The scan and format commands extract and format substrings corresponding to low-level types such as integers, reals, and characters. scan and format are good at dealing with filling, padding, and generating unusual characters. These commands are analogous to sscanf and sprintf in the C language, and most of the C conventions are supported.

As an example, the following command assigns to x a string composed of a ^A immediately followed by "foo==1700.000000" (the number of zeros after the decimal point may differ on your system). The string is left-justified in an eight-character field.

set x [format "%1c%-8s==%f" 1 foo 17.0e2]

The first argument is a description of how to print the remaining arguments. The remaining arguments are substituted for the fields that begin with a "%“. In the example above, the "-" means “left justify” and the 8 is a minimum field width. The "c“, "s“, and "f" force the arguments to be treated as a character, a string, and a real number (f stands for float), respectively. The "==" is passed through literally since it does not begin with a "%“.

The scan command does the opposite of format. For example, the output above can be broken back down with the following command:

scan $x "%c%8s%*[ =]%f" char string float

The first argument, $x, holds the string to be scanned. The first character is assigned to the variable char. The next string (ending at the first whitespace or after eight characters, whichever comes first) is assigned to the variable string. Any number of blanks and equal signs are matched and discarded. The asterisk suppresses the assignment. Finally, the real is matched and assigned to the variable float. The scan command returns the number of percent-style formats that matches.

I will not describe scan and format further at this point, but I will return to them later in the book. For a complete discussion, you can also consult the Tcl reference material or any C language reference.

The string Command

The string command is a catchall for a number of miscellaneous but very useful string manipulation operations. The first argument to the string command names the particular operation. While discussing these operations, I will refer to the remaining arguments as if they were the only arguments.

As with the list commands, the string commands also use zero-based indices. So the first character in a string is at position 0, the second character is at position 1, and so on.

The compare operation compares the two arguments lexicographically (i.e., dictionary order). The command returns −1, 0, or 1, depending on if the first argument is less than, equal to, or greater than the second argument.

As an example, this could be used in an if command like so:

if {[string compare $a $b] == 0} {
    puts "strings are equal"
} else {
    puts "strings are not equal"
}

The match operation returns 1 if the first argument matches the second or 0 if it does not match. The first argument is a pattern similar in style to the shell, where * matches any number of characters and ? matches any single character.

tclsh> string match "*.c" "main.c"
1

I will cover these patterns in more detail in Chapter 4 (p. 87). The regexp command provides more powerful patterns. I will describe regexp in Chapter 5 (p. 107).

The first operation searches for a string (first argument) in another string (second argument). It returns the first position in the second argument where the first argument is found. −1 is returned if the second string does not contain the first.

tclsh> string first "uu" "uunet.uu.net"
0

The last operation returns the last position where the first argument is found.

tclsh> string last "uu" "uunet.uu.net"
6

The length operation returns the number of characters in the string.

tclsh> string length "foo"
3
tclsh> string length ""
0

The index operation returns the character corresponding to the given index (second argument) in a string (first argument). For example:

tclsh> string index "abcdefg" 2
c

The range operation is analogous to the lrange command, but range works on strings. Indices correspond to character positions. All characters between the indices inclusive are returned. The string "end" may be used to refer to the last position in the string.

tclsh> string range "abcdefg" 2 3
cd

The tolower and toupper operations convert an argument to lowercase and uppercase, respectively.

tclsh> string tolower "NeXT"
next

The trimleft operation removes characters from the beginning of a string. The string is the first argument. The characters removed are any which appear in the optional second argument. If there is no second argument, then whitespace is removed.

string trimleft $num "-"     ;# force $num nonnegative

The trimright operation is like trimleft except that characters are removed from the end of the string. The trim operation removes characters from both the beginning and the end of the string.

The append Command

The following command appends a string to another string in a variable.

set var "$var$string"

Appending strings is a very common operation. For example, it is often used in a loop to read the output of a program and to create a single variable containing the entire output.

Appending occurs so frequently that there is a command specifically for this purpose. The append command takes a variable as the first argument and appends to it, all of the remaining strings.

tclsh> append var "abc"
abc
tclsh> append var "def" "ghi"
abcdefghi

Notice that the first argument is not passed by value. Only the name is passed. Tcl appends the remaining arguments in place—that is, without making a copy of the original list. This allows Tcl to take some shortcuts internally. Using append is much more efficient than the alternative set command.

This behavior of modifying the string in place is unusual—none of the string operations work this way. However, the lappend command does. So just remember that append and lappend work this way. It might be helpful to go back to the lappend description (page 42) now and compare its behavior with append.

Both append and lappend share a few other features. Neither requires that the variable be initialized. If uninitialized, the first string argument is set rather than appended (as if the set command had been used). append and lappend also return the final value of the variable. However, this return value is rarely used since both commands already store the value in the variable.

Arrays

Earlier, I described how multiple strings can be stored together in a list. Tcl provides a second mechanism for storing multiple strings together called arrays.

Each string stored in an array is called an element and has an associated name. The element name is given in parentheses following the array name. For example, an array of user ids could be defined as:

set uid(0)                "root"
set uid(1)                "daemon"
set uid(2)                "uucp"
set uid(100)                "dave"
set uid(101)                "josh"
. . .

Once defined, elements can then be accessed by name:

set number 101
puts var "User id $number is $uid($number)"

You can use any string as an array element, not just a number. For example, the following additional assignments allow user ids to be looked up by either user id or name.

set uid(root)                    0
set uid(daemon)                    1
set uid(uucp)                    2
set uid(dave)                    100
set uid(josh)                    101

Because element names can be arbitrary strings, it is possible to simulate multi-dimensional arrays or structures. For example, a password database could be stored in an array like this:

set uid(dave,uid)                            100
set uid(dave,password)                            diNBXuprAac4w
set uid(dave,shell)                            /usr/local/bin/zsh
set uid(josh,uid)                            101
set uid(josh,password)                            gS4jKHp1AjYnd
set uid(josh,shell)                            /usr/local/bin/tcsh

Now, an arbitrary user’s shell can be retrieved as $uid($user,shell). The choice of a comma to separate parts of the element name is arbitrary. You can use any character and you can use more than one in a name.

It is possible to have element names with whitespace in them. For example, it might be convenient to find out the user name, given a full name. Doing it in two steps is easy and usually what you want anyway—presumably, the name variable is set elsewhere.

set name "John Ousterhout"
set uid($name) ouster

If you just want to embed a literal array reference that contains whitespace, you have to quote it. Remember, any string with whitespace must be quoted to keep it as a single argument (unless it is already in braces).

set "uid(John Ousterhout)" ouster

This is not specific to arrays. Any variable containing whitespace can be set similarly. The following sets the variable named "a b“.

set "a b" 1

From now on when I want to explicitly talk about a variable that is not an array, I will use the term scalar variable.

Earlier, I described how to pass scalar variables into procedures—as parameters or globals. Arrays can be accessed as globals, too. (Name the array in a global command and all of the elements become visible in the procedure.) Arrays can be passed as parameters but not as easily as scalar variables. Later in this chapter (page 57), I will describe the upvar command that provides the means to accomplish this.

Indirect References

Earlier I described how the single-argument set command is useful to separate variable names from other adjacent characters. This can be used for arbitrarily complex indirect references. For example, the following commands dynamically form a variable name from the contents of b and a literal c character. This result is taken as another variable name, and its contents are assigned to d.

set xc 1
set b x
set d [set [set b]c]     ;# sets d to value of xc

This type of indirection works with array names too. For example, the following sequence stores an array name in a and then retrieves a value through it.

set a(1) foo
set a2 a
puts [set [set a2](1)]

In contrast, replacing either of the set commands with the $ notation fails. The first of the next two commands incorrectly tries to use a2 as the array name.

tclsh> puts [set $a2(1)]
can't read "a2(1)": variable isn't array
tclsh> puts $[set a2](1)
$a(1)

In the second command, the dollar sign is substituted literally because there is not a variable name immediately following it when it is first scanned.

Variable Information

Tcl provides the info command to obtain assorted pieces of internal information about Tcl. The most useful of the info operations is "info exists“. Given a variable name, "info exists" returns 1 if the variable exists or 0 otherwise. Only variables accessible from the current scope are checked. For example, the following command shows that haha has not been defined. An attempt to read it would fail.

tclsh> info exists haha
0

Three related commands are "info locals“, "info globals“, and "info vars“. They return a list of local, global, and all variables respectively. They can be constrained to match a subset by supplying an optional pattern (in the style of the "string match" command). For example, the following command returns a list of all global variables that begin with the letters "mail“.

info globals mail*

Tcl has similar commands for testing whether commands and procedures exist. "info commands" returns a list of all commands. "info procs" returns a list of just the procedures (commands defined with the proc command).

"info level" returns information about the stack. With no arguments, the stack depth is returned. "info level 0" returns the command and arguments of the current procedure. "info level −1" returns the command and arguments of the calling procedure of the current procedure. −2 indicates the next previous caller, and so on.

The "info script" command returns the file name of the current script being executed. This is just one of a number of other information commands that give related types of information useful in only the most esoteric circumstances. See the Tcl reference material for more information.

Array Information

While the info command can be used on arrays, Tcl provides some more specialized commands for this purpose. For example, "array size b" returns the number of elements in an array.

tclsh> set color(pavement) black
black
tclsh> set color(snow) white
white
tclsh> array size color
2

The command "array names" returns the element names of an array.

tclsh> array names color
pavement snow

Here is a loop to print the elements of the array and their values.

tclsh> foreach name [array names color] {
    puts "The color of $name is $color($name)."
}
The color of snow is white.
The color of pavement is black.

Unsetting Variables

The unset command unsets a variable. After being unset, a variable no longer exists. You can unset scalar variables or entire arrays. For example:

unset a
unset array(elt)
unset array

After a variable is unset, it can no longer be read and "info exists" returns 0.

Tracing Variables

Variable accesses can be traced with the trace command. Using trace, you can evaluate procedures whenever a variable is accessed. While this is useful in many ways, I will cover trace in more detail in the discussion on debugging (Chapter 18 (p. 402)) since that is almost always where trace first shows its value. In that same chapter, I will also describe how to trace commands.

Handling Errors

When typing commands interactively, errors cause the interpreter to give up on the current command and reprompt for a new command. All well and good. However, you do not want this to happen while running a script.

While many errors are just the result of typing goofs, some errors are more difficult to avoid and it is easier to react to them “after the fact”. For example, if you write a procedure that does several divisions, code before each division can check that the denominator is not zero. A much easier alternative is to check that the whole procedure did not fail. This is done using the catch command. catch evaluates its argument as another command and returns 1 if there was an error or 0 if the procedure returned normally.

Assuming your procedure is named divalot, you can call it this way:

if [catch divalot] {
    puts "got an error in divalot!"
    exit
}

The argument to catch is a list of the command and arguments to be evaluated. If your procedure takes arguments, then they must be grouped together. For example:

catch {puts "Hello world"}
catch {divalot some args}

If your procedure returns a value itself, this can be saved by providing a variable name as the second argument to catch. For example, suppose divalot normally returns a value of 17 or 18.

tclsh> catch {divalot some args} result
0
tclsh> set result
17

Here, catch returned 0 indicating divalot succeeded. The variable result is set to the value returned by divalot.

This same mechanism can be used to get the messages produced by an error. For example, you can compute the values of x for the equation 0 = ax2 + bx + c by using the quadratic formula. In mathematical notation, the quadratic formula looks like this:

Figure 2-1. 

Here is a procedure for the quadratic formula:

proc qf {a b c} {
    set s [expr sqrt($b*$b-4*$a*$c)]
    set d [expr 2*$a]
    list [expr (-$b+$s)/$d] 
         [expr (-$b-$s)/$d]
}

When run successfully, qf produces a two-element list of values:

tclsh> catch {qf 1 0 −2} roots
0
tclsh> set roots
1.41421 −1.41421

When run unsuccessfully, this same command records the error in roots:

tclsh> catch {qf 0 0 −2} roots
1
tclsh> set roots
divide by zero

By using catch this way, you avoid having to put a lot of error-checking code inside qf. In this case, there is no need to check for division by zero or taking the square root of a negative number. This simplifies the code.

While it is rarely useful in a script, it is possible to get a description of all the commands and procedures that were in evaluation when an error occurred. This description is stored in the global variable errorInfo. In the example above, errorInfo looks like this:

tclsh> set errorInfo
divide by zero
    while executing
"expr (-$b+$s)/$d"
    invoked from within
"list [expr (-$b+$s)/$d]..."
    invoked from within
"return [list [expr (-$b+$s)/$d]..."
    (procedure "qf" line 4)
    invoked from within
"qf 0 0 −2"

errorInfo is actually set when the error occurs. You can use errorInfo whether or not you use catch to, well, . . . catch the error.

Causing Errors

The error command is used to create error conditions which can be caught with the catch command. error is useful inside of procedures that return errors naturally already.

For example, if you wanted to restrict the qf routine so that a could not be larger than 100, you could rewrite the beginning of it as:

proc qf {a b c} {
    if {$a > 100} {error "a too large"}
    set s [expr sqrt($b*$b-4*$a*$c)]
    . . .

Now if a is greater than 100, "catch {qf ...}" will return 1. The message "a too large" will be stored in the optional variable name supplied to catch as the second argument.

Evaluating Lists As Commands

Everything in Tcl is represented as a string. This includes commands. You can manipulate commands just like any other string. Here is an example where a command is stored in a variable.

tclsh> set output "puts"
puts
tclsh> $output "Hello world!"
Hello world!

The variable output could be used to select between several different forms of output. If this command was embedded inside a procedure, it could handle different forms of output with the same parameterized code. The Tk extension of Tcl uses this technique to manipulate multiple widgets with the same code.

Evaluating an entire command cannot be done the same way. Look what happens:

tclsh> set cmd "puts "Hello world!""
puts "Hello world!"
tclsh> $cmd
invalid command name "puts "Hello world!""

The problem is that the entire string in cmd is taken as the command name rather than a list of a command and arguments.

In order to treat a string as a list of a command name and arguments, the string must be passed to the eval command. For instance:

tclsh> eval $cmd
Hello world!

The eval command treats each of its arguments as a list. The elements from all of the lists are used to form a new list that is interpreted as a command. The first element become the command name. The remaining elements become the arguments to the command.

The following example uses the arguments "append“, "v1“, "a b“, and "c d" to produce and evaluate the command "append v1 a b c d“.

tclsh> eval append v1 "a b" "c d"
abcd

Remember the concat command from page 42? The eval command treats it arguments in exactly the same was as concat. For example, notice how internal space is preserved:

tclsh> eval append v2 {a b} {c {d e}}
abcd e

The list command will protect any argument from being broken up by eval.

tclsh> eval append v3 [list {a b}] [list {c {d e}}]
a bc d e

When the arguments to eval are unknown (because they are stored in a variable), it is particularly important to use the list command. For example, the previous command is more likely to be expressed in a script this way:

eval append somevar [list $arg1] [list $arg2]

Unless you want your arguments to be broken up, surround them with list commands.

The eval command also performs $ substitution and [] evaluation so that the command is handled as if it had originally been typed in and evaluated rather than stored in a variable. In fact, when a script is running, eval is used internally to break the lines into command names and arguments, and evaluate them. Commands such as if, while, and catch use the eval command internally to evaluate their command blocks. So the same conventions apply whether you are using eval explicitly, writing commands in a script, or typing commands interactively.

Again, the list command will protect unwanted $ substitution and [] evaluation.

These eval conventions such as $ substitution and [] evaluation are only done when a command is evaluated. So if you have a string with embedded dollar signs or whitespace, for example, you have to protect it only when it is evaluated.

tclsh> set a "$foo"
$foo
tclsh> set b $a
$foo
tclsh> set b
$foo

Passing By Reference

By default, you can only refer to global variables (after using the global command) or variables declared within a procedure. The upvar command provides a way to refer to variables in any outer scope. A common use for this is to implement pass by reference. When a variable is passed by reference, the calling procedure can see any changes the called procedure makes to the variable.

The most common use of upvar is to get access to a variable in the scope of the calling procedure. If a procedure is called with the variable v as an argument, the procedure associates the caller’s variable v with a second variable so that when the second variable is changed, the caller’s v is changed also.

For example, the following command associates the variable name stored in name with the variable p.

upvar $name p

After this command, any references to p also refer to the variable named within name. If name contains "v“, "set p 1" sets p to 1 inside the procedure and v to 1 in the caller of the procedure.

The qf procedure can be rewritten to use upvar. As originally written, qf returned a list. This is a little inconvenient because the list always has to be torn apart to get at the two values. Lists are handy when they are long or of unknown size, but they are a nuisance just for handling two values. However, Tcl only allows procedures to return a single value, and a list is the only way to make two values “feel” like one.

Here is another procedure to compute the quadratic formula but written with upvar. This procedure, called qf2, writes its results into the caller’s fourth and fifth parameters.

proc qf2 {a b c name1 name2} {
    upvar $name1 r1 $name2 r2
        set s [expr sqrt($b*$b-4*$a*$c)]
    set d [expr $a+$a]
    set r1 [expr (-$b+$s)/$d]
    set r2 [expr (-$b-$s)/$d]
}

The qf2 procedure looks like this when it is called.

tclsh> catch {qf2 1 0 −2 root1 root2}
0
tclsh> set root1
1.41421
tclsh> set root2
−1.41421

A specific caller can be chosen by specifying a level immediately after the command name. Integers describe the number of levels up the procedure call stack. The default is 1 (the calling procedure). If an integer is preceded by a "#“, then the level is an absolute level with 0 equivalent to the global level. For example, the following associates the global variable curved_intersection_count with the local variable x.

upvar #0 curved_intersection_count x

The upvar command is especially useful for dealing with arrays because arrays cannot be passed by value. (There is no way to refer to the value of an entire array.) However, arrays can be passed by reference.

For example, imagine you want to compute the distance between two points in an xyz-coordinate system. Each point is represented by three numbers. Rather than passing six numbers, it is simpler to pass the entire array. Here is a procedure which computes the distance assuming the numbers are all stored in a single array:

proc distance {name} {
    upvar $name a
    set xdelta [expr $a(x,2) - $a(x,1)]
    set ydelta [expr $a(y,2) - $a(y,1)]
    set zdelta [expr $a(z,2) - $a(z,1)]
    expr {sqrt(
        $xdelta*$xdelta +
        $ydelta*$ydelta +
        $zdelta*$zdelta)
    }
}

Evaluating Commands In Other Scopes

The uplevel command is similar in spirit to upvar. With uplevel, commands can be evaluated in the scope of the calling procedure. The syntax is similar to eval. For example, the following command increments x in the scope of the calling procedure.

uplevel incr x

The uplevel command can be used to create new control structures such as variations on if and while or even more powerful constructs. Space does not permit a discussion of this technique, so I refer you to the Tcl reference material.

The following procedure (written by Karl Lehenbauer with a modification by Allan Brighton) provides static variables in the style of C. Like variables declared global, variables declared static are accessible from other procedures. However, the same static variables cannot be accessed by procedures in different files. This can be helpful in avoiding naming collisions between two programmers—both of whom unintentionally choose the same names for global variables that are private to their own files.

proc static {args} {
    set unique [info script]
    foreach var $args {
        uplevel 1 "upvar #0 static($unique:$var) $var"
    }
}

The procedure makes its arguments be references into an array (appropriately called static). Because of the uplevel command, all uses of the named variable after the static call become references into the array. The array elements have the file name embedded in them. This prevents conflicts with similarly-named variables in other files. By setting unique to "[lindex [info level −1] 0]“, static can declare persistent variables that cannot be accessed by any other procedure even in the same file.

If you have significant amounts of Tcl code, you may want to consider even more sophisticated scoping techniques. For instance, [incr Tcl], written by Michael McLennan, is a Tcl extension that supports object-oriented programming in the style of C++. [incr Tcl] provides mechanisms for data encapsulation within well-defined interfaces, greatly increasing code readability while lessening the effort to write such code in the first place. The Tcl FAQ describes how to obtain [incr Tcl]. For more information on how to get the FAQ, see Chapter 1 (p. 20).

Working With Files

Tcl has commands for accessing files. The open command opens a file. The second argument determines how the file should be opened. "r" opens a file for reading; "w" truncates a file and opens it for writing; "a" opens a file for appending (writing without truncation). The second argument defaults to "r“.

open "/etc/passwd" "r"
open "/tmp/stuff.[pid]" "w"

The first command opens /etc/password for reading. The second command opens a file in /tmp for writing. The process id is used to construct the file name—this is an ideal way to construct unique temporary names.

The open command returns a file identifier. This identifier can be passed to the many other file commands, such as the close command. The close command closes a file that is open. The close command takes one argument—a file identifier. Here is an example:

set input [open "/etc/passwd" "r"]    ;# open file
close $input                          ;# close same file

The open command is a good example of a command that is frequently evaluated from a catch command. Attempting to open (for reading) a nonexistent file generates an error. Here is one way to catch it:

if [catch {open $filename} input] {
    puts "$input"
    return
}

By printing the error message from open, this fragment accurately reports any problems related to opening the file. For example, the file might exist yet not allow permission to read it.

The open command may also be used to read from or write to pipelines specified as /bin/sh-like commands. A pipe character (”|“) signifies that the remainder of the argument is a command. For example, the following command searches through all the files in the current directory hierarchy and finds each occurrence of the word book. Each matching occurrence can be read as if you were reading it from a plain file.

open "| find . -type f -print | xargs grep book"

The argument to open must be a valid list. Each element in the list becomes a command or argument in the pipeline. If you needed to search for "good book“, you could do it in a number of ways. Here are just two:

open "| find -type f -print | xargs grep "good book""
open {| find -type f -print | xargs grep {good book}}

File I/O

Once a file is open, you can read from it and write to it.

Use the puts command to write to a file. If you provide a file identifier, puts will write to that file.

set file [open /tmp/stuff w]
puts $file "Hello World"       ;# write to /tmp/stuff

Remember that puts writes to the standard output by default. Sometimes it is convenient to refer to the standard output explicitly. You can do that using the predefined file identifier stdout. (You can also refer to the standard input as stdin and the standard error as stderr.)

The puts command also accepts the argument -nonewline, which skips adding a newline to the end of the line.

puts -nonewline $file "Hello World"

If you are writing strings without newlines to a character special file (such as a terminal), the output will not immediately appear because the I/O system buffers output one line at a time. However, there are strings you want to appear immediately and without a newline. Prompts are good examples. To force them out, use the flush command.

puts -nonewline $file $prompt
flush $file

Use the gets command or read command to read from a file. gets reads a line at a time and is good for text files in simple applications. read is appropriate for everything else.

The gets command takes a file identifier and an optional variable name in which to store the string that was read. When used this way, the length of the string is returned. If the end of the file is reached, −1 is returned.

I frequently read through files with the following code. Each time through the loop, one line is read and stored in the variable line. Any other commands in the loop are used to process each line. The loop terminates when all the lines have been read.

while 1 {
    if {[gets $file line] == −1} break
        # do something with $line
}

The read command is similar to gets. The read command reads input but not line by line like gets. Instead, read reads a fixed number of characters. It is ideal if you want to process a file a huge chunk at a time. The maximum number to be read is passed as the second argument and the actual characters read are returned. For example, to read 100000 bytes you would use:

set chunk [read $file 100000]

The characters read may be less than the number requested if there are no more characters in the file or if you are reading from a terminal or similar type of special file.

The command eof returns a 0 if the file has more bytes left or a 1 otherwise. This can be used to rewrite the loop above (using gets) to use read.

while {![eof $file]} {
    set buffer [read $file 100000]
    # do something with $buffer
}

If you omit the length, the read command reads the entire file. If the file fits in physical memory[13], you can read things with this form of read much more efficiently than with gets. For example, if you want to process each line in a file, you can write:

foreach line [split [read $file] "
"] {
    # do something with $line
}

There are two other commands to manipulate files: seek and tell. They provide random access into files and are analogous to the UNIX lseek and tell system calls. They are rarely used, so I will not describe them further. See the Tcl reference material for more information.

File Name Matching

If a file name starts with a tilde character and a user name, the open command translates this to the named user’s home directory. If the tilde is immediately followed by a slash, it is translated to the home directory of the user running the script. This is the same behavior that most shells support.

However, the open command does not do anything special with other metacharacters such as "*" and "?“. The following command opens a file with a "*" at the end of its name!

open "/tmp/foo*" "w"

The glob command takes file patterns as arguments and returns the list of files that match. For example, the following command returns the files that end with .exp and .c in the current directory.

glob *.exp *.c

The result of glob can be passed to open (presuming that it only matches one file). An error occurs if no files are matched. Using glob as the source in a foreach loop provides a way of opening each file separately.

foreach filename [glob *.exp] {
    set file [open $filename]
    # do something with $file
    close $file
}

The characters understood by glob are * (matches anything), ? (matches any single character), [] (matches a set or range of characters), {} (matches a choice of strings), and (matches the next character literally). I will not go into details on these—they are similar to matching done by many shells such as csh. Plus I will be talking about most of them in later chapters anyway.

Setting And Getting The Current Directory

If file names do not begin with a "~" or "/“, they are relative to the current directory. The current directory can be set with cd. It is analogous to the cd command in the shell. As in the open command, the tilde convention is supported but all other shell metacharacters are not. There is no built-in directory stack.

tclsh> cd ~libes/bin
tclsh> pwd
/usr/libes/bin

File Name Manipulation

The file command does a large number of different things all related to file names. The first argument names the function and the second argument is the file name to work on.

Four functions are purely textual. The same results can be accomplished with the string functions, but these are very convenient.

The "file dirname" command returns the directory part of the file name. (It returns a "." if there is no slash. It returns a slash if there is only one slash and it is the first character.) For example:

tclsh> file dirname /usr/libes/bin/prog.exp
/usr/libes/bin

The opposite of "file dirname" is "file tail“. It returns everything after the last slash. (If there is no slash, it returns the original file name.)

The "file extension" command returns the last dot and anything following it in the file name. (It returns the empty string if there is no dot.) For example:

tclsh> file extension /usr/libes/src/my.prog.c
.c

The opposite of "file extension" is "file rootname“. It returns everything but the extension.

tclsh> file rootname /usr/libes/src/my.prog.c
/usr/libes/src/my.prog

While these functions are very useful with file names, they can be used on any string where dots and slashes are separators.

For example, suppose you have an IP address in addr and want to change the last field to the value stored in the variable new. You could use split and join, but the file name manipulation functions do it more easily.

tclsh> set addr
127.0.1.2
tclsh> set new
42
tclsh> set addr [file rootname $addr].$new
127.0.1.42

When you need to construct arbitrary compound names, consider using dots and slashes so that you can use the file name commands. You can also use blanks, of course, in which case you can use the string commands. However, since blanks are used as argument separators, you have to be much more careful when using commands such as eval.

File Information

The file command can be used to test for various attributes of a file. Listed below are a number of predicates and their meanings. Each variation returns a 0 if the condition is false for the file or 1 if it is true. Here is an example to test whether the file /tmp/foo exists:

tclsh> file exists /tmp/foo
1

The predicates are:

file isdirectoryfile

true if file is a directory

file isfilefile

true if file is a plain file (i.e., not a directory, device, etc.)

file executablefile

true if you have permission to execute file

file existsfile

true if file exists

file ownedfile

true if you own file

file readablefile

true if you have permission to read file

file writablefile

true if you have permission to write file

All the predicates return 0 if the file does not exist.

While the predicates make it very easy to test whether a file meets a condition, it is occasionally useful to directly ask for file information. Tcl provides a number of commands that do that. Each of these takes a file name as the last argument.

The "file size" command returns the number of bytes in a file. For example:

tclsh> file size /etc/motd
63

The "file atime" command returns the time in seconds when the file was last accessed. The "file mtime" command returns the time in seconds when the file was last modified. The number of seconds is counted starting from January 1, 1970.

The "file type" command returns a string describing the type of the file such as file, directory, characterSpecial, blockSpecial, link, or socket. The "file readlink" command returns the name to which the file points to, assuming it is a symbolic link.

The "file stat" command returns the raw values of a file’s inode. Each value is written as elements in an array. The array name is given as the third argument to the file command. For example, the following command writes the information to the array stuff.

file stat /etc/motd stuff

Elements are written for atime, ctime, mtime, type, uid, gid, ino, mode, nlink, dev, size. These are all written as integers except for the type element which is written as I described before. Most of these values are also accessible more directly by using one of the other arguments to the file command. However, some of the more unusual elements (such as nlink) have no corresponding analog. For example, the following command prints the number of links to a file:

file stat $filename fileinfo
puts "$filename has $fileinfo(nlink) links"

If the file is a symbolic link, "file stat" returns information about the file to which it points. The "file lstat" command works similarly to "file stat" except that it returns information about the link itself. See the UNIX stat documentation for more information on stat and lstat.

All of these file information commands require the file to exist or else they will generate an error. This error can be caught using catch.

Executing UNIX Commands

Most UNIX commands can be executed by calling exec. The arguments generally follow the /bin/sh conventions including ">“, "<“, "|“, "&“, and variations on them. Use whitespace before and after the redirection symbols.

tclsh> exec date
Thu Feb 24  9:32:00 EST 1994
tclsh> exec date | wc -w
       6
tclsh> exec date > /tmp/foo
tclsh> exec cat /tmp/foo
Thu Feb 24  9:32:03 EST 1994

Unless redirected, the standard output of the exec command is returned as the result. This enables you to save the output of a program in a variable or use it in another command.

tclsh> puts "The date is [exec date]"
The date is Thu Feb 24  9:32:17 EST 1994

Tcl assumes that UNIX programs return the exit value 0 if successful. Use catch to test whether a program succeeds or not. The following command returns the exit value from mv which could, for example, indicate that a file did not exist.

catch {exec mv oldname newname}

Many programs return nonzero exit values even if they were successful. For example, diff returns an exit value of 1 when it finds that two files are different. Some UNIX programs are sloppy and return a random exit value which can generate an error in exec. An error is also generated if a programs writes to its standard error stream. It is common to use catch with exec to deal with these problems.

Tilde substitution is performed on the command but not on the arguments, and no globbing is done at all. So if you want to delete all the .o files in a directory, for instance, it must be done as follows:

exec rm [glob *.o]

Beyond the /bin/sh conventions, exec supports special redirections to reference open files. In particular, an @ after a redirection symbol introduces a file identifier returned from open. For example, the following command writes the date to an open file.

set file [open /tmp/foo]
exec date >@ $file

The exec command has a number of other esoteric features. See the reference documentation for more information.

Environment Variables

The global array env is pre-initialized so that each element corresponds to an environment variable. For example, the path is a list of directories to search for executable programs. From the shell, the path is stored in the variable PATH. When using Tcl, the path is contained in env(PATH). It is manipulated just like any other variable.

tclsh> set env(PATH)
/usr/local/bin:/usr/bin:/bin
tclsh> set env(PATH) ".:$env(PATH)"    ;# prepend current dir
.:/usr/local/bin:/usr/bin:/bin

Modifications to the env array do not affect the parent environment, but new processes that are created (using exec, for instance) will inherit the current values (including any new elements that have created).

Handling Unknown Commands

The unknown command is called when another command is executed which is not known to the interpreter. Rather than simply issuing an error message, this gives you the opportunity to handle the problem and recover in an intelligent way. For example, you could attempt to re-evaluate the arguments as an expression. This would allow you to be able to evaluate expressions without using the expr command.

set a [1+1]

To make unknown do what you want, simply define it as a procedure. The list of arguments is available as a parameter to the unknown command. Here is a definition of unknown which supports expression evaluation without having to specify the expr command:

proc unknown {args} {
    expr $args
}

By default, Tcl comes with a definition for unknown that does a number of things such as attempt history substitution. I will only go into detail on the most useful action that unknown takes—retrieving procedure definitions from libraries.

Libraries

By default, the unknown command tries to find procedure definitions in a library. A library is simply a file that contain procedure definitions. Libraries can be explicitly read using the source command. However, it is possible to prepare an index file which describes the library contents in such a way that Tcl knows which library to load based on the command name. Once a library is loaded, the unknown command calls the new procedure just defined. After the procedure completes, unknown completes and it appears as if the procedure had been defined all along.

As an example, one of Tcl’s default libraries defines the parray procedure. parray prints out the contents of an array. It is a parameterized version of the code on page 51. The info command shows that parray is not defined before it is invoked, but it is defined afterwards.

tclsh> info command parray
tclsh> parray color
color(pavement) = black
color(snow)     = white
tclsh> info command parray
parray

You can add procedures to the libraries or create new libraries. See the Tcl reference material for more information on using libraries.

Is There More To Tcl?

This chapter has covered most of the Tcl commands and data structures. I will expand on a few of these descriptions later in the book, but for the most part, you have now seen the entire Tcl language.

Even though Tcl is a small language, it is capable of handling very large and sophisticated scripts. However, Tcl was originally designed for writing small scripts with most of the work being done in the underlying commands themselves. Indeed, Tcl supports the ability to add additional commands written in other languages such as C and C++. This is useful for commands that must be very fast or do something unusual (such as the Expect commands do).

Fortunately, the need to resort to implementing your own commands is growing increasingly rare. People have already written commands for just about anything you can imagine. They are packaged into collections called extensions and are available from the Tcl archive. I have already mentioned [incr Tcl] which provides commands for object-oriented programming. Another popular extension is TclX, which provides commands for most of the UNIX system and library calls. There are a variety of extensions to support different databases (e.g., SQL, Oracle, Dbm). And there are many extensions to support graphics (e.g., SIPP, GL, PHIGS). These extensions and others are described in the Tcl FAQ (page 20). In Chapter 22 (p. 507), I will describe how to add existing extensions to Expect.

If none of these extensions provides what you are looking for, you can always write your own. Tcl has always supported this way of adding new commands and it is surprisingly easy to do. If you are interested in learning more about this, I recommend Ousterhout’s Tcl and the Tk Toolkit.

Exercises

  1. Is Tcl like any other language you know? Bourne shell? C shell? Lisp? C?

  2. As best as you can remember (or guess), write down the precedence table for Tcl expressions. Now look it up in the reference material. How close were you? Repeat this exercise with Perl, C, Lisp, and APL.

  3. What is the best thing about Tcl? What is the worst thing about Tcl? (That bad, eh?)

  4. Try putting comments where they do not belong—for instance, inside the arguments to a procedure. What happens?

  5. Write a procedure to reverse a string. If you wrote an iterative solution, now write a recursive solution or vice versa.

  6. Repeat the previous exercise but with a list instead of a string.

  7. Write a procedure to rename all the files in a directory ending with .c to names ending in ".cc“.

  8. Write a procedure that takes a list of variable names and a list of values, and sets each variable in the list to the respective value in the other list. Think of different alternatives to handle the case when the lists are of different lengths.

  9. Write a procedure that creates a uniquely-named temporary file.

  10. Write a procedure that can define other procedures that automatically have access to global variables.



[7] The use of by itself to represent the null character is the only escape not supported. I will describe how to handle nulls in Chapter 6 (p. 153).

[8] When Tcl is installed, it creates a program called tclsh (which stands for “Tcl shell” but is usually pronounced “ticklish”). tclsh is a program that contains only the Tcl commands and interpreter. Typing directly to tclsh is not the usual way to use Tcl, but tclsh is convenient for experimenting with the basic Tcl commands. tclsh actually prompts with a bare "%“, but I show it here as "tclsh>" so that you cannot confuse it with the C-shell prompt.

[9] I have long considered numerous levels of precedence to be more a hindrance than a benefit. I am reminded of this whenever I switch back and forth between languages that have differing precedence tables, each with dozens of levels. To avoid mental anguish, I frequently use more parentheses than necessary. In Tcl and the Tk Toolkit, Ousterhout echoes my sentiments when he says: "Except in the simplest and most obvious cases you should use parentheses to indicate the way operators should be grouped; this will prevent errors by you and by others who modify your programs.”

[10] That is not a misspelling—all of the list manipulation commands start with an "l“.

[11] When I use ordinal terms such as “first”, I mean “index 0”.

[12] Alternatively, you can write your own Tcl command to do this in C, C++, or any other faster language.

[13] If the file is larger than physical memory, algorithms that require multiple passes over strings will cause thrashing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset