Expect does not have its own special-purpose language. Expect uses Tcl, a popular language for embedding in applications. Tcl provides lots of basic commands such as if
/then
/else
, while
, and set
. Expect extends the language with commands such as expect
and interact
.
This chapter is an introduction and overview of Tcl. While not covering all of Tcl, this chapter does provide everything that the rest of the book depends on, and this is enough to write virtually any Expect script. Even if you already know Tcl, you may find it helpful to read this chapter. In this chapter, I will emphasize things about Tcl that you may not have thought much about before.
You probably want to get on with using Expect, and I can understand the urge to skip this chapter in the hopes of learning as little Tcl as possible so you can put Expect to work for you now. Please be patient and it will all fit together that much more easily.
If you do skip this chapter and you find yourself wondering about points in the other chapters, turn back to this chapter and read it.
A few concepts will not be covered here but will be explained as they are encountered for the first time in other chapters. The index can help you locate where each command is first defined and used.
I will occasionally mention when a particular Tcl command or feature is similar to C. It is not necessary that you know C in order to use Tcl, but if you do know it, such statements are clues that you can rely on what you already know from that language.
The types of variables are not declared in Tcl. There is no need since there is only one type: string. Every value is a string. Numbers are strings. Even commands and variables are strings! The following commands set the variable name
to the string value "Don
“, the variable word
to the value "foobar
“, and the variable pi
to "3.14159
“.
set name Don set word foobar set pi 3.14159
Variable names, values, and commands are case sensitive. So the variable name
is different than Name
.
To access a variable’s value, prefix the variable name with a dollar sign ($
). The following command sets the variable phrase
to foobar
by retrieving it from the variable word
.
set phrase $word
Variable substitutions can occur anywhere in a command, not just at the beginning of an argument. The following command sets the variable phrase2
to the string "word=foobar
“.
set phrase2 word=$word
You can insert a literal dollar sign by prefixing it with a backslash. The following command sets the variable money
to the value "$1000
“.
set money $1000
The backslash is also a useful way to embed other special characters in strings. For example, "
" is a tab and "" is a backspace. Most of the Standard C backslash conventions are supported including
followed by one to three octal digits and
x
followed by any number of hex digits.[7] I will mention more of these escapes later.
# stick control-Z in a variable set controlZ 32 # define control-C set controlC x03 # define string with embedded control-Z and tab set oddword foo 32bar gorp
A command beginning with "#
" is a comment. It extends to the end of the line. You can think of "#
" as a command whose arguments are discarded.
Multiple commands can be strung together if they are separated by a semicolon. A literal semicolon can be embedded by prefacing it with a backslash.
set word1 foo; set word2 bar ;# two commands set word3 foo;bar ;# one command
The ";#
" sequence above is a common way of a tacking comments on the end of a line. The ";
" ends the previous command and the "#
" starts the comment. Writing the ";#
" together avoids the possibility of having your comment unintentionally accepted as additional arguments because of a forgotten semicolon.
Commands are normally terminated at the end of a line, but a backslash at the end of a line allows multi-line commands. The backslash-newline sequence and any following whitespace behaves as if it were a single space. Later, I will show other ways of writing multi-line commands.
set word really-long-string-which-does-not-quite-fit-on-previous-line
Tcl separates arguments by whitespace (blanks, tabs, etc.). You can pass space characters by double quoting an argument. The quotes are not part of the argument; they just serve to keep it together. Tcl is similar to the shell in this respect.
set string1 "Hello world"
Double quotes prevent “;” from breaking things up.
set string2 "A semicolon ; is in here"
Keeping an argument together is all that double quotes do. Character sequences such as $
,
, and still behave as before.
set name "Sam" set age 17 set word2 "My name is $name; my age is $age;"
After execution of these three commands, word2
is left with the value "My name is Sam; my age is 17;
“.
Notice that in the first command Sam
was quoted, while in the second command 17
was not quoted, even though neither contained blanks. When arguments do not have blanks, you do not have to quote them but I often do anyway—they are easier to read. However, I do not bother to quote numbers because quoted numbers simply look funny. Anyway, numbers never contain whitespace.
You can actually have whitespace in command names and variable names in which case you need to quote them too. I will show how to do this later, but I recommend you avoid it if possible.
All commands return values. For example, the pid
command returns the process id of the current process. To evaluate a command and use its return value as an argument or inside an argument, embed the command in brackets. The bracketed command is replaced by its return value.
set p "My pid is [pid]."
The set
command returns its second argument. The following command sets b
and a
to 0.
set b "[set a 0]"
When it comes to deciding what are arguments, brackets are a special case. Tcl groups everything between matching brackets, so it is not necessary to quote arguments that only have spaces inside brackets. The following are all legal.
set b [set a 0] set b "[set a 0]" set b "[set a 0]hello world" set b [set a 0]hello
After execution of this last command, a
is set to "0
" and b
is set to "0hello
“.
With only one argument, set
returns the value of its first argument.
set c [set b]
Calling set
with one argument is similar to using the dollar sign. Indeed, the previous command can be rewritten as "set c $b
“. However they are not always interchangeable. Consider the two commands:
set $phello set [set p]hello
The first returns the value of the variable phello
. The second returns the value of the variable p
concatenated with the string "hello
“.
The $
syntax is shorter but does not automatically terminate the end of the variable. In the rare cases where the variable just runs right into more alphanumeric characters, the one-argument set
command in brackets is useful.
The one-argument set
command is also useful when entering commands interactively. For example, here is what it looks like when I type a command to the Tcl interpreter:[8]
tclsh> set p
12389
After entering "set p
“, the return value was printed. When using Tcl interactively, you will always see the return value printed. When executing commands from a script, the return values are discarded. Inside scripts, use the puts
command to print to the standard output.
Strings can be treated as expressions having mathematical values. For example, when the string "1+1
" is treated as an expression, it has the value 2. Tcl does not automatically treat strings as expressions, but individual commands can. I will demonstrate this in a moment. For now, just remember that expressions are not complete commands in themselves.
Expressions are quite similar to those in the C language. The following are all valid expressions:
1 + 5 1+5 1+5.5 1e10+5.5 (5%3)*sin(4) 1 <= 2 (1 <= 2) || (2 != 3)
All of the usual mathematical operators are available including +
, -
, *
, /
, and %
(modulo). Many functions exist such as sin
, cos
, log
, and sqrt
. Boolean operators include ||
(or), &&
(and), and !
(not). The usual comparison operators are available (<=
(less than or equal), ==
(equal), !=
(not equal), etc.). They return 0 if the expression is false and 1 if it is true.
Whitespace may be used freely to enhance readability. A number of numeric forms are supported including scientific notation as well as octal (any number with a leading 0
) and hexadecimal (any number with a leading 0x
). Functions such as floor
, ceil
, and round
convert from floating-point to integer notation.
Precedence and associativity follow C rules closely. For instance, the expression "1-2*3
" is interpreted as "1-(2*3)
" because multiplication is of higher precedence than addition. All binary operators at the same level of precedence are left-associative. This means, for instance, that the expression "1-2-3
" is interpreted as "(1-2)-3
“. Since Tcl expressions rarely become complex, I will omit a lengthy discussion of the numerous levels of precedence, and instead note that you can always use parentheses to override a particular precedence or associativity.[9] See the Tcl reference material for the complete list of operators and their precedences.
Variable values may also be used in expressions.
1 + $age $argc < 10
Return values can be used in expressions using the bracket notation. For example, an expression to compare the current process id to 0 is:
[pid] == 0
If the process id is 0, the expression equals 1; otherwise it equals 0.
Expressions are not commands by themselves. Rather, certain commands treat their arguments as expressions, evaluating them in the process of command execution. For example, the while
command treats its first argument as an expression. I will describe while
and similar commands later.
The expr
command takes any number of arguments and evaluates them as a single expression and returns the result.
set x "The answer is 1 + 3" set y "The answer is [1 + 3]" set z "The answer is [expr 1 + 3]"
After evaluation of the first command, x
has the value "The answer is 1 + 3
“. The last command leaves z
with the value "The answer is 4
“. The middle command causes an error. "1 + 3
" is not a valid command because 1
is not a command.
Here is a more complicated-looking command (legal this time). It computes a result based on the current process id value and the value of the variable mod
.
set x [expr (5 % $mod) + ( 17 == [pid])]
Braces are similar to double quotes. Both function as a grouping mechanism; however, braces defer any evaluation of brackets, dollar signs, and backslash sequences. In fact, braces defer everything.
set var1 "a$b[set c] " set var2 {a$b[set c] }
After evaluation of these two commands, var1
contains an "a
" followed by the values of b
and c
, terminated by a return character. The variable var2
contains the characters "a
“, "$
“, "b
“, "[
“, "s
“, "e
“, "t
“, " “, "c
“, "]
“, "“, and "
r
“.
As with double quotes, the braces are not part of the argument they enclose. They just serve to group and defer. The primary use of braces is in writing control commands such as while
loops, for
loops, and procedures. These commands need to see the strings without having $
substitutions and bracket evaluations made.
Control structures change the flow of control. Many of the control structures in Tcl are patterned directly after their C equivalents. Tcl gives you the power to write your own control structures, so if you do not like those of C, you may yet find happiness. I will not describe how to do it, but it is surprisingly easy. (The hard part is designing something that makes sense.)
The while
command loops until an expression is false. It looks very similar to a while
in the C language. The following while
loop computes the factorial of the number stored in the variable count
.
set total 1 while {$count > 0} { set total [expr $total * $count] set count [expr $count−1] }
The body of the loop is composed of the two set
commands between the braces. The body is executed as long as $count
is greater than 0.
Taking a step back, the while
command has two arguments. The first is the controlling expression. The second is the body. Notice that both arguments are enclosed in braces. That means that no $
substitutions or bracket evaluations are made. For instance, this while
command literally gets the string "$count > 0
" as its first argument. Similarly, for the body. So how does anything happen?
The answer is that the while
command itself evaluates the expression. If true (nonzero), the while
command evaluates the body. The while
command then re-evaluates the expression. As long as the expression keeps re-evaluating to a nonzero value, the while
command keeps re-evaluating the body.
It is useful to compare this with the set
command. The set
command does not do any evaluation of its second argument. Consider this command:
set count [expr $count−1]
The [expr ...]
part is evaluated before the set
command even begins. If count
is 7, the set
command sees an argument of 6. In contrast, the while
command sees the argument "$count > 0
“. It would not make any sense to evaluate that expression before the while
command, since it has to change every time through the loop.
Using braces this way is fundamental toward the correct use of Tcl’s control structures. You will see that all the other ones follow easily from this.
Many loops use a counter of some sort. Incrementing or decrementing a counter is so common that there is a command to simplify it. It is called incr
. It modifies the variable given as its first argument. With no other argument incr
adds one, otherwise incr
adds the remaining argument.
The two commands are equivalent:
set count [expr $count−1] incr count −1
The for
command is similar to the while
command. The for
command has a controlling expression and a body. However, before the expression is a “start” argument, and after the expression is a “next” argument.
forstart
expression
next
{ # commands here make up # the body of the for }
Both the start and next arguments are commands. The start argument is executed before the first evaluation of the controlling expression. The next argument is evaluated immediately after the body of the loop.
The code shown earlier to compute a factorial can be simplified using the incr
command and a for
loop as follows:
for {set total 1} {$count > 0} {incr count −1} { set total [expr $total * $count] }
Either of the start or next argument can be empty, but you have to leave a placeholder. For example, you could express an infinite loop as:
for {} {1} {} { ... some command ... }
In its simplest form, the if
command takes a controlling expression and a body to execute if the expression is nonzero. It looks a lot like a while
command, but the body is executed at most once.
if {$count < 0} { set total 1 }
If present, an optional else
fragment is executed only if the expression evaluates to zero. Here is an example:
if {$count < 5} { puts "count is less than five" } else { puts "count is not less than five" }
It is also possible to add more conditions using elseif
arguments. Any number of elseif
arguments may be used.
if {$count < 0} { puts "count is less than zero" } elseif {$count > 0} { puts "count is greater than zero" } else { puts "count is equal to zero" }
In the while
and for
commands, the controlling expressions are written with braces to defer their evaluation. Their evaluation is deferred because they need to be re-evaluated repeatedly. The expression in an if
is not re-evaluated and so it does not need to be deferred. The braces are still useful to group the arguments of the expression together but if the grouping behavior is not needed, then the braces can be omitted entirely. For example, the following two commands are equivalent:
if $a {incr a} if {$a} {incr a}
The following two commands are not equivalent.
while $a {incr a} while {$a} {incr a}
It does not hurt to write braces around all expressions; however, if you frequently read other people’s code, you must get used to seeing the braces omitted in some expressions.
The switch
command is similar to the if
command but is more specialized. Instead of evaluating a mathematical expression, the switch
command compares a string to a set of patterns. Each pattern is also associated with a body of Tcl commands. The first pattern that matches has its associated body evaluated.
Here is a fragment that sets the variable type
depending on the value of count
. For example, if count
is the string big
, then type
is set to array
. If count
matches none of the choices, the special default
body is used.
switch -- $count 1 { set type byte } 2 { set type word } big { set type array } default { puts "$count is not a valid type" }
By default, shell-style pattern matching is used. For example "?
" matches any single character and "*
" matches any string. I will describe the different types of pattern matching in more detail later.
The "--
" immediately after "switch
" is a placeholder for several flags that can modify the way the switch
command works. The flags are rarely used so I will not describe them; nevertheless it is still a good idea to use the "--
“. This will prevent your string from inadvertently matching a flag.
By default, commands do not continue beyond the end of a line. However, there are several exceptions to this. One is that a backslash at the end of a line continues the command. I used this in the previous example where the first line of the switch
had a backslash to continue the command. Without it, the command would have ended after $count
and the 1 on the next line would have mistakenly been interpreted as another command.
Another exception is that open braces cause commands to continue across lines. This is precisely how I have written all of the other multi-line examples so far. Fortunately, this style looks a lot like another common style—the C language. Even if you are not used to C, it will be helpful if you adopt the C formatting style—just leave an open brace at the end of the current line and you can omit the backslashes.
Consider the following three if
commands:
if {$count < 0} { puts "count is less than zero" } if {$count < 0} { puts "count is less than zero" } if {$count < 0} { puts "count is less than zero" }
The first two examples are correct. The third one is incorrect—the if
command is missing a body.
Open braces nest, so this guideline works if you write braced commands inside of braces. Later in this chapter and in the next, I will return to the subject of braces and how to use them effectively.
Double quotes and brackets also cause commands to continue across lines. As before, a
or literal newline is retained and a backslash-newline-whitespace sequence is replaced by a single space. Compare the following two commands:
set oneline "hello world" set twolines "hello world."
After execution, oneline
is set to "hello world
“, while twolines
is set to "hello
<newline><space><space><space><space>world
“.
The break
and continue
commands change the normal flow inside control structure commands such as for
and while
.
The break
command causes the current loop command to return so that the next command after the loop can run. For example, the following would loop infinitely except for the break
command in the middle. If a
ever equals three, the break
command will execute and the while
command will return.
set a 0 while {1} { incr a if {$a == 3} break puts "hello" }
The continue
command drives control back to the top of the loop so that no more commands are executed during the current iteration. In the following example, the continue
is executed whenever the the value of a
modulo three is not equal to zero. This has the effect of printing "hello
" three times for every "there
" printed.
set a 0 while {1} { incr a puts "hello" if {$a%3 != 0} continue puts "there" }
It is possible to create your own commands using the proc
command. Such commands are called procedures but they behave the same as if they were built-in commands.
The proc
command takes three arguments. The first argument is a command name. The second argument is a list of variables, each initalized to an argument of the procedure when it is called. (The variables are occasionally called formal parameters or just parameters to distinguish them from the actual arguments to which they are set.) The third argument of proc
is a body of code.
The following command defines a procedure called fib
that computes the nth Fibonacci number given any two starting numbers. Fibonacci numbers are sequences of numbers where each new number in the sequence is generated by adding the most recent two together. The starting two numbers are the first two arguments. The last argument defines which element of the sequence to return.
proc fib {pen ult n} { for {set i 0} {$i<$n} {incr i} { set new [expr $ult+$pen] set pen $ult ;# new penultimate value set ult $new ;# new ultimate value } return $pen }
The return
command in the last line takes its argument and makes the procedure fib
itself return with that value. Once defined, fib
can then be used as a command. For example, it could be called as:
set m [fib 0 1 9]
Although fib
always returns a number, any string can be returned using return
. A lot of procedures do not have a need to return anything—they just need to return. In this case, it is not necessary to provide return
with an argument. The return
command itself may be omitted if it would otherwise be the last command in a procedure. Then, the procedure returns with whatever is returned by the last command executed within the procedure.
You can force a single procedure to return by using return
. In contrast, use the exit
command to make the script (i.e., process) end and return to the shell (or whatever invoked the original script). The exit
command works even inside of a procedure. The exit
command can only return a number because that is all that UNIX permits. For example:
exit 1
Procedures can be called only after they are defined. Procedures share a single namespace with no name scoping. That means that once a procedure is defined, it can be called from any procedure, including itself. Here is another version of fib
. This one is recursive—it calls itself.
proc fib {pen ult n} { if {$n == 0} { return $pen } return [fib $ult [expr $pen+$ult] [expr $n−1]] }
In contrast, variables are usually local to the current procedure. In the example above, the variables ult
, pen
, and n
, are not visible to any procedures that call fib
such as expr
. They are not even visible to other invocations of the fib
procedure. The fib
procedure calls expr
, passing it the value of ult
but not the string "ult
“. The expr
command cannot modify the variable ult
. Computer scientists call this pass by value.
It is possible for a procedure to change a variable in the caller’s scope. The set
and incr
commands handle their first argument this way. This technique is called pass by reference, and I will explain how to write procedures that do this on page 56. Another technique to communicate values between commands is to use global variables. Using a lot of global variables can be confusing, but since most scripts are short, global variables are a very popular technique.
The global
command identifies variables to consider global. This means that references to those variables inside the procedure are the same as references to those variables outside all the procedures.
For example, you could define the constant pi
as a global variable outside any procedure. A procedure that needed the value would then access it using the global command.
proc area_of_circle {radius} { global pi return [expr 2*$pi*$radius] }
You can list additional variable names as arguments to a global
command, and you can have multiple global
commands in a procedure.
global pi e golden_ratio
The return
command may be omitted if it is the last command in a procedure. In that case, the procedure returns the last value computed. For example, the previous procedure could be simplified as follows:
proc area_of_circle {radius} { global pi expr 2*$pi*$radius }
Procedures and variables that are used in numerous scripts can be stored in other files, allowing them to be conveniently shared. A file of commands can be read with the source
command. Its argument is the name of a file to read. Tcl understands the tilde convention from the C shell. For example, the following command reads the file definitions.tcl
from your home directory:
source ~/definitions.tcl
As the file is read, the commands are executed. So if the file contains procedure definitions, the procedures will be callable after the source
command returns. Tcl’s source
command is similar to the C shell’s source
command.
Tcl’s library facility provides a way to automatically source
files as needed. It is described on page 66.
The return
command can be used to make a source
command return. Otherwise, source
returns only after executing the last command in the file.
In the while
and if
command examples, I enclosed the expressions in braces and said that the expressions were evaluated by the commands themselves. For example, in the while
command, the expression was
$count > 0
Because the expression was wrapped in braces, evaluation of $count
was deferred. During expression evaluation, a $
followed by a variable name is interpreted in just the same way it is done with arguments that are not wrapped in braces. Similarly, brackets are also interpreted the same way in both contexts. For this reason, commands like the following two have the same result. But in the first command, $count
is evaluated before expr
executes, while in the second command, expr
itself evaluates $count
.
expr $count > 0 expr {$count > 0}
The expr
command can perform some string operations. For example, quoted strings are recognized as string literals. Thus, you can say things like:
expr {$name == "Don"}
Unquoted strings cannot be used as string literals. The following fails:
expr {$name == Don}
Unquoted strings are not permitted inside expressions to prevent ambiguities in interpretation. But strings in expressions can still be tricky. In fact, I recommend avoiding expr
for these implicit string operations. The reason is that expr
tries to interpret strings as numbers. Only if they are not numbers, are they treated as uninterpreted strings. Consider:
tclsh>if {"0x0" == "0"} {puts equal}
equal
Strings which are not even internally representable as numbers can cause problems:
tclsh> expr {$x=="1E500"}
floating-point value too large to represent
while executing
"expr {$x=="1E500"}"
On page 46, I will describe the "string compare
" operation which is a better way of comparing arbitrary strings. However, because many people do use expr
for string operations, it is important to be able to recognize and understand it in scripts.
In the proc
command, the second argument was a list of variables.
proc fib {ult pen n} {
The parameter list is just a string containing the characters, "u
“, "l
“, "t
“, " “, "p
“, "e
“, "n
“, " “, and "n
“. Intuitively, the string can also be thought of as a list of three elements: ult
, pen
, and n
. The whitespace just serves to separate the elements.
Lists are very useful, and Tcl provides many commands to manipulate them. For example, llength
returns the length of a list.[10]
tclsh>llength "a b c"
3 tclsh>llength ""
0 tclsh>llength [llength "a b c"]
1
In the next few sections, I will describe more commands to manipulate lists.
The lindex
and lrange
commands select elements from a list by their index. The lindex
command selects a single element. The lrange
command selects a set of elements. Elements are indexed starting from zero.[11]
tclsh>lindex "a b c d e" 0
a tclsh>lindex "a b c d e" 2
c tclsh>lrange "a b c d e" 0 2
a b c tclsh>llength [lrange "a b c d e" 0 2]
3
You can step through the members of a list using an index and a for
loop. Here is a loop to print out the elements of a list in reverse.
for {set i [expr [llength $list]−1]} {$i>=0} {incr $i −1} { puts [lindex $list $index] }
Iterating from front to back is much more common than the reverse. In fact, it is so common, there is a command to do it called foreach
. The first argument is a variable name. Upon each iteration of the loop, the variable is set to the next element in the list, provided as the second argument. The third argument is the loop body.
For example, this fragment prints each element in list
.
foreach element $list { puts $element }
After execution, the variable element
remains set to the last element in the list.
Usually procedures must be called with the same number of arguments as they have formal parameters. When a procedure is called, each parameter is set to one of the arguments. However, if the last parameter is named args
, all the remaining arguments are stored in a list and assigned to args
. For example, imagine a procedure p1
definition that begins:
proc p1 {a b args} {
If you call it as "p1 red box cheese whiz
“, a
is set to red
, b
is set to box
, and args
is set to "cheese whiz
“. If called as "p1 red box
“, args
is set to the empty list.
Here is a more realistic example. The procedure sum
takes an arbitrary number of arguments, adds them together, and returns the total.
proc sum {args} { set total 0 foreach int $args { incr total $int } return $total }
I will show another example of a procedure with varying arguments on page 57.
List elements can themselves be lists. This is a concern in several situations, but the simplest is when elements contain whitespace. For example, there must be a way to distinguish between the list of "a
" and "b
" and the list containing the single element "a b
“.
Assuming an argument has begun with a double quote, the next double quote ends the argument. It is possible to precede double quotes with backslashes inside a list, but this is very hard to read, given enough levels of quoting. Here is a list of one element.
set x ""a b""
As an alternative, Tcl supports braces to group lists inside lists. Using braces, it is easy to construct arbitrarily complex lists.
set a "a b {c d} {e {f g {xyz}}}"
Assuming you do not need things like variable substitution, you can replace the top-level double quotes with braces as well.
set a {a b {c d} {e {f g {xyz}}}}
This looks more consistent, so it is very common to use braces at the top level to write an argument consisting of a list of lists.
The reason braces work so much better for this than double quotes is that left and right braces are distinguishable. There is no such thing as a right double quote, so Tcl cannot tell when quotes should or should not match. But it is easy to tell when braces match. Tcl counts the left-hand braces and right-hand braces. When they are equal, the list is complete.
This is exactly what happens when Tcl sees a for
, while
, or other control structure. Examine the while
command below. The last argument is the body. There is one open brace on the first line and another on the fourth. The close brace on the sixth line matches one, and the close brace on the last line matches the other, terminating the list and hence the argument.
while {1} { ;# one open brace incr a puts "hello" if {$a%3 != 0} { ;# two open braces continue } ;# one open brace puts "there" } ;# zero open braces
Double quotes and brackets have no special meaning inside of braces and do not have to match. But braces themselves do. To embed a literal brace, you have to precede it with a backslash.
Here are some examples of lists of lists:
set x { a b c } set y { a b {Hello world!}} set z { a [ { ] } }
All of these are three-element lists. The third element of y
is a two-element list. The first and second elements of y
can be considered one-element lists even though they have no grouping braces around them.
tclsh>llength $y
3 tclsh>llength [lindex $y 2]
2 tclsh>llength [lindex $y 0]
1
The second and third elements of z
are both one-element lists.
tclsh>lindex $z 1
[ tclsh>lindex $z 2
]
Notice the spacing. Element two of z
is a three-character string. The first and last characters are spaces, and the middle character is a right bracket. In contrast, element one is a single character. The spaces separating the elements of z
were stripped off when the elements were extracted by lindex
. The braces are not part of the elements.
The braces are, however, part of the string z
. Considered as a string, z
has eleven characters including the inner braces. The outer braces are not part of the string.
Similarly, the string in y
begins with a space and ends with a right brace. The last element of y
has only a single space in the middle.
tclsh> lindex $y 2
Hello world!
The assignment to y
could be rewritten with double quotes.
tclsh>set y2 { a b "Hello World" }
a b "Hello World" tclsh>lindex $y2 2
Hello world!
In this case, the last element of y2
is the same. But more complicated strings cannot be stored this way. Tcl will complain.
tclsh>set y3 { a b "My name is "Goofy"" }
a b "My name is "Goofy"" tclsh>lindex $y3 2
list element in quotes followed by "Goofy""" instead of space
There is nothing wrong with y3
as a string. However, it is not a list.
This section may seem confusing at first. You might want to come back to it after you have written some Tcl scripts.
With care, lists can be created with the set
command or with any command that creates a string with the proper structure. To make things easier, Tcl provides three commands that create strings that are guaranteed to be lists. These commands are list
, lappend
, and linsert
.
The list
command takes all of its arguments and combines them into a list. For example:
tclsh> list a b "Hello world"
a b {Hello world}
In this example, a three-element list is returned. Each element corresponds to one of the arguments. The double quotes have been replaced by braces, but that does not affect the contents. When the third element is extracted, the braces will be stripped off.
tclsh> lindex [list a b "Hello World"] 2
Hello World
The list
command is particularly useful if you need to create a list composed of variable values. Simply appending them is insufficient. If either variable contains embedded whitespace, for example, the list will end up with more than two elements.
tclsh>set a "foo bar "hello""
foo bar "hello" tclsh>set b "gorp"
gorp tclsh>set ab "$a $b"
;# WRONG foo bar "hello" gorp tclsh>llength $ab
4
In contrast, the list
command correctly preserves the embedded lists. The list
command also correctly handles things such as escaped braces and quotes.
If you want to append several lists together, use the concat
command.
tclsh> concat a b "Hello world"
a b Hello world
The concat
command treats each of its arguments as a list. The elements of all of the lists are then returned in a new list. Compare the output from concat
(above) and list
(below).
tclsh> list a b "Hello world"
a b {Hello world}
Here is another example of concat
. Notice that whitespace inside elements is preserved.
tclsh> concat a {b {c d}}
a b {c d}
In practice, concat
is rarely used. However, it is helpful to understand concat
because several commands exhibit concat
-like behavior. For example, the expr command concatenates its arguments together before evaluating them—much in the style of concat
. Thus, the following commands produce the same result:
tclsh>expr 1 - {2 - 3}
-4 tclsh>expr 1 - 2 - 3
-4
Building up lists is a common operation. For example, you may want to read in lines from a file and maintain them in memory as a list. Assuming the existence of commands get_a_line
and more_lines_in_file
, your code might look something like this:
while {[more_lines_in_file]} { set list "$list [get_a_line]" }
The body builds up the list. Each time through the loop, a new line is appended to the end of the list.
This is such a common operation that Tcl provides a command to do this more efficiently. The lappend
command takes a variable name as its first argument and appends the remaining arguments. The example above could be rewritten:
while {[more_lines_in_file]} { lappend list [get_a_line] }
Notice that the first argument is not passed by value. Only the name is passed. Tcl appends the remaining arguments in place—that is, without making a copy of the original list. You do not have to use set
to save the new list. This behavior of modifying the list in place is unusual—the other list commands require the list to be passed by value.
Like its name implies, the linsert
command inserts elements into a list. The first argument is the name of a list. The second argument is a numeric index describing where to insert into the list. The remaining arguments are the arguments to be inserted.
tclsh>set list {a b c d}
a b c d tclsh>set list [linsert $list 0 new]
new a b c d tclsh>linsert $list 1 foo bar {hello world}
new foo bar {hello world} a b c d
The lreplace
command is similar to the linsert
command except that lreplace
deletes existing elements before inserting the new ones. The second and third arguments identify the beginning and ending indices of the elements to be deleted, and the remaining arguments are inserted in their place.
tclsh>set list {a b c d e}
a b c d e tclsh>lreplace $list 1 3 x y
a x y e
The lsearch
command is the opposite of the lindex
command. The lsearch
command returns the index of the first occurrence of an element in a list. If the element does not exist, −1 is returned.
tclsh>lsearch {a b c d e} "c"
2 tclsh>lsearch {a b c d e} "f"
−1
The lsearch
command uses the element as a shell-style pattern by default. If you want an exact match, use the -exact
flag.
tclsh>lsearch {a b c d ?} "?"
0 tclsh>lsearch -exact {a b c d ?} "?"
4
The lsort
command sorts a list. By default, it sorts in increasing ASCII order.
tclsh> lsort {one two three four five six}
five four one six three two
Several flags are available including -integer
, -real
, and -decreasing
to sort in ways suggested by their names.
It is also possible to supply lsort
with a comparison procedure of your own. This is useful for lists with elements that are lists in themselves. However, lists of more than a hundred or so elements are sorted slowly enough that it is more efficient to have them sorted by an external program such as the UNIX sort
command.[12]
The split
and join
commands are useful for splitting strings into lists, and vice versa—joining lists into strings.
The split
command splits a string into a list. The first argument is the string to be split. The second argument is a string of characters, each of which separates elements in the first argument.
For example, if the variable line contained a line from /etc/passwd
, it could be split as:
tclsh> split $line ":"
root Gw19QKxuFWDX7 0 1 Operator / /bin/csh
The directories of a file name can be split using a "/
“.
tclsh> split "/etc/passwd" "/"
{} etc passwd
Notice the empty element because of the /
at the beginning of the string.
The join
command does the opposite of split
. It joins elements in a list together. The first argument is a list to join together. The second argument is a string to place between all the elements.
tclsh> join {{} etc passwd} "/"
/etc/passwd
With an empty second argument, split
splits between every character, and join
joins the elements together without inserting any separating characters.
tclsh>split "abc" ""
a b c tclsh>join {a b c} ""
abc
There are a number of other useful commands for string manipulation. These include scan
, format
, string
, and append
. Two more string manipulation commands are regexp
and regsub
. Those two commands require a decent understanding of regular expressions, so I will hold off describing regexp
and regsub
until Chapter 6 (p. 135). Then, the commands will be much easier to understand.
The scan
and format
commands extract and format substrings corresponding to low-level types such as integers, reals, and characters. scan
and format
are good at dealing with filling, padding, and generating unusual characters. These commands are analogous to sscanf
and sprintf
in the C language, and most of the C conventions are supported.
As an example, the following command assigns to x
a string composed of a ^A immediately followed by "foo==1700.000000
" (the number of zeros after the decimal point may differ on your system). The string is left-justified in an eight-character field.
set x [format "%1c%-8s==%f" 1 foo 17.0e2]
The first argument is a description of how to print the remaining arguments. The remaining arguments are substituted for the fields that begin with a "%
“. In the example above, the "-
" means “left justify” and the 8 is a minimum field width. The "c
“, "s
“, and "f
" force the arguments to be treated as a character, a string, and a real number (f stands for float), respectively. The "==
" is passed through literally since it does not begin with a "%
“.
The scan
command does the opposite of format
. For example, the output above can be broken back down with the following command:
scan $x "%c%8s%*[ =]%f" char string float
The first argument, $x
, holds the string to be scanned. The first character is assigned to the variable char
. The next string (ending at the first whitespace or after eight characters, whichever comes first) is assigned to the variable string. Any number of blanks and equal signs are matched and discarded. The asterisk suppresses the assignment. Finally, the real is matched and assigned to the variable float
. The scan
command returns the number of percent-style formats that matches.
I will not describe scan
and format
further at this point, but I will return to them later in the book. For a complete discussion, you can also consult the Tcl reference material or any C language reference.
The string
command is a catchall for a number of miscellaneous but very useful string manipulation operations. The first argument to the string
command names the particular operation. While discussing these operations, I will refer to the remaining arguments as if they were the only arguments.
As with the list commands, the string commands also use zero-based indices. So the first character in a string is at position 0, the second character is at position 1, and so on.
The compare
operation compares the two arguments lexicographically (i.e., dictionary order). The command returns −1, 0, or 1, depending on if the first argument is less than, equal to, or greater than the second argument.
As an example, this could be used in an if
command like so:
if {[string compare $a $b] == 0} { puts "strings are equal" } else { puts "strings are not equal" }
The match
operation returns 1 if the first argument matches the second or 0 if it does not match. The first argument is a pattern similar in style to the shell, where *
matches any number of characters and ?
matches any single character.
tclsh> string match "*.c" "main.c"
1
I will cover these patterns in more detail in Chapter 4 (p. 87). The regexp
command provides more powerful patterns. I will describe regexp
in Chapter 5 (p. 107).
The first
operation searches for a string (first argument) in another string (second argument). It returns the first position in the second argument where the first argument is found. −1 is returned if the second string does not contain the first.
tclsh> string first "uu" "uunet.uu.net"
0
The last
operation returns the last position where the first argument is found.
tclsh> string last "uu" "uunet.uu.net"
6
The length
operation returns the number of characters in the string.
tclsh>string length "foo"
3 tclsh>string length ""
0
The index
operation returns the character corresponding to the given index (second argument) in a string (first argument). For example:
tclsh> string index "abcdefg" 2
c
The range
operation is analogous to the lrange
command, but range
works on strings. Indices correspond to character positions. All characters between the indices inclusive are returned. The string "end
" may be used to refer to the last position in the string.
tclsh> string range "abcdefg" 2 3
cd
The tolower
and toupper
operations convert an argument to lowercase and uppercase, respectively.
tclsh> string tolower "NeXT"
next
The trimleft
operation removes characters from the beginning of a string. The string is the first argument. The characters removed are any which appear in the optional second argument. If there is no second argument, then whitespace is removed.
string trimleft $num "-" ;# force $num nonnegative
The trimright
operation is like trimleft
except that characters are removed from the end of the string. The trim
operation removes characters from both the beginning and the end of the string.
The following command appends a string to another string in a variable.
set var "$var$string"
Appending strings is a very common operation. For example, it is often used in a loop to read the output of a program and to create a single variable containing the entire output.
Appending occurs so frequently that there is a command specifically for this purpose. The append
command takes a variable as the first argument and appends to it, all of the remaining strings.
tclsh>append var "abc"
abc tclsh>append var "def" "ghi"
abcdefghi
Notice that the first argument is not passed by value. Only the name is passed. Tcl appends the remaining arguments in place—that is, without making a copy of the original list. This allows Tcl to take some shortcuts internally. Using append
is much more efficient than the alternative set
command.
This behavior of modifying the string in place is unusual—none of the string
operations work this way. However, the lappend
command does. So just remember that append
and lappend
work this way. It might be helpful to go back to the lappend
description (page 42) now and compare its behavior with append
.
Both append
and lappend
share a few other features. Neither requires that the variable be initialized. If uninitialized, the first string argument is set rather than appended (as if the set
command had been used). append
and lappend
also return the final value of the variable. However, this return value is rarely used since both commands already store the value in the variable.
Earlier, I described how multiple strings can be stored together in a list. Tcl provides a second mechanism for storing multiple strings together called arrays.
Each string stored in an array is called an element and has an associated name. The element name is given in parentheses following the array name. For example, an array of user ids could be defined as:
set uid(0) "root" set uid(1) "daemon" set uid(2) "uucp" set uid(100) "dave" set uid(101) "josh" . . .
Once defined, elements can then be accessed by name:
set number 101 puts var "User id $number is $uid($number)"
You can use any string as an array element, not just a number. For example, the following additional assignments allow user ids to be looked up by either user id or name.
set uid(root) 0 set uid(daemon) 1 set uid(uucp) 2 set uid(dave) 100 set uid(josh) 101
Because element names can be arbitrary strings, it is possible to simulate multi-dimensional arrays or structures. For example, a password database could be stored in an array like this:
set uid(dave,uid) 100 set uid(dave,password) diNBXuprAac4w set uid(dave,shell) /usr/local/bin/zsh set uid(josh,uid) 101 set uid(josh,password) gS4jKHp1AjYnd set uid(josh,shell) /usr/local/bin/tcsh
Now, an arbitrary user’s shell can be retrieved as $uid($user,shell)
. The choice of a comma to separate parts of the element name is arbitrary. You can use any character and you can use more than one in a name.
It is possible to have element names with whitespace in them. For example, it might be convenient to find out the user name, given a full name. Doing it in two steps is easy and usually what you want anyway—presumably, the name variable is set elsewhere.
set name "John Ousterhout" set uid($name) ouster
If you just want to embed a literal array reference that contains whitespace, you have to quote it. Remember, any string with whitespace must be quoted to keep it as a single argument (unless it is already in braces).
set "uid(John Ousterhout)" ouster
This is not specific to arrays. Any variable containing whitespace can be set similarly. The following sets the variable named "a b
“.
set "a b" 1
From now on when I want to explicitly talk about a variable that is not an array, I will use the term scalar variable.
Earlier, I described how to pass scalar variables into procedures—as parameters or globals. Arrays can be accessed as globals, too. (Name the array in a global
command and all of the elements become visible in the procedure.) Arrays can be passed as parameters but not as easily as scalar variables. Later in this chapter (page 57), I will describe the upvar
command that provides the means to accomplish this.
Earlier I described how the single-argument set
command is useful to separate variable names from other adjacent characters. This can be used for arbitrarily complex indirect references. For example, the following commands dynamically form a variable name from the contents of b
and a literal c
character. This result is taken as another variable name, and its contents are assigned to d
.
set xc 1 set b x set d [set [set b]c] ;# sets d to value of xc
This type of indirection works with array names too. For example, the following sequence stores an array name in a
and then retrieves a value through it.
set a(1) foo set a2 a puts [set [set a2](1)]
In contrast, replacing either of the set
commands with the $
notation fails. The first of the next two commands incorrectly tries to use a2
as the array name.
tclsh>puts [set $a2(1)]
can't read "a2(1)": variable isn't array tclsh>puts $[set a2](1)
$a(1)
In the second command, the dollar sign is substituted literally because there is not a variable name immediately following it when it is first scanned.
Tcl provides the info
command to obtain assorted pieces of internal information about Tcl. The most useful of the info
operations is "info exists
“. Given a variable name, "info exists
" returns 1 if the variable exists or 0 otherwise. Only variables accessible from the current scope are checked. For example, the following command shows that haha
has not been defined. An attempt to read it would fail.
tclsh> info exists haha
0
Three related commands are "info locals
“, "info globals
“, and "info vars
“. They return a list of local, global, and all variables respectively. They can be constrained to match a subset by supplying an optional pattern (in the style of the "string match
" command). For example, the following command returns a list of all global variables that begin with the letters "mail
“.
info globals mail*
Tcl has similar commands for testing whether commands and procedures exist. "info commands
" returns a list of all commands. "info procs
" returns a list of just the procedures (commands defined with the proc
command).
"info level
" returns information about the stack. With no arguments, the stack depth is returned. "info level 0
" returns the command and arguments of the current procedure. "info level −1
" returns the command and arguments of the calling procedure of the current procedure. −2 indicates the next previous caller, and so on.
The "info script
" command returns the file name of the current script being executed. This is just one of a number of other information commands that give related types of information useful in only the most esoteric circumstances. See the Tcl reference material for more information.
While the info
command can be used on arrays, Tcl provides some more specialized commands for this purpose. For example, "array size b
" returns the number of elements in an array.
tclsh>set color(pavement) black
black tclsh>set color(snow) white
white tclsh>array size color
2
The command "array names
" returns the element names of an array.
tclsh> array names color
pavement snow
Here is a loop to print the elements of the array and their values.
tclsh>foreach name [array names color] {
puts "The color of $name is $color($name)."
}
The color of snow is white. The color of pavement is black.
The unset
command unsets a variable. After being unset, a variable no longer exists. You can unset scalar variables or entire arrays. For example:
unset a unset array(elt) unset array
After a variable is unset, it can no longer be read and "info exists
" returns 0.
Variable accesses can be traced with the trace
command. Using trace
, you can evaluate procedures whenever a variable is accessed. While this is useful in many ways, I will cover trace
in more detail in the discussion on debugging (Chapter 18 (p. 402)) since that is almost always where trace
first shows its value. In that same chapter, I will also describe how to trace commands.
When typing commands interactively, errors cause the interpreter to give up on the current command and reprompt for a new command. All well and good. However, you do not want this to happen while running a script.
While many errors are just the result of typing goofs, some errors are more difficult to avoid and it is easier to react to them “after the fact”. For example, if you write a procedure that does several divisions, code before each division can check that the denominator is not zero. A much easier alternative is to check that the whole procedure did not fail. This is done using the catch
command. catch
evaluates its argument as another command and returns 1 if there was an error or 0 if the procedure returned normally.
Assuming your procedure is named divalot
, you can call it this way:
if [catch divalot] { puts "got an error in divalot!" exit }
The argument to catch
is a list of the command and arguments to be evaluated. If your procedure takes arguments, then they must be grouped together. For example:
catch {puts "Hello world"} catch {divalot some args}
If your procedure returns a value itself, this can be saved by providing a variable name as the second argument to catch
. For example, suppose divalot
normally returns a value of 17 or 18.
tclsh>catch {divalot some args} result
0 tclsh>set result
17
Here, catch
returned 0 indicating divalot
succeeded. The variable result
is set to the value returned by divalot
.
This same mechanism can be used to get the messages produced by an error. For example, you can compute the values of x for the equation 0 = ax2 + bx + c by using the quadratic formula. In mathematical notation, the quadratic formula looks like this:
Here is a procedure for the quadratic formula:
proc qf {a b c} { set s [expr sqrt($b*$b-4*$a*$c)] set d [expr 2*$a] list [expr (-$b+$s)/$d] [expr (-$b-$s)/$d] }
When run successfully, qf
produces a two-element list of values:
tclsh>catch {qf 1 0 −2} roots
0 tclsh>set roots
1.41421 −1.41421
When run unsuccessfully, this same command records the error in roots
:
tclsh>catch {qf 0 0 −2} roots
1 tclsh>set roots
divide by zero
By using catch
this way, you avoid having to put a lot of error-checking code inside qf
. In this case, there is no need to check for division by zero or taking the square root of a negative number. This simplifies the code.
While it is rarely useful in a script, it is possible to get a description of all the commands and procedures that were in evaluation when an error occurred. This description is stored in the global variable errorInfo
. In the example above, errorInfo
looks like this:
tclsh> set errorInfo
divide by zero
while executing
"expr (-$b+$s)/$d"
invoked from within
"list [expr (-$b+$s)/$d]..."
invoked from within
"return [list [expr (-$b+$s)/$d]..."
(procedure "qf" line 4)
invoked from within
"qf 0 0 −2"
errorInfo
is actually set when the error occurs. You can use errorInfo
whether or not you use catch
to, well, . . . catch the error.
The error
command is used to create error conditions which can be caught with the catch
command. error
is useful inside of procedures that return errors naturally already.
For example, if you wanted to restrict the qf
routine so that a
could not be larger than 100, you could rewrite the beginning of it as:
proc qf {a b c} { if {$a > 100} {error "a too large"} set s [expr sqrt($b*$b-4*$a*$c)] . . .
Now if a
is greater than 100, "catch {qf ...}
" will return 1. The message "a too large
" will be stored in the optional variable name supplied to catch
as the second argument.
Everything in Tcl is represented as a string. This includes commands. You can manipulate commands just like any other string. Here is an example where a command is stored in a variable.
tclsh>set output "puts"
puts tclsh>$output "Hello world!"
Hello world!
The variable output
could be used to select between several different forms of output. If this command was embedded inside a procedure, it could handle different forms of output with the same parameterized code. The Tk extension of Tcl uses this technique to manipulate multiple widgets with the same code.
Evaluating an entire command cannot be done the same way. Look what happens:
tclsh>set cmd "puts "Hello world!""
puts "Hello world!" tclsh>$cmd
invalid command name "puts "Hello world!""
The problem is that the entire string in cmd
is taken as the command name rather than a list of a command and arguments.
In order to treat a string as a list of a command name and arguments, the string must be passed to the eval
command. For instance:
tclsh> eval $cmd
Hello world!
The eval
command treats each of its arguments as a list. The elements from all of the lists are used to form a new list that is interpreted as a command. The first element become the command name. The remaining elements become the arguments to the command.
The following example uses the arguments "append
“, "v1
“, "a b
“, and "c d
" to produce and evaluate the command "append v1 a b c d
“.
tclsh> eval append v1 "a b" "c d"
abcd
Remember the concat
command from page 42? The eval
command treats it arguments in exactly the same was as concat
. For example, notice how internal space is preserved:
tclsh> eval append v2 {a b} {c {d e}}
abcd e
The list
command will protect any argument from being broken up by eval
.
tclsh> eval append v3 [list {a b}] [list {c {d e}}]
a bc d e
When the arguments to eval
are unknown (because they are stored in a variable), it is particularly important to use the list
command. For example, the previous command is more likely to be expressed in a script this way:
eval append somevar [list $arg1] [list $arg2]
Unless you want your arguments to be broken up, surround them with list
commands.
The eval
command also performs $
substitution and []
evaluation so that the command is handled as if it had originally been typed in and evaluated rather than stored in a variable. In fact, when a script is running, eval
is used internally to break the lines into command names and arguments, and evaluate them. Commands such as if
, while
, and catch
use the eval
command internally to evaluate their command blocks. So the same conventions apply whether you are using eval
explicitly, writing commands in a script, or typing commands interactively.
Again, the list
command will protect unwanted $
substitution and []
evaluation.
These eval
conventions such as $
substitution and []
evaluation are only done when a command is evaluated. So if you have a string with embedded dollar signs or whitespace, for example, you have to protect it only when it is evaluated.
tclsh>set a "$foo"
$foo tclsh>set b $a
$foo tclsh>set b
$foo
By default, you can only refer to global variables (after using the global
command) or variables declared within a procedure. The upvar
command provides a way to refer to variables in any outer scope. A common use for this is to implement pass by reference. When a variable is passed by reference, the calling procedure can see any changes the called procedure makes to the variable.
The most common use of upvar
is to get access to a variable in the scope of the calling procedure. If a procedure is called with the variable v
as an argument, the procedure associates the caller’s variable v
with a second variable so that when the second variable is changed, the caller’s v
is changed also.
For example, the following command associates the variable name stored in name
with the variable p
.
upvar $name p
After this command, any references to p
also refer to the variable named within name
. If name
contains "v
“, "set p 1
" sets p
to 1 inside the procedure and v
to 1 in the caller of the procedure.
The qf
procedure can be rewritten to use upvar
. As originally written, qf
returned a list. This is a little inconvenient because the list always has to be torn apart to get at the two values. Lists are handy when they are long or of unknown size, but they are a nuisance just for handling two values. However, Tcl only allows procedures to return a single value, and a list is the only way to make two values “feel” like one.
Here is another procedure to compute the quadratic formula but written with upvar
. This procedure, called qf2
, writes its results into the caller’s fourth and fifth parameters.
proc qf2 {a b c name1 name2} { upvar $name1 r1 $name2 r2 set s [expr sqrt($b*$b-4*$a*$c)] set d [expr $a+$a] set r1 [expr (-$b+$s)/$d] set r2 [expr (-$b-$s)/$d] }
The qf2
procedure looks like this when it is called.
tclsh>catch {qf2 1 0 −2 root1 root2}
0 tclsh>set root1
1.41421 tclsh>set root2
−1.41421
A specific caller can be chosen by specifying a level immediately after the command name. Integers describe the number of levels up the procedure call stack. The default is 1 (the calling procedure). If an integer is preceded by a "#
“, then the level is an absolute level with 0 equivalent to the global level. For example, the following associates the global variable curved_intersection_count
with the local variable x
.
upvar #0 curved_intersection_count x
The upvar
command is especially useful for dealing with arrays because arrays cannot be passed by value. (There is no way to refer to the value of an entire array.) However, arrays can be passed by reference.
For example, imagine you want to compute the distance between two points in an xyz-coordinate system. Each point is represented by three numbers. Rather than passing six numbers, it is simpler to pass the entire array. Here is a procedure which computes the distance assuming the numbers are all stored in a single array:
proc distance {name} { upvar $name a set xdelta [expr $a(x,2) - $a(x,1)] set ydelta [expr $a(y,2) - $a(y,1)] set zdelta [expr $a(z,2) - $a(z,1)] expr {sqrt( $xdelta*$xdelta + $ydelta*$ydelta + $zdelta*$zdelta) } }
The uplevel
command is similar in spirit to upvar
. With uplevel
, commands can be evaluated in the scope of the calling procedure. The syntax is similar to eval
. For example, the following command increments x
in the scope of the calling procedure.
uplevel incr x
The uplevel
command can be used to create new control structures such as variations on if
and while
or even more powerful constructs. Space does not permit a discussion of this technique, so I refer you to the Tcl reference material.
The following procedure (written by Karl Lehenbauer with a modification by Allan Brighton) provides static variables in the style of C. Like variables declared global
, variables declared static
are accessible from other procedures. However, the same static
variables cannot be accessed by procedures in different files. This can be helpful in avoiding naming collisions between two programmers—both of whom unintentionally choose the same names for global variables that are private to their own files.
proc static {args} { set unique [info script] foreach var $args { uplevel 1 "upvar #0 static($unique:$var) $var" } }
The procedure makes its arguments be references into an array (appropriately called static
). Because of the uplevel
command, all uses of the named variable after the static
call become references into the array. The array elements have the file name embedded in them. This prevents conflicts with similarly-named variables in other files. By setting unique
to "[lindex [info level −1] 0]
“, static
can declare persistent variables that cannot be accessed by any other procedure even in the same file.
If you have significant amounts of Tcl code, you may want to consider even more sophisticated scoping techniques. For instance, [incr Tcl]
, written by Michael McLennan, is a Tcl extension that supports object-oriented programming in the style of C++. [incr Tcl]
provides mechanisms for data encapsulation within well-defined interfaces, greatly increasing code readability while lessening the effort to write such code in the first place. The Tcl FAQ describes how to obtain [incr Tcl]
. For more information on how to get the FAQ, see Chapter 1 (p. 20).
Tcl has commands for accessing files. The open
command opens a file. The second argument determines how the file should be opened. "r
" opens a file for reading; "w
" truncates a file and opens it for writing; "a
" opens a file for appending (writing without truncation). The second argument defaults to "r
“.
open "/etc/passwd" "r" open "/tmp/stuff.[pid]" "w"
The first command opens /etc/password
for reading. The second command opens a file in /tmp
for writing. The process id is used to construct the file name—this is an ideal way to construct unique temporary names.
The open
command returns a file identifier. This identifier can be passed to the many other file commands, such as the close
command. The close
command closes a file that is open. The close
command takes one argument—a file identifier. Here is an example:
set input [open "/etc/passwd" "r"] ;# open file close $input ;# close same file
The open
command is a good example of a command that is frequently evaluated from a catch
command. Attempting to open (for reading) a nonexistent file generates an error. Here is one way to catch it:
if [catch {open $filename} input] { puts "$input" return }
By printing the error message from open
, this fragment accurately reports any problems related to opening the file. For example, the file might exist yet not allow permission to read it.
The open
command may also be used to read from or write to pipelines specified as /bin/sh
-like commands. A pipe character (”|
“) signifies that the remainder of the argument is a command. For example, the following command searches through all the files in the current directory hierarchy and finds each occurrence of the word book
. Each matching occurrence can be read as if you were reading it from a plain file.
open "| find . -type f -print | xargs grep book"
The argument to open
must be a valid list. Each element in the list becomes a command or argument in the pipeline. If you needed to search for "good book
“, you could do it in a number of ways. Here are just two:
open "| find -type f -print | xargs grep "good book"" open {| find -type f -print | xargs grep {good book}}
Once a file is open, you can read from it and write to it.
Use the puts
command to write to a file. If you provide a file identifier, puts
will write to that file.
set file [open /tmp/stuff w] puts $file "Hello World" ;# write to /tmp/stuff
Remember that puts
writes to the standard output by default. Sometimes it is convenient to refer to the standard output explicitly. You can do that using the predefined file identifier stdout
. (You can also refer to the standard input as stdin
and the standard error as stderr
.)
The puts
command also accepts the argument -nonewline
, which skips adding a newline to the end of the line.
puts -nonewline $file "Hello World"
If you are writing strings without newlines to a character special file (such as a terminal), the output will not immediately appear because the I/O system buffers output one line at a time. However, there are strings you want to appear immediately and without a newline. Prompts are good examples. To force them out, use the flush
command.
puts -nonewline $file $prompt flush $file
Use the gets
command or read
command to read from a file. gets
reads a line at a time and is good for text files in simple applications. read
is appropriate for everything else.
The gets
command takes a file identifier and an optional variable name in which to store the string that was read. When used this way, the length of the string is returned. If the end of the file is reached, −1 is returned.
I frequently read through files with the following code. Each time through the loop, one line is read and stored in the variable line
. Any other commands in the loop are used to process each line. The loop terminates when all the lines have been read.
while 1 { if {[gets $file line] == −1} break # do something with $line }
The read
command is similar to gets
. The read
command reads input but not line by line like gets
. Instead, read
reads a fixed number of characters. It is ideal if you want to process a file a huge chunk at a time. The maximum number to be read is passed as the second argument and the actual characters read are returned. For example, to read 100000 bytes you would use:
set chunk [read $file 100000]
The characters read may be less than the number requested if there are no more characters in the file or if you are reading from a terminal or similar type of special file.
The command eof
returns a 0 if the file has more bytes left or a 1 otherwise. This can be used to rewrite the loop above (using gets
) to use read
.
while {![eof $file]} { set buffer [read $file 100000] # do something with $buffer }
If you omit the length, the read
command reads the entire file. If the file fits in physical memory[13], you can read things with this form of read
much more efficiently than with gets
. For example, if you want to process each line in a file, you can write:
foreach line [split [read $file] " "] { # do something with $line }
There are two other commands to manipulate files: seek
and tell
. They provide random access into files and are analogous to the UNIX lseek
and tell
system calls. They are rarely used, so I will not describe them further. See the Tcl reference material for more information.
If a file name starts with a tilde character and a user name, the open
command translates this to the named user’s home directory. If the tilde is immediately followed by a slash, it is translated to the home directory of the user running the script. This is the same behavior that most shells support.
However, the open
command does not do anything special with other metacharacters such as "*
" and "?
“. The following command opens a file with a "*
" at the end of its name!
open "/tmp/foo*" "w"
The glob
command takes file patterns as arguments and returns the list of files that match. For example, the following command returns the files that end with .exp
and .c
in the current directory.
glob *.exp *.c
The result of glob
can be passed to open
(presuming that it only matches one file). An error occurs if no files are matched. Using glob
as the source in a foreach
loop provides a way of opening each file separately.
foreach filename [glob *.exp] { set file [open $filename] # do something with $file close $file }
The characters understood by glob
are *
(matches anything), ?
(matches any single character), []
(matches a set or range of characters), {}
(matches a choice of strings), and (matches the next character literally). I will not go into details on these—they are similar to matching done by many shells such as
csh
. Plus I will be talking about most of them in later chapters anyway.
If file names do not begin with a "~
" or "/
“, they are relative to the current directory. The current directory can be set with cd
. It is analogous to the cd
command in the shell. As in the open
command, the tilde convention is supported but all other shell metacharacters are not. There is no built-in directory stack.
tclsh>cd ~libes/bin
tclsh>pwd
/usr/libes/bin
The file
command does a large number of different things all related to file names. The first argument names the function and the second argument is the file name to work on.
Four functions are purely textual. The same results can be accomplished with the string functions, but these are very convenient.
The "file dirname
" command returns the directory part of the file name. (It returns a ".
" if there is no slash. It returns a slash if there is only one slash and it is the first character.) For example:
tclsh> file dirname /usr/libes/bin/prog.exp
/usr/libes/bin
The opposite of "file dirname
" is "file tail
“. It returns everything after the last slash. (If there is no slash, it returns the original file name.)
The "file extension
" command returns the last dot and anything following it in the file name. (It returns the empty string if there is no dot.) For example:
tclsh> file extension /usr/libes/src/my.prog.c
.c
The opposite of "file extension
" is "file rootname
“. It returns everything but the extension.
tclsh> file rootname /usr/libes/src/my.prog.c
/usr/libes/src/my.prog
While these functions are very useful with file names, they can be used on any string where dots and slashes are separators.
For example, suppose you have an IP address in addr
and want to change the last field to the value stored in the variable new
. You could use split
and join
, but the file name manipulation functions do it more easily.
tclsh>set addr
127.0.1.2 tclsh>set new
42 tclsh>set addr [file rootname $addr].$new
127.0.1.42
When you need to construct arbitrary compound names, consider using dots and slashes so that you can use the file name commands. You can also use blanks, of course, in which case you can use the string
commands. However, since blanks are used as argument separators, you have to be much more careful when using commands such as eval
.
The file
command can be used to test for various attributes of a file. Listed below are a number of predicates and their meanings. Each variation returns a 0 if the condition is false for the file or 1 if it is true. Here is an example to test whether the file /tmp/foo
exists:
tclsh> file exists /tmp/foo
1
The predicates are:
true if | |
| true if |
| true if you have permission to execute |
| true if |
| true if you own |
| true if you have permission to read |
| true if you have permission to write |
All the predicates return 0 if the file does not exist.
While the predicates make it very easy to test whether a file meets a condition, it is occasionally useful to directly ask for file information. Tcl provides a number of commands that do that. Each of these takes a file name as the last argument.
The "file size
" command returns the number of bytes in a file. For example:
tclsh> file size /etc/motd
63
The "file atime
" command returns the time in seconds when the file was last accessed. The "file mtime
" command returns the time in seconds when the file was last modified. The number of seconds is counted starting from January 1, 1970.
The "file type
" command returns a string describing the type of the file such as file
, directory
, characterSpecial
, blockSpecial
, link
, or socket
. The "file readlink
" command returns the name to which the file points to, assuming it is a symbolic link.
The "file stat
" command returns the raw values of a file’s inode. Each value is written as elements in an array. The array name is given as the third argument to the file
command. For example, the following command writes the information to the array stuff
.
file stat /etc/motd stuff
Elements are written for atime
, ctime
, mtime
, type
, uid
, gid
, ino
, mode
, nlink
, dev
, size
. These are all written as integers except for the type
element which is written as I described before. Most of these values are also accessible more directly by using one of the other arguments to the file
command. However, some of the more unusual elements (such as nlink
) have no corresponding analog. For example, the following command prints the number of links to a file:
file stat $filename fileinfo puts "$filename has $fileinfo(nlink) links"
If the file is a symbolic link, "file stat
" returns information about the file to which it points. The "file lstat
" command works similarly to "file stat
" except that it returns information about the link itself. See the UNIX stat
documentation for more information on stat
and lstat
.
All of these file information commands require the file to exist or else they will generate an error. This error can be caught using catch
.
Most UNIX commands can be executed by calling exec
. The arguments generally follow the /bin/sh
conventions including ">
“, "<
“, "|
“, "&
“, and variations on them. Use whitespace before and after the redirection symbols.
tclsh>exec date
Thu Feb 24 9:32:00 EST 1994 tclsh>exec date | wc -w
6 tclsh>exec date > /tmp/foo
tclsh>exec cat /tmp/foo
Thu Feb 24 9:32:03 EST 1994
Unless redirected, the standard output of the exec
command is returned as the result. This enables you to save the output of a program in a variable or use it in another command.
tclsh> puts "The date is [exec date]"
The date is Thu Feb 24 9:32:17 EST 1994
Tcl assumes that UNIX programs return the exit value 0 if successful. Use catch
to test whether a program succeeds or not. The following command returns the exit value from mv
which could, for example, indicate that a file did not exist.
catch {exec mv oldname newname}
Many programs return nonzero exit values even if they were successful. For example, diff
returns an exit value of 1 when it finds that two files are different. Some UNIX programs are sloppy and return a random exit value which can generate an error in exec
. An error is also generated if a programs writes to its standard error stream. It is common to use catch
with exec
to deal with these problems.
Tilde substitution is performed on the command but not on the arguments, and no globbing is done at all. So if you want to delete all the .o
files in a directory, for instance, it must be done as follows:
exec rm [glob *.o]
Beyond the /bin/sh
conventions, exec
supports special redirections to reference open files. In particular, an @
after a redirection symbol introduces a file identifier returned from open
. For example, the following command writes the date to an open file.
set file [open /tmp/foo] exec date >@ $file
The exec
command has a number of other esoteric features. See the reference documentation for more information.
The global array env
is pre-initialized so that each element corresponds to an environment variable. For example, the path is a list of directories to search for executable programs. From the shell, the path is stored in the variable PATH
. When using Tcl, the path is contained in env(PATH)
. It is manipulated just like any other variable.
tclsh>set env(PATH)
/usr/local/bin:/usr/bin:/bin tclsh>set env(PATH) ".:$env(PATH)" ;# prepend current dir
.:/usr/local/bin:/usr/bin:/bin
Modifications to the env
array do not affect the parent environment, but new processes that are created (using exec
, for instance) will inherit the current values (including any new elements that have created).
The unknown
command is called when another command is executed which is not known to the interpreter. Rather than simply issuing an error message, this gives you the opportunity to handle the problem and recover in an intelligent way. For example, you could attempt to re-evaluate the arguments as an expression. This would allow you to be able to evaluate expressions without using the expr
command.
set a [1+1]
To make unknown
do what you want, simply define it as a procedure. The list of arguments is available as a parameter to the unknown
command. Here is a definition of unknown
which supports expression evaluation without having to specify the expr
command:
proc unknown {args} { expr $args }
By default, Tcl comes with a definition for unknown
that does a number of things such as attempt history substitution. I will only go into detail on the most useful action that unknown
takes—retrieving procedure definitions from libraries.
By default, the unknown
command tries to find procedure definitions in a library. A library is simply a file that contain procedure definitions. Libraries can be explicitly read using the source
command. However, it is possible to prepare an index file which describes the library contents in such a way that Tcl knows which library to load based on the command name. Once a library is loaded, the unknown
command calls the new procedure just defined. After the procedure completes, unknown
completes and it appears as if the procedure had been defined all along.
As an example, one of Tcl’s default libraries defines the parray
procedure. parray
prints out the contents of an array. It is a parameterized version of the code on page 51. The info
command shows that parray
is not defined before it is invoked, but it is defined afterwards.
tclsh>info command parray
tclsh>parray color
color(pavement) = black color(snow) = white tclsh>info command parray
parray
You can add procedures to the libraries or create new libraries. See the Tcl reference material for more information on using libraries.
This chapter has covered most of the Tcl commands and data structures. I will expand on a few of these descriptions later in the book, but for the most part, you have now seen the entire Tcl language.
Even though Tcl is a small language, it is capable of handling very large and sophisticated scripts. However, Tcl was originally designed for writing small scripts with most of the work being done in the underlying commands themselves. Indeed, Tcl supports the ability to add additional commands written in other languages such as C and C++. This is useful for commands that must be very fast or do something unusual (such as the Expect commands do).
Fortunately, the need to resort to implementing your own commands is growing increasingly rare. People have already written commands for just about anything you can imagine. They are packaged into collections called extensions and are available from the Tcl archive. I have already mentioned [incr Tcl]
which provides commands for object-oriented programming. Another popular extension is TclX, which provides commands for most of the UNIX system and library calls. There are a variety of extensions to support different databases (e.g., SQL, Oracle, Dbm). And there are many extensions to support graphics (e.g., SIPP, GL, PHIGS). These extensions and others are described in the Tcl FAQ (page 20). In Chapter 22 (p. 507), I will describe how to add existing extensions to Expect.
If none of these extensions provides what you are looking for, you can always write your own. Tcl has always supported this way of adding new commands and it is surprisingly easy to do. If you are interested in learning more about this, I recommend Ousterhout’s Tcl and the Tk Toolkit.
Is Tcl like any other language you know? Bourne shell? C shell? Lisp? C?
As best as you can remember (or guess), write down the precedence table for Tcl expressions. Now look it up in the reference material. How close were you? Repeat this exercise with Perl, C, Lisp, and APL.
What is the best thing about Tcl? What is the worst thing about Tcl? (That bad, eh?)
Try putting comments where they do not belong—for instance, inside the arguments to a procedure. What happens?
Write a procedure to reverse a string. If you wrote an iterative solution, now write a recursive solution or vice versa.
Repeat the previous exercise but with a list instead of a string.
Write a procedure to rename all the files in a directory ending with .c
to names ending in ".cc
“.
Write a procedure that takes a list of variable names and a list of values, and sets each variable in the list to the respective value in the other list. Think of different alternatives to handle the case when the lists are of different lengths.
Write a procedure that creates a uniquely-named temporary file.
Write a procedure that can define other procedures that automatically have access to global variables.