Chapter 2. Bash Primer

Bash is more than just a simple command-line interface for running programs. It is a programming language in its own right. Its default operation is to launch other programs. As we said earlier, when several words appear on the command line; bash assumes that the first word is the name of the program to launch and the remaining words are the arguments to pass to that program.

But as a programming language, it also has features to support input and output, and control structures such as if, while, for, case, and more. Its basic data type is strings (such as filenames and pathnames) but it also supports integers. Because its focus is on scripts and launching programs and not on numerical computation, it doesn’t directly support floating-point numbers, though other commands can be used for that. Here, then, is a brief look at some of the features that make bash a powerful programming language, especially for scripting.

Output

As with any programming language, bash has the ability to output information to the screen. Output can be achieved by using the echo command:

$ echo "Hello World"

Hello World

You may also use the printf built-in command, which allows for additional formatting:

$ printf "Hello World
"

Hello World

You have already seen (in the previous chapter) how to redirect that output to files or to stderr or, via a pipe, into another command. You will see much more of these commands and their options in the pages ahead.

Variables

Bash variables begin with an alphabetic character or underscore followed by alphanumeric characters. They are string variables unless declared otherwise. To assign a value to the variable, you write something like this:

MYVAR=textforavalue

To retrieve the value of that variable—for example, to print out the value by using the echo command—you use the $ in front of the variable name, like this:

echo $MYVAR

If you want to assign a series of words to the variable, that is, to preserve any whitespace, use quotation marks around the value, as follows:

MYVAR='here is a longer set of words'
OTHRV="either double or single quotes will work"

The use of double quotes will allow other substitutions to occur inside the string. For example:

firstvar=beginning
secondvr="this is just the $firstvar"
echo $secondvr

This results in the output this is just the beginning

A variety of substitutions can occur when retrieving the value of a variable; we show those as we use them in the scripts to follow.

Warning

Remember that by using double quotes ("), any substitutions that begin with the $ will still be made, whereas inside single quotes (') no substitutions of any sort are made.

You can also store the output of a shell command by using $( ) as follows:

CMDOUT=$(pwd)

That executes the command pwd in a subshell, and rather than printing the result to stdout, it will store the output of the command in the variable CMDOUT. You can also pipe together multiple commands within the $ ( ).

Positional Parameters

It is common when using command-line tools to pass data into the commands by using arguments or parameters. Each parameter is separated by the space character and is accessed inside bash by using a special set of identifiers. In a bash script, the first parameter passed into the script can be accessed using $1, the second using $2, and so on. $0 is a special parameter that holds the name of the script, and $# returns the total number of parameters. Take a look at the script in Example 2-1:

Example 2-1. echoparams.sh
#!/bin/bash -
#
# Cybersecurity Ops with bash
# echoparams.sh
#
# Description:
# Demonstrates accessing parameters in bash
#
# Usage:
# ./echoparms.sh <param 1> <param 2> <param 3>
#

echo $#
echo $0
echo $1
echo $2
echo $3

This script first prints out the number of parameters ($#), then the name of the script ($0), and then the first three parameters. Here is the output:

$ ./echoparams.sh bash is fun

3
./echoparams.sh
bash
is
fun

Input

User input is received in bash by using the read command. The read command obtains user input from stdin and stores it in a specified variable. The following script reads user input into the MYVAR variable and then prints it to the screen:

read MYVAR
echo "$MYVAR"

You have already seen (in the previous chapter) how to redirect that input to come from files. You will see much more of read and its options, and of this redirecting, in the pages ahead.

Conditionals

Bash has a rich variety of conditionals. Many, but not all, begin with the keyword if.

Any command or program that you invoke in bash may produce output but it will also always return a success or fail value. In the shell, this value can be found in the $? variable immediately after a command has run. A return value of 0 is considered “success” or “true”; any nonzero value is considered “error” or “false.” The simplest form of the if statement uses this fact. It takes the following form:

if cmd
then
   some cmds
else
   other cmds
fi
Warning

Using 0 for true and nonzero for false is the exact opposite of many programming languages (C++, Java, Python, to name a few). But it makes sense for bash because a program that fails should return an error code (to explain how it failed), whereas a success would have no error code, that is, 0. This reflects the fact that many operating system calls return 0 if successful or -1 (or other nonzero value) if an error occurs. But there is an exception to this rule in bash for values inside double parentheses (more on that later).

For example, the following script attempts to change directories to /tmp. If that command is successful (returns 0), the body of the if statement will execute.

if  cd /tmp
then
    echo "here is what is in /tmp:"
    ls -l
fi

Bash can even handle a pipeline of commands in a similar fashion:

if ls | grep pdf
then
    echo "found one or more pdf files here"
else
    echo "no pdf files found"
fi

With a pipeline, it is the success/failure of the last command in the pipeline that determines if the “true” branch is taken. Here is an example where that fact matters:

ls | grep pdf | wc

This series of commands will be “true” even if no pdf string is found by the grep command. That is because the wc command (a word count of the input) will succeed and print the following:

0       0       0

That output indicates zero lines, zero words, and zero bytes (characters) when no output comes from the grep command. That is still a successful (thus true) result for wc, not an error or failure. It counted as many lines as it was given, even if it was given zero lines to count.

A more typical form of if used for comparison makes use of the compound command [[ or the shell built-in command [ or test. Use these to test file attributes or to make comparisons of value.

To test whether a file exists on the filesystem:

if [[ -e $FILENAME ]]
then
    echo $FILENAME exists
fi

Table 2-1 lists additional tests that can be done on files by using if comparisons.

Table 2-1. File test operators
File test operator Use

-d

Test if a directory exists

-e

Test if a file exists

-r

Test if a file exists and is readable

-w

Test if a file exists and is writable

-x

Test if a file exists and is executable

To test whether the variable $VAL is less than the variable $MIN:

if [[ $VAL -lt $MIN ]]
then
    echo "value is too small"
fi

Table 2-2 lists additional numeric tests that can be done using if comparisons.

Table 2-2. Numeric test operators
Numeric test operator Use

-eq

Test for equality between numbers

-gt

Test if one number is greater than another

-lt

Test if one number is less than another

Warning

Be cautious of using the less-than symbol (<). Take the following code:

if [[ $VAL < $OTHR ]]

In this context, the less-than operator uses lexical (alphabetical) ordering. That means that 12 is less than 2, because they alphabetically sort in that order (just as a < b, so 1 < 2, but also 12 < 2anything).

If you want to do numerical comparisons with the less-than sign, use the double-parentheses construct. It assumes that the variables are all numerical and will evaluate them as such. Empty or unset variables are evaluated as 0. Inside the parentheses, you don’t need the $ operator to retrieve a value, except for positional parameters like $1 and $2 (so as not to confuse them with the constants 1 and 2). For example:

if (( VAL < 12 ))
then
    echo "value $VAL is too small"
fi
Warning

Inside the double parentheses, a more numerical (C/Java/Python) logic plays out. Any nonzero value is considered “true,” and only zero is “false”—the reverse of all the other if statements in bash. For example, if (( $? )) ; then echo "previous command failed" ; fi will do what you would want/expect—if the previous command failed, then $? will contain a nonzero value; inside the (( )), the nonzero value will be true and the then branch will run.

In bash, you can even make branching decisions without an explicit if/then construct. Commands are typically separated by a newline—that is, they appear one per line. You can get the same effect by separating them with a semicolon. If you write cd $DIR ; ls, bash will perform the cd and then the ls.

Two commands can also be separated by either && or || symbols. If you write cd $DIR && ls, the ls command will run only if the cd command succeeds. Similarly, if you write cd $DIR || echo cd failed, the message will be printed only if the cd fails.

You can use the [[ syntax to make various tests, even without an explicit if:

[[ -d $DIR ]] && ls "$DIR"

That means the same as if you had written the following:

if [[ -d $DIR ]]
then
  ls "$DIR"
fi
Warning

When using && or ||, you need to group multiple statements if you want more than one action within the then clause. For example:

[[ -d $DIR ]] || echo "error: no such directory: $DIR" ; exit

This will always exit, whether or not $DIR is a directory.

What you probably want is this:

[[ -d $DIR ]] || { echo "error: no such directory: $DIR" ; exit ; }

Here, the braces will group both statements together.

Looping

Looping with a while statement is similar to the if construct in that it can take a single command or a pipeline of commands for the decision of true or false. It can also make use of the brackets or parentheses as in the previous if examples.

In some languages, braces (the { } characters) are used to group the statements together that are the body of the while loop. In others, such as Python, indentation is the indication of which statements are the loop body. In bash, however, the statements are grouped between two keywords: do and done.

Here is a simple while loop:

i=0
while (( i < 1000 ))
do
    echo $i
    let i++
done

The preceding loop will execute while the variable i is less than 1,000. Each time the body of the loop executes, it will print the value of i to the screen. It then uses the let command to execute i++ as an arithmetic expression, thus incrementing i by 1 each time.

Here is a more complicated while loop that executes commands as part of its condition:

while ls | grep -q pdf
do
    echo -n 'there is a file with pdf in its name here: '
    pwd
    cd ..
done

A for loop is also available in bash, in three variations.

Simple numerical looping can be done using the double-parentheses construct. It looks much like the for loop in C or Java, but with double parentheses and with do and done instead of braces:

for ((i=0; i < 100; i++))
do
    echo $i
done

Another useful form of the for loop is used to iterate through all the parameters that are passed to a shell script (or function within the script)—that is, $1, $2, $3, and so on. Note that ARG in args.sh can be replaced with any variable name of your choice:

Example 2-2. args.sh
for ARG
do
    echo here is an argument: $ARG
done

Here is the output of Example 2-2 when three parameters are passed in:

$ ./args.sh bash is fun

here is an argument: bash
here is an argument: is
here is an argument: fun

Finally, for an arbitrary list of values, use a similar form of the for statement and simply name each of the values you want for each iteration of the loop. That list can be explicitly written out, like this:

for VAL in 20 3 dog peach 7 vanilla
do
    echo $VAL
done

The values used in the for loop can also be generated by calling other programs or using other shell features:

for VAL in $(ls | grep pdf) {0..5}
do
    echo $VAL
done

Here the variable VAL will take, in turn, the value for each file that ls piped into grep that contains the letters pdf in its filename (e.g., doc.pdf or notapdfile.txt) and then each of the numbers 0 through 5. It may not be that sensible to have the variable VAL be a filename sometimes and a single digit other times, but this shows you that it can be done.

Note

The braces can be used to generate a sequence of numbers (or single characters) {first..last..step}, where the ..step can be positive or negative but is optional. In the most recent versions of bash, a leading 0 will cause numeric values to be zero-padded to the same width. For example, the sequence {090..104..2} will expand into the even digits from 090 to 104 inclusive, with all values zero-padded to three digits wide.

Functions

You define a function with syntax like this:

function myfun ()
{
  # body of the function goes here
}

Not all that syntax is necessary. You can use either function or ();—you don’t need both. We recommend, and will be using, both—mostly for readability.

There are a few important considerations to keep in mind with bash functions:

  • Unless declared with the local built-in command inside the function, variables are global in scope. A for loop that sets and increments i could be messing with the value of i used elsewhere in your code.

  • The braces are the most commonly used grouping for the function body, but any of the shell’s compound command syntax is allowed—though why, for example, would you want the function to run in a subshell?

  • Redirecting input/output (I/O) on the braces does so for all the statements inside the function. Examples of this will be seen in upcoming chapters.

  • No parameters are declared in the function definition. Whatever and however many arguments are supplied on the invocation of the function are passed to it.

The function is called (invoked) just as any command is called in the shell. Having defined myfun as a function, you can call it like this:

myfun 2 /arb "14 years"

This calls the function myfun, supplying it with three arguments.

Function Arguments

Inside the function definition, arguments are referred to in the same way as parameters to the shell script—as $1, $2, etc. Realize that this means that they “hide” the parameters originally passed to the script. If you want access to the script’s first parameter, you need to store $1 into a variable before you call the function (or pass it as a parameter to the function).

Other variables are set accordingly too. $# gives the number of arguments passed to the function, whereas normally it gives the number of arguments passed to the script itself. The one exception to this is $0, which doesn’t change in the function. It retains its value as the name of the script (and not of the function).

Returning Values

Functions, like commands, should return a status—a 0 if all goes well, and a nonzero value if an error has occurred. To return other kinds of values (pathnames or computed values, for example), you can set a variable to hold that value, because those variables are global unless declared local within the function. Alternatively, you can send the result to stdout; that is, print the answer. Just don’t try to do both.

Warning

If your function prints the answer, you will want to use that output as part of a pipeline of commands (e.g., myfunc args | next step | etc ), or you can capture the output like this: RETVAL=$( myfunc args ) . In both cases, the function will be run in a subshell and not in the current shell. Thus, changes to any global variables will be effective only in that subshell and not in the main shell instance. They are effectively lost.

Pattern Matching in bash

When you need to name a lot of files on a command line, you don’t need to type each and every name. Bash provides pattern matching (sometimes called wildcarding) to allow you to specify a set of files with a pattern.

The easiest wildcard is simply an asterisk (*) or star, which will match any number of any character. When used by itself, therefore, it matches all files in the current directory. The asterisk also can be used in conjunction with other characters. For example, *.txt matches all the files in the current directory that end with the four characters .txt. The pattern /usr/bin/g* will match all the files in /usr/bin that begin with the letter g.

Another special character in pattern matching is the question mark (?), which matches a single character. For example, source.? will match source.c or source.o but not source.py or source.cpp.

The last of the three special pattern-matching characters are the square brackets: [ ]. A match can be made with any one of the characters listed inside the square brackets, so the pattern x[abc]y matches any or all of the files named xay, xby, or xcy, assuming they exist. You can specify a range within the square brackets, like [0–9] for all digits. If the first character within the brackets is either an exclamation point (!) or a carat (^), then the pattern means anything other than the remaining characters in the brackets. For example, [aeiou] would match a vowel, whereas [^aeiou] would match any character (including digits and punctuation characters) except the vowels.

Similar to ranges, you can specify character classes within braces. Table 2-3 lists the character classes and their descriptions.

Table 2-3. Pattern-matching character classes
Character class Description

[:alnum:]

Alphanumeric

[:alpha:]

Alphabetic

[:ascii:]

ASCII

[:blank:]

Space and tab

[:ctrl:]

Control characters

[:digit:]

Number

[:graph:]

Anything other than control characters and space

[:lower:]

Lowercase

[:print:]

Anything other than control characters

[:punct:]

Punctuation

[:space:]

Whitespace including line breaks

[:upper:]

Uppercase

[:word:]

Letters, numbers, and underscore

[:xdigit:]

Hexadecimal

Character classes are specified like [:ctrl:] but within square brackets (so you have two sets of brackets). For example, the pattern *[[:punct:]]jpg will match any filename that has any number of any characters followed by a punctuation character, followed by the letters jpg. So it would match files named wow!jpg or some,jpg or photo.jpg but not a file named this.is.myjpg, because there is no punctuation character right before the jpg.

More-complex aspects of pattern matching are available if you turn on the shell option extglob (like this: shopt -s extglob) so that you can repeat patterns or negate patterns. We won’t need these in our example scripts, but we encourage you to learn about them (e.g., via the bash man page).

There are a few things to keep in mind when using shell pattern matching:

  • Patterns aren’t regular expressions (discussed later); don’t confuse the two.

  • Patterns are matched against files in the filesystem; if the pattern begins with a pathname (e.g., /usr/lib ), the matching will be done against files in that directory.

  • If no pattern is matched, the shell will use the special pattern-matching characters as literal characters of the filename. For example, if your script indicates echo data > /tmp/*.out, but there is no file in /tmp that ends in .out, then the shell will create a file called *.out in the /tmp directory. Remove it like this: rm /tmp/*.out by using the backslash to tell the shell not to pattern-match with the asterisk.

  • No pattern matching occurs inside quotes (either double or single quotes), so if your script says echo data > "/tmp/*.out", it will create a file called /tmp/*.out (which we recommend you avoid doing).

Note

The dot, or period, is just an ordinary character and has no special meaning in shell pattern matching—unlike in regular expressions, which are discussed later.

Writing Your First Script—Detecting Operating System Type

Now that we have gone over the fundamentals of the command line and bash, you are ready to write your first script. The bash shell is available on a variety of platforms including Linux, Windows, macOS, and Git Bash. As you write more-complex scripts in the future, it is imperative that you know what operating system you are interacting with, as each one has a slightly different set of commands available. The osdetect.sh script, shown in Example 2-3, helps you in making that determination.

The general idea of the script is that it will look for a command that is unique to a particular operating system. The limitation is that on any given system, an administrator may have created and added a command with that name, so this is not foolproof.

Example 2-3. osdetect.sh
#!/bin/bash -
#
# Cybersecurity Ops with bash
# osdetect.sh
#
# Description:
# Distinguish between MS-Windows/Linux/MacOS
#
# Usage: bash osdetect.sh
#   output will be one of: Linux MSWin macOS
#

if type -t wevtutil &> /dev/null           1
then
    OS=MSWin
elif type -t scutil &> /dev/null           2
then
    OS=macOS
else
    OS=Linux
fi
echo $OS
1

We use the type built-in in bash to tell us what kind of command (alias, keyword, function, built-in, or file) its arguments are. The -t option tells it to print nothing if the command isn’t found. The command returns as “false” in that case. We redirect all the output (both stdout and stderr) to /dev/null, thereby throwing it away, as we want to know only whether the wevtutil command was found.

2

Again, we use the type built-in, but this time we are looking for the scutil command, which is available on macOS systems.

Summary

The bash shell can be seen as a programming language, one with variables and if/then/else statements, loops, and functions. It has its own syntax, similar in many ways to other programming languages, but just different enough to catch you if you’re not careful.

It has its strengths—such as easily invoking other programs or connecting sequences of other programs. It also has its weaknesses: it doesn’t have floating-point arithmetic or much support (though some) for complex data structures.

Tip

There is so much more to learn about bash than we can cover in a single chapter. We recommend reading the bash man page—repeatedly—and consider also the bash Cookbook by Carl Albing and JP Vossen (O’Reilly).

Throughout this book, we describe and use many commands and bash features in the context of cybersecurity operations. We further explore some of the features touched on here, and other more advanced or obscure features. Keep your eyes out for those features, and practice and use them for your own scripting.

In the next chapter, we explore regular expressions, which is an important subcomponent of many of the commands we discuss throughout the book.

Workshop

  1. Experiment with the uname command, seeing what it prints on the various operating systems. Rewrite the osdetect.sh script to use the uname command, possibly with one of its options. Caution: not all options are available on every operating system.

  2. Modify the osdetect.sh script to use a function. Put the if/then/else logic inside the function and then call it from the script. Don’t have the function itself produce any output. Make the output come from the main part of the script.

  3. Set the permissions on the osdetect.sh script to be executable (see man chmod) so that you can run the script without using bash as the first word on the command line. How do you now invoke the script?

  4. Write a script called argcnt.sh that tells how many arguments are supplied to the script.

    1. Modify your script to have it also echo each argument, one per line.

    2. Modify your script further to label each argument like this:

    $ bash argcnt.sh this is a "real live" test
    
    there are 5 arguments
    arg1: this
    arg2: is
    arg3: a
    arg4: real live
    arg5: test
    $
  5. Modify argcnt.sh so it lists only the even arguments.

Visit the Cybersecurity Ops website for additional resources and the answers to these questions.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset