Bash scripts are easily human readable, which is a feature of the language by design. Readability is a desirable attribute for most applications, but not so for penetration testing. In most cases, you do not want your target to be able to easily read or reverse engineer your tools when performing offensive operations. To counter that, you can use obfuscation.
Obfuscation is a suite of techniques used to make something purposely difficult to read or understand. There are three main methods for obfuscating scripts:
Obfuscate the syntax
Obfuscate the logic
Encode or encrypt
We look at each of these methods in detail in the sections that follow.
We introduce base64
for data conversions and the eval
command to execute arbitrary command statements.
The base64
command is used to encode data using the Base64 format.
For additional information on Base64 encoding, see RFC 4648.
Decode Base64-encoded data
The eval
command executes the arguments given to it in the context of the current shell. For example, you can provide shell commands and arguments in the format of a string to eval
, and it will execute it as if it were a shell command. This is particularly useful when dynamically constructing shell commands within a script.
In this example, we dynamically concatenate a shell command with an argument and execute the result in the shell by using the eval
command:
$ commandOne="echo" $ commandArg="Hello World" $ eval "$commandOne $commandArg" Hello World
Obfuscating the syntax of a script aims to purposely make it difficult to read—in other words, make it look ugly. To accomplish this, throw out any best practice you have ever learned about writing well-formatted and readable code. Example 14-1 provides a sample of well-formatted code.
#!/bin/bash -
#
# Cybersecurity Ops with bash
# readable.sh
#
# Description:
# Simple script to be obfuscated
#
if
[[
$1
==
"test"
]]
then
echo
"testing"
else
echo
"not testing"
fi
echo
"some command"
echo
"another command"
In bash, you can place the entire script on one line, separating commands by using a semicolon (;
) instead of a newline. Example 14-2 shows the same script on one line (two lines in the book for the purpose of fitting on the page).
#!/bin/bash -
#
# Cybersecurity Ops with bash
# oneline.sh
#
# Description:
# Demonstration of one-line script obfuscation
#
if
[[
$1
==
"test"
]]
;
then
echo
"testing"
;
else
echo
"not testing"
;
fi
;
echo
"some command"
;
echo
"another command"
Although this might not look that bad for the preceding simple script, imagine a script that was a few hundred or a few thousand lines of code. If the entire script was written in one line, it would make understanding it quite difficult without reformatting.
Another technique for obfuscating syntax is to make variable and function names as nondescript as possible. In addition, you can reuse names as long as it is for different types and scopes. Example 14-3 shows a sample:
#!/bin/bash -
#
# Cybersecurity Ops with bash
# synfuscate.sh
#
# Description:
# Demonstration of syntax script obfuscation
#
a
(
)
{
local
a
=
"Local Variable a"
echo
"
$a
"
}
a
=
"Global Variable a"
echo
"
$a
"
a
Example 14-3 includes three different items:
Using nondescript naming conventions and reusing names where possible makes following the code difficult, particularly for larger codes bases. To make things even more confusing, you can combine this with the earlier technique of placing everything on one line:
#!/bin/bash - a(){ local a="Local Variable a";echo "$a";};a="Global Variable a";echo "$a";a
Lastly, when obfuscating the syntax of scripts, be sure to remove all comments. You do not want to give the analyst reversing engineering the code any hints.
Another technique is to obfuscate the logic of the script. The idea here is to make the script difficult to follow logically. The script still performs the same function in the end, but it does so in a roundabout way. This technique does incur an efficiency and size penalty for the script.
Here are a few things you can do to obfuscate logic:
Use nest functions.
Add functions and variables that don’t do anything that is critical to the functionality of the script.
Write if
statements with multiple conditions, where only one might matter.
Example 14-4 is a script that implements some of the logic obfuscation techniques. Take a look at it and see if you can figure out what the script is doing before reading the explanation.
#!/bin/bash -
#
# Cybersecurity Ops with bash
# logfuscate.sh
#
# Description:
# Demonstration of logic obfuscation
#
f
=
"
$1
"
a
(
)
(
b
(
)
{
f
=
"
$((
$f
+
5
))
"
g
=
"
$((
$f
+
7
))
"
c
}
b
)
c
(
)
(
d
(
)
{
g
=
"
$((
$g
-
$f
))
"
f
=
"
$((
$f
-
2
))
"
echo
"
$f
"
}
f
=
"
$((
$f
-
3
))
"
d
)
f
=
"
$((
$f
+
$2
))
"
a
Here is a line-by-line explanation of what the script is doing:
The value of the first argument is stored in variable f
.
The value of the second argument is added to the current value of f
and the result is stored in f
.
Function a
is called.
Function b
is called.
Adds 5 to the value of f
and stores the result in f
.
Adds 7 to the value of f
and stores the result in variable g
.
Function c
is called.
Subtracts 3 from the value of f
and stores the result in f
.
Function d
is called.
Subtracts f
from the value of g
and stores the result in g
.
Subtracts 2 from the value of f
and stores the result in f
.
Prints the value of f
to the screen.
So, what does the script do in totality? It simply accepts two command-line arguments and adds them together. The entire script could be replaced by this:
echo "$(($1+$2))"
The script uses nested functions that do little or nothing other than call additional functions. Useless variables and computation are also used. Multiple computations are done with variable g
, but it never actually impacts the output of the script.
There are limitless ways to obfuscate the logic of your script. The more convoluted you make the script, the more difficult it will be to reverse engineer.
Syntax and logic obfuscation are typically done after a script is written and tested. To make this easier, consider creating a script whose purpose is to obfuscate other scripts using the techniques described.
One of the most effective methods to obfuscate a script is to encrypt it with a wrapper. This not only makes reverse engineering difficult, but if done correctly, the script will not even be able to be run by anyone unless they have the proper key. However, this technique does come with a fair amount of complexity.
Cryptography is the science and principles of rendering information into a secure, unintelligible form for storage or transmission. It is one of the oldest forms of information security, dating back thousands of years.
A cryptographic system, or cryptosystem, comprises five basic components:
The original intelligible message
The method used to transform the original intelligible message into its secure unintelligible form
The method used to transform the secure unintelligible message back into its original intelligible form
Secret code used by the function to encrypt or decrypt
The unintelligible encrypted message
Encryption is the process of transforming an original intelligible message (plaintext) into its secure unintelligible form (ciphertext). To encrypt, a key is required, which is to be kept secret and be known only by the person performing the encryption or the intended recipients of the message. Once encrypted, the resulting ciphertext will be unreadable except to those with the appropriate key.
Decryption is the process of transforming an encrypted unintelligible message (ciphertext) back into its intelligible form (plaintext). As with encryption, the correct key is required to decrypt and read the message. A ciphertext message cannot be decrypted unless the correct key is used.
The cryptographic key used to encrypt the plaintext message is critical to the overall security of the system. The key should be protected, remain secret at all times, and be shared only with those intended to decrypt the message.
Modern cryptosystems have keys ranging in length from 128 bits to 4,096 bits. Generally, the larger the key size, the more difficult it is to break the security of the cryptosystem.
Encryption will be used to secure the main (or inner) script so it cannot be read by a third party without the use of the correct key. Another script, known as a wrapper, will be created, containing the inner encrypted script stored in a variable. The primary purpose of the wrapper script is to decrypt the encrypted inner script and execute it when the proper key is provided.
The first step in this process is to create the script that you want to obfuscate. Example 14-5 will serve this purpose.
echo
"This is an encrypted script"
echo
"running uname -a"
uname -a
Once you have created the script, you then need to encrypt it. You can use the OpenSSL tool to do that. OpenSSL is available by default in many Linux distributions and is included with Git Bash. In this case, we will use the Advanced Encryption Standard (AES) algorithm, which is considered a symmetric-key algorithm because the same key is used for both encryption and decryption. To encrypt the file:
openssl aes-256-cbc -base64 -in innerscript.sh -out innerscript.enc -pass pass:mysecret
The aes-256-cbc
argument specifies the 256-bit version of AES. The -in
option specifies the file to encrypt, and -out
specifies the file to which to output the ciphertext. The -base64
option specifies the output to be Base64 encoded. The Base64 encoding is important and is needed because of the way the ciphertext will be used later. Lastly, the -pass
option is used to specify the encryption key.
The output from OpenSSL, which is the encrypted version of innerscript.sh, is as follows:
U2FsdGVkX18WvDOyPFcvyvAozJHS3tjrZIPlZM9xRhz0tuwzDrKhKBBuugLxzp7T MoJoqx02tX7KLhATS0Vqgze1C+kzFxtKyDAh9Nm2N0HXfSNuo9YfYD+15DoXEGPd
Now that the inner script is encrypted and in Base64 format, you can write a wrapper for it. The primary job of the wrapper is to decrypt the inner script (given the correct key), and then execute the script. Ideally, this should all occur in main memory. You want to avoid writing the unencrypted script to the hard drive, as it might be found later. Example 14-6 shows the wrapper script.
#!/bin/bash -
#
# Cybersecurity Ops with bash
# wrapper.sh
#
# Description:
# Example of executing an encrypted "wrapped" script
#
# Usage:
# wrapper.sh
# Enter the password when prompted
#
encrypted
=
'U2FsdGVkX18WvDOyPFcvyvAozJHS3tjrZIPlZM9xRhz0tuwzDrKhKBBuugLxzp7T MoJoqx02tX7KLhATS0Vqgze1C+kzFxtKyDAh9Nm2N0HXfSNuo9YfYD+15DoXEGPd'
read
-s
word
innerScript
=
$(
echo
"
$encrypted
"
|
openssl
aes-256-cbc
-base64
-d
-pass
pass:
"
$word
"
)
eval
"
$innerScript
"
This is the encrypted inner script stored in a variable called encrypted
. The reason we Base64-encoded the OpenSSL output earlier is so that it can be included inside the wrapper.sh script. If your encrypted script is very large, you can also consider storing it in a separate file, but in that case, you will need to upload two files to the target system.
This reads the decryption key into the variable word
. The -s
option is used so the user input is not echoed to the screen.
Pipes the encrypted script into OpenSSL for decryption. The result is stored in the variable innerScript
.
Executes the code stored in innerScript
by using the eval
command.
When the program is executed, it first prompts the user to enter the decryption key. As long as the correct key (same one used for encryption) is entered, the inner script will be decrypted and executed:
$ ./wrapper.sh This is an encrypted script running uname -a MINGW64_NT-6.3 MySystem 2.9.0(0.318/5/3) 2017-10-05 15:05 x86_64 Msys
The use of encryption has two significant advantages over syntax and logic obfuscation:
It is mathematically secure and essentially unbreakable so long as a good encryption algorithm and sufficiently long key is used. The syntax and logic obfuscation methods are not unbreakable and merely cause an analyst to have to spend more time reverse engineering the script.
Someone trying to reverse engineer the inner script cannot even execute the script without knowing the correct key.
One weakness with this method is that when the script is executing, it is stored in an unencrypted state in the computer’s main memory. The unencrypted script could possibly be extracted from main memory by using appropriate forensic techniques.
The preceding encryption method works great if OpenSSL is installed on the target system, but what do you do if it is not installed? You can either install OpenSSL on the target, which could be noisy and increase operational risk, or you can create your own implementation of a cryptographic algorithm inside your script.
In most cases, you should never create your own cryptographic algorithm, or even attempt to implement an existing one such as AES. You should instead use industry-standard algorithms and implantations that have been reviewed by the cryptographic community.
In this case, we will implement an algorithm for operational necessity and to demonstrate fundamental cryptographic principles, but realize that it should not be considered strong encryption or secure.
The algorithm that we will use has a few basic steps and is easy to implement. It is a basic stream cipher that uses a random number generator to create a key that is the same length as the plain text to be encrypted. Next, each byte (character) of the plain text is exclusive-or’ed (XOR) with the corresponding byte of the key (random number). The output is the encrypted ciphertext. Table 14-1 illustrates how to use the XOR method to encrypt the plain-text echo.
Plain text |
e |
c |
h |
o |
ASCII (hex) |
65 |
63 |
68 |
30 |
Key (hex) |
ac |
27 |
f2 |
d9 |
XOR |
- |
- |
- |
- |
Ciphertext (hex) |
c9 |
44 |
9a |
e9 |
To decrypt, simply XOR the ciphertext with the exact same key (sequence of random numbers), and the plain text will be revealed. Like AES, this is considered a symmetric-key algorithm. Table 14-2 illustrates how to use the XOR method to decrypt a ciphertext.
Ciphertext (hex) |
c9 |
44 |
9a |
e9 |
Key (hex) |
ac |
27 |
f2 |
d9 |
XOR |
- |
- |
- |
- |
ASCII (hex) |
65 |
63 |
68 |
30 |
Plain text |
e |
c |
h |
o |
In order for this to work properly, you need to have the same key to decrypt the ciphertext that was used to encrypt it. That can be done by using the same seed value for the random number generator. If you run the same random number generator, using the same starting seed value, it should generate the same sequence of random numbers. Note that the security of this method is highly dependent on the quality of the random number generator you are using. Also, you should choose a large seed value and should use a different value to encrypt each script.
Here’s an example of how you might run this script. You specify the encryption key as the argument—in this case, 25,624. The input is a single phrase, the Linux command uname -a
, and the output, the encryption of this phrase, is a sequence of hex digits all run together:
$
bash streamcipher.sh 25624 uname -a 5D2C1835660A5822$
To test, you can decrypt right after encrypting to see if you get the same result:
$
bash streamcipher.sh25624
|
bash streamcipher.sh -d 25624 uname -a uname -a$
The first uname -a
is the input to the encrypting script; the second is the output from the decrypting—it worked!
The script in Example 14-7 reads in a specified file and then encrypts or decrypts the file by using the XOR method and the key provided by the user.
#!/bin/bash -
#
# Cybersecurity Ops with bash
# streamcipher.sh
#
# Description:
# A lightweight implementation of a stream cipher
# Pedagogical - not recommended for serious use
#
# Usage:
# streamcipher.sh [-d] <key> < inputfile
# -d Decrypt mode
# <key> Numeric key
#
#
source
./askey.sh
#
# Ncrypt - Encrypt - reads in characters
# outputs 2digit hex #s
#
function
Ncrypt
(
)
{
TXT
=
"
$1
"
for
(
(
i
=
0
;
i
<
${#
TXT
}
;
i++
)
)
do
CHAR
=
"
${
TXT
:
i
:
1
}
"
RAW
=
$(
asnum
"
$CHAR
"
)
# " " needed for space (32)
NUM
=
${
RANDOM
}
COD
=
$((
RAW
^
(
NUM
&
0
x7F
))
)
printf
"%02X"
"
$COD
"
done
echo
}
#
# Dcrypt - DECRYPT - reads in a 2digit hex #s
# outputs characters
#
function
Dcrypt
(
)
{
TXT
=
"
$1
"
for
(
(
i
=
0
;
i
<
${#
TXT
}
;
i
=
i+2
)
)
do
CHAR
=
"
0x
${
TXT
:
i
:
2
}
"
RAW
=
$((
$CHAR
))
NUM
=
${
RANDOM
}
COD
=
$((
RAW
^
(
NUM
&
0
x7F
))
)
aschar
"
$COD
"
done
echo
}
if
[
[
-n
$1
&&
$1
=
=
"-d"
]
]
then
DECRYPT
=
"YES"
shift
fi
KEY
=
${
1
:-
1776
}
RANDOM
=
"
${
KEY
}
"
while
read
-r
do
if
[
[
-z
$DECRYPT
]
]
then
Ncrypt
"
$REPLY
"
else
Dcrypt
"
$REPLY
"
fi
done
The source
statement reads in the specified file, and it becomes part of the script. In this instance, it contains the definitions for two functions, asnum
and aschar
, which we will use later in the code.
The Ncrypt
function will take a string of text as its first (and only) argument and encrypt each character, printing out the encrypted string.
It loops for the length of the string….
Taking the ith character.
When we reference that one-character string, we put it in quotes in case that character is a space (ASCII 32) that the shell might otherwise just ignore as whitespace.
Inside the double parentheses, we don’t need the $
in front of variable names as we would elsewhere in the script. The variable RANDOM
is a special shell variable that will return a random number (integer) between 0 and 16,383 (3FFF
hex). We use the bitwise and operator to clear out all but the lower 7 bits.
We print the new, encoded value as a zero-padded, two-digit hexadecimal number.
This echo
will print a newline at the end of the line of hex digits.
The Dcrypt
function will be called to reverse the action of the encryption.
The input for decrypting is hex digits, so we take two characters at a time.
We build a substring with the literal 0x
followed by the two-character substring of the input text.
Having built a hex digit in the format that bash understands, we can just evaluate it as a mathematical expression (using the dollar-double-parens), and bash will return its value. You could write it as follows:
$(( $CHAR + 0 ))
This emphasizes the fact that we are doing a mathematical evaluation, but it adds needless overhead.
Our algorithm for encoding and decoding is the same. We take a random number and exclusive-or
it with our input. The sequence of random numbers must be the same as when we encrypted our message, so we need to use the same seed value.
The aschar
function converts the numerical value into an ASCII character, printing it out. (Remember, this is a user-defined function, not part of bash.)
The -n
asks if the argument is null; if not null, it checks whether it is the -d
option to indicate that we want to decode (rather than encode) a message. If so, it sets a flag to check later.
The shift
discards that -d
option so the next argument, if any, now becomes the first argument, $1
.
The first argument, if any, is assigned to the variable KEY
. If no argument is specified, we will use 1776
as the default value.
By assigning a value to RANDOM
, we set the seed for the sequence of (pseudo-) random numbers that will be produced by each reference to the variable.
The -r
option on the read
command disables the special meaning of the backslash character. That way, if our text has a backslash, it is just taken as a literal backslash, no different than any other character. We need to preserve the leading (and trailing) whitespace on the lines that we read in. If we specify one or more variable names on the read
command, the shell will try to parse the input into words in order to assign the words to the variables we specify. By not specifying any variable names, the input will be kept in the shell built-in variable REPLY
. Most important for our use here, it won’t parse the line, so it preserves the leading and trailing whitespace. (Alternately, you could specify a variable name but precede the read
with an IFS=""
to defeat any parsing into words, thereby preserving the whitespace.)
The if
statement checks whether the flag is set (if the variable is empty or not) to decide which function to call Dcrypt
or Ncrypt
. In either case, it passes in the line just read from stdin, putting it in quotes to keep the entire line as a single argument and preserving any whitespace in the line of text (really needed only for the Ncrypt
case).
The first line of streamcipher.sh uses the source
built-in to include external code from the file askey.sh. That file contains the aschar
and asnum
functions as shown in Example 14-8.
# functions to convert decimal to ascii and vice-versa
# aschar - print the ascii character representation
# of the number passed in as an argument
# example: aschar 65 ==> A
#
function
aschar
(
)
{
local
ashex
printf
-v
ashex
'\x%02x'
$1
printf
'%b'
$ashex
}
# asnum - print the ascii (decimal) number
# of the character passed in as $1
# example: asnum A ==> 65
#
function
asnum
(
)
{
printf
'%d'
"
$1
}
These are two rather obscure features of printf
in use here, one for each function.
We begin with a local variable, so as not to mess with any variables in a script that might source this file.
This call to printf
takes the function parameter ($1
) and prints it as a hex value in the format x
, where is a zero-padded two-digit hexadecimal number. The first two characters, the leading backslash and
x
, are needed for the next call. But this string is not printed to stdout. The -v
option tells printf
to store the result in the shell variable specified (we specified ashex
).
We now take the string in ashex
and print it by using the %b
format. This format tells printf
to print the argument as a string but to interpret any escape sequences found in the string. You typically see escape sequences (such as \n
for newline) only in the format string. If they appear in an argument, they are treated like plain characters. But using the %b
format tells printf
to interpret those sequences in the parameter. For example, the first and third printf
statements here print a newline (a blank line), whereas the second will print only the two characters backslash and n
:
printf
" "
printf
"%s"
" "
printf
"%b"
" "
The escape sequence we’re using for this aschar
function is one that takes a hex number, denoted by the sequence backslash-x (x
) and a two-digit hex value, and prints the ASCII character corresponding to that number. That’s why we took the decimal number passed into the function and printed it into the variable ashex
, in the format of this escape sequence. The result is the ASCII character.
Converting from a character to a number is simpler. We print the character as a decimal number by using printf
. The printf
function would normally give an error if we tried to print a string as a number. We escaped it (using a backslash) to tell the shell that we want a literal double quote character; this is not the start of a quoted string. What does that do for us? Here’s what the POSIX standard for the printf
command says:
If the leading character is a single-quote or double-quote, the value shall be the numeric value in the underlying codeset of the character following the single-quote or double-quote. The Open Group Base Specifications Issue 7, 2018 edition IEEE Std 1003.1-2017 (Revision of IEEE Std 1003.1-2008) Copyright © 2001-2018 IEEE and The Open Group
The askey.sh file gives you two functions: asnum
and aschar
so that you can convert back and forth between ASCII and integer values. You may find them useful in other scripts, which is one reason why we didn’t just define them as part of the streamcipher.sh script. As a separate file, you can source them into other scripts as needed.
Obfuscating the content of a script is an important step in maintaining operational security during a penetration test. The more-sophisticated techniques you use, the more difficult it will be for someone to reverse engineer your toolset.
In the next chapter, we explore how to identify possible vulnerabilities in scripts and executables by building a fuzzer.
Look again at streamcipher.sh and consider this: If you output, when encrypting, not a hex number but the ASCII character represented by that hex number, then the output would be one character for each character of input. Would you need a separate “decode” option for the script, or could you just run the exact same algorithm? Modify the code to do that.
There is a basic flaw in this approach, though not with the encryption algorithm. Think about what that might be—what wouldn’t work and why.
Obfuscate the following script by using the techniques described earlier to make it difficult to follow.
#!/bin/bash - for args do echo $args done
Encrypt the preceding script, and create a wrapper by using OpenSSL or streamcipher.sh.
Write a script that reads in a script file and outputs an obfuscated version of it.
Visit the Cybersecurity Ops website for additional resources and the answers to these questions.