The OS interface and the command line

Generally, the operating system's shell starts applications with several pieces of information that constitute the OS API:

  • The shell provides each application with its collection of environment variables. In Python, these are accessed through os.environ.
  • The shell prepares three standard files. In Python, these are mapped to sys.stdin, sys.stdout, and sys.stderr. There are some other modules, such as fileinput, that can provide access to sys.stdin.
  • The command line is parsed by the shell into words. Parts of the command line are available in sys.argv. For POSIX operating systems, the shell may replace shell environment variables and glob wildcard filenames. In Windows, the simple cmd.exe shell will not glob filenames for us.
  • The OS also maintains context settings, such as the current working directory, user identity, and user group information, among many other things. These are available through the os module. They aren't provided as arguments on the command line.

The OS expects an application to provide a numeric status code when it terminates. If we want to return a specific numeric code, we can use sys.exit() in our applications. The os module defines a number of values, such as os.EX_OK, to help return codes with common meanings. Python will return a zero if our program is terminated normally, a value of one if the program ended with an unhandled exception, and a value of two if the command-line arguments were invalid.

The shell's operation is an important part of this OS API. Given a line of input, the shell performs a number of substitutions, depending on the (rather complex) quoting rules and substitution options. It then parses the resulting line into space-delimited words. The first word must be either a built-in shell command (such as cd or set) or it must be the name of a file, such as python3. The shell searches its defined PATH for this file.

To make effective use of executable files, it's imperative that you are sure that the directory with those files is named by the PATH environment variable. In most OSes, you should append a colon (:) and the directory for your script. In Windows, you should append a semicolon (;) and the directory for your script.

The file named on the first word of a command must have execute, x, permission. The chmod +x somefile.py shell command marks a file as executable. A filename that isn't executable gets an OS Permission Denied error. Use the OS ls -l (or the Windows equivalent) command to see file permissions.

The first bytes of an executable file have a magic number that is used by the shell to decide how to execute that file. Some magic numbers indicate that the file is a binary executable; the shell can fork a subshell and execute it. Other magic numbers, specifically the value encoded by two bytes b'#!', indicate that the file is a proper text script and requires an interpreter. The rest of the first line of this kind of file is the name of the interpreter.

We often use a line like the following in a Python file:

#!/usr/bin/env python3

If the Python file has permission to execute, and has this as the first line, then the shell will run the env program. The env program's argument (python3) will cause it to set up an environment and run the Python 3 program with the Python file as the first positional argument.

After setting the PATH correctly, what happens when we enter ch18_demo.py -s someinput.csv at the command line? The sequence of steps that the program works through from the OS shell via an executable script to Python looks like the following:

  1. The shell parses the ch18_demo.py -s someinput.csv line. The first word is ch18_demo.py. This file is on the shell's PATH and has the x executable permission. The shell opens the file and finds the #! bytes. The shell reads the rest of this line and finds the /usr/bin/env python3 command.
  1. The shell parses the new /usr/bin/env command, which is a binary executable. The shell starts the env program. This program, in turn, starts python3. The sequence of words from the original command line, as parsed by the shell ['ch18_demo.py', '-s', 'someinput.csv'], is provided to Python.
  2. Python will extract any options that are prior to the first argument. Options are distinguished from arguments by having a leading hyphen, -. These first options are used by Python during startup. In this example, there are no options. The first argument must be the Python filename that is to be run. This filename argument and all of the remaining words on the line will be assigned to sys.argv.
  3. The Python startup is based on the options found. Depending on the -s option, the site module may be used to set up the import path, sys.path. If we used the -m option, then Python will use the runpy module to start our application. The given script files may be (re)compiled to byte code. The -v option will expose the imports that are being performed.
  4. Our application can make use of sys.argv to parse options and arguments with the argparse module. Our application can use environment variables in os.environ. It can also parse configuration files; you can read Chapter 14, Configuration Files and Persistence, for more on this topic.

If there is no filename, the Python interpreter will read from standard input. If the standard input is a console (called a TTY, in Linux parlance), then Python will enter a read-execute-print loop (REPL) and display the >>> prompt. While we use this mode as developers, we don't generally make use of this mode for a finished application.

Because of Python's flexibility, there are some other ways of providing input to the Python runtime. The standard input can be a redirected fileā€”for example, python <some_file or some_app | python. While both examples are valid uses of Python, they are potentially confusing because the application source is not very obvious.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset