Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

T. MailundIntroducing Markdown and Pandochttps://doi.org/10.1007/978-1-4842-5149-2_10

10. Preprocessing

Thomas Mailund¹

(1)

Aarhus N, Denmark

Markdown is just a plain text document, and you can do any rewriting of that text before you pass it through Pandoc. Any rewriting of the text before you give it to Pandoc is called preprocessing . Pandoc will read from standard input, so we can pipe the result of preprocessing into it on the command line (see Figure 10-1).

Assuming that the preprocessor takes the input file as input and that it writes its output to standard out, then a pipeline can look like this:

preprocessor infile.md |

pandoc --from markdown ... -o outfile

You need to tell Pandoc that it is getting Markdown as input if it reads it from standard in, and you do this with the --from option .

The preprocessor can do whatever you want it to as long as it outputs a file that Pandoc can process. The output does not need to be Markdown—you can change the --from option if it is not—but it must be a file in a format that Pandoc can read. I will use Markdown as my output in the following.

../images/486315_1_En_10_Chapter/486315_1_En_10_Fig1_HTML.png — Figure 10-1
Document formatting pipeline with a preprocessing step

Examples

In the following examples, I will use GPP¹ for the first two and Python² for the last. GPP is a preprocessor with somewhat limited functionality, but for including files and for selectively including or excluding segments of a file, it works excellently. Getting Python to do the same is additional work. On the other hand, since Python is a general-purpose programming language, we can get it to do whatever we want with the input document.

Including Files

One use for a preprocessor is to have some information we can reuse in some files and another document-specific input—like a document’s body—in another file. That is the idea with templates, but there are other cases we might have such a setup.

Imagine that you are teaching a class and hand out exercises every week. Some information, such as the name of the class and the name of the instructor, do not change from week to week but other information does, for example, the week number.

We can make a file header.yml with the general information

class: Markdown and Pandoc

instructor: Thomas Mailund

The header here is, of course, artificially simple. You only want to include a file that is of some complexity, but the example shows the principle.

For a specific week, we can then specify the week information, for example, the week number and the actual exercises for that week. Here is a file; let us call it exercises.md . It holds the exercises for week 14 of the class.

---

#include "header.yml"

week: Week 14

---

# This is an exercise

Do something difficult

# This is another exercises

Do something even more difficult

The #include "header.yml " is where the preprocessor does its thing.

Notice that the three dashes delimiting the YAML specification are not in the header.yml . If it was then couldn’t include it and still set the variable week in the exercises.md file . When we include it into the YAML header, we can combine the general variables set in header.yml with the file-specific variables.

If we pipe the document through the preprocessor

gpp < exercises.md

we get this result:

---

class: Markdown and Pandoc

instructor: Thomas Mailund

week: Week 14

---

# This is an exercise

Do something difficult

# This is another exercises

Do something even more difficult

We can combine this with a template:

documentclass{article}

usepackage{hyperref}

itle{$class$: $week$}

author{$instructor$}

egin{document}

maketitle

end{document}

Combining the preprocessor and Pandoc now lets us build a document with our exercises.

gpp exercises.md |

pandoc --template exercises.tex

--from markdown

-o exercises.pdf

Conditional Inclusion

Continuing with the exercise example, we could imagine that you have TAs for your class and you want to give them solutions to the exercise. It is easier to have the solutions in the same document as the exercises, but you don’t want to hand the solutions to your student. So, what you want is a way to include the solutions when you make documents to the TAs and exclude them otherwise. This is something GPP is excellent at as well.

You can test if a variable is defined using #ifdef. A variable here should not be confused with the variables that Pandoc works with. Remember that the preprocessor sees the document before Pandoc and does not communicate with Pandoc other than piping its output into it.

If we want to include or exclude a block of text, we can put them between #ifdef and #endif. We can do that for the solutions to our exercises:

---

#include "header.yml"

week: Week 14

---

# This is an exercise

Do something difficult

#ifdef SOLUTIONS

This is the solution to the exercise

#endif

# This is another exercises

Do something even more difficult

#ifdef SOLUTIONS

This is the solution to the exercise

#endif

If you build a document as the preceding one, you will not get the solutions in the output. To get them, you need to define SOLUTIONS . You can do this in the file with a #define statement, but for this particular application, we might as well give them to gpp on the command line. Here we can use the option -D. This command line will build a PDF that contains both the exercises and the solutions.

gpp -DSOLUTIONS week14_exercises.md |

pandoc --template exercises.tex

--from markdown

-o week14_exercises_solutions.pdf

Running Code

Leaving the exercises, imagine that you are writing a book about programming and you have code examples. You want to show the result of running the code, so you want to evaluate all your code and insert the result into your document.

For example, you have the code

```python

for i in range(10):

print(i, end = ' ')

```

```python

for i in range(10):

print(-i, end = ' ')

```

and you want the first code block to be followed by the numbers 0–9 and the second from 0 to -9.³

This Python code iterates over all lines in the input. It uses sys.stdin to read the input, so you must pipe input to it and not call it with a file name. For each line, it checks if it is a code block line, that is, whether it starts with three backtics. If it is, and it starts with python, then it starts collecting lines until it sees the end of the block. When it gets there, it evaluates the python code, using exec. This function will execute the code producing any output the code prints—which is what we want here. Since we are using exec, functions and variables defined in earlier block scan be used in later blocks.

from sys import stdin

def main():

exec_env = {}

incode = False

codeblock = []

for line in stdin:

print(line, end=“)

if line.startswith("```python"):

incode = True

continue

if incode:

if line.startswith("```"):

exec("".join(codeblock), exec_env)

incode = False

codeblock = []

continue

codeblock.append(line)

if __name__ == "__main__":

main()

You can call the preprocess like this

python3 evalpy.py < eval-python.md

and get this result:

```python

for i in range(10):

print(i, end = ' ')

```

0 1 2 3 4 5 6 7 8 9

```python

for i in range(10):

print(-i, end = ' ')

```

0 -1 -2 -3 -4 -5 -6 -7 -8 -9

Exercises

If you have gpp installed, then preprocess a document such that you use a flag that gets you a different output when you create HTML and when you create LaTeX output. You have to explicitly set variables to do this, but see the next chapter for how to handle output formats in filters.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 10. Preprocessing

Create new playlist

Sign In

Sign Up

10. Preprocessing

Examples

Including Files

Conditional Inclusion

Running Code

Exercises

Table of Contents for
10. Preprocessing