In this section, we are going read a PDF file using the PyPDF2 module. Also, we are going to get the number of pages of that PDF. This module has a function called PdfFileReader() that helps in reading a PDF file. Make sure you have a PDF file in your system. Right now, I have the test.pdf file present in my system so I will use this file throughout this section. Enter your PDF filename in place of test.pdf. Create a script called read_pdf.py and write the following content in it:
import PyPDF2
with open('test.pdf', 'rb') as pdf:
read_pdf= PyPDF2.PdfFileReader(pdf)
print("Number of pages in pdf : ", read_pdf.numPages)
Run the script and you will get the following output:
student@ubuntu:~/work$ python3 read_pdf.py
Following is the output:
Number of pages in pdf : 20
In the preceding example, we used the PyPDF2 module. Next, we created a pdf file object. PdfFileReader() will read the created object. After reading the PDF file, we are going to get the number of pages of that pdf file using the numPages property. In this case, it is 20 pages.