
Python is easy to learn. You’re probably here because now that your code runs correctly, you need it to run faster. You like the fact that your code is easy to modify and you can iterate with ideas quickly. The trade-off between easy to develop and runs as quickly as I need is a well-understood and often-bemoaned phenomenon. There are solutions.

Some people have serial processes that have to run faster. Others have problems that could take advantage of multicore architectures, clusters, or graphics processing units. Some need scalable systems that can process more or less as expediency and funds allow, without losing reliability. Others will realize that their coding techniques, often borrowed from other languages, perhaps aren’t as natural as examples they see from others.

In this book we will cover all of these topics, giving practical guidance for understanding bottlenecks and producing faster and more scalable solutions. We also include some war stories from those who went ahead of you, who took the knocks so you don’t have to.

Python is well suited for rapid development, production deployments, and scalable systems. The ecosystem is full of people who are working to make it scale on your behalf, leaving you more time to focus on the more challenging tasks around you.

Who This Book Is For

You’ve used Python for long enough to have an idea about why certain things are slow and to have seen technologies like Cython, numpy, and PyPy being discussed as possible solutions. You might also have programmed with other languages and so know that there’s more than one way to solve a performance problem.

While this book is primarily aimed at people with CPU-bound problems, we also look at data transfer and memory-bound solutions. Typically these problems are faced by scientists, engineers, quants, and academics.

We also look at problems that a web developer might face, including the movement of data and the use of just-in-time (JIT) compilers like PyPy for easy-win performance gains.

It might help if you have a background in C (or C++, or maybe Java), but it isn’t a pre-requisite. Python’s most common interpreter (CPython—the standard you normally get if you type python at the command line) is written in C, and so the hooks and libraries all expose the gory inner C machinery. There are lots of other techniques that we cover that don’t assume any knowledge of C.

You might also have a lower-level knowledge of the CPU, memory architecture, and data buses, but again, that’s not strictly necessary.

Who This Book Is Not For

This book is meant for intermediate to advanced Python programmers. Motivated novice Python programmers may be able to follow along as well, but we recommend having a solid Python foundation.

We don’t cover storage-system optimization. If you have a SQL or NoSQL bottleneck, then this book probably won’t help you.

What You’ll Learn

Your authors have been working with large volumes of data, a requirement for I want the answers faster! and a need for scalable architectures, for many years in both industry and academia. We’ll try to impart our hard-won experience to save you from making the mistakes that we’ve made.

At the start of each chapter, we’ll list questions that the following text should answer (if it doesn’t, tell us and we’ll fix it in the next revision!).

We cover the following topics:

  • Background on the machinery of a computer so you know what’s happening behind the scenes

  • Lists and tuples—the subtle semantic and speed differences in these fundamental data structures

  • Dictionaries and sets—memory allocation strategies and access algorithms in these important data structures

  • Iterators—how to write in a more Pythonic way and open the door to infinite data streams using iteration

  • Pure Python approaches—how to use Python and its modules effectively

  • Matrices with numpy—how to use the beloved numpy library like a beast

  • Compilation and just-in-time computing—processing faster by compiling down to machine code, making sure you’re guided by the results of profiling

  • Concurrency—ways to move data efficiently

  • multiprocessing—the various ways to use the built-in multiprocessing library for parallel computing, efficiently share numpy matrices, and some costs and benefits of interprocess communication (IPC)

  • Cluster computing—convert your multiprocessing code to run on a local or remote cluster for both research and production systems

  • Using less RAM—approaches to solving large problems without buying a humungous computer

  • Lessons from the field—lessons encoded in war stories from those who took the blows so you don’t have to

Python 3

Python 3 is the standard version of Python as of 2020 with Python 2.7 deprecated after a 10 year migration process. If you’re still on Python 2.7, you’re doing it wrong - many libraries are now no longer supported for your line of Python and support will become more expensive over time. Please do the community a favour and migrate to Python 3 and make sure that all new projects use Python 3.

In this book we use 64-bit Python. Whilst 32-bit Python is supported, it is far less common for scientific work. We’d expect all the libraries to work as usual but numeric precision, which depends upon the number of bits available for counting, is likely to change. 64-bit is dominant in this field, along with *nix environments (often Linux or Mac). 64-bit lets you address larger amounts of RAM. *nix lets you build applications that can be deployed and configured in well-understood ways with well-understood behaviors.

If you’re a Windows user, then you’ll have to buckle up. Most of what we show will work just fine, but some things are OS-specific, and you’ll have to research a Windows solution. The biggest difficulty a Windows user might face is the installation of modules: research in sites like StackOverflow should give you the solutions you need. If you’re on Windows, then having a virtual machine (e.g., using VirtualBox) with a running Linux installation might help you to experiment more freely.

Windows users should definitely look at a packaged solution like those available through Anaconda, Canopy, Python(x,y), or Sage. These same distributions will make the lives of Linux and Mac users far simpler too.

Changes from Python 2.7

If you’ve upgraded from Python 2.7 then you might not be aware of a few relevant changes:

  • meant integer division in Python 2.7 and it is float division in Python 3.

  • str and unicode were used to represent text data in Python 2.7, in Python 3 everything is a str and these are always Unicode. For clarity a bytes type is used if we’re using un-encoded byte sequences.

If you’re in the process of upgrading your code, two good guides are “Porting Python 2 Code to Python 3” and “Porting to Python 3: An in-depth guide.”. With a distribution like Anaconda or Canopy, you can run both Python 2 and Python 3 simultaneously—this will simplify your porting.


This book is licensed under Creative Commons Attribution-NonCommercial-NoDerivs 3.0.

You’re welcome to use this book for noncommercial purposes, including for noncommercial teaching. The license only allows for complete reproductions; for partial reproductions, please contact O’Reilly (see [Link to Come]). Please attribute the book as noted in the following section.

We negotiated that the book should have a Creative Commons license so the contents could spread further around the world. We’d be quite happy to receive a beer if this decision has helped you. We suspect that the O’Reilly staff would feel similarly about the beer.

How to Make an Attribution

The Creative Commons license requires that you attribute your use of a part of this book. Attribution just means that you should write something that someone else can follow to find this book. The following would be sensible: “High Performance Python (2nd edition) by Micha Gorelick and Ian Ozsvald (O’Reilly). Copyright 2020 Micha Gorelick and Ian Ozsvald, 978-1-449-36159-4.”

Errata and Feedback

We encourage you to review this book on public sites like Amazon—please help others understand if they’d benefit from this book! You can also email us at [email protected].

We’re particularly keen to hear about errors in the book, successful use cases where the book has helped you, and high performance techniques that we should cover in the next edition. You can access the page for this book at

Complaints are welcomed through the instant-complaint-transmission-service > /dev/null.

Conventions Used in This Book

The following typographical conventions are used in this book:


Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.


This element signifies a tip or suggestion.


This element signifies a general note.


This element indicates a warning or caution.

Using Code Examples

Supplemental material (code examples, exercises, etc.) is available for download at

If you have a technical question or a problem using the code examples, please send email to .

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

O’Reilly Online Learning


For more than 40 years, O’Reilly Media has provided technology and business training, knowledge, and insight to help companies succeed.

Our unique network of experts and innovators share their knowledge and expertise through books, articles, conferences, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, please visit

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at

Email to comment or ask technical questions about this book.

For more information about our books, courses, conferences, and news, see our website at

Find us on Facebook:

Follow us on Twitter:

Watch us on YouTube:


Thanks to Jake Vanderplas, Brian Granger, Dan Foreman-Mackey, Kyran Dale, John Montgomery, Jamie Matthews, Calvin Giles, William Winter, Christian Schou Oxvig, Balthazar Rouberol, Matt “snakes” Reiferson, Patrick Cooper, and Michael Skirpan for invaluable feedback and contributions. Ian thanks his wife Emily for letting him disappear for 10 months to write this (thankfully she’s terribly understanding). Micha thanks Elaine and the rest of his friends and family for being so patient while he learned to write. O’Reilly are also rather lovely to work with.

Our contributors for the “Lessons from the Field” chapter very kindly shared their time and hard-won lessons. We give thanks to Ben Jackson, Radim Řehůřek, Sebastjan Trebca, Alex Kelly, Marko Tasic, and Andrew Godwin for their time and effort.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.