Performance tips

Throughout this book, we have paid attention to performance. Here, we summarize some highlighted performance topics and give some additional tips. These tips need not always be used, and you should always benchmark or profile the code and the effect of a tip. However, applying some of them can often yield a remarkable performance improvement. Using type annotations everywhere is certainly not the way to go; Julia's type inferring engine does that work for you:

  • Refrain from using global variables. If unavoidable, make them constant with const, or at least annotate the types. It is better to use local variables instead; they are often only kept on the stack (or even in registers), especially if they are immutable.
  • Use a main() function to structure your code.
  • Use functions that do their work on local variables via function arguments, rather than mutating global objects.
  • Type stability is very important:
    • Avoid changing the types of variables over time
    • The return type of a function should only depend on the type of the arguments

Even if you do not know the types that will be used in a function, but you do know it will always be of the same type T, then functions should be defined keeping that in mind, as in the following code snippet:

function myFunc(a::T, c::Int) where T 
  # code 
end 
  • If large arrays or dictionaries are needed, indicate their final size with sizehint! from the start (refer to the Ranges and arrays section of Chapter 2, Variables, Types, and Operations). The following is an example of its use:
d1 = Dict(); 
sizehint!(d1, 10000); 
for i in [1:10000] d1[string(i)] = 2*i; end; 
  • If arr is a very large array that you no longer need, you can free the memory it occupies by setting arr = nothing. The occupied memory will be released the next time the garbage collector runs. You can force this to happen by invoking GC.gc().
  • In certain cases (such as real-time applications), disabling garbage collection (temporarily) with GC.enable(false) can be useful.
  • Use named functions instead of anonymous functions.
  • In general, use small functions.
  • Don't test for the types of arguments inside a function, use an argument type annotation instead.
  • If necessary, code different versions of a function (several methods) according to the types, so that multiple dispatch applies. Normally, this won't be necessary, because the JIT compiler is optimized to deal with types as they come.
  • Use types for keyword arguments; avoid using the splat operator (...) for dynamic lists of keyword arguments.
  • Using mutating APIs (functions with ! at the end) is helpful, for example, to avoid copying large arrays.
  • Prefer array operations to comprehensions, for example, x.^2 is considerably faster than [val^2 for val in x].
  • Don't use try/catch in the inner loop of a calculation.
  • Use immutable types (cfr. package ImmutableArrays).
  • Avoid using type Any, especially in collection types.
  • Avoid using abstract types in a collection.
  • Type annotate fields in composite types.
  • Avoid using a large number of variables, large temporary arrays, and collections, because this provokes a great deal of garbage collection. Also, don't make copies of variables if you don't have to.
  • Avoid using string interpolation ($) when writing to a file, just write the values.
  • Devectorize your code, that is, use explicit for loops on array elements instead of simply working with the arrays and matrices. (This is the exact opposite of advice commonly given to R, MATLAB, or Python users.)
  • If appropriate, use a parallel reducing form with @distributed instead of a normal for loop (refer to Chapter 8, IO, Networking, and Parallel Computing).
  • Reduce data movement between workers in a parallel execution as much as possible (refer to Chapter 8, IO, Networking, and Parallel Computing).
  • Fix deprecation warnings.
  • Use the macro @inbounds so that no array bounds checking occurs in expressions (if you are absolutely certain that no BoundsError occurs!).
  • Avoid using eval at runtime.

In general, split your code into functions. Data types will be determined at function calls, and when a function returns. Types that are not supplied will be inferred, but the Any type does not translate to efficient code. If types are stable (that is, variables stick to the same type) and can be inferred, then your code will run quickly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset