Day 2: Getting Assimilated

Yesterday we looked at Julia’s basic types and operators, and we spent quite a lot of time with its arrays. Julia’s basic data structures are versatile, but it has even more to offer.

First we’ll quickly review control flow, which should feel quite familiar. We’ll also hit abstract and user-defined types and learn all about functions and multiple dispatch.

Finally, we’ll wrap up the day by playing with Julia’s concurrency features, which Julia has assimilated from languages like Erlang.

Control Flow

Julia’s if, while, and for are pretty standard. Their syntax feels like a pleasant mix of Ruby and Python. Julia’s for loops are able to iterate over a variety of things, which is quite handy.

Let’s look at branching with if first:

 
julia>​ x = 10
 
10
 
julia>​ if x < 10
 
println("My chair is too small")
 
elseif x > 10
 
println("My chair is too big")
 
else
 
println("My chair is just right")
 
end
 
My chair is just right

One notable difference between Julia and languages like C, Python, and JavaScript is that the test expression must evaluate to a Boolean; 0, 1, and empty collections are not coercible to Boolean values. This is Julia’s underlying strong typing asserting itself.

while loops are also what you’d expect:

 
julia>​ x = 8
 
8
 
julia>​ while x < 11
 
x = x + 1
 
println("More!")
 
end
 
More!
 
More!
 
More!

Here are some for loops showing several different kinds of iteration.

This example iterates over an array. You can also use in instead of = if you prefer. Also, note that Julia’s strings interpolate from the current scope using $. You can reference variable names or entire expressions like $(a + 10).

 
julia>​ for a = [1, 2, 3]
 
println("$a")
 
end
 
1
 
2
 
3

Here we iterate over a range. 1:10 is all the integers from 1 to 10 inclusive.

 
julia>​ sum = 0
 
0
 
julia>​ for a = 1:10
 
sum += a
 
end
 
julia> sum
 
55

Iterating over other collections like dictionaries is easy too. Here we deconstruct each element, which for a dictionary is a tuple of the key and value.

 
julia>​ numbers = [:one => 1, :two => 2]
 
Dict{Symbol,Int64} with 2 entries:
 
:two => 2, :one => 1
 
julia>​ for (key, value) in numbers
 
println("The name of $value is $key")
 
end
 
The name of 2 is two
 
The name of 1 is one

Compared to multidimensional arrays, the control flow of Julia is unambitious. Sometimes simplicity is best, but you’ll see control flow shine in more complex examples later.

User-Defined Types and Functions

Julia has some great types, but no language is complete without the ability to make your own. You can define your own types in Julia, and it has a limited form of abstract types and subtyping as well.

After types, we’ll talk about user-defined functions, including Julia’s powerful multiple dispatch, which is a functional incarnation of polymorphism.

Let’s build a simple type to hold movie characters. Types in Julia are like structs in C or classes without methods if you are familiar with Java or Ruby.

Fields in a type definition can be constrained to be of a particular type with the :: operator. If no type constraint is given, the field is of type Any. It has the same kind of behavior as fields in Ruby, Python, or JavaScript.

Constructing a value of type is done with its constructor function, which has the same name as the type and takes an argument for each field.

 
julia>​ type MovieCharacter
 
heart :: Bool
 
name
 
end
 
 
julia> cowardly_lion = MovieCharacter(false, "Lion")
 
MovieCharacter(false,"Lion")

Accessing fields on a value of a type is done with the . operator, just as in many other languages.

 
julia>​ cowardly_lion.name
 
"Lion"

Abstract types have no fields, but serve as a way to group multiple types together. Concrete types are then defined as subtypes of the abstract type. This allows for extension and default behavior.

Abstract types cannot be constructed, but they can be used as field type specifiers or in typed array literals.

 
julia>​ abstract Story
 
 
julia> Story()
 
ERROR: type cannot be constructed

Defining a subtype is done with the <: operator, but looks exactly like a normal type definition otherwise. Multiple subtypes can coexist next to each other.

 
julia>​ type Book <: Story
 
title
 
author
 
end
 
julia> type Movie <: Story
 
title
 
director
 
end

Like any dynamic language, Julia can use introspection to walk the type hierarchy. You can easily find the supertype as well as all the subtypes.

 
julia>​ super(Book)
 
Story
 
julia>​ super(Story)
 
Any
 
julia>​ subtypes(Story)
 
2-element Array{Any,1}:
 
Book
 
Movie

You can’t subtype more than one level. This is perhaps unexpected, but avoids many pitfalls of traditional object-oriented languages.

 
julia>​ type Short <: Movie
 
plot
 
end
 
ERROR: invalid subtyping in definition of Short

We can now abstract over data, but we still need to abstract over code. Let’s see how user-defined functions look in Julia. You’ll find they have quite a Python flavor.

Functions return the last expression in their bodies. You can also use return to exit early.

 
julia>​ function hello(name)
 
"Hello, $(name)!"
 
end
 
hello (generic function with 1 method)
 
julia>​ hello("world")
 
"Hello, world!"

Default arguments can be provided. If not specified when the function is invoked, the default values will be used.

 
julia>​ function with_defaults(a, b=10, c=11)
 
println("a is $a, b is $b, and c is $c")
 
end
 
with_defaults (generic function with 3 methods)
 
julia>​ with_defaults(1, 2)
 
a is 1, b is 2, and c is 11
 
julia>​ with_defaults(1)
 
a is 1, b is 10, and c is 11

Using ... on the final argument will make it a collection of all the remaining arguments if any exist.

 
julia>​ function it_depends(args...)
 
for arg in args
 
println(arg)
 
end
 
end
 
it_depends (generic function with 1 method)
 
julia>​ it_depends(:one, :two)
 
one
 
two

All of Julia’s operators are also functions and can be used in prefix notation too.

 
julia>​ +(1, 2)
 
3
 
julia>​ numbers = 1:10
 
1:10

When ... appears in a function definition’s argument list, it gathers arguments into a collection. When ... appears in a function invocation it expands the collection into arguments. It’s a very tidy feature that saves you from what other languages call apply.

 
julia>​ +(numbers...)
 
55

Functions in Julia really start to shine when you couple them with multiple dispatch. The same function can be defined multiple times for different types.

You might be familiar with overloading from other languages, but multiple dispatch is even more powerful. Instead of picking a function to call based on its first argument (or the object on which it’s invoked in object-oriented languages), multiple dispatch actually picks the function based on the types of all the arguments.

In Julia, each version of a function is called a method, but unlike object-oriented programming, the methods don’t belong to one particular type. This makes a lot of sense given Julia’s focus on scientific code; after all, if the dividend and the divisor have different types, which type should the division operator / belong to? In object-oriented languages, it ends up being whichever type is written on the left, which doesn’t make a lot of sense, but people have gotten used to it.

Let’s see multiple dispatch in action in a simple set of methods to concatenate two values together.

What makes the following a method instead of a function is that the types of the arguments are specified. This method is defined only when both arguments are Int64. This version of concat does a little math to append the numbers together.

 
julia>​ function concat(a :: Int64, b :: Int64)
 
zeros = int(ceil(log10(b+1)))
 
a * 10^zeros + b
 
end
 
concat (generic function with 1 method)
 
julia>​ concat(117, 5)
 
1175

If we try to call our function on different kinds of arguments, Julia complains that no method was found.

 
julia>​ concat(117, "5")
 
ERROR: no method concat(Int64, ASCIIString)

Now we’ll define a concat method that takes a string as the second argument and returns a string. Now when we call the function, the correct method is selected. Notice that to pick the method, Julia had to look at the types of all the arguments. This is multiple dispatch at work.

 
julia>​ function concat(a :: Int64, b :: ASCIIString)
 
"$a$b"
 
end
 
concat (generic function with 2 methods)
 
julia>​ concat(117, "5")
 
"1175"

Multiple dispatch is a rarely seen language feature assimilated directly from Lisps. Clojure is probably the most mainstream language that includes it. Although little known, it is quite powerful and makes for some beautiful code.

It allows for open extension where normal object-oriented methods do not. There’s no need to subclass Int64 to add a new type of concat, nor do you need to modify the Int64 object with monkey patching. If your library provides methods for common types, users of the library can extend those methods to their own types without modifying your library at all.

Julia’s whole standard library relies heavily on multiple dispatch. The behavior of all the numeric types and operators are built with it. If you’re curious, try running methods(+) at the REPL, which will show you all the definitions for addition.

Concurrency

You’ve now seen all the basics—some familiar, some new. Taken together, it makes for quite a nice dynamic language with strong typing and abstraction. Julia is a language with a prime directive—to make writing numerical code better.

One of the biggest issues with numerical code is that it takes a long time to run, even on supercomputers. To eke out the maximum performance, concurrency and distributed computing are a necessity, and so Julia has it built right in.

Julia concurrency works a lot like Erlang. You communicate with other processes via message passing. Whether those processes are on the same machine or on remote machines makes no difference.

Before we can start using these processes, we must create some. There are two ways to do this. The first is to use addprocs to add local processes. The second is to start Julia with -p N, where N is the number of processes to create.

 
julia>​ addprocs(2)
 
2-element Array{Any,1}:
 
2, 3
 
julia>​ workers()
 
2-element Array{Int64,1}:
 
2, 3

addprocs creates new processes and returns their IDs. You might have noticed it starts at 2. Process 1 is the process for the REPL. workers returns the list of processes.

Now that we have some processes, we can send and receive messages from them with remotecall and fetch. Note that these are the low-level primitives the rest of the system is built on, not necessarily things you’d use all the time.

 
julia>​ r1 = remotecall(2, rand, 10000000)
 
RemoteRef(2,1,7)
 
julia>​ r2 = remotecall(3, rand, 10000000)
 
RemoteRef(3,1,9)
 
julia>​ println("Not blocking")
 
Not blocking
 
julia>​ rand_list = fetch(r1)
 
10000000-element Array{Float64,1}:
 
0.902002, 0.495766, ...

remotecall executes a function on a particular worker. The first argument is the worker’s ID. Then comes the name of the function, and the rest of the arguments are passed to the given function. It returns a RemoteRef, which can be used to retrieve the result later.

remotecall returns immediately, as long as the worker ID is not 1—it does not block the shell process. We can still run code even if the processes are busy crunching numbers.

fetch takes a RemoteRef and returns the result of the function the worker was evaluating. If the worker isn’t done yet, this will block and wait until the result is available.

Adding processes interactively is a little tedious. It’s a bit easier to start a REPL with a bunch of processes already available. Julia takes the -p argument to set the number of processes to start.

 
$ ​julia -p 8
 
2014-06-22 10:14:01.021 julia[93233:707] App did finish launching

Now we have a REPL with nine processes: one for the shell itself and eight spares to do parallel tasks with. Let’s put them to work on something more substantial than generating random arrays.

We’re going to write a coin flipping simulator using Julia’s higher-level parallel programming features instead of dealing with remotecall and fetch directly. First, we’ll start with a nonparallel version.

First, the function flip_coins returns the number of heads after doing all the flips. It uses a simple for loop.

 
julia>​ function flip_coins(times)
 
count = 0
 
for i = 1:times
 
count += int(randbool())
 
end
 
count
 
end
 
flip_coins (generic function with 1 method)
 
julia>​ flip_coins(20)
 
9
 
julia>​ flip_coins(20)
 
10

The @time macro will evaluate the given expression and print out how much time it took. As the number of flips increases, flip_coins becomes quite slow.

 
julia>​ @time flip_coins(100000000)
 
elapsed time: 0.391368303 seconds (96 bytes allocated)
 
49994306
 
julia>​ @time flip_coins(1000000000)
 
elapsed time: 4.219781844 seconds (96 bytes allocated)
 
500005355

We can speed this code up by flipping coins in parallel with Julia’s parallel for loops. Using the @parallel macro we can change a normal for loop into a parallel reducing version. The first argument is the combining operator. Note that the loop’s operation must be commutative since the order it runs is arbitrary, as it gets scheduled over the processes.

 
julia>​ function pflip_coins(times)
 
@parallel (+) for i = 1:times
 
int(randbool())
 
end
 
end
 
flip_coins (generic function with 1 method)
 
julia>​ @time pflip_coins(100000000)
 
elapsed time: 0.293102855 seconds (113932 bytes allocated)
 
50001665
 
julia>​ @time pflip_coins(1000000000)
 
elapsed time: 2.19619143 seconds (55248 bytes allocated)
 
499995729

The parallel version is even a bit easier to read as the explicit summation is now gone.

If you compare these numbers to the previous nonparallel ones, you’ll see the parallel one is 30–50% faster. That’s a pretty good result for such a minor syntactic change.

Not all code will be this easy to parallelize, but you’d be surprised at how many things can be expressed as a parallel reduction. Lispers have been expressing things this way for decades, resulting in concise and powerful code, and Clojure has recently added parallel reducers as well.

Before we wrap up the day, let’s create a histogram to see the distribution of coin flips across multiple runs. Along the way we’ll point out a few more of Julia’s many charms.

 
julia>​ function flip_coins_histogram(trials, times)
 
bars = zeros(times + 1)
 
for i = 1:trials
 
bars[pflip_coins(times) + 1] += 1
 
end
 
hist = pmap((len -> repeat("*", int(len))), bars)
 
for line in hist
 
println("|$(line)")
 
end
 
end
 
flip_coins_histogram (generic function with 1 method)

bars[0] tracks the number of simulations that resulted in 0 flips, and so on. There is one more bar than times since the result could range from 0 heads to 10.

In addition to a normal map, Julia provides pmap, which runs the mapping function in parallel across all the processes but preserves the order of the result.

The -> notation is Julia’s lightweight anonymous function syntax.

Let’s run it:

 
julia>​ flip_coins_histogram(100, 10)
 
|*
 
|
 
|*****
 
|*****
 
|*********************
 
|*****************************
 
|***********************
 
|**********
 
|******
 
|
 
|

Julia makes short work of data analysis tasks both in terms of the amount of time it takes you to code them and in how fast they run. It’s nice to be able to use the whole machine for work without having to juggle processes or mutexes yourself.

Interview with Julia’s Founders: Jeff Bezanson, Stefan Karpinski, Viral Shah, Alan Edelman

Now that you’ve seen some of Julia’s main features and put the language through its paces, you have a better appreciation for some of the trade-offs that came into play. Let’s check in with all of Julia’s founders: Jeff Bezanson, Stefan Karpinski, Viral Shah, and Alan Edelman.

Us:

Why did you create Julia?

Julia founders:

Our motivation for creating a new language is captured pretty well (if a bit lyrically) in our first blog post about Julia.[67] We wanted a language that combined the best of computer science and scientific computing. Historically, there has been a divide between the practical tools that scientists use to get work done and the systems carefully designed by computer scientists which don’t seem to work out in practice for the scientific crowd. There has also been a longstanding tension between productivity and performance. For speed and control, you have to write C or Fortran, but for productivity, people use high-level, dynamic languages like MATLAB, R, or Python. We wanted to have our cake and eat it too: get the performance of C in a language as easy to use as Python; to have all that great programming languages can offer in a form that is usable for hard scientific problems. To a large extent, we feel that Julia has shown that this is possible.

Julia is a bit different from other high-performance dynamic language projects in that the language is designed for performance from the beginning. This means there is more control over memory usage and layout and it’s easy to interact with C and Fortran. It also means we didn’t need lots of difficult implementation tricks to get speed. Julia’s execution model is pretty straightforward and transparent once you get the hang of it.

Us:

What do you like most about it?

Julia founders:

Once you get used to multiple dispatch it is very hard to go back to single dispatch. It just feels so natural to provide multiple methods of a function that do slightly different things based on what types of values you pass. We’re also happy with how clean and uncluttered the language is. The core language is quite minimal, channeling the spirit of Scheme in many ways. Of course, Scheme doesn’t have the burden of syntax, which Julia has to deal with. On the other hand, in Julia basic numeric types like Int and Float64 are defined in the standard library instead of being baked into the language spec. Multiple dispatch is absolutely crucial here because mathematical operators like addition and array indexing are by far the most polymorphic things in most languages—in Julia they’re just syntax for calling generic functions.

We are most proud of the Julia community. Not only are the people who frequent the Julia mailing lists and GitHub repositories brilliant and knowledgeable, but the standards of politeness, civility, and helpfulness are remarkable. Every time someone new is confused or rude the community response is unfailingly civil and kind.

Us:

What kinds of problems does it solve best?

Julia founders:

Julia is ideal for really hard technical problems that require a flexible, productive language to explore the problem space efficiently, but also need great performance to get answers in reasonable time. Traditionally, technical computing languages have been quite limited once you stray beyond number crunching. Julia is not like this—it is also a general-purpose language. You can solve hard computational problems, but also build a web service in front of that computation, all in the same language.

Us:

What’s the most surprising place you’ve seen Julia in production?

Julia founders:

We have seen interesting and sometimes quite unexpected applications in aerospace, finance, and real-time audio. We are also starting to see startups deploy Julia in web applications to solve computational problems on demand. There is a surprising amount of interest in using Julia for embedded systems. Our in-progress port to ARM should help accelerate this trend. The ability to compile Julia scripts to executables will also help—you can already do this but it’s not as convenient as it should be.

Us:

If you were to start from scratch, is there anything you’d do differently?

Julia founders:

When we started, there was a trade-off between making new users feel comfortable in Julia vs. clean language design. In hindsight, we were probably more concerned than we should have been with maintaining superficial similarity to other technical computing languages. For example, for our array concatenation syntax, it would have been better to do something more general as long as it was reasonably easy to use. Of course, it’s not too late to change some of these choices.

What We Learned in Day 2

We started off today looking into Julia’s control flow constructs. Control flow looks similar to many other languages, especially Python and Ruby.

Next we dived into user defined types and functions. There are only two levels of types, abstract types and concrete subtypes, but Julia lets you mix types via the Any type. Functions are built on multiple dispatch, which is a more powerful version of overloading and dynamic dispatch that you might be familiar with from object-oriented languages.

Finally we dove into Julia’s concurrency, starting from the primitives and then working up to the high level with parallel for loops and pmap. With just a few tweaks we made a coin flipping function twice as fast.

Your Turn

Now that you have nearly the whole language in your grasp, you can work on some more interesting problems.

Find…

  • The parallel computing part of the Julia manual. Specifically, read up on @spawn and @everywhere.

  • The Wikipedia page on multiple dispatch.

Do (Easy):

  • Write a for loop that counts backward using Julia’s range notation.

  • Write an iteration over a multidimensional array like [1 2 3; 4 5 6; 7 8 9]. In what order does it get printed out?

  • Use pmap to take an array of trial counts and produce the number of heads found for each element.

Do (Medium):

  • Write a factorial function as a parallel for loop.

  • Add a method for concat that can concatenate an integer with a matrix. concat(5, [1 2; 3 4]) should produce [5 5 1 2; 5 5 3 4].

  • You can extend built-in functions with new methods too. Add a new method for + to make "jul" + "ia" work.

Do (Hard):

  • Parallel for loops dispatch loop bodies to other processes. Depending on the size of the loop body, this can have noticeable overhead. See if you can beat Julia’s parallel for loop version of pflip_coins by writing something using the lower-level primitives like @spawn or remotecall.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset