,

14

Software Infrastructure

images

Software used in neuromorphic systems can be categorized according to the role(s) it performs. Each of these roles has particular features and presents particular challenges. Optimization, and Application Programming Interface (API) design are important, especially for software that is directly involved in processing streams of address-events. Several example software systems are briefly presented.

14.1   Introduction

The software used in AER systems is an often neglected component of such systems and historically, relatively little has been published on the subject (with a few exceptions, e.g., Dante et al. 2005; Delbrück 2008; Oster et al. 2005), although it forms an essential part of all but the simplest of systems. The software in AER systems generally covers one or more of the following roles: chip and system description, configuration, AE stream handling, mapping, and placement and routing.

Chip and System Description software generally consists of databases and/or description languages, which enable knowledge of the properties of relatively disparate hardware to be maintained and interrogated by one or more of the remaining kinds of software in a uniform manner.

By Configuration Software, we mean software which is not necessarily involved once the AER system is up and running; but which is used to configure, for instance various parameters and bias values in mixed-signal chips; and software which is used for parameter tuning and calibration.

By AE Stream Handling Software, we mean software which actively participates in the handling of the streams of AEs on their way to and from hardware devices, to replay or generate stimuli for the hardware, or to capture AEs for the purposes of display, statistical analysis, later replay, and so on, or even to carry out algorithmic processing on the AE streams, as further discussed in Chapter 15.

Mapping Software is software which controls the mapping of AEs between address spaces (one-to-one mappings) or to perform fan-out (one-to-many mappings).

Placement and Routing Software is involved in configuring large scale AER systems consisting of many identical hardware devices with multiple possible paths for AEs to flow between them. The task of such software is to optimally distribute or place neural populations across the available hardware and to determine how to optimally route the AE traffic between them. This task is analogous to the placement and routing problems which arise in the design of printed circuit boards, integrated circuits in general, and field programmable gate arrays. Software for doing this is beyond the scope of this chapter, but see for example Brüderle et al. (2011) and Ehrlich et al. (2010) (in which placement and routing is referred to as mapping), and Chapter 16.

14.1.1   Importance of Cross-Community Commonality

To promote the development of neuromorphic systems to a scale that can deal with real-world problems, the community relies on exchange and cooperation between different laboratories. Much effort has been put in to facilitate the communication between the researchers themselves, for example at the annual Telluride and CapoCaccia workshops or by the Institute of Neuromorphic Engineering (Cap n.d.; Ins n.d.). Multi-lab efforts such as those in for example the CAVIAR (CAVIAR 2002), ALAVLSI (e.g., Chicca et al. 2007), and later FACETS (Brüderle et al. 2011) projects were facilitated by having common chip setup descriptions, configuration interfaces, AE stream formats, and so on.

14.2   Chip and System Description Software

One of the challenges facing those who build neuromorphic systems is the complexity of the hardware. Even the simplest of systems may have tens of biases and parameters which must be adjusted to reach a desired operating regime. Frequently, compromises in the chip designs due for instance to area constraints lead to certain, but not all, parameters being shared across diverse structures on the chip. For example, on a hypothetical multineuron chip, all the weights of synapses of a certain type may be forced to be controlled by a single parameter across all neurons on the chip, whereas the weights of other synapses might be individually adjustable. Some neurons may be able to use on-chip connections and others not. The addresses emitted by a 1000-neuron chip might not range from 0–999, but from, say, 48–2046 with only the even numbered addresses being used, while the corresponding synapses might be addressed with the neurons numbered in the opposite order in bits 5..14 (i.e., from 31 968 down to 0) with the index of the synapse along the neuron in the low order 5 bits. In hardware designs, all things are possible, and the software needs to take account of this!

The number of parameters grows with the size of systems. And inherent in the use of analog technology is the issue of mismatch, which means that two instances of what is notionally the same chip will generally require slightly different bias parameters to operate in approximately the same regime, so the parameter values for one chip are likely to be unique to that very chip and cannot be used for other chips of the same type. The proper design of bias generators (see Chapter 11) can help with this problem, but many of the current academic research developments still require custom settings or calibration for individual chips, and these settings must be maintained together with the hardware components. Furthermore, different chips will be mounted on different boards, and in an inhomogeneous system, each board type may be addressed in different ways.

In order to master all of this complexity, it helps build up a database of some kind describing the parameters, chips, boards, and so on involved, or potentially involved in a particular setup. For very simple systems, this could be done with simple text files, but typically some form of XML (Extensible Markup Language) is used.

Given a database (in whatever form) containing knowledge about the hardware system, it can be interrogated by all of the other software in a uniform manner to determine how to address given synapses and neurons, how to connect them, how to automatically generate graphical user interface (GUI) controls, how to display results, and so on.

14.2.1   Extensible Markup Language

XML, (see, e.g., Bradley 2002) lends itself well to describing typically hierarchically organized neuromorphic systems, since it itself is a hierarchical yet versatile format which can be easily extended. The files can be edited with a standard text editor and displayed in a browser. The syntax is understandable by anyone who is familiar with the concepts of HTML. Most importantly, one can easily import and process XML documents in a variety of programming environments, for example MATLAB, Java, and Python.

An entry in XML format consists of tags and attributes of the form

<tag attribute1 = “value1” attribute2 = “value2” ... > content /tag>

If no content is given, this can be shortened to

<tag attribute1 = “value1” attribute2 = “value2” ... />.

Hierarchical structures are built by nesting tags as content into other tags. Files can include arbitrary tags and attributes that are ignored during further processing, so any kind of additional content can be added.

14.2.2   NeuroML

A particularly important flavor of XML used in neural network modeling is NeuroML (Gleeson et al. 2010). NeuroML is structured into three levels and includes MorphML for describing the morphology of biological neurons, ChannelML for describing channel and synapse properties, and NetworkML for describing networks of neurons. Not all of its levels are necessarily relevant in describing neuromorphic, hardware-based systems, but the structure of NeuroML makes it possible to use only the components which are relevant.

14.3   Configuration Software

To facilitate testing and operation of neuromorphic systems with a multitude of parameters, a software infrastructure to easily interface to the hardware is desirable. Without this, the setup and operation of the chips require too much skill and experience. Also it is often the case that parameters have to be tuned to a narrow operating range, and a working set of values established after many testing sessions should not be easily lost and should be easy to retrieve at a later time. Persistent storage of parameters between runtime sessions is essential so that users can tweak parameters and return to the same state later.

When hand-tuning is required, a GUI automatically constructed from a database description of the system is extremely useful. An example of such a system is presented in Section 14.6.1 below.

Hand-tuning, however, does not scale to larger systems. In larger systems it is more desirable to perform parameter estimation and calibration automatically. Until recently, automated methods for mapping VLSI circuit bias voltages to neural network type parameters were based on heuristics and result in ad-hoc custom-made calibration routines. For example, in Brüderle et al. (2009) the authors perform an exhaustive search of the parameter space to calibrate their hardware neural networks, using the simulator-independent description language ‘PyNN’ (Davison et al. 2008).

This type of brute-force approach is possible because of the accelerated nature of the hardware used, but it becomes intractable for real-time hardware or for very large systems, due to the massive amount of data that must be measured and analyzed to carry out the calibration procedure. An alternative model-based approach is proposed in Neftci et al. (2011), where the authors fit data from experimental measurements with equations from transistors, circuit models, and computational models to map the bias voltages of VLSI spiking neuron circuits to the parameters of the corresponding software neural network. This approach does not require the extensive parameter space search techniques, but new models and mappings need to be formulated every time a new circuit or chip is used, making its application quite laborious.

An example of automatic parameter estimation and calibration software is presented in Section 14.6.4.

14.4   Address Event Stream Handling Software

AE systems by their nature lend themselves rather well to their integration with this kind of software, as all of the information being conveyed has been converted to a digital form. However, applying software to AE processing is challenging from the point of view of latency and bandwidth. (For a formal treatment of latency and bandwidth, see Chapter 2 and Section 2.3 in particular.)

It is more efficient to process a buffer full of AE data than to process each event as it arrives, and even if the software is designed following the best principles for real-time systems, latencies may be bounded but not always consistent. Therefore, in order to retain the timing information present in an incoming AE stream on its asynchronous bus as it enters the synchronous world of a computer, it is essential that the interface hardware stores not only the address embodied in the event, but also its arrival time to sufficient resolution in the form of a time-stamp. Thus, software handling AEs in a computer is usually dealing on the input side with data consisting of pairs of addresses and time-stamps. On the output side, a slightly different form is often required by the hardware, namely pairs of addresses and inter-spike intervals, that is times which are relative to the output of the preceding AE rather than to some clock which runs continuously.

14.4.1   Field-Programmable Gate Arrays

Even given highly optimized code in a hard real-time environment, conventional software ‘in-the-loop’ is unlikely to be able to keep up with the demands of processing all of the spikes flowing through an AE system. It is for this reason that mappers are usually constructed using FPGAs rather than CPUs to perform the mapping function. Of course, the FPGAs must be ‘programmed’ using a hardware description language (HDL), but this kind of logic design is outside of our scope here.

14.4.2   Structure of AE Stream Handling Software

Inevitably some AEs must leave the core of an AE system and enter a conventional computer, whether for debugging, monitoring, or control purposes. And often it is desirable to be able to feed a pre-computed AE stream, for instance a test stimulus, into an AE system from a computer. In these cases software must be written that directly handles an AE stream. This software typically contains drivers which talk to and provide an abstraction of the specific hardware being used, a library that presents a well-defined, stable API and applications written on top of this library, possibly with a GUI, to provide a means to monitor and record what is going on inside the AE system and possibly further process the AE output; or indeed to provide input such as a test stimulus to the AE system.

The capture and algorithmic processing of AE data on computers is central to jAER, which is introduced in Section 14.6.3, and this algorithmic processing is discussed separately in Chapter 15.

14.4.3   Bandwidth and Latency

Bandwidth and latency issues have already been discussed primarily in relation to hardware in Chapters 2 and 13, but of course bandwidth is also almost always an important consideration in AE stream handling software. Although it may be inherent in the nature of spike-based communication that the loss of a spike or two should not be critical, in experimental situations it is usually desired not to drop any spikes, but to be able to faithfully record all of the output from a hardware system under test. Latency may or not be an issue, depending on the nature of the system. Once AEs have been time-stamped by hardware, the original inter-spike intervals can always be recovered, so the latency between reading the AE and its time-stamp from the hardware and further processing steps within the computer is not generally an issue while such processing remains within the computer. However, if these AEs or subsequent AEs arising from their processing are to be reinjected into the same AE system from which they came, then the latency is likely to be critical, as providing spikes back into a not purely feed-forward neural network too late, that is, after some essentially arbitrary delay, may have non-negligible effects. That being said, depending on the nature of the system, if it operates on biological neural timescales, jitter of up to a few hundred microseconds may be acceptable.

So how is high bandwidth and low latency to be achieved? Probably the most important concern is to minimize the copying of data (i.e, the AEs). In a na¨ıve approach, within a monitor driver the data are copied from hardware buffers to kernel buffers, and then when demanded by the overlying application it is copied again to user space buffers. Each of these copies requires CPU time. In order to eliminate this, direct memory access (DMA) can be used (if the hardware supports it, and it should) to copy the data from hardware buffers. The use of DMA eliminates the CPU from being involved in the first copy, except insofar as it needs to set up the DMA transfer. If buffers can be memory mapped into user space, the second copy can be eliminated. Memory mapping of buffers into user space also avoids having to make time-consuming transitions into and out of kernel mode to read the data. One difficulty that can arise here is that the user space application needs a means to know how much data are available in the buffers, that is, what portion of the buffers contain valid data. The hardware and/or driver must make this information available by some means.

Always waiting until a given quantity (a buffer-full) of data is available before signaling to user space that data is available does not work for general purpose AE monitoring, since an AE system may sometimes produce very few output events for extended periods of time (e.g., a silicon retina observing an unchanging scene) and these output events would not be available to the application until later. As mentioned in the description of the early packet feature of the USBAERmini2 in Section 13.2.3 and in Section 15.2.1, it is important that hardware interfaces send their contents at a minimum rate (e.g., 1 kHz) even if their FIFOs are not full.

Typically, overlapping or asynchronous input and output must be performed to decouple data acquisition from processing. Separate threads or even processes are used to process the AE data and to perform the actual data acquisition and data output. If overlapping I/O is used, a processing thread can work on newly acquired data passed to it from an acquisition thread, while the acquisition thread is waiting for new data. Otherwise, the processor might be idle while waiting for new data to be captured.

If the data rate from a device is too high, it can overwhelm processing capability, possibly resulting in an ever-increasing backlog of data being held in memory. One way to deal with this is to allocate a certain maximum time for processing a buffer of data. If this time is exceeded, the rest of the data are discarded. This way, at least the most recent data are processed, even though some data are discarded. Another way to implement this approach in hardware is by only capturing events up to some determined rate. The rest of the events are then discarded by the hardware without being transmitted to the processor.

Similar, reciprocal arguments apply on the AE output (sequencer) side.

14.4.4   Optimization

It may be that to achieve the best possible throughput and lowest latency, the software will need to be optimized. However, one should not be tempted to micro-optimize from the word go. When Knuth (1974) wrote

[…] programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.

This might have been (at least mostly) hyperbole, but in trying to optimize too early or too broadly, one can end up trying to second guess what optimizations the compiler or interpreter will perform and it is very easy to waste a great deal of time trying to optimize areas of code that do not need optimizing. The risk is that the code will end up being difficult to read, understand, and maintain.

On modern processors it is very difficult to predict how a given small change to code will affect its runtime performance, primarily because of cache effects (Drepper 2007). For instance, new buffers should not be allocated and old buffers freed. Re-using old buffers for new data not only avoids the pure processing time overhead associated with the allocation and freeing operations, it also means that there is a higher chance that the memory used remains cached.

Optimization needs to be performed in an empirical way. First, the performance of the software needs to be measured. A profiler can be used to determine which routines are taking up the most time and therefore where optimization effort is best spent. After each would-be optimization has been coded, the performance should be measured again to ensure that performance has indeed been improved, and not made worse. The need to perform such measurements should be considered during the design of the software and consideration should be given to building in a means of ‘instrumentation,’ although this should of course itself be designed to have a minimal impact on performance.

14.4.5   Application Programming Interface

For interoperability between different systems developed in different laboratories or between different generations of hardware from the same laboratory it is desirable that a common API be used, so that application-level software can be ported from one hardware base to another with as little effort as possible. Developing a good API is a difficult task, particularly where it should, to some degree, be ‘future-proof’ in a field in which the technology is developing rapidly. However, some aspects of dealing with AE streams are likely to remain unchanging over time, irrespective of whether the underlying hardware is PCI based or USB based for example. The fundamental operations of reading and writing an AE stream, and perhaps imposing hardware-level filters on such streams to select which AEs or ranges of AEs are monitored, remain the same. However the hardware has to play its part if true compatibility is to be achieved. If some hardware formats its AE streams differently from others, it is probably of insufficient benefit to spend processor cycles reformatting the data stream in the API layer for consistency between the two types of hardware.

Further points that are sometimes neglected in the development of APIs in this area are:

  • for any property for which there is a ‘set’ function, there should also be a corresponding ‘get’ function;
  • the ability to correlate the time-stamps applied by the hardware to the computer’s clock, and/or so-called ‘wall-clock’ time, necessary to relate AE data to stimuli presentation, other (e.g., oscilloscope) measurements, and so on;
  • scalability, in terms of allowing for multiple hardware instances; and
  • reentrancy, such that a library can be safely used by multiple threads.

14.4.6   Network Transport of AE Streams

For integration with other software components, some means of transferring AE streams across a network is useful. It is particularly useful to transport raw streams of AEs by UDP (Postel 1980) or TCP (Braden 1989; Postel 1981). UDP is not a reliable protocol in the technical sense, meaning that data may be lost or suffer errors or duplication. However, such occurrences are rare on modern networks and UDP is preferred over TCP for real-time applications requiring low latency and overhead. Also, UDP is connectionless, so transmitters and receivers can appear and disappear in any order.

For an introduction to writing programs that communicate across a network using UDP or TCP, see for example, Stevens et al. (2003).

14.5   Mapping Software

As explained in Section 14.4.1, mappers are usually constructed using FPGAs rather than CPUs to perform the mapping function so the ‘software’ (so far as the term is applicable at all) that actually performs the mapping function is typically written in VHDL, which is outside of the scope here. It is of course possible to create mappers which are algorithmic mappers, that is to say the destination addresses they emit are determined by applying some fixed arithmetical rules to the incoming source addresses to achieve say some fixed pattern of fan-out or projective field. But most mappers are constructed as more general purpose table-lookup driven mappers in which the incoming source addresses are merely looked up in a table in RAM in which the corresponding list of destination addresses is to be found. These general purpose mappers are much more flexible, but a software interface is needed to program the various mappings into them. This is the software we are concerned with, in Section 14.5.

One might consider mapping software to be an extension of the category of configuration software, but there are significant differences. First, the underlying hardware is different. Configuration software as defined here is typically dealing with setting parameters and biases on mixed-signal chips by manipulating digital to analog converters (DACs) or sending signals to on-chip bias generators, and is not generally directly involved in AE processing (although some chips require that parameterization information is conveyed into the chip by piggybacking on an AE stream). Mapping software is however typically directed at writing lookup tables into RAM which will be read by the actual mapper hardware. Second, the parameters and biases set by the configuration software usually remain static for the course of an experiment, whereas mappings may need to be added and removed on-line as a result of learning algorithms being applied to the AE data.

Here again, many of the same considerations apply regarding constructing a useful API as were mentioned for the AE stream handling software APIs in Section 14.4.5. A common set of fundamental operations for a mapping API are:

  • set a new mapping from a source address to a list of destination addresses, replacing an existing mapping if necessary.
  • delete a mapping for a given source address such that the arrival of that source address no longer produces any output.
  • determine the current mapping for a given source address.
  • add additional destination event addresses to an existing mapping.
  • remove a set of destination event addresses from an existing mapping.

As with the AE stream handling software APIs, where there is a ‘set’ function there should be a corresponding ‘get’ function, and scalability and reentrancy should also be considered. If mappings need to be added and removed for on-line learning, the speed with which this can be done may be important, and the code paths involved in doing so may then be considered hot paths, which merit the same use of optimization as the direct AE stream handling software.

If the mapper lookup tables are implemented with flexible, variable destination event list lengths, then considerably more sophisticated memory management will need to be performed to keep track of free space within the mapper’s lookup tables and to enable the addition and removal of individual destination event addresses to and from a given arbitrary length destination event list than would be the case with fixed length destination event lists.

14.6   Software Examples

Even single chips and small systems composing a few neuromorphic chips require configuration. In the earliest days of neuromorphic engineering, configuration consisted of connecting AER chips by ribbon cables and carefully turning potentiometers to set bias voltages. Nowadays, much, if not all, of the configuration is digitally programmable. Chapter 11 describes on-chip bias generators, but the values of these biases must be loaded onto the chips somehow. And Chapter 13 discusses mapper hardware, but the mapper lookup tables must be loaded from somewhere. This section discusses examples of software solutions to these needs.

14.6.1   ChipDatabase – A System for Tuning Neuromorphic aVLSI Chips1

Oster (2005) described a software infrastructure for interfacing with the Rome PCI-AER board (Dante et al. 2005) and for setting up the configuration of biases on particular chips. An XML-based system called ChipDatabase is used to set biases on neuromorphic aVLSI chips: an XML file describes the chip pinout and how the pins are connected to digital/analog computer interfaces; and a MATLAB-based GUI is created to provide an intuitive method for tuning the biases of neuromorphic chips. This setup facilitates the exchange of aVLSI chips by defining a common interface, allowing remote tuning and the sharing of bias settings over the web.

The ChipDatabase project created a GUI (see example in Figure 14.1) that allowed the user to set biases from a standard MATLAB working environment, without knowing about the underlying hardware interfaces. It defined a standardized documentation of chips, adapter boards, and setup that included names instead of cryptic pin numbers, a description of bias functionalities, default voltages, and so on, all together in a flexible, easy-to-use, and extendable database format (XML). It also provided mechanisms to easily exchange the documentation and tuning settings between different researchers over a common web interface and it used computer controlled DAC hardware, controlled by a high-level mathematical language, to allow easy and complete characterization of chips. The same environment and data acquisition systems could be used for the remote tuning of chips when they were exchanged between different laboratories.

All these capabilities come at a price, of course. A lot of information has to be entered before the GUI for a particular chip can be used. However, it is also necessary and useful that every chip has a common standard description. ChipDatabase also distinguished between the definition of the chip itself and the test setup consisting of, for example, the boards on which the chips are mounted, which are normally built by different developers.

images

Figure 14.1   Example bias group window in the ChipDatabase Graphical User Interface (GUI). For each bias in a group, it contains the bias name and a slider to graphically set the bias value. The N or P marker determines the direction in which the voltage value decreases. The text field shows the value that is currently set. If the off button is checked, the bias is set to the ‘off’ value. When the button is unchecked, the voltage that was active when switching off is restored. The rightmost button is a push button to set the current voltage to the predefined default voltage from the chip class definition. The working settings can be saved to and restored from a file. From Oster (2004). Reproduced with permission of Matthias Oster

The ChipDatabase setup was used in several projects (CAVIAR, ALAVLSI, and in the other academic projects). The CAVIAR and ALAVLSI systems used the DAC boards developed in the CAVIAR project as the underlying DAC hardware, whereas some other projects used dedicated boards with additional functionality.

As a good example of the software design principles of information hiding and encapsulation, the functionality of the different DAC interface boards used in various projects is hidden from the database code by different hardware ‘drivers’. These drivers also encapsulate the low-level communication functions that are dependent on the operating system (OS), for example, to access different OS-specific code on Windows and Linux machines. The GUI code simply calls one of four commands which have to be supplied by the drivers:

setchannel (dacboardid, channel, value, type)
value = getchannel (dacboardid, channel, type)
setchannels (dacboardid, values, type)
values = getchannels (dacboardid, type)

The use of device descriptors, sub-device types, and channel numbers is also a good step in the direction of the adoption of existing interfaces, in that it was designed to be compatible with the interface defined by the comedi project (Hess and Abbott 2012), a project which provides drivers for many data acquisition cards. This would make it easy to provide a generic interface to the standardized comedi functions and thus make the ChipDatabase software usable with any of the data acquisition cards supported by comedi. Adopting existing standards and interfaces rather than taking a not-invented-here approach (i.e., reinventing the wheel) can not only save initial development time, but also make future, perhaps originally unforeseen, integration with other software much easier.

images

Figure 14.2   Example jAER windows showing the synchronized playback of the AE output from a retina chip and a cochlea chip. (a) User-friendly control of DVS silicon retina biases. (b) DVS output rendered as a 2D histogram of events over the last 20 ms. (c) Output from an AER-EAR silicon cochlea rendered as a spike rastergram. As one of the people seen in the DVS view claps their hands together, bursts of cochlea spikes are produced

14.6.2   Spike Toolbox2

One example of software that creates AE streams for injection into AE systems, and also for monitoring the AE streams produced by such systems is the Spike Toolbox (Muir 2008). This is a custom MATLAB toolbox for the off-line generation, manipulation, and analysis of digital spike trains. It allows arbitrary spike trains to be easily generated with control over temporal structure and allows the trains to be manipulated as an opaque objects.

The toolbox also has links for stimulating external spike-based communications devices, using either the Rome PCI-AER hardware (Section 13.2.2) or a so-called ‘spike server’. The toolbox can monitor spikes from devices such as spiking retinas directly from MATLAB and can be configured for arbitrary hardware addressing schemes.

14.6.3   jAER

In contrast to previous software packages, the Java-based software project jAER (usually pronounced ‘jay-er’) is focused on real-time processing of AER sensor output (Delbrück 2008; jAER 2007). Figure 14.2 shows jAER rendering the output from two chips simultaneously. Other systems integrated in jAER include silicon retinas from several groups, convolution chips, AER monitor and sequencer boards developed in the CAVIAR project, silicon cochleas as described in Chapter 4, servo motor controllers, and several special-purpose optical sensors. jAER has been used for chip testing, algorithm development, and the creation of entire robots such as the robotic goalie (Delbrück and Lichtsteiner 2007) and the DVS-based pencil balancer (Conradt et al. 2009).

jAER applications are mostly written in Java, currently the most popular programming language (TIOBE 2013). It allows plugging in one or more AER device with USB interfaces, and then viewing the events coming from the devices, logging them to disk and playing them back. Network transport of events via TCP or UDP are also supported and have been used for example to interface 10 DVS silicon retinas in a permanent installation in a railway station in Switzerland. Here the event streams from several DVS retinas are fused together to form a single very wide view of a railway station passenger underpass (Derrer et al. n.d.).

UDP control of jAER allows users to control experiments from a dynamic programming environment such as Matlab. A variety of pre-built ‘filters’ allow for reducing noise, extracting low-level features, tracking objects, and controlling servo motors. Generally an application in jAER is written as a pipeline of previously developed filters nested inside a custom filter. Java introspection mechanisms are used to automatically build GUI control panels that allow control and persistent storage of parameters.

jAER also supports chips with the programmable bias current generators discussed in Chapter 11 to provide persistent GUI control of chip biases (see Section 11.5). jAER also serves as the repository for full-design kits for on-chip bias generators, with sample schematics, chip layout, board design, firmware, and host-side software.

jAER’s internal model of processing based on temporally-ordered packets of time-stamped event objects has prompted a new way of thinking about how to perform computer vision and audition tasks on the basis of applying iterative algorithms to these packets. These algorithms are discussed in Chapter 15.

14.6.4   Python and PyNN

The Python-based PyNN (pronounced the same as ‘pine’) simulator-independent framework for building neuronal network models (Davison et al. 2008) has in recent years practically revolutionized the use of software in modeling neural networks and helped bring the Python programming language (Python 2012) to prominence in the field (Gewaltig et al. 2009). Working in Python brings the advantage of giving the programmer access to a vast collection of libraries that have been developed in other fields for scientific computing and plotting, amongst other things. It is platform independent and easy to extend using other programming languages.

Python for Neural Networks

PyNN was developed with the intention of providing a common high-level API for multiple neural network simulators (e.g., NEURON, Hines and Carnevale 2003, NEST, Diesmann and Gewaltig 2002, PCSIM, Pecevski et al. 2009, and Brian, Goodman and Brette 2008). This allows a network model to be written once and then be run on any of the supported simulators. PyNN not only supports modeling networks at the level of populations of neurons, layers, columns, and the connections between them, but also supports dealing with individual neurons and synapses. It provides a set of simulator-independent models of neurons, synapses and synaptic plasticity, and various connectivity algorithms, while still allowing connectivity to be specified by the user.

Given suitable back-ends, PyNN can also be used to interface to neuromorphic hardware, as has been done as part of the FACETS project (Brüderle et al. 2009, 2011) and in the SpiNNaker project (Galluppi et al. 2012). This gives the advantage that modelers can move their models directly from the simulator of their choice onto neuromorphic hardware without having to learn about the details of the hardware implementation.

pyNCS and pyTune3

An alternative approach, also based on Python, has been implemented by Sheik et al. (2011) to simplify the configuration of multichip neuromorphic VLSI systems, and automate the mapping of neural network model parameters to neuromorphic circuit bias values.

Sheik et al. (2011) proposed a modular framework for the tuning of parameters on multichip neuromorphic systems. On the one hand, the modularity of the framework allows the definition of a wide range of models (network, neural, synapse, circuit) that can be used in the parameter translation routines; on the other hand, the framework does not require detailed knowledge of the hardware/circuit properties, and can optimize the search and evaluate the effectiveness of the parameter translations by measuring experimentally the behavior of the hardware neural network. This framework was implemented using Python, and makes use of its object-oriented features.

The framework consists of two software modules: pyNCS (Stefanini et al. 2014) and pyTune. The pyNCS toolset allows the user to interface the hardware to a workstation, to access and modify the VLSI chip bias settings, and to define the functional circuit blocks of the hardware system as abstract software modules. The abstracted components represent computational neuroscience relevant entities (e.g., synapses, neurons, populations of neurons, etc.), which do not depend directly on the chip’s specific circuit details and provide a framework that is independent of the hardware used. The pyTune toolset allows users to define abstract high-level parameters of these computational neuroscience relevant entities, as functions of other high- or low-level parameters (such as circuit bias settings). This toolset can then be used to automatically calibrate the properties of the corresponding hardware components (neurons, synapses, conductances, etc.), or to determine the optimal set of high- and low-level parameters that minimize arbitrarily defined cost-functions.

Using this framework, neuromorphic hardware systems can be automatically configured to reach a desired configuration or state, and parameters can be tuned to maintain the system in the optimal state.

The pyNCS Toolset

At the lowest level, dedicated drivers are required to interface custom neuromorphic chips to computers. Although custom drivers must be developed for each specific hardware, they can be cast as Python modules and integrated as plug-ins in the pyNCS toolset. Once the drivers are implemented, pyNCS creates an abstraction layer to simplify the configuration of the hardware and its integration with other software modules. The experimental setup is then defined using information provided by the designeron the circuit functional blocks, their configuration biases, and the chip’s analog and digital input and output channels. The setup, the circuits, and their biases are encapsulated into abstract components controllable via a GUI or an API.

Experiments (equivalent to software simulation runs) can be defined, set-up, and carried out, using methods and commands analogous to those present in software neural simulators such as Brian (Goodman and Brette 2008) or PCSIM (Pecevski et al. 2009).

pyNCS uses a client-server architecture, thereby allowing multiclient support, load sharing, and remote access to the multichip setups. Thanks to this client-server architecture; multiple clients can control the hardware remotely, regardless of the OS used.

The pyTune Toolset

The pyTune toolset is a Python module which automatically calibrates user-defined, high-level parameters and optimizes user-defined cost-functions. The parameters are defined using a dependency tree that specifies lower-level sub-parameters in a recursive hierarchical way. This hierarchical scheme allows the definition of complex parameters and related cost-functions. For example, synaptic efficacies in neural network models can be related to the bias voltages which control the gain of synaptic circuits in neuromorphic chips. Using the pyTune toolset it is possible to automatically search a space of bias voltages and set a desired synaptic efficacy by measuring the neuron’s response properties from the chip.

This automated parameter search can be applied to more complex scenarios to optimize high-level parameters related to network properties. For example, the user can specify the mapping between low-level parameters and the gain of a winner-take-all network (Yuille and Geiger 2003), or the error of a learning algorithm (Hertz et al. 1991).

The pyTune toolset relies on the translation of the problem into parameter dependencies. The user defines each parameter by its measurement routine (getValue function) and its sub-parameter dependencies. At the lowest level, the parameters are defined only by their interaction with the hardware, that is they represent circuit biases. The user can choose a minimization algorithm from those available in the package or can define custom methods to do the optimization that sets the parameters’ values. Optionally one can also define a specific cost function that needs to be minimized. By default, the cost function is computed as (ppdesired)2 where p is the current measured value of the parameter and pdesired is the desired value. Explicit options (such as maximum tolerance for the desired value, maximum number of iteration steps, etc.) can also be passed as arguments to the optimization function.

Finally, the sub-parameters’ methods are mapped by the appropriate plug-in onto the corresponding driver calls in the case of a hardware system, or onto method calls and variables in the case of a system simulated in software. Each mapping specific to a system has to be separately implemented and included in pyTune as a plug-in.

The pyTune toolset can be used to adjust the corresponding parameters and obtain the desired neural properties irrespective of temperature effects, mismatch between different instances of the chips and other sources of heterogeneity because it relies on measuring outputs from the chips.

Modularity and Integration with other Python Tools

Thanks to the modularity of pyNCS and pyTune, they are in principle completely compatible with other existing Python tools such as PyNN, and can be considered as an additional useful tool that can be included in the increasing number of Python applications developed for the neuroscience and neuromorphic engineering community.

14.7   Discussion

Here in this chapter we have been principally concerned with software used in relatively small systems. In the domains of chip and system description software, mapping software, and even in the challenging area of AE stream handling software; the necessary techniques and technologies are essentially known and available. To some extent creating software in these areas is a matter of following best practices in database design, API design, buffering, optimization. In medium and large scale systems, such as those described in Chapter 16, there are significant scalability challenges, particularly as touched upon in Section 14.3 and Section 14.6.4 in the realms of configuration software, and placement and routing software, as mentioned in Section 14.1. However, the kind of software which has been described in the present chapter remains an indispensable ingredient, also in larger systems. And it formed part of the inspiration for the algorithmic processing of AE event streams described in Chapter 15.

References

Braden R. 1989. Requirements for internet hosts – communication layers. RFC 1122, RFC Editor. http://www.rfc-editor.org/rfc/rfc1122.txt (accessed August 6, 2014).

Bradley N. 2002. The XML Companion. 3rd edn. Addison-Wesley.

Brüderle D, Müller E, Davison A, Muller E, Schemmel J, and Meier K. 2009. Establishing a novel modeling tool: a Python-based interface for a neuromorphic hardware system. Front. Neuroinformat. 3, 17, doi:10.3389/neuro.11.017.2009.

Brüderle D, Petrovici MA, Vogginger B, Ehrlich M, Pfeil T, Millner S, Grübl A, Wendt K, Müller E, Schwartz MO, de Oliveira D, Jeltsch S, Fieres J, Schilling M, Müller P, Breitwieser O, Petkov V, Muller L, Davison A, Krishnamurthy P, Kremkow J, Lundqvist M, Muller E, Partzsch J, Scholze S, Zühl L, Mayr C, Destexhe A, Diesmann M, Potjans T, Lansner A, Schüffny R, Schemmel J, and Meier K. 2011. A comprehensive workflow for general-purpose neural modeling with highly configurable neuromorphic hardware systems. Biol. Cybern. 104(4), 263–296.

Cap. n.d. Capo Caccia Cognitive Neuromorphic Engineering Workshop. http://capocaccia.ethz.ch/ (accessed August 6, 2014).

CAVIAR. 2002. CAVIAR Project, http://www.imse-cnm.csic.es/caviar/ (accessed August 6, 2014).

Chicca E, Whatley AM, Dante V, Lichtsteiner P, Delbrück T, Del Giudice P, Douglas RJ, and Indiveri G. 2007. A multi-chip pulse-based neuromorphic infrastructure and its application to a model of orientation selectivity. IEEE Trans. Circuits Syst. I 54(5), 981–993.

Conradt J, Cook M, Berner R, Lichtsteiner P, Douglas RJ, and Delbruck T. 2009. A pencil balancing robot using a pair of AER dynamic vision sensors. Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), pp. 781–784.

Dante V, Del Giudice P, and Whatley AM. 2005. Hardware and software for interfacing to address-event based neuromorphic systems. The Neuromorphic Engineer 2(1), 5–6.

Davison AP, Brüderle D, Eppler JM, Kremkow J, Muller E, Pecevski DA, Perrinet L, and Yger P. 2008. PyNN: a common interface for neuronal network simulators. Front. Neuroinformat. 2, 11. doi:10.3389/neuro.11.011.2008.

Delbrück T. 2008. Frame-free dynamic digital vision. Proceedings of International Symposium on Secure-Life Electronics, Advanced Electronics for Quality Life and Society, University of Tokyo, March 6–7. pp. 21–26.

Delbrück T and Lichtsteiner P. 2007. Fast sensory motor control based on event-based hybrid neuromorphic-procedural system. Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), pp. 845–848.

Derrer R, Jauslin S, Vehovar M, Suisseplan Ingenieure AG, Vehovar & Jauslin Architektur AG, Keller S, Gössling L, Delbruck T, Br¨andli C, Steinweber P, Schilling M, Santana E, and Schnick-Schnack-Systems GmbH. n.d. Atelier Derrer | Gravity – Bahnhof Aarau, http://www.lightlife.de/gravity-bahnhof-aarau/ (accessed August 6, 2014).

Diesmann M and Gewaltig M. 2002. Nest: an environment for neural systems simulations. In: Forschung und wissenschaftliches Rechnen, Beitr¨age zum Heinz-Billing-Preis 2001 (eds Plesser T and Macho V), vol. 58. Gesellschaft für wissenschaftliche Datenverarbeitung. pp. 43–70.

Drepper U. 2007. What every programmer should know about memory. Technical report, Red Hat Inc, http://people.redhat.com/drepper/cpumemory.pdf (accessed August 6, 2014).

Ehrlich M, Wendt K, Zühl L, Schüffny R, Brüderle D, Müller E, and Vogginger B. 2010. A software framework for mapping neural networks to a wafer-scale neuromorphic hardware system. Proc. Artificial Neural Netw. Intell. Inf. Process. Conf. (ANNIIP), pp. 43–52.

Galluppi F, Davies S, Rast A, Sharp T, Plana LA, and Furber S. 2012. A hierachical configuration system for a massively parallel neural hardware platform. Proceedings of the 9th Conference on Computing Frontiers (CF ’12), pp. 183–192.

Gewaltig MO, Hines M, Kötter R, Diesmann M, Davison AP, Muller E, and Bednar JA. 2009. Python in neuro-science, http://www.frontiersin.org/Neuroinformatics/researchtopics/Python_in_neuroscience/8 (accessed August 6, 2014).

Gleeson P, Crook S, Cannon RC, Hines ML, Billings GO, Farinella M, Morse TM, Davison AP, Ray S, Bhalla US, Barnes SR, Dimitrova YD, and Silver RA. 2010. NeuroML: a language for describing data driven models of neurons and networks with a high degree of biological detail. PLoS Comput. Biol. 6(6), e1000815.

Goodman DF and Brette R. 2008. Brian: a simulator for spiking neural networks in Python. Front. Neuroinformat. 2, 5. doi:10.3389/neuro.11.005.2008.

Hertz J, Krogh A, and Palmer RG. 1991. Introduction to the Theory of Neural Computation. Addison Wesley, Reading, MA.

Hess FM and Abbott I. 2012. Comedi: linux control and measurement device interface, http://www.comedi.org/ (accessed August 6, 2014).

Hines ML and Carnevale NT. 2003. The NEURON simulation environment. In: The HandBook of Brain Theory and Neural Networks (ed. Arbib MA), 2nd edn. MIT Press, Cambridge, MA. pp. 769–773.

Ins. n.d. Institute of Neuromorphic Engineering, http://www.ine-web.org/ (accessed August 6, 2014).

jAER. 2007. jAER Open Source Project, http://jaerproject.org (accessed August 6, 2014).

Knuth DE. 1974. Computer programming as an art. Commun. ACM 17(12), 667–673.

Muir D. 2008. Spike toolbox for matlab, http://spike-toolbox.ini.uzh.ch/ (accessed August 6, 2014).

Neftci E, Chicca E, Indiveri G, and Douglas R. 2011. A systematic method for configuring VLSI networks of spiking neurons. Neural Comput. 23(10), 2457–2497.

Oster M. 2004. ChipDatabase – a system for tuning neuromorphic aVLSI chips, http://www.ini.ethz.ch/~mao/ ChipDatabase/ChipDatabase.pdf (accessed August 6, 2014).

Oster M. 2005. Tuning aVLSI chips with a mouse click. The Neuromorphic Engineer 2(1), 9.

Oster M, Whatley AM, Liu SC, and Douglas RJ. 2005. A hardware/software framework for real-time spiking systems. Springer Lecture Notes in Computer Science (ed. Duch W, Kacprzyk J, Oja E and Zadroznyet S), vol. 3696. Springer GmbH, Heidelberg. pp. 161–166.

Pecevski D, Natschl¨ager T, and Schuch K. 2009. PCSIM: a parallel simulation environment for neural circuits fully integrated with Python. Front. Neuroinformat. 3, 11. doi:10.3389/neuro.11.011.2009.

Postel J. 1980. User datagram protocol. RFC 768, RFC Editor, http://www.rfc-editor.org/rfc/rfc768.txt (accessed August 6, 2014).

Postel J. 1981. Transmission control protocol. RFC 793, RFC Editor, http://www.rfc-editor.org/rfc/rfc793.txt (accessed August 6, 2014).

Python. 2012. Python programming language – official website, http://www.python.org/ (accessed August 6, 2014). Python Software Foundation.

Sheik S, Stefanini F, Neftci E, Chicca E, and Indiveri G. 2011. Systematic configuration and automatic tuning of neuromorphic systems. Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), pp. 873–876.

Stefanini F, Neftci EO, Sheik S, and Indiveri G. 2014. PyNCS: a microkernel for high-level definition and configuration of neuromorphic electronic systems. Front. Neuroinformat. 8(73), 1–14.

Stevens WR, Fenner B and Rudoff AM. 2003 Unix Network Programming. The Sockets Networking API, vol. 1, 3rd edn. Addison-Wesley, Reading, MA.

TIOBE. 2013. TIOBE programming community index, http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html (accessed August 6, 2014).

Yuille AL and Geiger D. 2003. Winner-take-all networks. In: The Handbook of Brain Theory and Neural Networks, 2nd edn. MIT Press, Cambridge, MA. pp. 1228–1231.

__________

1The text in this section is chiefly adapted from Oster (2004). Reproduced with permission of Matthias Oster.

2The text in this section is taken from Muir (2008). Reproduced with permission of Dylan Muir.

3The text in this section is © 2011 IEEE. Reprinted, with permission, from Sheik et al. (2011).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset