Chapter 5
A Sample App

This chapter presents general architectural aspects of the sample HappyWalk Android HiTLCPS application and, in doing so, some of its underlying concepts and technologies, from a practical perspective. We encourage avid readers to further explore the presented technologies and solutions by complementing this presentation with material from books focusing on each of the addressed topics.

HappyWalk's base architecture will be presented, comprising an Android client application and a server-side application. Also, the technologies that will be used for the application development are briefly identified. Subsequently, the main classes that constitute this sample app will be listed and succinctly explained, both for the client side and the server side. Finally, the architectural options concerning emotion awareness will be presented and justified, and some initial implementation aspects will also be discussed.

5.1 A Sample Behavior Change Intervention App

As previously mentioned in Section 4.1.2, BCIs are therapeutic systems that focus on providing advice, support and relevant information to patients. Traditionally associated with presential therapeutic consultations, BCIs have recently begun to be delivered through the Internet and smartphones [11]. Using smartphone sensors to monitor humans with BCIs not only provides more effective feedback to help users in adapting or controlling some aspects of their behavior but also helps behavioral scientists' research. The journey towards our first HiTL system begins with a simple Android application, named HappyWalk, which we will modify throughout the book to introduce several HiTL capabilities, turning it into a full BCI system. HappyWalk will be an HiTL BCI application that attempts to positively influence its user's mood through moderate physical exercise.

In fact, recent research work has found evidence that moderate walking exercise and a change of environments can contribute to the improvement of mental health, providing several cognitive benefits such as improved memory, attention, and mood [133] [134]. Other studies suggest that contact with natural environments not only makes people feel better but also makes them behave better, thus presenting both personal health benefits and broader social benefits [135].

Thus, in this part of the book we will develop and extend an Android app that promotes walking with the aim of improving the user's mood. Our objective is to use the smartphone's sensors and a machine learning algorithm to trigger positive feedback notifications that suggest walking exercise whenever the data from the smartphone's sensors indicates a negative state of mind. Collaborative data gathering is also employed to show heatmaps representing the near real-time context of nearby points of interest (POIs) that might be of interest to visit. Thus, HappyWalk will employ a full closed-loop HiTL control, as shown in Figure 5.1.

Scheme for HappyWalk HiTL control

Figure 5.1 HappyWalk HiTL control.

5.2 The Sample App's Base Architecture

Since the objective of this book is not to teach Android programming, we will not be developing HappyWalk from scratch. Instead, we will be enhancing an existing base app, capable of showing POIs on a map, with HiTL control. A high-level architecture of HappyWalk can be seen in Figure 5.2, which shows that the basis of the system is composed of an Android client application and a server-side application.

Illustration of HappyWalk's architecture.

Figure 5.2 HappyWalk's architecture.

5.2.1 The Android App

The focus of this chapter is on HappyWalk's Android App, responsible for the interaction with the end user. It displays a map where the relevant POIs are shown, as well as menus that show information about them.

Android is an ecosystem supported by the Open Handset Alliance made up of devices and, primarily, an open-source operating system (OS) designed for smartphones, tablets, and other embedded devices. While Android applications are written in the Java programming language,1 they are not run within a traditional Java virtual machine. Android has its own runtime (Android Runtime and, in older devices, its predecessor Dalvik) and performs its own management of the application's life cycle. The end user is not concerned about which apps are running. Android is optimized for low-power, low-memory devices, and is capable of closing and opening processes as the device's capabilities dictate. This directly translates into a programming in which the developer has to always be aware of the possibility of the app suddenly being shut down because of processing or memory constraints.

For the didactic purposes of our book, we would like to ensure that our application is compatible with as many versions of Android as possible. Thus, as our minimum Android SDK we chose API 10, corresponding to Android 2.3.3 (Gingerbread). This ensures compatibility with 99.5% of Android handsets, according to Google's statistics.2 This tutorial uses Android API level 21 as the compile SDK; since Android is an ever-evolving development environment, the reader might be tempted to use a more recent compilation API. However, to ensure that the tutorial can be smoothly followed, we ask the reader to refrain from doing so, since it would certainly imply the adaptation of various parts of the code.

Additionally, during this part of the book we strongly recommend using a real device to develop the application, as virtual devices cannot be used to debug the usage of sensors, such as microphone and accelerometer.

Most Android apps are constructed by linking several building blocks known as Activities. In their essence, an activity can be described as a “thing” that the user can do.3 This implies that, for the most part, an activity has a way of interacting with the user (typically, a graphical user interface). The managing of an Android application is inherently tied to the lifecycle of its activities. During their lifecycle, Android activities can be in one of several states, as shown in Figure 5.3, which displays Android's Activity Lifecycle. Activity states are managed by the OS itself; the developer has the responsibility of handling the transitions between each state. This is done through special method calls, which should be overridden whenever necessary.

Illustration of Android's activity lifecycle.

Figure 5.3 Android's activity lifecycle.

The entire lifetime of an activity occurs between the first call to onCreate() and a single final call to onDestroy():

  • onCreate() is called when the activity begins. It is typically used to perform initialization tasks and the global setup of the activity.
  • onRestart() runs when a previously stopped activity restarts and is always followed by onStart().
  • onStart() marks the beginning of the visible lifetime of an activity. It is called to indicate that an activity is about to be displayed and can be used to maintain the necessary visual resources.
  • onResume() is called when an activity is about to become interactable. It is important to note that this method may be called multiple times, since an activity may frequently waver between the resumed and paused states. Thus, this is a good place to update visual elements or start animations and music.
  • onPause() is a very important method and is called whenever an activity is about to go into the background. The importance of this activity pertains to the fact that this is the last non-killable method in older Android versions (pre-3.0); that is, after this method returns, the process hosting the activity may be killed by Android at any time, following memory or processing constraints. This is important to us, since we will be targeting Android 2.3.3 (Gingerbread). Therefore, onPause() should be used to write any persistent data to storage. Additionally, this method is also typically used to stop animations, music, etc.
  • onStop() is called when the activity is no longer visible to the user. Starting with Android 3.0: Honeycomb, an application is not in the killable state until this method has returned. The method can be followed by either onRestart() (activity goes back to user interaction) or onDestroy().
  • onDestroy() marks the end of the activity's lifecycle, called right before it is destroyed. This is triggered by specific methods (e.g. finish()) or because the system is destroying the activity to save space.

An overview of HappyWalk's class structure is shown in Figure 5.4, and the relationship between its main classes is detailed in Figure 5.5.

Illustration of HappyWalk's Android class structure.

Figure 5.4 HappyWalk's Android class structure.

Illustration of HappyWalk Android app's main classes.

Figure 5.5 An overview of HappyWalk Android app's main classes.

HappyWalk's primary class is named MapsActivity, which encompasses the handling of the application's main activity: the one containing the map. HappyWalk uses the Google Maps Android API v2 to show its map and POIs.4 The MapsActivity class is responsible for managing the Google Maps object and the user position marker. The clustering of individual marker POIs on larger icons that occurs when the user zooms out is performed by the GeoClusterer class, in conjunction with the GeoCluster and GeoItem helper classes.

Whenever the user taps on a POI marker, the POIDescription activity is called. This activity is responsible for displaying information regarding the selected POI, such as a name, address, description, coordinates, and an illustrative image.

Another essential component of HappyWalk is its background service, providing various functionalities to other classes. It runs on the background in a separate thread, showing a notification while it is running, and also provides a handler object that can be used to perform background tasks. The background service supports the HwLocationListener class, which is responsible for acquiring and managing the user's location.

All of these functions are supported by helper Thread classes, which perform various tasks in parallel with the main application (e.g. on their on thread). Namely, ThreadGetPoi and ThreadGetDetailPoi handle the fetching of POI information.

Finally, the utilities package contains several useful classes that are used throughout the application. CalcDistance is used to calculate the distance between two points from their latitude and longitude coordinates. The CommunicationClass contains several methods that facilitate communicating with the server. Lastly, the GlobalVariables class is used to store values that are used by most other classes. This class allows it to easily tune various aspects of the application (server URL, when to update POIs, etc.).

5.2.2 The Server

HappyWalk's server-side application is responsible for the provision, management, and fetching of POI information. It is a Java EE web application implemented through the Java Servlet API5 that runs on the Apache Tomcat™ 7 open-source web server.6 It communicates in the form of Representational state transfer (RESTful) web services, the communication style of the the World Wide Web. This form of communication typically occurs over HTTP and uses the usual HTTP verbs (GET, POST, PUT, DELETE, etc.). Services are identified through a URI, and data is encapsulated within an Internet media type, which in our case is JSON. These RESTful web services and the JSON encapsulation are supported by the Jersey Java library.7

HappyWalk's POI information is retrieved from the well-known Foursquare® database.8 Foursquare® is a location-based mobile social network that takes into consideration the position of its users to provide suggestions of places to visit. It allows users to discover places that fit their interests based on the advice of other users they trust. Foursquare®'s POI database is very complete and available free of charge through a web API with a limit of 5000 requests per hour, which is more than enough for our educational purposes. To facilitate the integration with Java, our server uses the foursquare-api-java library.9

The server also communicates with a PostgreSQL database,10 where the records of emotions and POI locations are kept, through Hibernate11. PostgreSQL is an open-source object-relational database system that has earned a reputation for reliability and correctness. On the other hand, Hibernate is an Object/Relational Mapping framework concerned with data persistence in relational databases. Hibernate allows us to cleanly map Java objects with PostgreSQL tables, greatly simplifying HappyWalk's database management.

Figure 5.6 presents an overview of the server's packages and classes. Each package holds a specific purpose:

  • The Model package contains classes that are representative of the JSON communication messages, providing encapsulation through the Jersey library. As convention, each class has either a “Request” or a “Response” prefix.
  • The Web package contains classes that implement the RESTful web services' interfaces. Using Jersey, two POST services were implemented: GetDetailPoi, which returns detailed information regarding a certain POI, and GetListPoi, which returns a list of POIs around a certain location.
  • The Com package contains the actual intelligence of the server. The classes within these packages tend to communicate with other entities, such as the Database and Foursquare®. The GetDetailPoi and GetListPoi web service requests are processed by the classes ComGetDetailPoi and ComGetListPoi, respectively. The classes ComUpdatePoiImagesandTips and ComGetPoiFoursquare retrieve the necessary information from Foursquare®.
  • Within the DAO package rest several data access objects that facilitate communication with the database. These are, in turn, supported by the HibernateMaps, which provide an interface with the database's tables.
  • Finally, the Utilities package contains an ImageUtils class that processes images and a GlobalVariables class, similar to the one used in the Android app, for storing important variables that are used throughout the server.
Illustration of HappyWalkServer's main classes.

Figure 5.6 An overview of HappyWalkServer's main classes.

Since the focus of this book is on HiTL control, which rests mostly on the Android app, as far as the server is concerned we will focus mostly on the intelligence associated with handling emotional information. Database structure and communication is already implemented and ready for handling HiTL information. The tutorial sections will simply use the Dao and HibernateMaps helper classes to save and update emotions. We will explain how to use these classes in Section 9.1.4.

We will also detail how to fetch and deploy the server using the Eclipse Mars IDE for Java EE Developers.12 Despite the fact that the scope of this book does not include many of the server's inner workings, its source code is openly available for the inquisitive reader who might desire to tinker with it.

5.3 Enhancing the Sample App with HiTL Emotion-awareness

From this base map application, we will now take steps to introduce HiTL control. Since we will be dealing with possibly sensitive data, such as location and mood, one of the fundamental requirements of the design of our HiTL application will be to respect the privacy of users. We will consider this requirement through data anonymization, by generating a pseudo-random identifier (we will discuss this in greater detail in Section 8.3). While the app will be responsible for acquiring and processing GPS positions, as well as accelerometer and microphone data, the resulting emotion will be periodically sent in an anonymous way to the server. This allows HappyWalk to display a near real-time average of the mood at each POI, through heatmaps with different colors. This information allows users to pick areas which are either livelier (euphoric mood) or calmer (relaxed mood). All of this real-time information may provide the necessary motivation for walking and visiting places that the user feels are better suited to his/her current mood.

5.3.1 Choosing a Machine Learning Technique

In HappyWalk, the core of our emotion-awareness rests on our ability to associate sensory input with certain emotions. It would be arguably possible to simply periodically ask the user how they are feeling. However, such an approach would go against the principles of “calm” computing and non-intrusiveness inherent to HiTLCPS design. Thus, HappyWalk will employ a state inference mechanism based on machine learning which will attempt to automate the detection of mood. Nevertheless the detection of emotions is an extremely complex issue and, as such, we will still require the use of supervised learning systems which rely on direct user feedback. Even so, we will attempt to reduce the amount of required feedback whenever the state inference component is performing well enough.

As mentioned in the introduction to this part of the book, our objective with HappyWalk is not to propose/develop robust methods for mood detection, but instead to present a practical “proof of concept” that can show the reader how HiTL concepts can be applied to create a simple, smart HiTLCPS. Thus, in order to determine a good machine learning technique for our application, let us consider previous comparisons between the different possibilities. Previous research work [18], the results of which are shown in Table 5.1, has scored different classification algorithms in terms of correct classification rate and in terms of CPU time needed for the classification. The latter is of particular importance for smartphone HiTLCPSs, since these are limited in terms of available processing power and energy. Based on this study, we will be using an artificial neural network (ANN) as our mood inference tool, since it offers a reasonable classification rate while being one of the least time-consuming techniques. Other good alternatives could be C4.5 decision trees and Support vector machines, but ANNs make it significantly easier to update our inference model with new data provided by the user.

Table 5.1 Machine learning approaches for sensing context in smartphones [18]. Source: Adapted from Guinness 2013

Machine learning techniques Correct classification rate CPU time requirements ranking
Random Forest 96.5% c05-math-001
Support vector machines 80.2% c05-math-002
Naive Bayes classifiers 81.5% c05-math-003
Bayesian networks 90.9% c05-math-004
Logistic regression 83.4% c05-math-005
Artificial neural networks 87.2% c05-math-006

ANNs are machine learning techniques based on biological neurons, the cells that are the most representative of the thinking function of animal brains. As put by Haykin in his book Neural Networks: A Comprehensive Foundation [136], the brain is a highly complex, nonlinear and parallel computer that processes information in an entirely different way from the conventional digital computer. Much like biological brains, ANNs are systems of interconnected “neurons” which exchange messages with each other. They possess plasticity, that is the ability to adapt the “strength” of the connections between neurons, through numeric weights that can be tuned based on experience. This endows ANNs with the ability to learn from training data. In Figure 5.7 we can see a typical ANN architecture. The neurons of an ANN are usually grouped in layers; the first layer receives the input, while the last layer transmits the final output. In between these layers are the hidden layers, which allow the ANN to extract higher-order information from the data, by providing additional transformations and processing. ANNs are usually considered black-box systems, in the sense that their functioning is opaque: studying an ANN's inner structure does not provide any logical insight into the function being approximated. All the processing and memory of the ANN rests within the weights of the connections between its neurons, which, by themselves, do not mean much to the ANN designer.

Scheme for artificial neural network architecture.

Figure 5.7 A typical artificial neural network architecture.

5.3.2 Implementing Emotion-awareness

An important matter to decide on how to implement our emotion-aware ANN is which sources of input should be used to teach it. In order to avoid obligating the reader to use additional hardware and perform particularly complicated integration tasks, we want to limit our choice of sensors to those already provided by the smartphone device.

Current scientific knowledge does not yet have an exhaustive picture of all the factors influencing a person's emotions [137]. Nevertheless, in the case of our sample application, we intend to consider at least three general sources of data: contextual information, environmental clues, and body movement information.

In terms of context, most smartphones are equipped with GPS, allowing us to know where the user is located. Regarding environmental clues and body movement, the accelerometer and microphone have been previously identified as effective sensors for identifying human context [97]. Thus, our application will acquire raw data from these sensors, and will process it through a simple classifier.

Our classifiers will use some concepts of “signal processing”. In particular our accelerometer processor will use a “Fourier Transformation”: a powerful signal processing technique. In fact, signals can be understood from two different perspectives. The time perspective is the way we instinctively perceive our reality: things happen and vary as time passes. However, every signal in nature can also be described as a frequency spectrum and is determined by its inherent frequencies [138]. Thus, we can analyze a signal either in the time (or spatial) domain or in the frequency domain.

A Fourier transformation is a mathematical process which converts a finite signal, acquired at a certain frequency for a certain duration, into a series of coefficients, which represent a finite combination of complex sinusoid functions [139]. This combination of sinusoid functions represents that same signal on its frequency domain, as shown in Figure 5.813. Our accelerometer classifier will use this type of analysis by performing a fast Fourier transformation (FFT) and summing the resulting fourier coefficients. This sum gives the neural network an idea on the amount of movement detected by the smartphone.

Illustration of Sound signal in the time domain (left side) analyzed through a Fourier transformation to show its frequency domain (right side).

Figure 5.8 Sound signal in the time domain (left side) analyzed through a Fourier transformation to show its frequency domain (right side).

The user's emotion will be inferred once or twice an hour. The time between two sensory acquisitions will be randomly determined within these constraints, in order to avoid user habituation. In our example implementation, we will consider four distinct moods: euphoria, calmness, boredom, and anxiety. Boredom and a Anxiety are considered negative emotions, whereas euphoria and calmness are considered their positive counterparts. Users receive a notification when an emotion is detected, and, by selecting it, the application will open and display a feedback screen. The output representing the inferred emotion will be shown as a yellow circle in a two-dimensional space containing the four emotions (Figure 5.9). The user can provide corrective feedback by dragging the yellow circle to a new position, now shown in green. This feedback initiates an ANN re-training process, which will reflect the correction in future inference tasks. After some training, and when the neural network begins to become accurate, the feedback notifications will be progressively replaced with notifications suggesting walking exercise, whenever negative emotions are detected.

Illustration of HappyWalk's Emotional Feedback.

Figure 5.9 HappyWalk's Emotional Feedback.

The ANN will be implemented using the Encog14 [140] machine learning framework. Designing an ANN architecture is a challenging task. How should we select a number of neurons that provides minimal error and highest accuracy?

Previous research has shown that excessive hidden neurons will cause over fitting; that is, the neural networks overestimate the complexity of the target problem. A rule of thumb is selecting a size between the number of input neurons and number of output neurons [141]. Thus, it was decided to test two possibilities: using a hidden layer with four neurons or using two hidden layers, three neurons in the first and two neurons in the second. Both of these configurations fit the rule of thumb without becoming overly complex and taxing smartphone hardware. In the decision process, two major requirements were considered: the amount of effort required for training the network (which is important in terms of processing power and battery drain) and the accuracy of the network.

In order to test the training effort, simulated emotions were generated. For each type of emotion, a probability value for different ranges of its input components (movement, background noise, etc.) was empirically defined. Through this method, 150 simulated emotions were generated; while not valid for testing accuracy, these are sufficient for testing training performance. Thus, the number of epochs necessary to successfully train the network for each configuration was counted. The results are shown in Table 5.2. These show that using two hidden layers increases the training effort significantly. Therefore, it was also necessary to test if using more layers would bring any benefits in terms of accuracy. A test subject used HappyWalk for a week, during which his sensory data and emotional feedback were recorded for a total of 41 records. Using this data, both neural network configurations were tested to evaluate their sensitivity and specificity. Considering that negative emotions are the events of interest, performance was evaluated through two statistical measures known as “sensitivity” and “specificity”. In our case, sensitivity is the proportion of negative emotions that were correctly identified as such; in other words, it measures when our system was capable of detecting that it was necessary to actuate. Specificity, on the other hand, measures the proportion of correctly identified positive emotions. A perfect emotional predictor would present the maximum value of 1 for each of these metrics.

Table 5.2 Testing training performance (150 emotions)

Configuration Number of epochs
One hidden layer 100
Two hidden layers 3000

The results shown in Table 5.3 suggest that using a two-layer configuration leads to considerably better results. After pondering over the results, it was decided that, despite being more demanding, a two-hidden-layer configuration, the first containing three nodes and the second two nodes, constitutes a good compromise in terms of training time and accuracy. Thus, this is the configuration used for our sample application. The proposed neural network's architecture is presented in Figure 5.10.

Table 5.3 Testing neural network accuracy (41 emotions)

Configuration Sensitivity Specificity
One hidden layer 0.679 0.766
Two hidden layers 0.720 0.830
Scheme for HappyWalk's neural network design.

Figure 5.10 HappyWalk's neural network design.

5.4 In Summary..

In this chapter we have seen a high-level overview of the HappyWalk app, which we will use in this part of the book for illustrating the development of HiTLCPS applications. This is a kind of BCI application that provides feedback to the users and may influence their behavior, through a Data Acquisition, State Inference, and Actuation cycle.

HappyWalk is a BCI system that attempts to positively influence its user's mood throughout moderate physical exercise. The app's base architecture consists of a client application that requires at least Android Gingerbread. It already has classes for controlling POI clustering, the map interface, detecting location, and showing POI information. Many of its tasks are run under a background service to avoid encumbering the user interface. As for the server-side, it runs on Apache Tomcat™ 7 and uses the Jersey Java library to handle RESTful web service communication. The server also communicates with Foursquare® to retrieve POI information, and a PostgreSQL database to permanently store data.

The architectural options concerning emotion awareness were also presented and justified. We will be using the Encog library to implement a simple ANN with two hidden-layers as our mood inference tool. This network will be fed with various sources of data provided by the smartphone, including location, movement, and noise. We will pre-process this data before feeding it to the network through simple signal processing techniques, including frequency analysis. The user's emotion will be inferred once or twice an hour, within four possible moods: euphoria, calmness, boredom, and anxiety.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset