Build a 3-layer DNN

H2O exposes slightly different way of building models; however, it is unified among all H2O models. There are three basic building blocks:

  • Model parameters: Defines inputs and algorithm specific parameters
  • Model builder: Accepts model parameters and produces a model
  • Model: Contains model definition but also technical information about model building such as score times or error rates for each iteration

Prior to building our model, we need to construct parameters for the DeepLearning algorithm:

import _root_.hex.deeplearning._ 
import DeepLearningParameters.Activation 
 
val dlParams = new DeepLearningParameters() 
dlParams._train = trainingHF._key 
dlParams._valid = testHF._key 
dlParams._response_column = "label" 
dlParams._epochs = 1 
dlParams._activation = Activation.RectifierWithDropout 
dlParams._hidden = Array[Int](500, 500, 500) 

Let's walk through the parameters and figure out the model we just initialized:

  • train and valid: Specifying the training and testing set that we created. Note that these RDDs are in fact, H2O frames.
  • response_column: Specifying the label that we use which we declared beforehand was the first element (indexes from 0) in each frame.
  • epochs: An extremely important parameter which specifies how many times the network should pass over the training data; generally, models that are trained with higher epochs allow the network to learn new features and produce better model results. The caveat to this, however, is that these networks that have been trained for a long time suffer from overfitting and may not generalize well on new data.
  • activation: These are the various non-linear functions that will be applied to the input data. In H2O there are three primary activations from which to choose from:
  • Rectifier: Sometimes referred to as rectified linear unit (ReLU), this is a function that has a lower limit of 0 but goes to positive infinity in a linear fashion. In terms of biology, these units are shown to be closer to actual neuron activations. Currently, this is the default activation function in H2O given its results for tasks such as image recognition and speed.
Figure 14 - Rectifier activation function
  • Tanh: A modified logistic function that is bound between -1 and 1 but goes through the origin at (0,0). Due to its symmetry around 0, convergence is usually faster.
Figure 15 - Tanh activation function and Logistic function - note difference between Tanh.
  • Maxout: A function whereby each neuron picks the largest value coming from k separate channels:
    • hidden: Another extremely important hyper-parameter, this is where we specify two things:
      • The number of layers (which you can create with additional commas). Note that in the GUI, the default parameter is a two-layers hidden network with 200 hidden neurons per layer.
      • The number of neurons per layer. As with most things regarding machine learning, there is no set rule on what this number should be and experimentation is usually best. However, there are some additional tuning parameters we will cover in the next chapter that will help you think about this, namely: L1 and L2 regularization and dropout.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset