tools – Some Useful Tools and Macros

Neural network tools

pybrain.tools.shortcuts.buildNetwork(*layers, **options)

Build arbitrarily deep networks.

layers should be a list or tuple of integers, that indicate how many neurons the layers should have. bias and outputbias are flags to indicate whether the network should have the corresponding biases; both default to True.

To adjust the classes for the layers use the hiddenclass and outclass parameters, which expect a subclass of NeuronLayer.

If the recurrent flag is set, a RecurrentNetwork will be created, otherwise a FeedForwardNetwork.

If the fast flag is set, faster arac networks will be used instead of the pybrain implementations.

class pybrain.tools.neuralnets.NNregression(DS, **kwargs)

Learns to numerically predict the targets of a set of data, with optional online progress plots.

__init__(DS, **kwargs)

Initialize with the training data set DS. All keywords given are set as member variables. The following are particularly important:

Key hidden:number of hidden units
Key tds:test data set for checking convergence
Key vds:validation data set for final performance evaluation
Key epoinc:number of epochs to train for, before checking convergence (default: 5)
initGraphics(ymax=10, xmax=-1)
initialize the interactive graphics output window, and return a handle to the plot
setupNN(trainer=<class 'pybrain.supervised.trainers.rprop.RPropMinusTrainer'>, hidden=None, **trnargs)
Constructs a 3-layer FNN for regression. Optional arguments are passed on to the Trainer class.
runTraining(convergence=0, **kwargs)
Trains the network on the stored dataset. If convergence is >0, check after that many epoch increments whether test error is going down again, and stop training accordingly. CAVEAT: No support for Sequential datasets!
saveTrainingCurve(learnfname)
save the training curves into a file with the given name (CSV format)
saveNetwork(fname)
save the trained network to a file
class pybrain.tools.neuralnets.NNclassifier(DS, **kwargs)

Learns to classify a set of data, with optional online progress plots.

__init__(DS, **kwargs)
Initialize the classifier: the least we need is the dataset to be classified. All keywords given are set as member variables.
initGraphics(ymax=10, xmax=-1)
initialize the interactive graphics output window, and return a handle to the plot
setupNN(trainer=<class 'pybrain.supervised.trainers.rprop.RPropMinusTrainer'>, hidden=None, **trnargs)
Setup FNN and trainer for classification.
setupRNN(trainer=<class 'pybrain.supervised.trainers.backprop.BackpropTrainer'>, hidden=None, **trnargs)
Setup an LSTM RNN and trainer for sequence classification.
runTraining(convergence=0, **kwargs)
Trains the network on the stored dataset. If convergence is >0, check after that many epoch increments whether test error is going down again, and stop training accordingly.
saveTrainingCurve(learnfname)
save the training curves into a file with the given name (CSV format)
saveNetwork(fname)
save the trained network to a file

Dataset tools

pybrain.tools.datasettools.convertSequenceToTimeWindows(DSseq, NewClass, winsize)

Converts a sequential classification dataset into time windows of fixed length. Assumes the correct class is given at the last timestep of each sequence. Incomplete windows at the sequence end are pruned. No overlap between windows.

Parameters:
  • DSseq – the sequential data set to cut up
  • winsize – size of the data window
  • NewClass – class of the windowed data set to be returned (gets initialised with indim*winsize, outdim)

Training performance validation tools

class pybrain.tools.validation.Validator

This class provides methods for the validation of calculated output values compared to their destined target values. It does not know anything about modules or other pybrain stuff. It just works on arrays, hence contains just the core calculations.

The class has just classmethods, as it is used as kind of namespace instead of an object definition.

classmethod ESS(output, target)

Returns the explained sum of squares (ESS).

Parameters:
  • output – array of output values
  • target – array of target values
classmethod MSE(output, target, importance=None)

Returns the mean squared error. The multidimensional arrays will get flattened in order to compare them.

Parameters:
  • output – array of output values
  • target – array of target values
Key importance:

each squared error will be multiplied with its corresponding importance value. After summing up these values, the result will be divided by the sum of all importance values for normalization purposes.

classmethod classificationPerformance(output, target)

Returns the hit rate of the outputs compared to the targets.

Parameters:
  • output – array of output values
  • target – array of target values
class pybrain.tools.validation.ModuleValidator

This class provides methods for the validation of calculated output values compared to their destined target values. It especially handles pybrains modules and dataset classes. For the core calculations, the Validator class is used.

The class has just classmethods, as it is used as kind of namespace instead of an object definition.

classmethod MSE(module, dataset)

Returns the mean squared error.

Parameters:
  • module – Object of any subclass of pybrain’s Module type
  • dataset – Dataset object at least containing the fields ‘input’ and ‘target’ (for example SupervisedDataSet)
classmethod calculateModuleOutput(module, dataset)

Calculates the module’s output on the dataset. Can be called with any type of dataset.

Parameter:dataset – Any Dataset object containing an ‘input’ field.
classmethod classificationPerformance(module, dataset)

Returns the hit rate of the module’s output compared to the targets stored inside dataset.

Parameters:
  • module – Object of any subclass of pybrain’s Module type
  • dataset – Dataset object at least containing the fields ‘input’ and ‘target’ (for example SupervisedDataSet)
classmethod validate(valfunc, module, dataset)

Abstract validate function, that is heavily used by this class. First, it calculates the module’s output on the dataset. In advance, it compares the output to the target values of the dataset through the valfunc function and returns the result.

Parameters:
  • valfunc – A function expecting arrays for output, target and importance (optional). See Validator.MSE for an example.
  • module – Object of any subclass of pybrain’s Module type
  • dataset – Dataset object at least containing the fields ‘input’ and ‘target’ (for example SupervisedDataSet)
class pybrain.tools.validation.CrossValidator(trainer, dataset, n_folds=5, valfunc=<bound method type.classificationPerformance of <class 'pybrain.tools.validation.ModuleValidator'>>, **kwargs)

Class for crossvalidating data. An object of CrossValidator must be supplied with a trainer that contains a module and a dataset. Then the dataset ist shuffled and split up into n parts of equal length.

A clone of the trainer and its module is made, and trained with n-1 parts of the split dataset. After training, the module is validated with the n’th part of the dataset that was not used during training.

This is done for each possible combination of n-1 dataset pieces. The the mean of the calculated validation results will be returned.

setArgs(**kwargs)

Set the specified member variables.

Key max_epochs:maximum number of epochs the trainer should train the module for.
Key verbosity:set verbosity level
validate()
The main method of this class. It runs the crossvalidation process and returns the validation result (e.g. performance).
pybrain.tools.validation.testOnSequenceData(module, dataset)
Fetch targets and calculate the modules output on dataset. Output and target are in one-of-many format. The class for each sequence is determined by first summing the probabilities for each individual sample over the sequence, and then finding its maximum.

Auxiliary functions

pybrain.tools.functions.semilinear(x)
This function ensures that the values of the array are always positive. It is x+1 for x=>0 and exp(x) for x<0.
pybrain.tools.functions.semilinearPrime(x)
This function is the first derivative of the semilinear function (above). It is needed for the backward pass of the module.
pybrain.tools.functions.sigmoid(x)
Logistic sigmoid function.
pybrain.tools.functions.sigmoidPrime(x)
Derivative of logistic sigmoid.
pybrain.tools.functions.tanhPrime(x)
Derivative of tanh.
pybrain.tools.functions.safeExp(x)
Bounded range for the exponential function (won’t rpoduce inf or NaN).
pybrain.tools.functions.ranking(R)
Produces a linear ranking of the values in R.
pybrain.tools.functions.multivariateNormalPdf(z, x, sigma)
The pdf of a multivariate normal distribution (not in scipy). The sample z and the mean x should be 1-dim-arrays, and sigma a square 2-dim-array.
pybrain.tools.functions.simpleMultivariateNormalPdf(z, detFactorSigma)
Assuming z has been transformed to a mean of zero and an identity matrix of covariances. Needs to provide the determinant of the factorized (real) covariance matrix.
pybrain.tools.functions.multivariateCauchy(mu, sigma, onlyDiagonal=True)
Generates a sample according to a given multivariate Cauchy distribution.

Previous topic

trainers – Supervised Training for Networks and other Modules

Next topic

utilities – Simple but useful Utility Functions

This Page