.. _optimization:


Black-box Optimization
======================

This tutorial will illustrate how to use the optimization algorithms in PyBrain.

Very many practical problems can be framed as optimization problems: finding the best settings for a controller,
minimizing the risk of an investment portfolio, finding a good strategy in a game, etc.
It always involves determining a certain number of *variables* (the *problem dimension*),
each of them chosen from a set,
that maximizing (or minimize) a given *objective function*.

The main categories of optimization problems are based
on the kinds of sets the variables are chosen from:

	* all real numbers: continuous optimization
	* real numbers with bounds: constrained optimization
	* integers: integer programming
	* combinations of the above
	* others, e.g. graphs

These can be further classified according to properties of the objective function
(e.g. continuity, explicit access to partial derivatives, quadratic form, etc.).
In black-box optimization the objective function is a black box,
i.e. there are no conditions about it.
The optimization tools that PyBrain provides are all for the most general, black-box case.
They fall into 2 groups:

	* :class:`~pybrain.optimization.optimizer.BlackBoxOptimizer` are applicable to all kinds of variable sets
	* :class:`~pybrain.optimization.optimizer.ContinuousOptimizer` can only be used for continuous optimization

We will introduce the optimization framework for the more restrictive kind first,
because that case is simpler.

Continuous optimization
------------------------

Let's start by defining a simple objective function for (:mod:`numpy` arrays of) continuous variables,
e.g. the sum of squares:

	>>> def objF(x): return sum(x**2)

and an initial guess for where to start looking:

	>>> x0 = array([2.1, -1])

Now we can initialize one of the optimization algorithms,
e.g. :class:`~pybrain.optimization.distributionbased.cmaes.CMAES`:

	>>> from pybrain.optimization import CMAES
	>>> l = CMAES(objF, x0)

By default, all optimization algorithms *maximize* the objective function,
but you can change this by setting the :attr:`minimize` attribute:

	>>> l.minimize = True

.. note::
	We could also have done that upon construction:
	``CMAES(objF, x0, minimize = True)``

Stopping criteria can be algorithm-specific, but in addition,
it is always possible to define the following ones:

	* maximal number of evaluations
	* maximal number of learning steps
	* reaching a desired value

..

	>>> l.maxEvaluations = 200

Now that the optimizer is set up, all we need to use is the :meth:`learn` method, which will
attempt to optimize the variables until a stopping criterion is reached. It returns
a tuple with the best evaluable (= array of variables) found, and the corresponding fitness:

	>>> l.learn()
	(array([ -1.59778097e-05,  -1.14434779e-03]), 1.3097871509722648e-06)


General optimization: using :class:`Evolvable`
------------------------------------------------

Our approach to doing optimization in the most general setting (no assumptions about the variables) is
to let the user define a subclass of :class:`Evolvable` that implements:

	* a :meth:`copy` operator,
	* a method for generating random other points: :meth:`randomize`,
	* :meth:`mutate`, an operator that does a small step in search space, according to *some* distance metric,
	* (optionally) a :meth:`crossover` operator that produces *some* combination with other evolvables of the same class.

The optimization algorithm is then initialized with an instance of this class
and an objective function that can evaluate such instances.

Here's a minimalistic example of such a subclass with a single constrained variable
(and a bias to do mutation steps toward larger values):

	>>> from random import random
	>>> from pybrain.structure.evolvables.evolvable import Evolvable
	>>> class SimpleEvo(Evolvable):
	...    	def __init__(self, x): self.x = max(0, min(x, 10))
	...    	def mutate(self):      self.x = max(0, min(self.x + random() - 0.3, 10))
	...    	def copy(self):        return SimpleEvo(self.x)
	...    	def randomize(self):   self.x = 10*random()
	...    	def __repr__(self):    return '<-%.2f->'+str(self.x)


which can be optimized using, for example, :class:`~pybrain.optimization.hillclimber.HillClimber`:


	>>> from pybrain.optimization import HillClimber
	>>> x0 = SimpleEvo(1.2)
	>>> l = HillClimber(lambda x: x.x, x0, maxEvaluations = 50)
	>>> l.learn()
	(<-10.00->, 10)


Optimization in Reinforcement Learning
--------------------------------------

This section illustrates how to use optimization algorithms in the reinforcement learning framework.

As our objective function we use any episodic task, e.g:

	>>> from pybrain.rl.environments.cartpole.balancetask import BalanceTask
	>>> task = BalanceTask()

Then we construct a module that can interact with the task,
for example a neural network controller,

	>>> from pybrain.tools.shortcuts import buildNetwork
	>>> net = buildNetwork(task.outdim, 3, task.indim)

and we choose any optimization algorithm, e.g. a simple :class:`HillClimber`.

Now, we have 2 (equivalent) ways for connecting those:

	1) using the same syntax as before, where the task plays the role of the objective function directly:
		>>> HillClimber(task, net, maxEvaluations = 100).learn()

	2) or, using the agent-based framework:
		>>> from pybrain.rl.agents import OptimizationAgent
		>>> from pybrain.rl.experiments import EpisodicExperiment
		>>> agent = OptimizationAgent(net, HillClimber())
		>>> exp = EpisodicExperiment(task, agent)
		>>> exp.doEpisodes(100)

.. note::
	This is very similar to the typical (non-optimization) reinforcement learning setup,
	the key difference being the use of a :class:`LearningAgent` instead of an :class:`OptimizationAgent`.

		>>> from pybrain.rl.learners import ENAC
		>>> from pybrain.rl.agents import LearningAgent
		>>> agent = LearningAgent(net, ENAC())
		>>> exp = EpisodicExperiment(task, agent)
		>>> exp.doEpisodes(100)