.. _optimization: Black-box Optimization ====================== This tutorial will illustrate how to use the optimization algorithms in PyBrain. Very many practical problems can be framed as optimization problems: finding the best settings for a controller, minimizing the risk of an investment portfolio, finding a good strategy in a game, etc. It always involves determining a certain number of *variables* (the *problem dimension*), each of them chosen from a set, that maximizing (or minimize) a given *objective function*. The main categories of optimization problems are based on the kinds of sets the variables are chosen from: * all real numbers: continuous optimization * real numbers with bounds: constrained optimization * integers: integer programming * combinations of the above * others, e.g. graphs These can be further classified according to properties of the objective function (e.g. continuity, explicit access to partial derivatives, quadratic form, etc.). In black-box optimization the objective function is a black box, i.e. there are no conditions about it. The optimization tools that PyBrain provides are all for the most general, black-box case. They fall into 2 groups: * :class:`~pybrain.optimization.optimizer.BlackBoxOptimizer` are applicable to all kinds of variable sets * :class:`~pybrain.optimization.optimizer.ContinuousOptimizer` can only be used for continuous optimization We will introduce the optimization framework for the more restrictive kind first, because that case is simpler. Continuous optimization ------------------------ Let's start by defining a simple objective function for (:mod:`numpy` arrays of) continuous variables, e.g. the sum of squares: >>> def objF(x): return sum(x**2) and an initial guess for where to start looking: >>> x0 = array([2.1, -1]) Now we can initialize one of the optimization algorithms, e.g. :class:`~pybrain.optimization.distributionbased.cmaes.CMAES`: >>> from pybrain.optimization import CMAES >>> l = CMAES(objF, x0) By default, all optimization algorithms *maximize* the objective function, but you can change this by setting the :attr:`minimize` attribute: >>> l.minimize = True .. note:: We could also have done that upon construction: ``CMAES(objF, x0, minimize = True)`` Stopping criteria can be algorithm-specific, but in addition, it is always possible to define the following ones: * maximal number of evaluations * maximal number of learning steps * reaching a desired value .. >>> l.maxEvaluations = 200 Now that the optimizer is set up, all we need to use is the :meth:`learn` method, which will attempt to optimize the variables until a stopping criterion is reached. It returns a tuple with the best evaluable (= array of variables) found, and the corresponding fitness: >>> l.learn() (array([ -1.59778097e-05, -1.14434779e-03]), 1.3097871509722648e-06) General optimization: using :class:`Evolvable` ------------------------------------------------ Our approach to doing optimization in the most general setting (no assumptions about the variables) is to let the user define a subclass of :class:`Evolvable` that implements: * a :meth:`copy` operator, * a method for generating random other points: :meth:`randomize`, * :meth:`mutate`, an operator that does a small step in search space, according to *some* distance metric, * (optionally) a :meth:`crossover` operator that produces *some* combination with other evolvables of the same class. The optimization algorithm is then initialized with an instance of this class and an objective function that can evaluate such instances. Here's a minimalistic example of such a subclass with a single constrained variable (and a bias to do mutation steps toward larger values): >>> from random import random >>> from pybrain.structure.evolvables.evolvable import Evolvable >>> class SimpleEvo(Evolvable): ... def __init__(self, x): self.x = max(0, min(x, 10)) ... def mutate(self): self.x = max(0, min(self.x + random() - 0.3, 10)) ... def copy(self): return SimpleEvo(self.x) ... def randomize(self): self.x = 10*random() ... def __repr__(self): return '<-%.2f->'+str(self.x) which can be optimized using, for example, :class:`~pybrain.optimization.hillclimber.HillClimber`: >>> from pybrain.optimization import HillClimber >>> x0 = SimpleEvo(1.2) >>> l = HillClimber(lambda x: x.x, x0, maxEvaluations = 50) >>> l.learn() (<-10.00->, 10) Optimization in Reinforcement Learning -------------------------------------- This section illustrates how to use optimization algorithms in the reinforcement learning framework. As our objective function we use any episodic task, e.g: >>> from pybrain.rl.environments.cartpole.balancetask import BalanceTask >>> task = BalanceTask() Then we construct a module that can interact with the task, for example a neural network controller, >>> from pybrain.tools.shortcuts import buildNetwork >>> net = buildNetwork(task.outdim, 3, task.indim) and we choose any optimization algorithm, e.g. a simple :class:`HillClimber`. Now, we have 2 (equivalent) ways for connecting those: 1) using the same syntax as before, where the task plays the role of the objective function directly: >>> HillClimber(task, net, maxEvaluations = 100).learn() 2) or, using the agent-based framework: >>> from pybrain.rl.agents import OptimizationAgent >>> from pybrain.rl.experiments import EpisodicExperiment >>> agent = OptimizationAgent(net, HillClimber()) >>> exp = EpisodicExperiment(task, agent) >>> exp.doEpisodes(100) .. note:: This is very similar to the typical (non-optimization) reinforcement learning setup, the key difference being the use of a :class:`LearningAgent` instead of an :class:`OptimizationAgent`. >>> from pybrain.rl.learners import ENAC >>> from pybrain.rl.agents import LearningAgent >>> agent = LearningAgent(net, ENAC()) >>> exp = EpisodicExperiment(task, agent) >>> exp.doEpisodes(100)