In the CAPM and APT pricing theories, we assumed linearity in the models and solved for expected security prices using regressions in Python.
As the number of securities in our portfolio increases, certain limitations are introduced as well. A portfolio manager would find himself constrained by these rules in pursing certain objectives mandated by investors.
Linear optimization helps you overcome the problem of portfolio allocation. Optimization focuses on minimizing or maximizing the value of the objective functions. The examples are maximizing returns and minimizing volatility. These objectives are usually governed by certain regulations, such as no short-selling rule, limits on the number of securities to be invested, and so on.
Unfortunately, in Python there is no single official package that supports this solution. However, there are third-party packages available with the implementation of the simplex algorithm for linear programming. For the purpose of this demonstration, we will use PuLP, an open source linear programming modeler, to assist us in this particular linear programming problem.
You can obtain PuLP from https://github.com/coin-or/pulp. The project page contains a comprehensive list of documentations to help you get started with your optimization process.
Suppose that we are interested in investing in two securities, and . We would like to find out the actual number of units to invest for every 3 units of security X and 2 units of security Y, such that the total number of units invested is maximized, where possible. However, there are certain constraints on our investment strategy:
The maximization problem can be mathematically represented as follows:
By plotting the constraints on an by graph, the set of feasible solutions is shown in the shaded area:
The problem can be translated to Python with the PuLP package:
""" A simple linear optimization problem with 2 variables """ import pulp x = pulp.LpVariable("x", lowBound=0) y = pulp.LpVariable("y", lowBound=0) problem = pulp.LpProblem("A simple maximization objective", pulp.LpMaximize) problem += 3*x + 2*y, "The objective function" problem += 2*x + y <= 100, "1st constraint" problem += x + y <= 80, "2nd constraint" problem += x <= 40, "3rd constraint" problem.solve()
The LpVariable
function defines a variable to be solved. The LpProblem
function initializes the problem with a text description of the problem and the type of optimization, which in this case is the maximization method. The +=
operation allows an arbitrary number of constraints to be added, along with a text description. Finally, the solve
function is called to begin performing linear optimization. Each variable value is printed to show the values that the optimizer has solved for us.
The following output is generated when the code runs:
>>> print "Maximization Results:" >>> for variable in problem.variables(): ... print variable.name, "=", variable.varValue Maximization Results: x = 20.0 y = 60.0
The results show that obtaining the maximum value of 180 is possible when the value of is 20 and is 60 while fulfilling the given set of constraints.
There are three outcomes in linear optimization, as follows:
In the simple optimization problem we have investigated earlier, so far the variables have been allowed to be continuous or fractional. What if the use of fractional values or results is not realistic? This problem is called the linear integer programming problem, where all the variables are restricted as integers. A special case of an integer variable is a binary variable that can either be 0 or 1. Binary variables are especially useful to model decision making given a set of choices.
Integer programming models are frequently used in operational research to model real-world working problems. More often than not, stating nonlinear problems in a linear or even binary fashion requires more art than science.
Suppose we must go for 150 contracts in a particular over-the-counter exotic security from three dealers. Dealer X quoted $500 per contract plus handling fees of $4,000, regardless of the number of contracts sold. Dealer Y charges $450 per contract plus a transaction fee of $2,000. Dealer Z charges $450 per contract plus a fee of $6,000. Dealer X will sell at most 100 contracts, dealer Y at most 90, and dealer Z at most 70. The minimum transaction volume from any dealer is 30 contracts if any are transacted with that dealer. How should we minimize the cost of purchasing 150 contracts?
Using the pulp
package, let's set up the required variables:
""" An example of implementing an integer programming model with binary conditions """ import pulp dealers = ["X", "Y", "Z"] variable_costs = {"X": 500, "Y": 350, "Z": 450} fixed_costs = {"X": 4000, "Y": 2000, "Z": 6000} # Define PuLP variables to solve quantities = pulp.LpVariable.dicts("quantity", dealers, lowBound=0, cat=pulp.LpInteger) is_orders = pulp.LpVariable.dicts("orders", dealers, cat=pulp.LpBinary)
The dealers
variable simply contains the dictionary identifiers that are used to reference lists and dictionaries later on. The variable_costs
and fixed_costs
variables are dictionaries that contain their respective contract cost and fees charged by each dealer. The PuLP solver solves for the values of quantities
and is_orders
, which are defined by the LpVariable
function. The dicts
function tells PuLP to treat the assigned variable as a dictionary object, using the dealers
variable for referencing. Note that the quantities
variable has a lower boundary of 0 that prevents us from entering a short position in any securities. The is_orders
values are treated as binary objects, indicating whether we should enter into a transaction with any of the dealers.
What is the best approach to modeling this integer programming problem? At first glance, this seems fairly straightforward by applying this equation:
Where
The equation simply states that we want to minimize the total costs with the binary variable , determining whether to account for the costs associated with buying from a specific dealer should we choose to.
Let's implement this model in Python:
""" This is an example of implementing an integer programming model with binary variables the wrong way. """ # Initialize the model with constraints model = pulp.LpProblem("A cost minimization problem", pulp.LpMinimize) model += sum([(variable_costs[i] * quantities[i] + fixed_costs[i])*is_orders[i] for i in dealers]), "Minimize portfolio cost" model += sum([quantities[i] for i in dealers]) == 150, "Total contracts required" model += 30 <= quantities["X"] <= 100, "Boundary of total volume of X" model += 30 <= quantities["Y"] <= 90, "Boundary of total volume of Y" model += 30 <= quantities["Z"] <= 70, "Boundary of total volume of Z" model.solve()
What happens when we run the solver?
TypeError: Non-constant expressions cannot be multiplied
As it turned out, we were trying to perform multiplication on two unknown variables, quantities
and is_order
, which unknowingly led us to perform nonlinear programming. Such are the pitfalls encountered when performing integer programming.
Another method of formulating the minimization objective is to place all unknown variables in a linear fashion such that they are additive:
Comparing with the previous objective equation, we would obtain the same fixed cost values. However, the unknown variable remains in the first term of the equation. Hence, the variable is required to be solved as a function of such that the constraints are stated as follows:
Let's apply these formulas in Python:
""" This is an example of implementing an IP model with binary variables the correct way. """ # Initialize the model with constraints model = pulp.LpProblem("A cost minimization problem", pulp.LpMinimize) model += sum([variable_costs[i]*quantities[i] + fixed_costs[i]*is_orders[i] for i in dealers]), "Minimize portfolio cost" model += sum([quantities[i] for i in dealers]) == 150, "Total contracts required" model += is_orders["X"]*30 <= quantities["X"] <= is_orders["X"]*100, "Boundary of total volume of X" model += is_orders["Y"]*30 <= quantities["Y"] <= is_orders["Y"]*90, "Boundary of total volume of Y" model += is_orders["Z"]*30 <= quantities["Z"] <= is_orders["Z"]*70, "Boundary of total volume of Z" model.solve()
What happens when we try to run the solver?
>>> print "Minimization Results:" >>> for variable in model.variables(): ... print variable, "=", variable.varValue >>> >>> print "Total cost: %s" % pulp.value(model.objective) Minimization Results: orders_X = 0.0 orders_Y = 1.0 orders_Z = 1.0 quantity_X = 0.0 quantity_Y = 90.0 quantity_Z = 60.0 Total cost: 66500.0
The output tells us that buying 90 contracts from the dealer Y and 60 contracts from the dealer Z gives us the lowest possible cost of $66,500 while fulfilling all the other constraints.
As we can see, careful planning is required in the designing of integer programming models to arrive at an accurate solution in order for them to be useful in decision making.