Buying and selling stocks is a bit like shopping. Shopping is something that supermarkets and online bookstores know well. These types of business often apply techniques such as basket analysis and recommendation engines. If, for example, you are a fan of a writer who writes historically inaccurate novels, a recommendation engine will probably recommend another novel by the same writer or other historically inaccurate novels.
A recommendation engine for stocks can't work this way. For instance, if you only have stocks of oil producers in your portfolio and the oil price moves against you, then the whole portfolio will lose value. So, we should try to have stocks from different sectors, industries, or geographical regions. We can measure similarity with the correlation of returns.
Analogous to the Sharpe ratio (refer to the Ranking stocks with the Sharpe ratio and liquidity recipe), we want to maximize the average returns of our portfolio and minimize the variance of the portfolio returns. These ideas are also present in the Modern Portfolio Theory (MPT), the inventor of which was awarded the Nobel Prize. For a two-asset portfolio, we have the following equations:
The weights wA and wB are the portfolio weights and sum up to 1. The weights can be negative—as investors can sell short (selling without owning, which incurs borrowing costs) a security. We can solve the portfolio optimization problem with linear algebra methods or general optimization algorithms. However, for a two-asset portfolio with equal weights and a handful of stocks, the brute force approach is good enough.
The following is a breakdown of the portfolio_optimization.ipynb
file in this book's code bundle:
import dautil as dl import ch7util import pandas as pd import numpy as np import seaborn as sns import matplotlib.pyplot as plt
def expected_return(stocka, stockb, means): return 0.5 * (means[stocka] + means[stockb])
def variance_return(stocka, stockb, stds): ohlc = dl.data.OHLC() dfa = ohlc.get(stocka) dfb = ohlc.get(stockb) merged = pd.merge(left=dfa, right=dfb, right_index=True, left_index=True, suffixes=('_A', '_B')).dropna() retsa = ch7util.log_rets(merged['Adj Close_A']) retsb = ch7util.log_rets(merged['Adj Close_B']) corr = np.corrcoef(retsa, retsb)[0][1] return 0.25 * (stds[stocka] ** 2 + stds[stockb] ** 2 + 2 * stds[stocka] * stds[stockb] * corr)
def calc_ratio(stocka, stockb, means, stds, ratios): if stocka == stockb: return np.nan key = stocka + '_' + stockb ratio = ratios.get(key, None) if ratio: return ratio expected = expected_return(stocka, stockb, means) var = variance_return(stocka, stockb, stds) ratio = expected/var ratios[key] = ratio return ratio
means = {} stds = {} ohlc = dl.data.OHLC() for stock in ch7util.STOCKS: close = ohlc.get(stock)['Adj Close'] rets = ch7util.log_rets(close) means[stock] = rets.mean() stds[stock] = rets.std()
pairs = dl.collect.grid_list(ch7util.STOCKS) sorted_pairs = [[sorted(row[i]) for row in pairs] for i in range(len(ch7util.STOCKS))] ratios = {} grid = [[calc_ratio(row[i][0], row[i][1], means, stds, ratios) for row in sorted_pairs] for i in range(len(ch7util.STOCKS))]
%matplotlib inline plt.title('Expected Return/Return Variance for 2 Asset Portfolio') sns.heatmap(grid, xticklabels=ch7util.STOCKS, yticklabels=ch7util.STOCKS)