Now that we have an idea of designing a video game for creating a backtesting trading system, we can begin our object-oriented approach by first defining the required classes for the various components in our trading system.
We are interested in implementing a simple backtesting system to test a mean-reverting strategy. Using the daily historical prices from Google Finance, we will take the closing price of each day to compute the volatility of price returns for a particular stock, using the ticker symbol AAPL as an example. We want to test a theory that if the standard deviation of returns for an elapsed number of days is far from the mean of zero by a particular threshold, a buy or sell signal is generated. When such a signal is indeed generated, a market order is sent to the exchange to be executed at the opening price of the next trading day.
As soon as we open a position, we would like to track our unrealized and realized profits till date. Our open position can be closed when an opposing signal is generated. On completion of the backtest, we will plot our profits and losses to see how well our strategy holds.
Does our theory sound like a viable trading strategy? Well, let's find out! The following sections explain the classes that will be used for implementing a backtesting system.
The TickData
class represents a single unit of data received from a market data source. In this example, we are interested in just the stock symbol, the timestamp of the data, the opening price, and the last price:
""" Store a single unit of data """ class TickData: def __init__(self, symbol, timestamp, last_price=0, total_volume=0): self.symbol = symbol self.timestamp = timestamp self.open_price = 0 self.last_price = last_price self.total_volume = total_volume
Detailed descriptions of a single unit of tick data, such as the total volume, bid price, ask price, or last volume can be added as our system evolves.
An instance of this class is used throughout the system to store and retrieve prices by the various components. Essentially, a container is used to store the last tick data. Additional helper functions are included to provide easy reference to the required information:
class MarketData: def __init__(self): self.__recent_ticks__ = dict() def add_last_price(self, time, symbol, price, volume): tick_data = TickData(symbol, time, price, volume) self.__recent_ticks__[symbol] = tick_data def add_open_price(self, time, symbol, price): tick_data = self.get_existing_tick_data(symbol, time) tick_data.open_price = price def get_existing_tick_data(self, symbol, time): if not symbol in self.__recent_ticks__: tick_data = TickData(symbol, time) self.__recent_ticks__[symbol] = tick_data return self.__recent_ticks__[symbol] def get_last_price(self, symbol): return self.__recent_ticks__[symbol].last_price def get_open_price(self, symbol): return self.__recent_ticks__[symbol].open_price def get_timestamp(self, symbol): return self.__recent_ticks__[symbol].timestamp
The
MarketDataSource
class helps us fetch historical data from an external source, such as Google Finance or Yahoo! Finance. The required parameter values, such as start
, end
, ticker
, and source
are provided from the host component of this class, which we will discuss later. After saving the opening and closing prices of each day, the event_tick
variable that represents a function handled by the host component will be invoked on every tick event. Notice that we are using the DataReader
function of pandas to retrieve historical prices. The acceptable parameters are yahoo
for Yahoo! Finance data source and google
for Google Finance data source:
import pandas.io.data as web """ Download prices from an external data source """ class MarketDataSource: def __init__(self): self.event_tick = None self.ticker, self.source = None, None self.start, self.end = None, None self.md = MarketData() def start_market_simulation(self): data = web.DataReader(self.ticker, self.source, self.start, self.end) for time, row in data.iterrows(): self.md.add_last_price(time, self.ticker, row["Close"], row["Volume"]) self.md.add_open_price(time, self.ticker, row["Open"]) if not self.event_tick is None: self.event_tick(self.md)
The
Order
class represents a single order sent by the strategy to the server. Each order contains a timestamp, the symbol, quantity, price, and the size of the order. In this example, we are using market orders only. Other order types, such as limit and stop orders, can be further implemented if desired. Once an order is filled, the order is further updated with the filled time, quantity, and price:
class Order: def __init__(self, timestamp, symbol, qty, is_buy, is_market_order, price=0): self.timestamp = timestamp self.symbol = symbol self.qty = qty self.price = price self.is_buy = is_buy self.is_market_order = is_market_order self.is_filled = False self.filled_price = 0 self.filled_time = None self.filled_qty = 0
The
Position
class helps us keep track of our current market position and account balance. Note that the position_value
variable starts with a value of zero. When stocks are bought, the value of the securities is debited from this account. When stocks are sold, the value of the securities is credited into this account:
class Position: def __init__(self): self.symbol = None self.buys, self.sells, self.net = 0, 0, 0 self.realized_pnl = 0 self.unrealized_pnl = 0 self.position_value = 0 def event_fill(self, timestamp, is_buy, qty, price): if is_buy: self.buys += qty else: self.sells += qty self.net = self.buys - self.sells changed_value = qty * price * (-1 if is_buy else 1) self.position_value += changed_value if self.net == 0: self.realized_pnl = self.position_value def update_unrealized_pnl(self, price): if self.net == 0: self.unrealized_pnl = 0 else: self.unrealized_pnl = price * self.net + self.position_value return self.unrealized_pnl
The
Strategy
class is the base class for all other strategy implementations. The event_tick
method is called when new market tick data arrives. The event_order
method is called whenever there are order updates. The event_position
method is called whenever there are updates to our positions. The send_market_order
method is called when the implementing strategy sends a market order to the host component to be routed to the server for execution:
""" Base strategy for implementation """ class Strategy: def __init__(self): self.event_sendorder = None def event_tick(self, market_data): pass def event_order(self, order): pass def event_position(self, positions): pass def send_market_order(self, symbol, qty, is_buy, timestamp): if not self.event_sendorder is None: order = Order(timestamp, symbol, qty, is_buy, True) self.event_sendorder(order)
In this example, we are implementing a mean-reverting strategy with the MeanRevertingStrategy
class that inherits the Strategy
class. We will use the stock symbol AAPL.
The event_position
method is overridden and updates the state of the strategy to indicate a long or a short on every change in position. Knowing the current state of the strategy prevents us from adding on to our positions and entering more orders than intended.
The event_tick
method is overridden to perform the trade logic decision on every incoming tick data, which is stored as a pandas DataFrame
object, to calculate the strategy parameters. The lookback_intervals
variable defines a maximum of 20 days of historical prices to store.
The calculate_z_score
method implements our mean-reverting calculations. The daily percentage change of close prices over the previous day is computed. The dropna
function removes any empty values from the result. The returns are then Z-scored, such as:
Here, is the most recent return, is the mean of returns, and is the standard deviation of returns. A z_score
value of 0 indicates that the score is the same as the mean. When the value of z_score
reaches 1.5 or -1.5, as defined by the sell_threshold
and buy_threshold
variables respectively, this could indicate a strong sell or buy signal, since the Z-score for the following periods is expected to revert back to the mean of zero. When a signal is generated it can be used to either open a position or to close an existing position:
""" Implementation of a mean-reverting strategy based on the Strategy class """ import pandas as pd class MeanRevertingStrategy(Strategy): def __init__(self, symbol, lookback_intervals=20, buy_threshold=-1.5, sell_threshold=1.5): Strategy.__init__(self) self.symbol = symbol self.lookback_intervals = lookback_intervals self.buy_threshold = buy_threshold self.sell_threshold = sell_threshold self.prices = pd.DataFrame() self.is_long, self.is_short = False, False def event_position(self, positions): if self.symbol in positions: position = positions[self.symbol] self.is_long = True if position.net > 0 else False self.is_short = True if position.net < 0 else False def event_tick(self, market_data): self.store_prices(market_data) if len(self.prices) < self.lookback_intervals: return signal_value = self.calculate_z_score() timestamp = market_data.get_timestamp(self.symbol) if signal_value < self.buy_threshold: self.on_buy_signal(timestamp) elif signal_value > self.sell_threshold: self.on_sell_signal(timestamp) def store_prices(self, market_data): timestamp = market_data.get_timestamp(self.symbol) self.prices.loc[timestamp, "close"] = market_data.get_last_price(self.symbol) self.prices.loc[timestamp, "open"] = market_data.get_open_price(self.symbol) def calculate_z_score(self): self.prices = self.prices[-self.lookback_intervals:] returns = self.prices["close"].pct_change().dropna() z_score = ((returns-returns.mean())/returns.std())[-1] return z_score def on_buy_signal(self, timestamp): if not self.is_long: self.send_market_order(self.symbol, 100, True, timestamp) def on_sell_signal(self, timestamp): if not self.is_short: self.send_market_order(self.symbol, 100, False, timestamp)
After defining all of our core components, we are now ready to implement the backtesting engine as the Backtester
class.
The start_backtest
method initializes our strategy, defines the order handler for this strategy with the evthandler_order
method, sets up and runs the market data source function. When data is received from the market data source function, the function evthandler_tick
method handles each incoming tick data and passes them to our strategy.
Thereafter, the match_order_book
method, in conjunction with the is_order_unmatched
method, is called to make an attempt to match any outstanding orders in our system, given the current market prices. The is_order_unmatched
method returns True
when no order is filled, or False
otherwise. On filling an order, it calls the update_filled_position
method for further processing. This includes updating the position values, notifying the Strategy
object of a position update, and keeping track of our profits and losses. The is_order_unmatched
method also notifies the Strategy
object of an order update event when an order is filled.
Lastly, the position updates are printed to the console to help us keep track of our account status. This main loop of the backtesting engine continues until the last tick is available from the source of the market data. The full implementation of the Backtester
class is given as follows:
import datetime as dt import pandas as pd class Backtester: def __init__(self, symbol, start_date, end_date, data_source="google"): self.target_symbol = symbol self.data_source = data_source self.start_dt = start_date self.end_dt = end_date self.strategy = None self.unfilled_orders = [] self.positions = dict() self.current_prices = None self.rpnl, self.upnl = pd.DataFrame(), pd.DataFrame() def get_timestamp(self): return self.current_prices.get_timestamp( self.target_symbol) def get_trade_date(self): timestamp = self.get_timestamp() return timestamp.strftime("%Y-%m-%d") def update_filled_position(self, symbol, qty, is_buy, price, timestamp): position = self.get_position(symbol) position.event_fill(timestamp, is_buy, qty, price) self.strategy.event_position(self.positions) self.rpnl.loc[timestamp, "rpnl"] = position.realized_pnl print self.get_trade_date(), "Filled:", "BUY" if is_buy else "SELL", qty, symbol, "at", price def get_position(self, symbol): if symbol not in self.positions: position = Position() position.symbol = symbol self.positions[symbol] = position return self.positions[symbol] def evthandler_order(self, order): self.unfilled_orders.append(order) print self.get_trade_date(), "Received order:", "BUY" if order.is_buy else "SELL", order.qty, order.symbol def match_order_book(self, prices): if len(self.unfilled_orders) > 0: self.unfilled_orders = [order for order in self.unfilled_orders if self.is_order_unmatched(order, prices)] def is_order_unmatched(self, order, prices): symbol = order.symbol timestamp = prices.get_timestamp(symbol) if order.is_market_order and timestamp > order.timestamp: # Order is matched and filled. order.is_filled = True open_price = prices.get_open_price(symbol) order.filled_timestamp = timestamp order.filled_price = open_price self.update_filled_position(symbol, order.qty, order.is_buy, open_price, timestamp) self.strategy.event_order(order) return False return True def print_position_status(self, symbol, prices): if symbol in self.positions: position = self.positions[symbol] close_price = prices.get_last_price(symbol) position.update_unrealized_pnl(close_price) self.upnl.loc[self.get_timestamp(), "upnl"] = position.unrealized_pnl print self.get_trade_date(), "Net:", position.net, "Value:", position.position_value, "UPnL:", position.unrealized_pnl, "RPnL:", position.realized_pnl def evthandler_tick(self, prices): self.current_prices = prices self.strategy.event_tick(prices) self.match_order_book(prices) self.print_position_status(self.target_symbol, prices) def start_backtest(self): self.strategy = MeanRevertingStrategy(self.target_symbol) self.strategy.event_sendorder = self.evthandler_order mds = MarketDataSource() mds.event_tick = self.evthandler_tick mds.ticker = self.target_symbol mds.source = self.data_source mds.start, mds.end = self.start_dt, self.end_dt print "Backtesting started..." mds.start_market_simulation() print "Completed."
To run our backtester, simply create an instance of the class with the required parameters. Here, we defined the ticker symbol AAPL
for the period January 1, 2014 to December 31, 2014. By default, our target market data source is defined as google
. Then, we will call the start_backtest
method:
>>> backtester = Backtester("AAPL", ... dt.datetime(2014, 1, 1), ... dt.datetime(2014, 12, 31)) >>> backtester.start_backtest()
The output will begin to run like this:
Backtesting started... 2014-02-27 Received order: SELL 100 AAPL 2014-02-28 Filled: SELL 100 AAPL at 75.58 2014-02-28 Net: -100 Value: 7558.0 UPnL: 40.0 RPnL: 0 2014-03-03 Net: -100 Value: 7558.0 UPnL: 19.0 RPnL: 0 2014-03-04 Net: -100 Value: 7558.0 UPnL: -31.0 RPnL: 0 …
Almost a year's worth of daily information will be printed onto the console. The output will end with something like this:
… 014-12-29 Net: -100 Value: 12504.0 UPnL: 1113.0 RPnL: 1278.0 2014-12-30 Net: -100 Value: 12504.0 UPnL: 1252.0 RPnL: 1278.0 2014-12-31 Net: -100 Value: 12504.0 UPnL: 1466.0 RPnL: 1278.0 Completed.
In the MeanRevertingStrategy
class, we trade shares of AAPL in quantities of 100. Note that when the backtest is completed, we still have an outstanding short position of 100 shares. Our realized profit and loss is $1,278, while the unrealized profit from the short position is $1,466.
Since we store the daily realized and unrealized profits and losses into a pandas DataFrame
object, named rpnl
and upnl
respectively, we can plot the results to visualize the returns from our strategy:
>>> import matplotlib.pyplot as plt >>> backtester.rpnl.plot() >>> plt.show()
>>> backtester.upnl.plot() >>> plt.show()
In this section, we looked at creating a simple backtesting system based on daily closing prices for a mean-reverting strategy. There are several areas of considerations to make such a backtesting model more realistic. Are historical daily prices sufficient to test our model? Should intra-day limit orders be used instead? Our account value started from zero; how can we reflect our capital requirements accurately? Are we able to borrow shares for shorting?
Since we took an object-oriented approach to create a backtesting system, how easy would it be to integrate other components in future? A trading system could accept more than one source of market data. We could also create components that allow us to deploy our system to the product environment.
The list of concerns mentioned are not exhaustive. To guide us in implementing a robust backtesting model, the next section spells out ten considerations in the design of such a system.