Chapter 12 - Implementations (1/2)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 12

Implementations

As we have proposed several online portfolio selection (OLPS) algorithms, we are

interested in whether they work in real markets. To examine their empirical efﬁcacy,

we conducted an extensive set of empirical studies on a variety of real datasets. In our

evaluations, we adopted six real datasets, which were collected from several diverse

ﬁnancial markets. The performance metrics include cumulative wealth (return) and

risk-adjusted returns (based on volatility risk and drawdown risk). We also compared

the proposed algorithms with various existing algorithms. The results clearly demon-

strate that the proposed algorithms sequentially surpass the state-of-the-art techniques

in terms of either metric.

This chapter is organized as follows. Section 12.1 describes the experimental plat-

form or the OLPS platform. Section 12.2 details the experimental testbed, including

six real datasets. Section 12.3 sets up all the proposed algorithms and illustrates sev-

eral compared approaches. Section 12.4 introduces the performance metrics used for

the empirical studies. Finally, Section 12.5 summarizes this chapter.

12.1 The OLPS Platform

To evaluate the performance of a proposed algorithm, researchers and practitioners

usually implemented a back-test system, simulating the strategies using historical

market data. We also designed a back-test system, named “OLPS”, as follows, and

Appendix A describes the details of the OLPS toolbox. It implements a frame-

work for back-testing and various algorithms for online portfolio selection. Based

on MATLAB



∗

it is compatible with Window, Linux, and Mac OS. Figure 12.1

illustrates the structure of the OLPS toolkit, which consists of three parts. The ﬁrst

part on the upper left preprocesses data, that is, it loads a speciﬁed dataset and initial-

izes the trading environments, such as log ﬁles, timing variable. The second part on

the lower level calls OLPS algorithms and simulates the trading process for strategies

based on the data prepared in the ﬁrst part. The third part in the upper right postpro-

cesses the outputs from the second part, that is, it statistically analyzes the returns and

calculates some risk-adjusted returns.

∗

More details are available at http://www.mathworks.com

T&F Cat #K23731 — K23731_C012 — page 95 — 9/30/2015 — 16:46

96 IMPLEMENTATIONS

Data

Statistical t-test

Volatility risk and Sharpe ratio

Drawdown analysis and Calmar ratio

OLPS: Postprocess

Load data

Initialize log files

OLPS: Preprocess

OLPS: Algorithmic trading

Market

Best stock

BCRP

Benchmarks

Universal portfolios

Exponential gradient

Online Newton step

Switching portfolios

Follow the winner

Anticorrelation

Passive–aggressive mean reversion

Confidence-weighted mean reversion

Online moving average reversion

Follow the loser

Nonparametric kernel-based log-optimal

Nonparametric nearest neighbor log-optimal

Correlation-driven nonparametric learning

Pattern matching–based

Figure 12.1 Structure of the OLPS toolbox.

12.1.1 Preprocess

This step aims to prepare trading environments. As existing datasets are often in MAT

ﬁles,

∗

OLPS accepts datasets in MAT format. The dataset often contains an n ×m

matrix, where n denotes the number of trading periods and m refers to the number of

assets. It is straightforward to incorporate market feeds

†

from real markets, such that

the toolkit can handle real-time data and conduct paper or even real trading.

‡

12.1.2 Algorithmic Trading

This step conducts simulations based on historical real-market data. In our framework,

implementing a new strategy generally requires four ﬁles: a start ﬁle, a run ﬁle, a

kernel ﬁle, and an expert ﬁle. The start (entry) ﬁle extracts parameters and call the

corresponding run ﬁle. The run ﬁle simulates a whole trading process and calls its

kernel ﬁle to construct a portfolio for each period, which is used for rebalancing.

The kernel ﬁle outputs a ﬁnal portfolio, while it facilitates the development of meta-

algorithms, which effectively combines multiple experts’ portfolios. The expert ﬁle

outputs one portfolio depending on the input data and speciﬁc parameters. In case of

only one expert, the kernel ﬁle is not necessary and directly enters the expert ﬁle.

OLPS implements the following OLPS algorithms:

• Benchmarks (Market, Best stock, and BCRP).

• Follow the winner approaches (UP, EG, and ONS): make portfolio decisions fol-

lowing the assumption that the next price relatives (or experts for UP) will follow

the previous one.

• Follow the loser approaches (Anticor, PAMR, CWMR, and OLMAR): make

portfolio decisions by assuming that next price relatives will revert to previous

trends.

∗

A full description about MAT ﬁles can be found at http://www.mathworks.com/help/pdf_doc/matlab/

matﬁle_format.pdf

†

For example, Interactive Brokers (http://www.interactivebrokers.com) provides free APIs.

‡

Both paper and real trading require users to implement an order submission step, while back-test

does not.

T&F Cat #K23731 — K23731_C012 — page 96 — 9/30/2015 — 16:46

DATA 97

• Pattern matching–based approaches (B

, and CORN): locate a set containing

similar price relatives and make optimal portfolios based on the set.

• Others: some are ad hoc algorithms, such as M0/T0.

12.1.3 Postprocess

After the algorithmic trading simulation, this step processes the results by providing

the following performance metrics:

• Cumulative return: The most widely used in related studies;

• Volatility and Sharpe ratio: Typically used to measure risk-adjusted return in the

investment industry;

• Drawdown and Calmar ratio: Used to measure downside risk and related risk-

adjusted return;

• T-test statistics: Tests whether a strategy’s return is signiﬁcantly different from that

of the market.

12.2 Data

In our study, we focus on historical daily closing prices in stock markets, which are

easy to obtain from public domains (such as Yahoo Finance and Google Finance

∗

and thus are publicly available to other researchers. Data from other types of markets,

such as high-frequency intraday quotes

†

and Forex markets, are either too expensive

or hard to obtain and process, and thus may reduce the experimental reproducibil-

ity. Summarized in Table 12.1, six real and diverse datasets from several ﬁnancial

markets

‡

are employed.

The ﬁrst dataset, “NYSE (O),” is one “standard” dataset pioneered by Cover

(1991) and followed by others (Helmbold et al. 1998; Borodin et al. 2004; Agarwal

et al. 2006; Györﬁ et al. 2006, 2008). This dataset contains 5651 daily price relatives

of 36 stocks

in the New York Stock Exchange (NYSE) for a 22-year period from

July 3, 1962, to December 31, 1984.

The second dataset is an extended version of the NYSE (O) dataset. For consis-

tency, we collected the latest data in the NYSE from January 1, 1985, to June 30,

2010, a period that consists of 6431 trading days. We denote this new dataset as

“NYSE (N).”

Note that the new dataset consists of 23 stocks rather than the pre-

vious 36 stocks owing to amalgamations and bankruptcies. All self-collected price

∗

Yahoo Finance: http://ﬁnance.yahoo.com; and Google Finance: http://www.google.com/ﬁnance

†

We did evaluate certain algorithms using high-frequency data and weekly data, as in Li et al. (2013).

‡

All related codes and datasets, including their compositions, are available at http://stevenhoi.org/olps

Borodin et al. (2004)’s datasets (NYSE (O), TSE, SP500, and DJIA) are also available at

http://www.cs.technion.ac.il/∼rani/portfolios/

According to Helmbold et al. (1998), the dataset was originally collected by Hal Stern. The stocks are

mainly large cap stocks in NYSE; however, we do no know the criteria of choosing these stocks.

The dataset before 2007 was collected by Gábor Gelencsér (http://www.cs.bme.hu/∼oti/portfolio);

we collected the remaining data from 2007 to 2010 via Yahoo Finance.

T&F Cat #K23731 — K23731_C012 — page 97 — 9/30/2015 — 16:46

98 IMPLEMENTATIONS

Table 12.1 Summary of the six datasets from real markets

Dataset Market Region Time Frame # Periods # Assets

NYSE (O) Stock USA July 3, 1962– 5651 36

December 31, 1984

NYSE (N) Stock USA January 1, 1985– 6431 23

June 30, 2010

TSE Stock CA January 4, 1994– 1259 88

December 31, 1998

SP500 Stock USA January 2, 1998– 1276 25

January 31, 2003

MSCI Index Global April 1, 2006– 1043 24

March 31, 2010

DJIA Stock USA January 14, 2001– 507 30

January 14, 2003

relatives are adjusted for splits and dividends, which is consistent with the previous

“NYSE (O)” dataset.

The third dataset, “TSE,” is collected by Borodin et al. (2004), and it consists

of 88 stocks from the Toronto Stock Exchange (TSE) containing price relatives of

1259 trading days, ranging from January 4, 1994, to December 31, 1998. The fourth

dataset, SP500, is collected by Borodin et al. (2004), and it consists of 25 stocks

with the largest market capitalizations in the 500 SP500 components. It ranges from

January 2, 1998, to January 31, 2003, containing 1276 trading days.

The ﬁfth dataset is “MSCI,” which is a collection of global equity indices that

constitute the MSCI World Index.

∗

It contains 24 indices that represent the equity

markets of 24 countries around the world, and it consists of a total of 1043 trading

days, ranging from April 1, 2006, to March 31, 2010. The ﬁnal dataset is the DJIA

dataset (Borodin et al. 2004), which consists of 30 Dow Jones composite stocks. DJIA

contains 507 trading days, ranging from January 14, 2001, to January 14, 2003.

Besides the six real-market data, in the main experiments (i.e., Experiment 1 in

Section 13.1), we also evaluate each dataset in their reversed form (Borodin et al.

2004). For each dataset, we create a reversed dataset, which reverses the original

order and inverts the price relatives. We denote these reverse datasets using a ‘−1’

superscript on the original dataset names. In nature, these reverse datasets are quite

different from the original datasets, and we are interested in the behaviors of the

proposed algorithms on such artiﬁcial datasets.

Unlike previous studies, the above testbed covers much longer trading peri-

ods from 1962 to 2010 and much more diversiﬁed markets, which enables us to

examine the behaviors of the proposed strategies under different events and crises.

For example, it covers several well-known events in the stock markets, such as the

∗

The constituents of the MSCI World Index are available on MSCI Barra (http://www.mscibarra.com),

accessed on 28 May 2010.

T&F Cat #K23731 — K23731_C012 — page 98 — 9/30/2015 — 16:46

SETUPS 99

dot-com bubble from 1995 to 2000 and the subprime mortgage crisis from 2007 to

2009. The ﬁve stock datasets are mainly chosen to test the capability of the pro-

posed algorithms on regional stock markets, while the index dataset aims to test their

capability on global indices, which may be potentially applicable to a fund of funds

(FOF).

∗

As a remark, although we numerically test the proposed algorithms on stock

and exchange traded funds (ETF) markets, we note that the proposed strategies could

be generally applied to any type of ﬁnancial market.

12.3 Setups

In our experiments, we implemented all the proposed approaches: CORN-U,

CORN-K, PAMR, PAMR-1, PAMR-2, CWMR-Var, CWMR-Stdev, OLMAR-1, and

OLMAR-2. For CWMR algorithms, we only present the results achieved by the

deterministic versions. The results of the stochastic versions are presented in Li et al.

(2013). Besides individual algorithms, we also designed their buy and hold (BAH)

versions whose results can be found on their respective studies (Li et al. 2011b,

2012, 2013; Li and Hoi 2012). Without ambiguity, when referring to CORN, PAMR,

CWMR, and OLMAR, we often focus on their representative versions, that is,

CORN-U, PAMR, CWMR-Stdev, and OLMAR-1, respectively.

As the proposed algorithms are all online, we follow the existing work and simply

set the parameters empirically without tuning for each dataset separately. Note that

the best values for these parameters are often dataset dependent, and our choices are

not always the best, as we will further evaluate in Section 13.3. Below, we introduce

the parameter settings of the proposed algorithms.

For the proposed CORN experts, two possible parameters can affect their perfor-

mance, that is, the correlation coefﬁcient threshold ρ and the window size w. In our

evaluations, we simply ﬁx ρ = 0.1 and W = 5 for the CORN-U algorithm, which is

not always the best. And for the CORN-K algorithm, we ﬁrst ﬁx W = 5, P = 10,

and K = 50, which means choose all experts in the experiments and denote it as

“CORN-K1.” We also provide “CORN-K2,” whose parameters are ﬁxed as W = 5,

P = 10, and K = 5.

There are two key parameters in the proposed PAMR algorithms. One is the

sensitivity parameter , and the other is the aggressiveness parameter C. Speciﬁcally,

for all datasets and experiments, we set the sensitivity parameter  to 0.5inthe

three algorithms, and set the aggressiveness parameter C to 500 in both PAMR-1

and PAMR-2, with which the cumulative wealth achieved tends to be stable on most

datasets. Our experiments on the parameter sensitivity show that the proposed PAMR

algorithms are quite robust with respect to different parameter settings.

CWMR has two key parameters, that is, the conﬁdence parameter φ and the

sensitivity parameter . We set the sensitivity parameter  to 0.5 and set the conﬁ-

dence parameter φ to 2.0, or equivalently 95% conﬁdence level, in both CWMR-Var

and CWMR-Stdev. As the results show, the proposed CWMR algorithm is generally

∗

Note that not every index is tradable through ETFs.

T&F Cat #K23731 — K23731_C012 — page 99 — 9/30/2015 — 16:46

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 12 - Implementations (1/2)

Create new playlist

Sign In

Sign Up

Table of Contents for
Chapter 12 - Implementations (1/2)