Non-relational databases are very widespread. Because the nature of the data is increasingly based on time series, this type of database has developed rapidly during the last decade. The best non-relational database for time series is called KDB. This database is designed to achieve performance with time series. There are many other competitors, including InfluxDB, MongoDB, Cassandra, TimescaleDB, OpenTSDB, and Graphite.
All of these databases have their pros and cons:
|
Pros |
Cons |
KDB |
High performance |
Price; very difficult to use because of a non-SQL language |
InfluxDB |
Free, performant, quick start |
Small community; poor performance analysis tool, no security |
MongoDB |
Faster than rational databases |
No data joins; slow |
Cassandra |
Faster than rational databases |
Unpredictable performance |
TimescaleDB |
SQL support |
Performance |
Graphite |
Free, widespread support |
Performance |
OpenTSDB |
Faster than rational databases |
Small number of features |
As shown in the table, it is difficult to choose an alternative to KDB. We will code an example of Python code using the KDB library, pyq. We will create an example similar to the one we created for PostGresSQL:
from pyq import q
from datetime import date
# This is the part to be run on kdb
#googdata:([]dt:();high:();low:();open:();close:();volume:(),adj_close:())
q.insert('googdata', (date(2014,01,2), 555.263550, 550.549194, 554.125916, 552.963501, 3666400.0, 552.963501))
q.insert('googdata', (date(2014,01,3), 554.856201, 548.894958, 553.897461, 548.929749, 3355000.0, 548.929749))
q.googdata.show()
High Low Open Close Volume Adj Close
Date
2014-01-02 555.263550 550.549194 554.125916 552.963501 3666400.0 552.963501
2014-01-03 554.856201 548.894958 553.897461 548.929749 3355000.0 548.929749
# This is the part to be run on kdb
# f:{[s]select from googdata where date=d}
x=q.f('2014-01-02')
print(x.show())
2014-01-02 555.263550 550.549194 554.125916 552.963501 3666400.0 552.963501
This code ends this section on data storage. This part is critical in the design of your backtester since the running time of your backtesting will enable you to save time so as to be able to run many more backtests to validate your trading strategy. Following this section on different ways of storing financial data, we will introduce how a backtester works.