Tick Data for Backtesting and Algo Trading: Difference between revisions

From Volatility.RED
No edit summary
No edit summary
 
(25 intermediate revisions by 2 users not shown)
Line 1: Line 1:
I noticed that many guides or lists of tick data sources were out of date, broken, or otherwise not working, so I decided to make an updated list with modern sources for use as a resource with the algo trading community. I will also keep this up to date as new sources appear and old sources break.
I noticed that many guides or lists of tick data sources were out of date, broken, or otherwise not working, so I decided to make an updated list with modern sources for use as a resource with the [[Algorithmic_Trading | algo trading]] community. I will also keep this up to date as new sources appear and old sources break.


Below are the guides for downloading tick or bar level data for various markets which can be used for backtesting and analysis.
Below are the guides for downloading tick or bar level data for various markets which can be used for backtesting and analysis.
Line 7: Line 7:
____TOC____
____TOC____


=='''<p style="font-size:18px">Stocks [Daily Bars]'''</p>==
=='''Stock Daily Bar Data'''==


Yahoo Finance - US and International stock exchanges - On a stock by stock basis, you can download decades of historical daily bar data by doing the following steps:
===Yahoo Finance===


::1. Go to https://finance.yahoo.com
US and International stock exchanges - On a stock-by-stock basis, you can download decades of historical daily bar data by doing the following steps:
::2.In the search field, type in your desired symbol or company name and select the appropriate result
::3.Click the ''''Historical Data'''' tab
::4.Set the date range to ''''MAX''''
::5.Hit the ''''Apply'''' button
::6.Click the ''''Download Data'''' link just below the ''''Apply'''' button
::7.Enjoy your daily bar data


'''Google Finance -''' Rest in peace.. :( Google nuked their finance site back to the stone age many years ago so this service is pretty useless now (Honestly Google, what were you thinking? Google Finance was awesome back in 2012 and prior.) Google also used to have an API but that too got turned off. I included an entry for Google here because many people will be searching around the net and find references to using Google Finance's API but be unsure why it's not working as expected.
*Go to https://finance.yahoo.com
*In the search field, type in your desired symbol or company name and select the appropriate result
*Click the ''''Historical Data'''' tab
*Set the date range to ''''MAX''''
*Hit the ''''Apply'''' button
*Click the ''''Download Data'''' link just below the ''''Apply'''' button
*Enjoy your daily bar data


'''Cautious mention - QuantConnect - tick and minute bar data on stocks and forex.''' I'm currently reviewing QuantConnect as a potential source. They do offer a lot of free data but since their ecosystem was built around offering up crowd sourced strategies to prop firms I'm hesitant to recommend them until I've wrapped my head around their offering and can say I trust their platform.
===Google Finance===
Rest in peace.. :( Google nuked their finance site back to the stone age many years ago so this service is pretty useless now (Honestly Google, what were you thinking? Google Finance was awesome back in 2012 and prior.) Google also used to have an API but that too got turned off. I included an entry for Google here because many people will be searching around the net and find references to using Google Finance's API but be unsure why it's not working as expected.


===[https://pypi.org/project/yfinance/ yfinance Python Library]===


=='''<p style="font-size:18px">Forex [Tick Data]'''</p>==
yfinance aims to solve this problem by offering a reliable, threaded, and Pythonic way to download historical market data from Yahoo! finance.
''(Since tick data is available from multiple sources without cost, I will only mention bar data at the end.*)''


Jack from [https://fxgears.com/ FXGears.com] has provided a basic daily bar download script for yfinance in Python, but check out the yfinance docs for more advanced features:


'''<p style="font-size:18px">[https://fxgears.com/out/darwinex Darwinex]-</p>'''
Make sure you have yfinance installed. From your command line:
::Darwinex is easy but you'll need to have a live account. Anyone can [https://fxgears.com/out/darwinex/register signup for an account (live)] and head over to this page here once logged in: [https://fxgears.com/out/darwinex/tickdata Historical Tick Data Download]. From there, just click the '''"Request FTP Access"''' button to have your FTP login details generated for you. Darwinex provides tick level data all major and minor currency pairs, as well as index and commodity CFDs, with most of them going back to mid-late 2017. You don't need to deposit to have a live account at Darwinex, you just need to have a verified account with them, so there's little downside to enabling access to this high quality source of tick data. (As a side benefit, you can also get access to historical price data of their [http://fxgears.com/out/darwinex/explore Darwin Exchange], so you can model investing in other traders and model their strategy returns, which is pretty cool.)
<syntaxhighlight lang="PowerShell">
pip install yfinance
</syntaxhighlight>


<syntaxhighlight lang="python" line='line'>
import yfinance as yf
import datetime
import os


'''<p style="font-size:18px">[https://fxgears.com/out/pepperstone Pepperstone] / Integral / [https://www.truefx.com/truefx-historical-downloads/ TrueFX]''' -</p>'''
# define start date, go back further if you need but not all years are available for all symbols.
::Pepperstone relies on price feeds and liquidity very similar to Integral's offering, so similar that for a while Pepperstone used to publish Integral's data on their own FTP for easy customer access. Unfortunately, they stopped offering this FTP access some time ago (thus why most online tutorials referencing them are out of date) and the source of data that Pepperstone currently recommends is Integral's TrueFX data service offering. [https://www.truefx.com/truefx-historical-downloads/ A free account at TrueFX.com'] will get you access to tick data on most major and minor currency pairs from the past year only (Currently only 2019 YTD is available for free.) I encouraged Pepperstone to bring back their historical data access to regain access to years prior to 2019 and if they ever open back up then I'll update this spot here with access info.​
start_date = '2020-01-01'


end_date = datetime.date.today() - datetime.timedelta(days=1)
end_date = end_date.strftime(format="%Y-%m-%d")


'''<p style="font-size:18px">[https://fxgears.com/out/dukascopy Dukascopy] -</p>'''
symbols = ['SPY', 'IBM', 'BAC']
::This source is a staple in the forex tick data world. Dukascopy Bank has been around for a long time and has offered historical tick data on their products with dates going back into the mid-early 2000's. Like the other sources, you'll need signup to their site, but they do not require a live account to access data. To access, all you need to do is visit the [https://fxgears.com/out/dukascopy/tickdata Dukascopy Bank Historical Data Feed page], and use the historical download tool provided (with registering or logging in when prompted before it will let you download.) The download tool makes it easy to select what data you want, but note that all non-forex listings here (like stocks) are CFDs on Duka's platform, NOT the underlying stock data that trades on a stock exchange. (If you're interested in Dukascopy but need to trade under Euro zone regulations, you can find their [https://fxgears.com/out/dukascopy/europe Europe subsidiary here] with different trading conditions.)​


for item in symbols:
    working_df = yf.download(item, start=start_date, end=end_date)
    working_df.reset_index(inplace=True)
    working_df.to_csv('{}_{}.csv'.format(item, end_date), index=False)
</syntaxhighlight>


<nowiki>*</nowiki>Regarding Bar type data - Downscale it from the tick data above as needed. However, if you really only want daily bar data, you can find it from Oanda's website, or others pretty easily.
Note: You will have to know the Yahoo! Finance symbology to download your target symbols. (It's sometimes not intuitive, "CAD=X" is USD/CAD for example.)


=='''Forex [Tick Data]'''==


=='''<p style="font-size:18px">Crypto</p>'''==
Quality of data matters and the below options are some of the better quality sources available for free.


===[https://geni.us/Darwinex Darwinex]===
::Darwinex is easy but you'll need to have a live account. Anyone can [https://geni.us/Darwinex signup for an account (live)] and head over to this page here once logged in: [https://geni.us/DarwinexTickData Historical Tick Data Download]. From there, just click the '''"Request FTP Access"''' button to have your FTP login details generated for you. Darwinex provides tick level data for all major and minor [[currency]] pairs, as well as index and commodity CFDs, with most of them going back to mid-late 2017. You don't need to deposit to have a live account at Darwinex, you just need to have a verified account with them, so there's little downside to enabling access to this high-quality source of tick data. (As a side benefit, you can also get access to historical price data of their Darwin Exchange, so you can model investing in other [[Trader_Types | traders]] and model their [[Technical_Trading_Strategies | strategy]] returns, which is pretty cool.)


'''[https://fxgears.com/out/binance Binance -]''' One of the top exchanges by volume with easily one of the largest collections of crypto pairs to trade. Binance is very API friendly, and is structured in a way that even trading smaller amounts can be done efficiently. Recently, Binance opened up trading on leveraged products (spot, and futures,) as well as started offering options.
===[https://geni.us/Pepperstone Pepperstone] / Integral / [https://www.truefx.com/truefx-historical-downloads/ TrueFX]===


https://www.truefx.com/truefx-historical-downloads/


You're going to need an account, which is free to get started, so if you don't already have an account [https://fxgears.com/out/binance you can register here.]
::Pepperstone relies on price feeds and liquidity very similar to Integral's offering, so similar that for a while Pepperstone used to publish Integral's data on their own FTP for easy customer access. Unfortunately, they stopped offering this FTP access some time ago (thus why most online tutorials referencing them are out of date) and the source of data that Pepperstone currently recommends is Integral's TrueFX data service offering. [https://www.truefx.com/truefx-historical-downloads/ A free account at TrueFX.com'] will get you access to tick data on most major and minor [[currency]] pairs from the past year only (Currently only 2019 YTD is available for free.) I encouraged Pepperstone to bring back their historical data access to regain access to years prior to 2019 and if they ever open back up then I'll update this spot here with access info.​
 
=='''Crypto Minute Bar Data'''==
 
 
===[https://geni.us/TickDataBinance Binance]===
 
One of the top exchanges by volume with easily one of the largest collections of crypto pairs to [[Trading | trade]]. Binance is very API friendly and is structured in a way that even trading smaller amounts can be done efficiently. Recently, Binance opened up trading on leveraged products (spot, and futures,) as well as started offering options.
 
 
You're going to need an account, which is free to get started, so if you don't already have an account [https://geni.us/TickDataBinance you can register here.]




Line 56: Line 84:




On the following page, find the create field and enter any nick name you want:<br>
On the following page, find the Create field and enter any nickname you want:<br>
https://i.imgur.com/RPmtR41.png
https://i.imgur.com/RPmtR41.png


Line 62: Line 90:
Hit the create button. You'll likely have to authenticate or verify via email that you want to generate an API key, but once done you will see the generated API key itself, and the API secret code as well. Save these somewhere secure and safe.
Hit the create button. You'll likely have to authenticate or verify via email that you want to generate an API key, but once done you will see the generated API key itself, and the API secret code as well. Save these somewhere secure and safe.


Next up we need to setup Python on our system to do the leg work of downloading data. Head over to https://python.org and grab the latest version of Python 3 for your OS. Install said package.
Next up we need to set up Python on our system to do the leg work of downloading data. Head over to https://python.org and grab the latest version of Python 3 for your OS. Install said package.


Once installed we will add the python-binance and pandas packages to access the Binance API from within Python and to save the resulting data to a CSV file.
Once installed we will add the python-binance and pandas packages to access the Binance API from within Python and to save the resulting data to a CSV file.
Line 68: Line 96:
To do this, open a [https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-r2-and-2012/jj717276(v%3Dws.11) Open a CMD (Command Line) window as Administrator on Windows] (or the appropriate terminal in Linux / OSX..) and run the following commands:
To do this, open a [https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-r2-and-2012/jj717276(v%3Dws.11) Open a CMD (Command Line) window as Administrator on Windows] (or the appropriate terminal in Linux / OSX..) and run the following commands:


 
Install python-binance python library and pandas from your command line:
{| class ="wikitable"
<syntaxhighlight lang="PowerShell">
| style="background-color: #ccf;" |
 
pip install python-binance
pip install python-binance
pip install pandas
pip install pandas
 
</syntaxhighlight>
|}




Once that's done, copy the code below to a new python file and save it somewhere handy:
Once that's done, copy the code below to a new python file and save it somewhere handy:


<syntaxhighlight lang="python" line='line'>
import pandas as pd
from binance.client import Client
import datetime


{| class ="wikitable"
# YOUR API KEYS HERE
| style="background-color: #ccf;" |
api_key = ""    #Enter your own API-key here
Python:
api_secret = "" #Enter your own API-secret here
 
<span style="color:Purple">import</span> pandas <span style="color:Brown">as</span> pd<br>
<span style="color:Brown">from</span> binance.client import Client<br>
<span style="color:Brown">import</span> datetime<br>
 
 
<span style="color:DarkSalmon"><nowiki>#</nowiki> YOUR API KEYS HERE</span> <br>
api_key = ""    <span style="color:DarkSalmon">#Enter your own API-key here</span> <br>
api_secret = ""   <span style="color:DarkSalmon">#Enter your own API-secret here</span>
 


bclient = Client(api_key=api_key, api_secret=api_secret)
bclient = Client(api_key=api_key, api_secret=api_secret)


 
start_date = datetime.datetime.strptime('1 Jan 2016', '%d %b %Y')
start_date = datetime.datetime.strptime(<span style="color:Brown">'1 Jan 2016', '%d %b %Y'</span>)
today = datetime.datetime.today()
today = datetime.datetime.today()


def binanceBarExtractor(symbol):
    print('working...')
    filename = '{}_MinuteBars.csv'.format(symbol)


<span style="color:Purple">def</span> <span style="color:Blue">binanceBarExtractor</span>(symbol):
    klines = bclient.get_historical_klines(symbol, Client.KLINE_INTERVAL_1MINUTE, start_date.strftime("%d %b %Y %H:%M:%S"), today.strftime("%d %b %Y %H:%M:%S"), 1000)
::<span style="color:Purple">print</span>(<span style="color:Brown">'working...'</span>)
    data = pd.DataFrame(klines, columns = ['timestamp', 'open', 'high', 'low', 'close', 'volume', 'close_time', 'quote_av', 'trades', 'tb_base_av', 'tb_quote_av', 'ignore' ])
::filename = <span style="color:Brown">'{}_MinuteBars.csv'</span>.<span style="color:Blue">format</span>(symbol)
    data['timestamp'] = pd.to_datetime(data['timestamp'], unit='ms')


    data.set_index('timestamp', inplace=True)
    data.to_csv(filename)
    print('finished!')


::klines; = bclient.get_historical_klines(symbol, Client.KLINE_INTERVAL_1MINUTE, start_date.strftime(<span style="color:Brown">"%d %b %Y %H:%M:%S"</span>), today.strftime(<span style="color:Brown">"%d %b %Y %H:%M:%S"</span>), <span style="color:Green">1000</span>)
::data = pd.DataFrame(klines, columns = [<span style="color:Brown">'timestamp'</span>, '<span style="color:Brown">open</span>', '<span style="color:Brown">high</span>', '<span style="color:Brown">low</span>', '<span style="color:Brown">close</span>', '<span style="color:Brown">volume</span>', '<span style="color:Brown">close_time</span>', '<span style="color:Brown">quote_av</span>', '<span style="color:Brown">trades</span>', '<span style="color:Brown">tb_base_av</span>', '<span style="color:Brown">tb_quote_av</span>', '<span style="color:Brown">ignore</span>' ])
::data[<span style="color:Brown">'timestamp'</span>] = pd.to_datetime(data[<span style="color:Brown">'timestamp'</span>], unit=<span style="color:Brown">'ms'</span>)


 
if __name__ == '__main__':
::data.set_index(<span style="color:Orange">'timestamp'</span>, inplace=<span style="color:Blue">True,</span>)
    # Obviously replace BTCUSDT with whichever symbol you want from binance
 
    # Wherever you've saved this code is the same directory you will find the resulting CSV file
 
    binanceBarExtractor('BTCUSDT')
::data.to_csv(filename)
</syntaxhighlight>
::<span style="color:Purple">print</span>(<span style="color:Brown">'finished!'</span>)
 
 
<span style="color:Purple">if</span> __name__ == <span style="color:Brown">'__main__'</span>:
::<span style="color:DarkSalmon"># Obviously replace BTCUSDT with whichever symbol you want from binance</span>
::<span style="color:DarkSalmon"># Wherever you've saved this code is the same directory you will find the resulting CSV file</span>
::binanceBarExtractor(<span style="color:Brown">'BTCUSDT'</span>)}}
|}




Line 136: Line 148:




Done! Enjoy a few million lines of 1m data per symbol on any crypto currency pair that [https://fxgears.com/out/binance binance] has listed! Enjoy!:
Done! Enjoy a few million lines of 1m data per symbol on any cryptocurrency pair that [https://geni.us/TickDataBinance binance] has listed! Enjoy!:


https://i.imgur.com/HWHxOyd.png
https://i.imgur.com/HWHxOyd.png


===BitMEX (Crypto Futures)===


'''<p style="font-size:18px">[https://fxgears.com/out/bitmex BitMEX -]</p>'''One of the top exchanges by volume (number 1 at the time of writing,) with plenty of spot, futures, and alt coin options. Their API is pretty easy to use and there are many wrappers out there for commonly used languages. I'll be using Python for this example since that's my language of choice.
BitMEX provides public access to their quote and [[Trading | trade]] (transaction) level data via a web/file server here:  
 
https://public.bitmex.com/
 
You're going to need to register an account with BitMEX in order to use my code below since it relies on the higher message rate limit that registered accounts are afforded over non-authenticated sessions. [https://fxgears.com/out/bitmex You can register an account here if you need.] (Plus new users referred by us get 10% off commissions for 6 months if you decide to trade.)
 
 
Once you've registered an account and logged in to their web interface, you'll need to generate an API key to use with the script I'll provide later. To set that up. Click your user name or email link in the top right corner of the website, then click the "Account & Security" button on the resulting drop-down menu:
 
https://i.imgur.com/Un2QlC8.png
 
 
From the account settings page that you'll now be looking at, select the "API Keys" link on the left hand side:
https://i.imgur.com/NdaA8Zu.png
 
 
Finally, give your key a nick name in the "Name" filed, and click "Create API Key" at the bottom of this page.
 
 
Copy down your newly generated Key and Secret for use later!
 
 
Next up we need to setup Python on our system to do the leg work of downloading data. Head over to [https://python.org/ https://python.org] and grab the latest version of Python 3 for your OS. Install said package.
 
 
Once installed we will add the BitMEX API wrapper for python, pandas, and pytz, to access the BitMEX API from within Python and to save the resulting data to a CSV file.
 
 
To do this, open a [https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-r2-and-2012/jj717276(v%3Dws.11) Open a CMD (Command Line) window as Administrator on Windows] (or the appropriate terminal in Linux / OSX..) and run the following commands:
 
{| class ="wikitable"
| style="background-color: #ccf;" |
 
pip install bitmex
 
pip install pandas
 
pip install pytz
 
|}
 
Once that's done, copy the code below to a new python file and save it somewhere handy:
{| class ="wikitable"
| style="background-color: #ccf;" |
<span style="color:Orange">Python:</span>
 
<span style="color:Purple">import</span> pandas <span style="color:Purple">as</span> pd <br>
<span style="color:Purple">from</span> bitmex
<span style="color:Purple">import</span> bitmex  <br>
<span style="color:Purple">import</span> datetime <br>
<span style="color:Purple">import</span> pytz <br>
<span style="color:Purple">import</span> time <br>
 
 
<span style="color:DarkSalmon"># YOUR API KEYS HERE</span><br>
api_id = <span style="color:DarkSalmon">""    #Enter your own API-key here</span><br>
api_secret = <span style="color:DarkSalmon">""    #Enter your own API-secret here</span>
 
 
bclient =  bitmex(test=<span style="color:Blue">False</span>, api_key=api_id, api_secret=api_secret)
 
 
<span style="color:Purple">def</span> <span style="color:Blue">bitmexBarExtractor</span>(symbol):<br>
::start_date = datetime.datetime.strptime(<span style="color:Brown">'1 Jan 2015'</span>, <span style="color:Brown">'%d %b %Y'</span>)<br>
::<span style="color:Purple">print</span>(<span style="color:Brown">'working...'</span>)<br>
::filename = '<span style="color:Brown">{}_MinuteBars_BitMEX.csv'</span>.<span style="color:Blue">format</span>(symbol)<br>
::staging = []
 
::<span style="color:Purple">while</span> start_date.replace(tzinfo=pytz.utc) < (datetime.datetime.utcnow().replace(tzinfo=pytz.utc) - datetime.timedelta(days=1)):<br>
:::start_point = time.time()<br>
:::<span style="color:Purple">print</span>(<span style="color:Brown">'processing from: {}</span>'.<span style="color:Blue">format(start_date))<br>
:::klines = bclient.Trade.Trade_getBucketed(symbol=<span style="color:Brown">'XBTUSD'</span>, binSize=<span style="color:Brown">'1m'</span>, count=1000, startTime=start_date, endTime=datetime.datetime.utcnow()).result()[<span style="color:Green">0</span>]
:::<span style="color:Purple">if</span> <span style="color:Blue">len</span>(klines) == 0: <span style="color:DarkSalmon"># no data, start date reference too early.</span>
::::start_date = start_date + datetime.timedelta(weeks=<span style="color:Green">4.5</span>)
:::<span style="color:Purple">else</span>:
::::start_date = klines[<span style="color:Blue">len</span>(klines)-1][<span style="color:Brown">'timestamp'</span>].replace(tzinfo=pytz.utc)
::::<span style="color:Purple">for</span> item <span style="color:Brown">in</span> klines:
:::::staging.append(item)
 
 
:::end_point = time.time()
:::diff_time = end_point - start_point
 
 
:::<span style="color:Purple">if</span> diff_time < 1: <span style="color:DarkSalmon"><nowiki>#</nowiki>if less than 1 second, sleep the difference so we don't trigger the rate limiter on BitMEX's end.</span><br>
::::<span style="color:DarkSalmon"><nowiki>#</nowiki>print('sleeping for {}ms'.format((1 - diff_time + 0.010) * 100))</span>
::::time.sleep((<span style="color:Green">1</span> - diff_time + <span style="color:Green">0.010</span>))
 
 
::data = pd.DataFrame(staging)
 
 
::data[<span style="color:Brown">'timestamp'</span>] = pd.to_datetime(data[<span style="color:Brown">'timestamp'</span>], unit=<span style="color:Brown">'ms'</span>)
::data = data.drop_duplicates()
 
 
::data = data.dropna(axis=0, subset=[<span style="color:Brown">'open'</span>]) <span style="color:DarkSalmon"># Clean up rows with no price data because BitMEX server has iffy data sometimes.</span><br>
 
 
::data.set_index(<span style="color:Brown">'timestamp'</span>, inplace=<span style="color:Blue">True</span>)<br>
 
 
::data.to_csv(filename)
::<span style="color:Purple">print</span>(<span style="color:Brown">'finished!'</span>)<br>
::<span style="color:Purple">return</span> data<br>
 
 
if __name__ == <span style="color:DarkSalmon">'__main__'</span>:<br>
::bitmexBarExtractor(<span style="color:DarkSalmon">'XBTUSD'</span>)
|}
 
 
Where you see the api_key, and api_secret variables near the beginning of the file, you will add the generated API key and API secret from your BitMEX account '''BETWEEN the quotes.'''
 
 
Finally, at the bottom on the last line, change the 'XBTUSD' to whichever crypto pair you desire to extract from BitMEX... browse around BitMEX's exchange to see all your options.


However, this is for crypto Futures unique to their [[The_Best_and_Most_Popular_Forex_Platforms_for_Demo | platform]], not the underlying crypto currencies themselves or a fungible asset that's tradable on other exchanges. Be aware of this when designing any [[Technical_Trading_Strategies | strategy]] that's "tick" or "price" sensitive.


Run the code... the script will write a CSV file to the same directory you saved the code in, and these files will be ~250mb in size depending on how far back their history goes. As for timing, '''for XBTUSD it takes my connection about 45 minutes to grab minute bar data going all the way back to mid-2015''' (you've been warned :p).. so don't hold your breath. This is because BitMEX rate limits their historical data requests.


=='''Related Wikis'''==


Done! Enjoy a few million lines of 1m data per symbol on any crypto currency pair that [https://fxgears.com/out/bitmex BitMEX] has listed! Enjoy!
Readers of '''Editing Tick Data for Backtesting and [[Algorithmic_Trading | Algo Trading]]''' also viewed:


https://i.imgur.com/BSmb5Yn.png
* [[Algorithmic Trading]]
* [[Best_and_Worst_Times_to_Trade | Best and Worst Times to Trade]]
* [[Technical Trading Strategies]]
* [[Sentiment_Analysis#Volatility | Volatility]]
* [[Developing_your_Trading_Process | Developing your Trading Process]]

Latest revision as of 10:27, 6 October 2023

I noticed that many guides or lists of tick data sources were out of date, broken, or otherwise not working, so I decided to make an updated list with modern sources for use as a resource with the algo trading community. I will also keep this up to date as new sources appear and old sources break.

Below are the guides for downloading tick or bar level data for various markets which can be used for backtesting and analysis.

Note: Typical format for tick and bar data is in CVS files. You might need to adjust or re-arrange said files as needed to fit into your desired analytics and/or backtesting program.

__

__

Stock Daily Bar Data

Yahoo Finance

US and International stock exchanges - On a stock-by-stock basis, you can download decades of historical daily bar data by doing the following steps:

  • Go to https://finance.yahoo.com
  • In the search field, type in your desired symbol or company name and select the appropriate result
  • Click the 'Historical Data' tab
  • Set the date range to 'MAX'
  • Hit the 'Apply' button
  • Click the 'Download Data' link just below the 'Apply' button
  • Enjoy your daily bar data

Google Finance

Rest in peace.. :( Google nuked their finance site back to the stone age many years ago so this service is pretty useless now (Honestly Google, what were you thinking? Google Finance was awesome back in 2012 and prior.) Google also used to have an API but that too got turned off. I included an entry for Google here because many people will be searching around the net and find references to using Google Finance's API but be unsure why it's not working as expected.

yfinance Python Library

yfinance aims to solve this problem by offering a reliable, threaded, and Pythonic way to download historical market data from Yahoo! finance.

Jack from FXGears.com has provided a basic daily bar download script for yfinance in Python, but check out the yfinance docs for more advanced features:

Make sure you have yfinance installed. From your command line:

pip install yfinance
import yfinance as yf
import datetime
import os

# define start date, go back further if you need but not all years are available for all symbols.
start_date = '2020-01-01'

end_date = datetime.date.today() - datetime.timedelta(days=1)
end_date = end_date.strftime(format="%Y-%m-%d")

symbols = ['SPY', 'IBM', 'BAC']

for item in symbols:
    working_df = yf.download(item, start=start_date, end=end_date)
    working_df.reset_index(inplace=True)
    working_df.to_csv('{}_{}.csv'.format(item, end_date), index=False)

Note: You will have to know the Yahoo! Finance symbology to download your target symbols. (It's sometimes not intuitive, "CAD=X" is USD/CAD for example.)

Forex [Tick Data]

Quality of data matters and the below options are some of the better quality sources available for free.

Darwinex

Darwinex is easy but you'll need to have a live account. Anyone can signup for an account (live) and head over to this page here once logged in: Historical Tick Data Download. From there, just click the "Request FTP Access" button to have your FTP login details generated for you. Darwinex provides tick level data for all major and minor currency pairs, as well as index and commodity CFDs, with most of them going back to mid-late 2017. You don't need to deposit to have a live account at Darwinex, you just need to have a verified account with them, so there's little downside to enabling access to this high-quality source of tick data. (As a side benefit, you can also get access to historical price data of their Darwin Exchange, so you can model investing in other traders and model their strategy returns, which is pretty cool.)

Pepperstone / Integral / TrueFX

https://www.truefx.com/truefx-historical-downloads/

Pepperstone relies on price feeds and liquidity very similar to Integral's offering, so similar that for a while Pepperstone used to publish Integral's data on their own FTP for easy customer access. Unfortunately, they stopped offering this FTP access some time ago (thus why most online tutorials referencing them are out of date) and the source of data that Pepperstone currently recommends is Integral's TrueFX data service offering. A free account at TrueFX.com' will get you access to tick data on most major and minor currency pairs from the past year only (Currently only 2019 YTD is available for free.) I encouraged Pepperstone to bring back their historical data access to regain access to years prior to 2019 and if they ever open back up then I'll update this spot here with access info.​

Crypto Minute Bar Data

Binance

One of the top exchanges by volume with easily one of the largest collections of crypto pairs to trade. Binance is very API friendly and is structured in a way that even trading smaller amounts can be done efficiently. Recently, Binance opened up trading on leveraged products (spot, and futures,) as well as started offering options.


You're going to need an account, which is free to get started, so if you don't already have an account you can register here.


From your account page, then you'll need to generate API keys to access the API and pull down data:
PKtpYr9.png


On the following page, find the Create field and enter any nickname you want:
RPmtR41.png


Hit the create button. You'll likely have to authenticate or verify via email that you want to generate an API key, but once done you will see the generated API key itself, and the API secret code as well. Save these somewhere secure and safe.

Next up we need to set up Python on our system to do the leg work of downloading data. Head over to https://python.org and grab the latest version of Python 3 for your OS. Install said package.

Once installed we will add the python-binance and pandas packages to access the Binance API from within Python and to save the resulting data to a CSV file.

To do this, open a Open a CMD (Command Line) window as Administrator on Windows (or the appropriate terminal in Linux / OSX..) and run the following commands:

Install python-binance python library and pandas from your command line:

pip install python-binance
pip install pandas


Once that's done, copy the code below to a new python file and save it somewhere handy:

import pandas as pd
from binance.client import Client
import datetime

# YOUR API KEYS HERE
api_key = ""    #Enter your own API-key here
api_secret = "" #Enter your own API-secret here

bclient = Client(api_key=api_key, api_secret=api_secret)

start_date = datetime.datetime.strptime('1 Jan 2016', '%d %b %Y')
today = datetime.datetime.today()

def binanceBarExtractor(symbol):
    print('working...')
    filename = '{}_MinuteBars.csv'.format(symbol)

    klines = bclient.get_historical_klines(symbol, Client.KLINE_INTERVAL_1MINUTE, start_date.strftime("%d %b %Y %H:%M:%S"), today.strftime("%d %b %Y %H:%M:%S"), 1000)
    data = pd.DataFrame(klines, columns = ['timestamp', 'open', 'high', 'low', 'close', 'volume', 'close_time', 'quote_av', 'trades', 'tb_base_av', 'tb_quote_av', 'ignore' ])
    data['timestamp'] = pd.to_datetime(data['timestamp'], unit='ms')

    data.set_index('timestamp', inplace=True)
    data.to_csv(filename)
    print('finished!')


if __name__ == '__main__':
    # Obviously replace BTCUSDT with whichever symbol you want from binance
    # Wherever you've saved this code is the same directory you will find the resulting CSV file
    binanceBarExtractor('BTCUSDT')


Where you see the api_key, and api_secret variables near the beginning of the file, you will add the generated API key and API secret from your binance account BETWEEN the quotes.


Finally, at the bottom on the last line, change the 'BTCUSDT' to whichever crypto pair you desire to extract from Binance... browse around Binance's exchange to see all your options.


Run the code... the script will write a CSV file to the same directory you saved the code in, and these files will be ~100-200mb in size depending on how far back their history goes. As for timing, for BTCUSDT it takes my connection about 10-12 minutes to grab minute bar data going all the way back to ~2017.. so don't hold your breath. This is because Binance rate limits their historical data requests.


Done! Enjoy a few million lines of 1m data per symbol on any cryptocurrency pair that binance has listed! Enjoy!:

HWHxOyd.png

BitMEX (Crypto Futures)

BitMEX provides public access to their quote and trade (transaction) level data via a web/file server here: https://public.bitmex.com/

However, this is for crypto Futures unique to their platform, not the underlying crypto currencies themselves or a fungible asset that's tradable on other exchanges. Be aware of this when designing any strategy that's "tick" or "price" sensitive.


Related Wikis

Readers of Editing Tick Data for Backtesting and Algo Trading also viewed: