Can you recommend one quick way to improve my model's performance?
Expecting your model to behave in the same way for SPY (a $280 stock that trades on average 86m shares per day) as it does for SPXT (a $50 stock that has an average daily volume of 600) is obviously inappropriate.
So the simplest thing to do is, once you have your basic model set up, split it into 2 or 4 different models.
One for NYSE, One for NASDAQ and another two way split either by Price, Volume or Price*Volume (DollarVol).
For 6/3/2019 is_symbol_qualified returned 8792 symbols broken down as follows...
By Exchange...
3431 NASDAQ 39.0%
3110 NYSE 35.4%
1588 ARCA 18.1%
324 BATS 3.7%
306 AMEX 3.5%
32 None Test Symbols
1 IEX IBKR - I didn't know about that!
by price we get...
$0-$5 15.3%
$5-$10 9.3%
$10-$15 10.5%
$15-$20 8.0%
$20-$25 11.0%
$25-$30 4.5%
$30+ 33.0%
The mid point to split the universe in half by price would be $23.56
By average volume (avol) we get...
..0k - 10k = 21.6%
10k - 20k = 8.2%
20k - 40k = 9.5%
40k - 100k = 12.4%
100k - 200k = 10.6%
200k - 1m = 22.0%
1m+ = 12.8%
The mid point to split the universe in two by volume would be around 90,000 shares avol.
DollarVol = Prev_Close * avol
$0 - $100k = 18.0%
$100k - $1m = 29.6%
$1m - $2m = 9.0%
$2m - $10m = 16.7%
$2m+ = 38.5%
$10m+ = 26.3%
DollarVol as Percent of Total DollarVol
SPY makes up 7.4% of the Dollar Volume traded = $275.27 * 23375500523 = $6,013,815,019,552.21 !
The next 10 symbols make up 12.8% of the total Dollar Volume traded (AMZN, QQQ, AAPL, BABA, EEM, MSFT, IWM, TSLA, FB, AMD)
To split the market into two groups using "DollarVol as a percentage of Total DollarVol" the break point is around $16,000,000,000,000,000 !! That would be just the top 127 symbols.
Just 21 symbols make up 25% of the DollarVol traded (SPY, AMZN, QQQ, AAPL, BABA, EEM, MSFT, IWM, TSLA, FB, AMD, HYG, NFLX, EFA, GOOGL, NVDA, BA, GOOG, QCOM, FXI, ROKU)
By exchange, DollarVol is broken out as follows...
NYSE 41.3%
NASDAQ 33.8%
ARCA 23.6%
OTHERS 1.3%
All I used to work out these values was the tiny script below.
from cloudquant.interfaces import Strategy
class stats(Strategy):
@classmethod
def is_symbol_qualified(cls, symbol, md, service, account):
print symbol,md.stat.avol,md.stat.exchange,md.stat.prev_close
return False
I took the output from the console and ran it through some simple pivot tables in Excel.
Obviously, you will already have some qualification criteria for your model but there is no harm in breaking the model into smaller groups.
You will find that your qualification criteria will change quite dramatically from group to group.