Revolutionizing Trading: Master Quant Strategies with Python
Written on
In prior chapters, we explored acquiring high-quality data and calculating Bayesian probabilities to identify the most lucrative bets and their associated risks. We also practiced backtesting through Monte Carlo simulations to refine our trading algorithms. However, there are numerous other techniques to enhance our algorithms, and Deep Reinforcement Learning stands out as a promising avenue for advancing algorithmic trading by merging deep learning's capabilities with reinforcement learning's efficiency.
It’s essential to recognize that trading resembles a game. No domain is more suited for reinforcement learning than gaming, so why not allow our algorithms to engage in this "game," learn from it, and subsequently guide us on how to win with a set of successful strategies?
In this chapter, we will delve into how to achieve this.
Introducing Gymnasium
OpenAI, a leader in artificial intelligence, created Gym—a library specifically designed for reinforcement learning. This initiative marked a significant advance due to its user-friendly nature, later evolving into Gymnasium, which has benefited from a vibrant developer community that has made it even more robust. Currently, it’s evident that integrating decision optimization into AI is crucial for achieving Artificial General Intelligence (AGI). Given its intuitive design and the credibility of its creator, Gymnasium is arguably the premier library for reinforcement learning today.
Introducing the VIX Index
In earlier chapters, we developed a successful day trading approach. Now, I wish to explore other strategies that are better suited for longer durations but are equally rewarding.
> For long-term investors, it’s clear that stocks tend to appreciate over time. The secret to succeeding with long-term strategies is identifying when the market is poised for a downturn. By recognizing these critical moments and adapting accordingly, we can profit significantly in both bullish and bearish markets.
There are various popular metrics traders analyze, including fundamental indicators like the price-to-earnings ratio and technical indicators such as moving averages. However, for long-term investments, you need a tool that offers foresight, and there is no better tool than the VIX index.
The VIX, or volatility index, serves as the most effective gauge of market anxiety. While I won’t delve into its complex mathematical details, it’s crucial to grasp this key concept: The VIX is derived from option prices; higher option prices reflect greater anticipated volatility as investors are willing to pay more to shield against potential price movements. Thus, when the VIX rises, it signals an increase in investor panic, making a market crash more probable.
Researchers from Tilburg University in the Netherlands have highlighted significant correlations between the VIX index and long-term stock market returns in their studies.
To substantiate this with actual data, let’s gather some information:
!pip install yfinance from ctypes import addressof import yfinance as yf import pandas as pd
# Fetch the S&P 500 data from Yahoo Finance sp500 = yf.Ticker("^GSPC") sp500_history = sp500.history(period="max", interval="1mo")
df = pd.DataFrame(sp500_history.tail(400)['Close']) df = df[2:-8] data = pd.read_csv('VIXCLS.csv') data = data[17:] df['vix'] = [i for i in data['VIXCLS']] df['SP500_Returns'] = df['Close'].pct_change() df = df.dropna() df
The above code generates a dataframe containing monthly VIX data alongside S&P 500 returns.
Now, we can visualize these together to observe the inverse relationship:
import matplotlib.pyplot as plt plt.scatter(df['vix'], df['SP500_Returns'])
As illustrated, when the VIX exceeds approximately 25, negative returns become more prevalent than positive ones, indicating that higher VIX values increase the likelihood of stock market declines. This aligns with our expectations, as the VIX fundamentally measures market fear.
Next, let’s examine the VIX against the S&P 500 from 2000 to the present:
# Code for plotting VIX against S&P 500 would go here
The graphs reveal that when the VIX spikes, signaling investor anxiety, the S&P 500 tends to decline, resulting in a bear market.
Now, consider this long-term strategy:
Imagine utilizing an ETF that mirrors the S&P 500 while borrowing 100% leverage from your broker (options like Degiro or IBKR facilitate this). What would our long-term profits look like?
With this strategy, our profits would potentially double that of the S&P 500, simply by leveraging our position.
Moreover, if we used part of that leverage to establish short positions as a hedge during high VIX scenarios, our profits would rise even further, as these short positions would yield returns during market declines, safeguarding our portfolio.
This is precisely the strategy we will implement in this chapter.
Creating a Reinforcement Learning Algorithm
Next, we need to define a reinforcement learning model that incorporates our strategy. We will treat this situation like a game, setting the rules of engagement and allowing Gym to handle the intricate calculations to guide us toward success.
We will assess market conditions to determine whether they are bullish, bearish, or ambiguous. Additionally, we can utilize tools like leverage and long/short positions to balance our portfolio and mitigate risks.
> Based on this framework, I propose a long-short strategy that leverages our position, enabling us to increase exposure during low VIX levels while utilizing short positions to shield ourselves when the VIX rises.
The central question is: How much of your portfolio should be allocated to hedging positions at any given moment?
This leads us to an optimization challenge, and we will employ reinforcement learning to repeatedly simulate the game, determining the optimal risk allocation for maximum returns.
Here is how we can set this up:
!pip install stable-baselines3 gymnasium shimmy from gym import spaces import pandas as pd import numpy as np
# Define the possible long allocations from 0 to 2 (where 2 represents 200% allocation) long_values = [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2] print(len(long_values))
class TradingEnvironment(gym.Env):
def __init__(self, df):
super(TradingEnvironment, self).__init__()
self.df = df
self.observation_space = spaces.Box(low=self.df['vix'].min(), high=self.df['vix'].max(), shape=(1,), dtype=np.float32)
self.action_space = spaces.Discrete(len(long_values))
self.c = 339 # Initial capital set to the value of the S&P 500 at the start for simplicity
self.current_step = 0
self.max_steps = len(df) - 1
def reset(self):
self.current_step = 0
self.c = 339
return np.array([self.df['vix'].iloc[self.current_step]])
def returns(long, short, sp, c):
"""Calculate returns for the current step."""
if sp >= 0:
l = sp * long * c
s = -sp * short * c
return c + l + s
if sp <= 0:
l = sp * long * c
if abs(l) > c:
l = 0s = -sp * short * c
return c + s + l
def step(self, action):
"""Calculate rewards at each step based on the action taken."""
if self.current_step >= self.max_steps:
done = True
print('last_step')
return np.array([self.df['vix'].iloc[self.current_step]]), 0, done, {}
done = False
reward = 0
l = long_values[action]
s = 2 - l
# Calculate reward
new_c = returns(l, s, self.df.iloc[self.current_step]['SP500_Returns'], self.c)
reward = new_c - self.c
print(reward, ' ----- > ', self.c, ' ------ > ', self.current_step)
self.c = new_c
self.current_step += 1
obs = np.array([self.df['vix'].iloc[self.current_step]])
return obs, reward, done, {}
Now that we’ve established this trading game, we can start running simulations:
import gymnasium as gym from gymnasium import spaces import pandas as pd import numpy as np from stable_baselines3 import PPO
# Create the trading environment env = TradingEnvironment(df)
# Initialize PPO agent model = PPO("MlpPolicy", env, verbose=1)
# Train the agent model.learn(total_timesteps=1000000) # Adjust total_timesteps as needed
This training process may take a while, but it will eventually yield a trained model. The output will display the rewards (profits for the current step), the capital (current funds), and the step (current month).
Once trained, you can save and utilize the model as follows:
model.save("trained_trading_model") obs = env.reset() # Reset the environment to its initial state
done = False cumulative_reward = 0 episode_length = 0 capital = [] actions = []
while not done:
action, _states = model.predict(obs)
obs, reward, done, _ = env.step(action)
cumulative_reward += reward
capital.append(env.c)
actions.append(action)
observations = []
episode_length += 1
final_portfolio_value = env.c
Essentially, our algorithm simulated investing a capital amount of c over 1,000,000 iterations from 1991 to 2023, varying the long and short positions throughout. Each simulation recorded the actions taken and the resulting profits, termed rewards. The final trained model retains the actions that yielded the highest rewards—where actions refer to the long and short positions.
Understanding Results
At this stage, you can generate a column in the dataframe to visualize the optimal long and short positions identified by our reinforcement learning algorithm based on each VIX value. Additionally, I’ve included a “Value” column that corresponds to the price-to-earnings ratio at that time, illustrating that PE ratios and VIX can diverge significantly at times. Many traders base their decisions on PE ratios, yet the correlation between “PE ratio and next month’s returns” is often much weaker than “VIX and next month’s returns.”
> Frequently, the PE ratio may be elevated, suggesting stocks are overpriced, while the VIX remains low, indicating a lack of market panic. Consequently, investors who sell based on high PE ratios might miss substantial profits during bullish periods—a scenario that can be avoided by paying attention to the VIX.
actions = np.array(actions).flatten() df['actions'] = actions df
We can also visualize the actions against the VIX to determine optimal strategy values:
# Code for plotting actions vs VIX would go here
The observations indicate that the RL algorithm suggests long positions nearing 200% (full allocation to long positions) when the VIX is low. As the VIX surpasses 20, it starts favoring short positions with a similar allocation. Intermediate values are also present.
If the model continues to train over a more extended period, we anticipate a sigmoidal function that shifts rapidly from long to short positions around a VIX value between 20 and 30.
While the VIX index proves highly valuable, it can be further enhanced by integrating Bayesian probabilities from earlier chapters to provide additional input to the RL model, potentially amplifying your profits.
I implemented this approach and developed a model named Bayesian Risk Balancer, which achieved returns exceeding the S&P 500 by 550% from 1991 to 2024.
The following plot illustrates returns from the combined results of the pure RL model:
# Code for plotting returns would go here
And here’s a plot showing the results of RL in conjunction with Bayesian inference, delivering a more balanced outcome and mitigating extreme fluctuations:
# Code for plotting combined results would go here
If you're interested in accessing the outputs of this model and exploring additional exclusive content, I invite you to follow my Patreon account below.
Quantitative Trading Strategies
Thank you for reading through to the end. Before you leave:
- Visit my Patreon Site: https://patreon.com/ModernAiTrading for more in-depth content on quantitative trading, access to fully implemented algorithms, and opportunities to interact with me during Q&A sessions.
- Check out my YouTube Channel: https://youtube.com/@ModernaiTrading