Backtesting a Crossover Moving Average Strategy Algorithm in the Forex Market.

Introduction

This post aims to explore the effectiveness of a straightforward Forex market investment algorithm. Amidst numerous algorithmic possibilities, I chose to embrace simplicity as a stepping stone before delving into more complex strategies. Inspiration was drawn from Gurrib’s 2016 paper, published on the Global Review of Accounting and Finance, available at SSRN. Gurrib’s study, which benchmarked a crossover simple moving average strategy on daily S&P500 candles between 1993 and 2014, reported an impressive 24% return over 1593 investment days. Let’s embark on this journey to assess the potential of a similar approach in the Forex market.

SMA Crossover Strategy

The SMA crossover strategy works by assuming that the series contains short and long-run trends. The short-run trend follows the series closely and reacts more rapidly to variations in the series in comparison to the long-run trend. The core idea of the strategy is that we can find market signals of buying or selling by monitoring the intersections between the short and long-run trends in the series. If the short-run trend intersects the long-run trend and moves upward the value of the series is increasing (also called a golden cross), and therefore the algorithm sends a buy signal. Conversely, if the short-run trend intersects the long-run trend and moves downward the price decreases (known as a dead cross), and it is time to sell.

To understand better the behavior of the algorithm I have created a visualization that monitors the interaction between the USD.SEK series that I downloaded from Interactive Brokers in candles of 30 seconds (blue) and the short and long-run trends (red and green respectively) that I have estimated using SMA.

If you are interested, you can recreate the animation with the code bellow in Python:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import os
from matplotlib.animation import FuncAnimation
from IPython.display import display, clear_output

# change working directory

# Specify the target directory
new_directory = r'C:\Users\mglez\Documents\PHD\Semester 16\01092023_SMA_blog\material\dynamc_graph'

# Change the working directory
os.chdir(new_directory)

# Read the CSV file and extract the first column of the USD.SEK as y1
df = pd.read_csv('USDSEK_dur_5D_candle_30sec_2023-08-20_2023-08-24_CLOSE.csv')
y1 = df.iloc[301:, 1].values
# Loading the simple moving averages
y2 = df.iloc[301:, 2].values
y3 = df.iloc[301:, 3].values

# Create x data
n = len(y1)
x = np.arange(n)

# Initialize the plot
fig, ax = plt.subplots()
line1, = ax.plot(x, y1, label='USD.SEK', color='blue')
line2, = ax.plot(x, y2, label='SMA 5', color='red')
line3, = ax.plot(x, y3, label='SMA 300', color='green')
ax.set_title('SMA Crossover Strategy: USD.SEK')
ax.legend()

# Set the y-axis limits to display values between 10.5 and 11.5
ax.set_ylim(10.9, 11)

# Initialize text annotation
text = ax.text(0.1, 0.90, '', transform=ax.transAxes, fontsize=12, color='black')

# Number of values to display on the horizontal axis
num_values_to_display = 100

# Function to update the plot for each frame
def update(frame):
    x_data = x[max(0, frame - num_values_to_display):frame]
    line1.set_data(x_data, y1[max(0, frame - num_values_to_display):frame])
    line2.set_data(x_data, y2[max(0, frame - num_values_to_display):frame])
    line3.set_data(x_data, y3[max(0, frame - num_values_to_display):frame])
    
    # Calculate the x-axis limits dynamically based on the frame and num_values_to_display
    min_x = max(0, frame - num_values_to_display)
    max_x = frame
    ax.set_xlim(min_x, max_x)

    # Check if there are enough data points to calculate min and max
    if len(x_data) >= num_values_to_display:
        min_y = min(min(y1[max(0, frame - num_values_to_display):frame]), min(y2[max(0, frame - num_values_to_display):frame]), min(y3[max(0, frame - num_values_to_display):frame]))
        max_y = max(max(y1[max(0, frame - num_values_to_display):frame]), max(y2[max(0, frame - num_values_to_display):frame]), max(y3[max(0, frame - num_values_to_display):frame]))
        ax.set_ylim(min_y - 0.009, max_y + 0.009)
    
    if y2[frame] > y3[frame]:
        text.set_text('Buy')
        text.set_color('red')
    else:
        text.set_text('Sell')
        text.set_color('green')
    return line1, line2, line3, text

# Create the animation
ani = FuncAnimation(fig, update, frames=n, interval=300)  # Update every 3 seconds

# Display the animation in Jupyter Notebook
display(fig)

try:
    # Continuously update the animation
    for i in range(n):
        update(i)
        clear_output(wait=True)
        display(fig)
except KeyboardInterrupt:
    pass

Requesting Data from Interactive Brokers

To request data from Interactive Brokers, you need a trading account and to connect the API of the Trader Workstation to Python or R. If you are using R, you need to install the IBrokers package before you attempt to download the data. To teach how to download data from the API, can be a tutorial in itself. But the key elements that you need are a connection to the API, a contract with a correct symbol for the stock, a duration and a candle size. In this example, I am creating a connection that I call tws, then I am creating a contract with the twsContract() with the correct symbol USD.SEK and finally I am setting a duration of 5 D (5 days) with candles of 30 sec (seconds).

library('IBrokers')


#### ACCOUNT ####
tws = twsConnect(port=7496)
twsConnectionTime(tws)
ac <- reqAccountUpdates(tws)

#### CONTRACT ####
contract <- twsContract()
contract$symbol <- "USD"
contract$sectype <- "CASH"
contract$currency <- "SEK"
contract$exch <- "IDEALPRO"
contract$includeExpired = TRUE
is.twsContract(contract)


#### REQUEST HIST DATA ####

duration <- "5 D"
barSizeSetting <- "30 sec"

data <- reqHistoricalData(conn=tws, Contract=contract, duration=duration, barSize=barSizeSetting, whatToShow="MIDPOINT")

The setting that I am using on my Trader Workstation are the following:

Optimizing the SMA Crossover bands (Short-run Backtesting)

A key requirement for the success of the algorithm is to identify which set of bands (long and short-run) are better to predict changes in the behavior of the series. We are interesting on benchmarking a series of pair of bands so we can identify which combination is more profitable. In other words, we are going to set the criteria of the highest balance at the end of the testing period to select the pair of bands.

Probably there is fastest vectorized way to identify the intersections, but to calculate the final profit I think it is only possible to do with a loop. Because, the margins of profit/loss change in every transaction and the accumulation of capital depends on this interactive process.

The range I selected for the SMA in the short run is from 5:100 and 10:300for the long run. In each iteration the algorithm will select one pair of bands, fit the corresponding models, calculate benchmarks and estimate the capital at the end of the period. Effectively algorithm tests 27550 combinations of short and long-run bands and saves measurements of performance for latter analysis. The optimization of the bands was conducted in a data set that runs over 5 days in candles of 30 seconds, a data set of 94624.

Similar to the study by Gurrib (2016), I assume that:

The frequency of data is set to candles of 30 seconds.
The effect of discounts, taxes and commissions are ignored.
All orders occur immediately at market prices.
Limit and stop order options are not allowed at this stage.

perf_df <- data.frame(matrix(ncol = 13, nrow = 0))
colnames(perf_df) <-
  c(
    "n",
    "m",
    "capital",
    "num_trades",
    "trades_per_min",
    "numWinningTrades",
    "numLosingTrades",
    "mae_short",
    "mae_long",
    "rmse_short",
    "rmse_long",
    "corr_short",
    "corr_long"
  )

# Set commission rate
# commission_rate <- 0.00075
commission_rate <- 0

# Loop over n and m
for (n in 5:100) {
  for (m in 10:300) {
    # Calculate moving averages
    data$sma_short <- SMA(data$USD.SEK.Close, n = n)
    data$sma_long <- SMA(data$USD.SEK.Close, n = m)
    # data$sma_short[is.na(data$sma_short)] <- 0
    # data$sma_long[is.na(data$sma_long)] <- 0
    
    # Mean Absolute Error (MAE)
    mae_short <- mean(abs(data$sma_short - data$USD.SEK.Close),na.rm = TRUE)
    mae_long <- mean(abs(data$sma_long - data$USD.SEK.Close), na.rm = TRUE)
    
    # Root Mean Squared Error (RMSE)
    rmse_short <-
      sqrt(mean(sum(data$sma_short - data$USD.SEK.Close, na.rm = T) ^ 2))
    rmse_long <- sqrt(mean(sum(data$sma_long - data$USD.SEK.Close, na.rm = TRUE) ^ 2))
    
    # Correlation Coefficient
    corr_short <- cor(data$sma_short, data$USD.SEK.Close, use = "complete.obs")
    corr_long <- cor(data$sma_long, data$USD.SEK.Close, use = "complete.obs")
    
    
    # Initialize variables
    init_capital = 2000
    capital = 2000
    pos = 0
    numTrades = 0
    numWinningTrades = 0
    numLosingTrades = 0
    
    
    
    
    # Backtest strategy
    for (i in 2:nrow(data)){      # Check for a cross
      # c <- c + 1L
      # #print cross
      # print(paste0("cross: ", c))
      if (!is.na(data$sma_short[i - 1]) &&
          !is.na(data$sma_long[i - 1]) &&
          data$sma_short[i - 1] <= data$sma_long[i - 1] &&
          data$sma_short[i] > data$sma_long[i] && capital > 0) {
        # Buy
        pos = (capital - capital * commission_rate) * as.numeric(data$USD.SEK.Close[i])
        print(paste("BUY:", i, pos))
        capital = 0
        numTrades = numTrades + 1
      }
      else if (!is.na(data$sma_short[i - 1]) &&
               !is.na(data$sma_long[i - 1]) &&
               data$sma_short[i - 1] >= data$sma_long[i - 1] &&
               data$sma_short[i] < data$sma_long[i] && pos > 0) {
        # Sell
        capital = as.numeric(pos / data$USD.SEK.Close[i] - pos / data$USD.SEK.Close[i] * commission_rate)
        print(paste("SELL:", i, capital))
        pos = 0
        numTrades = numTrades + 1
        
        if (capital > init_capital) {
          numWinningTrades = numWinningTrades + 1
          
        } else {
          numLosingTrades = numLosingTrades + 1
          
        }
      }
      
    }
  }
  # c <- 0L
  # Append performance to dataframe
  perf_df <-
    rbind(
      perf_df,
      data.frame(
        n,
        m,
        capital,
        numTrades,
        trades_per_min = numTrades / ( (nrow(data) * 30)/60 ),
        numWinningTrades,
        numLosingTrades,
        mae_short,
        mae_long,
        rmse_short,
        rmse_long,
        corr_short = corr_short,
        corr_long = corr_short
      )
    )
  print(tail(perf_df))
}

  
colnames(perf_df) <-
  c(
    "n",
    "m",
    "capital",
    "numTrades",
    "trades_per_min",
    "numWinningTrades",
    "numLosingTrades",
    "mae_short",
    "mae_long",
    "rmse_short",
    "rmse_long",
    "corr_short",
    "corr_long"
  )




#### TOP PERFORMANCE ####
# Print final max capital
print(perf_df[which.max(perf_df$capital),])

# Print final max numWinningTrades  
print(perf_df[which.max(perf_df$numWinningTrades ),])



#### BEST FIT ####

# Print final max rmse_short   
print(perf_df[which.max(perf_df$rmse_short),1])

# Print final max mae_short   
print(perf_df[which.max(perf_df$mae_short),1])

# Print final max mae_short   
print(perf_df[which.max(perf_df$corr_short),1])

# Print final max rmse_long   
print(perf_df[which.max(perf_df$rmse_long),2])

# Print final max mae_long   
print(perf_df[which.max(perf_df$mae_long),2])

# Print final max mae_long   
print(perf_df[which.max(perf_df$corr_long),2])

In terms of performance (capital return) the pair that won is the 8, 300 followed closely by the 5, 300 for the short and long-run respectively.

Long Run Backtesting

To have a better idea of the behavior of the algorithm, I decided to run the algorithm using 6 months of data in candles 30 seconds with a total of 2961360 data points. Testing the algorithm over six months will give us a better perspective of how well the SMA bands capture the long and short run trends in the data and a better approximation of the financial return.

Improvements

I decided to make some small changes to the previous algorithm. Firstly, I wanted to compute the grossprofit/loss of each transaction. Secondly, I estimate the return of investment (ROI) of each transaction to compute the average and standard deviation of the returns at the end of the exercise and approximate a Sharpe Ratio. Thirdly, I wanted to correct a misleading numWinningTrades/numLosingTrades indicator in the previous algorithm. In the previous algorithm, I consider a wining trade if the current capital was higher than the initial capital after each transaction. However to be more accurate it is better to consider a winning trade when the buyPrice > sellPrice. This is a bit counter intuitive but remember that the algorithm buys when the price is increasing, so the USD (dollar) invested will render more Krones (SEK). For instance, imagine that you invest 10 USD, and the price of the Krone is 12 (buying price), that is 120 SEK. Then, if the algorithm identifies a selling signal at a price of 10.5 SEK (selling price), your profit would have been 1.43 USD for this transaction, calculated as (120 / 10.5 = 11.43).

First run of the algorithm

In the first run of the algorithm I wanted to secure winning transactions only. I attempt to achieve this by adding a rule buyPrice > as.numeric(data$USD.SEK.Close[i]), so I will guarantee that the selling price was always bellow the buying price and make winning trades all the time. The buying rule was transformed as follows:

 else if (!is.na(data$sma_short[i - 1]) &&
           !is.na(data$sma_long[i - 1]) &&
           data$sma_short[i - 1] >= data$sma_long[i - 1] &&
           data$sma_short[i] < data$sma_long[i] && pos > 0 && buyPrice > as.numeric(data$USD.SEK.Close[i])) 

Unfortunately this change didn’t report a greater performance than the regular unconstrained moving average. The issue is that the series eventually reach local maximum or minimum values. For instance, if the algorithm buys at a local minimum point in the series the selling condition will never be fulfilled buyPrice > as.numeric(data$USD.SEK.Close[i]). The algorithm’s performance suffered because it purchased an asset at a local minimum price, and since then, the price has consistently risen. This situation makes it unlikely for future prices to be lower, leading to lower overall performance. In a nutshell, it seems that in order to take advance of the volatility of the series and make higher profit it is necessary to lose some trades as long as on the averages we are winning more often. This is the reported performance of the first run:

n	m	capital	net_profit	grossProfit	grossLoss
8	300	2086.763	86.763	86.763	0

SMA constrained Crossover performance

The total profit over the six months was only 86.763 USD, a return of investment of only 4.33 %. However as expected, the total number of trades is low and more importantly there are no trades on loss.

buynumTrades	sellnumTrades	trades_per_min	numWinningTrades	numLosingTrades
56	55	0.001	55	0

Second run of the algorithm

In my second attempt I ran the unconstrained version (original version) with the additional elements that I discussed previously, as follows:

n <- 8
m <- 300

# Calculate moving averages
data$sma_short <- SMA(data$USD.SEK.Close, n = n)
data$sma_long <- SMA(data$USD.SEK.Close, n = m)
# data$sma_short[is.na(data$sma_short)] <- 0
# data$sma_long[is.na(data$sma_long)] <- 0

# Mean Absolute Error (MAE)
mae_short <- mean(abs(data$sma_short - data$USD.SEK.Close),na.rm = TRUE)
mae_long <- mean(abs(data$sma_long - data$USD.SEK.Close), na.rm = TRUE)

# Root Mean Squared Error (RMSE)
rmse_short <-
  sqrt(mean(sum(data$sma_short - data$USD.SEK.Close, na.rm = T) ^ 2))
rmse_long <- sqrt(mean(sum(data$sma_long - data$USD.SEK.Close, na.rm = TRUE) ^ 2))

# Correlation Coefficient
corr_short <- cor(data$sma_short, data$USD.SEK.Close, use = "complete.obs")
corr_long <- cor(data$sma_long, data$USD.SEK.Close, use = "complete.obs")



# Initialize variables
init_capital = 2000
capital = 2000
buyCapital <- 0
pos = 0
grossPnL = 0
buynumTrades = 0
sellnumTrades = 0
numWinningTrades = 0
numLosingTrades = 0
grossProfit = 0
grossLoss = 0
commission_rate = 0
buyPrice = 0  # Initialize previousPrice to 0
sellPrice = 0
roi = vector("numeric", length = 0)

# Backtest strategy
for (i in 2:nrow(data)){      # Check for a cross
  # ...
  if (!is.na(data$sma_short[i - 1]) &&
      !is.na(data$sma_long[i - 1]) &&
      data$sma_short[i - 1] <= data$sma_long[i - 1] &&
      data$sma_short[i] > data$sma_long[i] && capital > 0) {
    # Buy
    buyPrice <- as.numeric(data$USD.SEK.Close[i])  # Store the buy price
    buyCapital <- capital
    pos = (capital - capital * commission_rate) * buyPrice
    print(paste("BUY:", i, pos))
    capital = 0
    buynumTrades = buynumTrades + 1
  }
  else if (!is.na(data$sma_short[i - 1]) &&
           !is.na(data$sma_long[i - 1]) &&
           data$sma_short[i - 1] >= data$sma_long[i - 1] &&
           data$sma_short[i] < data$sma_long[i] && pos > 0){ #&& buyPrice > as.numeric(data$USD.SEK.Close[i])
    # Sell
    sellPrice <- as.numeric(data$USD.SEK.Close[i])  # Calculate PnL based on current capital
    capital = as.numeric(pos / sellPrice - pos / sellPrice * commission_rate)
    grossPnL <- capital - buyCapital
    print(paste("SELL:", i, capital))
    pos = 0
    sellnumTrades = sellnumTrades + 1
    
    roi <- c(roi, (buyPrice-sellPrice/buyPrice)*100)
    if (buyPrice > sellPrice) {
      numWinningTrades = numWinningTrades + 1
      grossProfit = grossProfit + grossPnL
    } else {
      numLosingTrades = numLosingTrades + 1
      grossLoss = grossLoss + abs(grossPnL)
    }
  }
}

The reported performance over the same period of data (6 months) is presented on the table bellow. The net profit of the unconstrained simple moving average over six months was 317.278 with an initial investment of 2000 USD. A total of 15.86 % return of investment, not bad at all, considering that we only tested half a year.

n	m	capital	net_profit	grossProfit	grossLoss
8	300	2317.278	317.278	1700.261	1382.983

SMA unconstrained Crossover performance

Remarkably, the unconstrained variant of the algorithm, absent the condition buyPrice > as.numeric(data$USD.SEK.Close[i]), exhibited a loss in approximately 20% of its trades. This is quite high, and it is an area of opportunity for further implementations of the algorithm. I will start by testing a less restrictive condition of selling that allows to sell on loss but only around certain margin, perhaps the standard deviation of long-run SMA.

buynumTrades	sellnumTrades	trades_per_min	numWinningTrades	numLosingTrades
2025	2025	0.001	1612	413

Composition of the trades

The distribution of the Return of Investment (ROI) of each trade is the following:

Min.	1st Qu.	Median	Mean	3rd Qu.	Max.	sd
-2.688	0.006	0.025	0.007	0.053	1.31	0.15

ROI Summary Statistics

Final remarks and areas of improvement.

The SMA crossover algorithm proved to be successful, achieving a total return on investment of 15.86% over six months with 30-second candles. However, it’s important to note that this performance heavily depends on specific parameter values (bands), candle intervals, and the chosen stock. In our rigorous testing, we explored a staggering 27,550 combinations of short and long-run bands over five days to identify the winning pair.

While the algorithm showed promise, there are areas for improvement. First, we observed a relatively high gross loss (1382 USD) compared to the gross gain (1780 USD), resulting in approximately 20% of losses. Enhancing the algorithm with additional rules, such as introducing resistance bands, may help mitigate losses during market uncertainties. Secondly, more realistic estimations of transaction commissions need to be incorporated to provide a more accurate representation of algorithm performance. It’s worth noting that Interactive Brokers limits regular trading accounts to one-minute candles, which may impact trading strategies. Looking ahead, optimizing and testing the algorithm’s performance in the equity market, particularly with stocks displaying higher returns and upward trends, could yield even better results. Finally, the next phase involves implementing the algorithm using live market data through the reqMktData function and testing it in a paper trading account to assess its real-time performance.

Table of Content: