The Journey So Far…

Building out a multi objective reinforcement learning model and creating custom indicators along the way is not something I would have been able to do or even think of doing before starting this course. Seems like stuff that Phd level individuals would be working on and so I am gladly soaking up all that I can with this project. I’m sure that the theories and more mathematical parts are covered for but the build itself I really need to know how to set these things up rather than building them from scratch. I am taking this course along with open source which has been a great pairing because I had no idea there were so many projects that were open source like Pandas for example. I guess I really didn’t know where it came from before taking these classes even though I had used it but that led me to try and find more libraries that I could use for this project. Learning new technologies has been the most rewarding so far and I think coming away from this class will help me in the future by going out and learning without much guidance. Other cool things on top of coding is learning about formulas in the financial world like the one here:

Learning about all the indicators has been interesting to know how to trade in the future and this is something that I will be looking into for much longer than the duration of this course.

Tools and Style

So my team is based in all different time zones so there can be commits happening at all times of the day. Since this project has taken a lot of research at the beginning it started off a little slow just to get our bearings but has ramped up pretty quickly and commits are starting to come in daily. For our project tracking we have been using Jira to make sure that our deadlines are being met and that we are all on the same page with what work needs to be done to hold each other accountable. I find it pretty easy to use since I use it at work and its so nice to have such a simpler board to keep us focused without customer bugs and tech debt that has built up over the years.


I think we will get a working product for sure. If it ends up tanking our paper trading accounts is another but we are definitely trying to optimize the strategies as best we can. We have been using the backtesting library which you can fine tune to give some insane results but of course hind sight is 20/20. I think we will get a better idea of profitability when we feed all the features into the model and train for the signal or buy or sell coming up in the next two weeks. This is the part that gets really exciting and seeing the sample models start to learn with the data we present. I started out by giving the lunar lander sample a go and helped the idea behind reinforcement learning get more solidified.

Overall I feel good about the project at this point and excited to see how we wrap it all together in the coming weeks! Check out my most recent post here!

The Tech Stack

Why did you and your team choose the technologies you did?

There are plenty of technologies to choose from for machine learning and artificial intelligence. With the explosion of AI, specifically OpenAIs ChatGPT, it seems like there is a new technology every day that can enhance and optimize our lives in large and small amounts alike. Taking what is complex and turning it into a simple interface makes work that would take hours and days turn into minutes and even seconds. So where do you go to find great tech for the specific task of creating a crypto trading bot trained with multi-objective reinforcement learning? Well it seems that there are a few options at this point and from what we gathered we just have to select a few and blend them together to make essentially our own tech stack for this project. We start with the basics of python given that it is the go to language for problems like this and with the popularity of Jupyter it felt like we would be able to train and test models most efficiently using that power duo. We also need to consider using Pandas, Scikit-learn, and NumPy, popular libraries for Python in their own right, for this project so we could accurately define a data set and structure it in a meaningful way. Preprocessing is a large part of the project so we are working with quality data and that can’t go overlooked, otherwise our results may not be accurate at all. We found that the ease of use and accessibility of yfinance would work well for our purposes and has done just that so far.

Reinforcement Learning. Figure 1.

Now once we gather the data and clean it what is next? Training, of course. For that we opted for sponsor and instructor guidance of TensorFlow and even OpenAI’s Gym to use as the environment. With so much buzz around the company and their products it seems like a great chance to learn more and get involved at such an early stage.Together we found that these technologies can all come together to build an amazing product and we are excited to present our findings!

How will your project use them?

When we build a machine learning model, our main goal is to create a model that can make accurate predictions on new, unseen data. To achieve this, we need to ensure that our model is not only good at fitting the data it has seen but also generalizes well to new data. The dataset we have is usually divided into two parts: the training set and the testing set.

Figure 2.

Training set: This is the part of the dataset that we use to “teach” or “train” our machine learning model. During the training process, the model learns patterns and relationships between the input features (e.g., characteristics of a flower) and the output (e.g., the flower’s species). Another example would be input of student study hours and output would be their grades. The model adjusts its internal parameters to minimize the error between its predictions and the actual output values in the training set.

Testing set: This is the part of the dataset that we use to evaluate the performance of our trained model. We do not use the testing set during the training process, so it represents new, unseen data for the model. By comparing the model’s predictions on the testing set with the actual output values, we can estimate how well the model will perform on real-world data that it has never seen before.

So using each of the technologies discussed previously are shown in Figure 2. Python is the base language across the entire project with the modeling taking place in Jupyter notebooks to aid in efficient memory usage and only training data once vs running large data sets over and over again each time we make an adjustment to the algorithm. The reason we use NumPy arrays instead of built-in Python lists is that NumPy arrays are more efficient for numerical computations, which are at the core of machine learning algorithms. NumPy provides a wide range of built-in functions for performing mathematical operations on arrays. These functions are highly optimized and easy to use, making it convenient to work with arrays. Scikit-learn is a great library for ML related tools like when we create a LinearRegression model object, we’re instantiating an object from the LinearRegression class provided by Scikit-learn. This class has several methods, including the fit() method, which is responsible for training the model using the provided input features and target values. OpenAI’s Gym makes setting up models super easy, like 5 lines of code easy. It really is the only exposure to training models I have to this point but I can say I can’t see it getting much better than this. Here is an example snippet:

import gym
from my_custom_environment import MyCustomEnv
from mo_dqn import MODQN

# Create the custom environment
env = MyCustomEnv()

# Initialize the multi-objective DQN agent
agent = MODQN(env)

So as you can see, we can quickly build off existing MORL agents and just get to testing.

What are their pros and cons?

Well initially the cons are that there are paid versions of what we are trying to do so depending on your financial situation you may be able to use for example quantconnect and build this out in a matter of minutes, including connecting to a brokerage of your choice and executing the trades. I guess that could be considered a pro as well. I think that for the enjoyment of learning though I will stick with the con side of view because I want to learn how to do it vs just using someone else’s product. I think that the pros of using all this tech comes with the failures and iterations that will surely have to be made because it just means that we are learning and building a tool suite of our own for future projects. It is one thing to say I made money with a bitcoin bot but it is another to say that you developed the bot itself. Even better would be to say that you in some manner sold the bot, whether through subscription or what ever else. At the end of the day the guys that made the most in the gold rush were the ones selling shovels and pans, not the gold miners themselves so I think that is the overall pro of using the tools we selected and the learning that comes along with them.

Want to See more? Check out my latest post here!