AI Trading Bot

Hi all! This is probably my last blog for this course so i thought i would share some of what I have learned during the course of this really cool project. My team has been developing a Bitcoin Trading Bot that uses reinforcement learning to trade bitcoin and beat the market. I’ve used Deep Learning and Neural Networks on other projects in the past and I am really amazed at the power and potential of AI. This project was my first opportunity to work with re-enforcement learning and my team choose to use Pytorch for the project as well instead of Tensorflow (which was originally prescribed). I’m all about learn new things, especially with AI, so I have really enjoyed this project.

The specific algorithm we are currently testing with our bot is Proximal Policy Optimization, which is a family of reinforcement learning algorithms developed by OpenAI. Proximal Policy Optimization is an algorithm that alternates between sampling data through interatcion with an environment (state, observations, reward structure, ect) and optimizing between a “surrogate” objective function using stochastic gradient ascent. PPO tries to find a balance between exploration (trying new actions) and exploitation (using known actions) by limiting how much the policy can change in each update. PPO updates the policy network using a clipped objective function that penalizes large deviations from the previous policy.

There is a lot more to learn and a lot more we plan to test but so far we have had really interesting results using PPO and reinforcement learning.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *