My project this term was to create a replica of Atari’s Breakout and use the Unity machine learning tool ML-Agents to create train a neural network that would play along side a human player on a split screen. My biggest success was creating code that achieved the training of the agent. By the nature of the project, each of our 4 group members basically had to proficient in all almost aspects of the project, namely Unity, ML-Agents, and Unity API for us to succeed. Another group member had already created code that would train our AI agent, but I felt it necessary for my understanding to create a viable model as well. My breakthrough moment in creating the agent was in adjusting the reward structure and episode length as well as use the isactive() method to recognize when a brick was active or deactivated.
During the project, it was not possible or desirable for us to modify the actual machine learning algorithms. This would have been prohibitively complex. Instead we treated the algorithms as a black box and feed them observation data through C# Unity scripts and hyper-parameters. This was very tricky because without the right rewards, penalties, and episode periods your agent would end up acting in often bizarre ways.
I took a risk when I decided to pursue the ML-Agents project, there were other projects that had more familiar tools that I knew I could easily accomplish, but I had mostly played it safe during my degree and I have a great interest in Unity and Machine learning so I decided to take the risk. There was absolutely scary moments especially in the beginning where I thought I was in over my head, but working through the project every single day brought me to a place where I’m confident I could create another agent for another game, which is a good feeling. I had never worked with Unity before and it was an absolute blast being introduced to it.
I’m really happy with how the course worked out for me. The greatest strength of the course for me was in how it introduced me to new tools and reinforced old ones. I absolutely feel more confident with git than I did coming in because of having to constantly push code. I feel like I have a good foundation in C# now, and I feel quite comfortable with Unity. Because ML-Agents uses pytorch, we had to create a virtual environment many times with many versions of Python and Numpy. This was my first experience using virtual environments and I feel very confident in the process now. I’m even more confident using google docs!
The greatest weakness of the course is perhaps unavoidable. The class is obviously a group project and that comes with a lot of baggage. It is absolutely important for us to learn how to work in a group, but because when we start our project we really have very little sense of what we are doing, it is very difficult to properly plan a schedule and divide tasks. Generally group work is most rewarding when there is a strong sense of direction, a well thought out structure, apportioned roles and an experienced leader. Without these things each group member may work at their own pace, sometimes duplicating work that has already been done. There are times when you feel there is not much for you to do, and other times when problems have been discovered where there is too much to do. One inherently difficult thing about group work is that organization, task assignment, and goal tracking are very important aspects of a project, but because they are less easy to communicate your work on in something like a progress report they are often ignored. In addition we generally all just want to write code, but there is so much more involved in the project! With that in mind a suggestion to improve the course would be to find a way to better familiarize each group with more details of the project early on so they can apportion tasks and responsibilities more easily. This is obviously a hard problem to solve and there is even some merit in the projects being vague so that the students create their own solutions, but I do believe more foresight into the project early on would wield better results for students.
Leave a Reply