Exploring MarI/O

During the holidays (while exploring various potential projects for this course) I dug a bit into some of the various neural networks being used to play video games.  A few examples I enjoyed reading follow:

A Neuroevolution Approach to General Atari Game Playing (http://www.cs.utexas.edu/users/pstone/Papers/bib2html-links/TCIAIG13-mhauskn.pdf),

Playing Atari with Deep Reinfocement Learning (https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)

SethBling’s MarI/O (https://www.youtube.com/watch?v=qv6UVOQ0F44)

While all are great, I’m going to focus specifically on some of my notes/thoughts while exploring MarI/O’s input layer.  MarI/O is an implementation of a genetic algorithm used to play Super Mario World (“SMW”) for the SNES.  While digging into this implementation, I discovered the fascinating community surrounding SNES ROM hacking.

For SMW specifically, there exists a community called SMW Central (https://www.smwcentral.net/).  This community consists of dedicated enthusiasts who create and share various hacks of the original ROM.  This community has created and maintains very detailed technical specifications of SMW, and freely discusses and shares detail for other games as well.  This resource was very invaluable in digging into the inputs for MarI/O.

The following is a list of memory locations utilized by MarI/O for SMW to create the inputs:

VariableMemory Location – Per MarI/ODescription (From SMW Central – https://www.smwcentral.net/?p=nmap&m=smwram)
getPositions – marioX0x94$7E0094 – Player X position (16-bit) within the level, next frame (calculates player position one frame ahead, as opposed to $7E:00D1). It’s also used as a player X position on-screen on the overworld border.
getPositions – marioY0x96$7E0096 – Player Y position (16-bit) within the level, next frame (calculates player position one frame ahead, as opposed to $7E:00D3). It’s also used as a player Y position on-screen on the overworld border.
getPositions – layer1x0x1A$7E1462 – Layer 1 X position, current frame. Mirror of SNES register $210D.
getPositions – layer1y0x1C$7E1464 – Layer 1 Y position, current frame. Mirror of SNES register $210E.
getTile – return0x1C800$7FC800 – Map16 high byte table. Same format as $7E:C800. $7F:FFF8 through $7F:FFFD are also used by Lunar Magic’s title screen recording ASM.
getSprites – status0x14C8$7E14C8 – Sprite status table:   #$00 = Free slot, non-existent sprite. #$01 = Initial phase of sprite. #$02 = Killed, falling off screen. #$03 = Smushed. Rex and shell-less Koopas can be in this state. #$04 = Killed with a spinjump. #$05 = Burning in lava; sinking in mud. #$06 = Turn into coin at level end. #$07 = Stay in Yoshi’s mouth. #$08 = Normal routine. #$09 = Stationary / Carryable. #$0A = Kicked. #$0B = Carried. #$0C = Powerup from being carried past goaltape.   States 08 and above are considered alive; sprites in other states are dead and should not be interacted with.
getSprites – spritex0xE4$7E00E4 – Sprite X position, low byte.
getSprites – spritex0x14E0$7E14E0 – Sprite X position, high byte.
getSprites – spritey0xD8$7E00D8 – Sprite Y position, low byte.
getSprites – spritey0x14D4$7E14D4 – Sprite Y position, high byte.
getExtendedSprites – number0x170B$7E170B – Extended sprite number.  Last two bytes reserved for fireballs.
getExtendedSprites – spritex0x171F$7E171F – Extended sprite X position, low byte. Last two bytes reserved for fireballs.
getExtendedSprites – spritex0x1733$7E1733 – Extended sprite X position, high byte. Last two bytes reserved for fireballs.
getExtendedSprites – spritey0x1715$7E1715 – Extended sprite Y position, low byte. Last two bytes reserved for fireballs.
getExtendedSprites – spritey0x1729$7E1729 – Extended sprite Y position, high byte. Last two bytes reserved for fireballs.

While the list is fairly straight forward, digging into the “why” behind each of these was very interesting.  Let’s start with the inputs from the getPositions function.  This RAM maps to Mario’s X and Y coordinate on the screen.  This is used to drive the fitness within the algorithm, and to help determine the Map 16 tiles that need to be pulled from memory for purposes of building input tiles, and determining Mario’s distance from enemies.  It doesn’t appear as if the layer1 variables are ultimately utilized in any of the calculations.

The getTile variable was probably the most confusing for me to find specific detail around.  This variable maps to the map16 high byte.  Thanks to the resources at SMW Central I was able to find Lunar Magic, which help clear up a lot of my confusion around how this memory was being used.  In general, a Map 16 high byte of 01 in SMW maps to tiles that are solid – these can include everything from ground tiles, question mark blocks, slopes, and (unfortunately for the Mario character being controlled by this implementation) even spikes. 

Map16 Tiles – Example 1 – The tiles in this tilemap seem like they would generally work fairly well. 🙂
Map16 Tiles – Example 2 – The tiles in this tilemap have tiles that would kill Mario. 🙁

These inputs are all brought together in the getInputs function to ultimately drive the creation of a feature matrix within the neural network representing small areas of the screen.  All features are initially assigned a value of 0, then is further assigned a value of 1 if there is a solid tile in that particular area, or a value of -1 if an enemy (sprite) or obstacle (extended sprite) exists in that particular area.

While it isn’t lost on me that MarI/O was created mostly to create an awesome YouTube video, and I shouldn’t be too critical, using this input feature selection wouldn’t work for a large number of the game’s levels.  Simply assigning a weight to a tile based on the high byte loses a lot of information that could be provided to the neural network, and will lead to inconsistent results across various tilesets.

For example, as discussed above, the example 2 tileset contains spikes with a high byte of 01. The image below shows an area where this would likely lead to issues for MarI/O. Based on how the current inputs are utilized, the spikes would be assigned a weight of 1.

The spikes on this level wouldn’t be recognized as being different than other ground tiles.

Ultimately, I’d like to pick up and continue to explore this (and other) implementations for ML in gaming. This examination lead me down a path of exploring emulation as a project for this course. I’m excited to dig more into emulation by building an NES emulator as a project for this course.

Print Friendly, PDF & Email
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *