Exploring MarI/O

During the holidays (while exploring various potential projects for this course) I dug a bit into some of the various neural networks being used to play video games. A few examples I enjoyed reading follow:

A Neuroevolution Approach to General Atari Game Playing (http://www.cs.utexas.edu/users/pstone/Papers/bib2html-links/TCIAIG13-mhauskn.pdf),

Playing Atari with Deep Reinfocement Learning (https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)

SethBling’s MarI/O (https://www.youtube.com/watch?v=qv6UVOQ0F44)

While all are great, I’m going to focus specifically on some of my notes/thoughts while exploring MarI/O’s input layer. MarI/O is an implementation of a genetic algorithm used to play Super Mario World (“SMW”) for the SNES. While digging into this implementation, I discovered the fascinating community surrounding SNES ROM hacking.

For SMW specifically, there exists a community called SMW Central (https://www.smwcentral.net/). This community consists of dedicated enthusiasts who create and share various hacks of the original ROM. This community has created and maintains very detailed technical specifications of SMW, and freely discusses and shares detail for other games as well. This resource was very invaluable in digging into the inputs for MarI/O.

The following is a list of memory locations utilized by MarI/O for SMW to create the inputs:

Variable	Memory Location – Per MarI/O	Description (From SMW Central – https://www.smwcentral.net/?p=nmap&m=smwram)
getPositions – marioX	0x94	$7E0094 – Player X position (16-bit) within the level, next frame (calculates player position one frame ahead, as opposed to $7E:00D1). It’s also used as a player X position on-screen on the overworld border.
getPositions – marioY	0x96	$7E0096 – Player Y position (16-bit) within the level, next frame (calculates player position one frame ahead, as opposed to $7E:00D3). It’s also used as a player Y position on-screen on the overworld border.
getPositions – layer1x	0x1A	$7E1462 – Layer 1 X position, current frame. Mirror of SNES register $210D.
getPositions – layer1y	0x1C	$7E1464 – Layer 1 Y position, current frame. Mirror of SNES register $210E.
getTile – return	0x1C800	$7FC800 – Map16 high byte table. Same format as $7E:C800. $7F:FFF8 through $7F:FFFD are also used by Lunar Magic’s title screen recording ASM.
getSprites – status	0x14C8	$7E14C8 – Sprite status table: #$00 = Free slot, non-existent sprite. #$01 = Initial phase of sprite. #$02 = Killed, falling off screen. #$03 = Smushed. Rex and shell-less Koopas can be in this state. #$04 = Killed with a spinjump. #$05 = Burning in lava; sinking in mud. #$06 = Turn into coin at level end. #$07 = Stay in Yoshi’s mouth. #$08 = Normal routine. #$09 = Stationary / Carryable. #$0A = Kicked. #$0B = Carried. #$0C = Powerup from being carried past goaltape. States 08 and above are considered alive; sprites in other states are dead and should not be interacted with.
getSprites – spritex	0xE4	$7E00E4 – Sprite X position, low byte.
getSprites – spritex	0x14E0	$7E14E0 – Sprite X position, high byte.
getSprites – spritey	0xD8	$7E00D8 – Sprite Y position, low byte.
getSprites – spritey	0x14D4	$7E14D4 – Sprite Y position, high byte.
getExtendedSprites – number	0x170B	$7E170B – Extended sprite number. Last two bytes reserved for fireballs.
getExtendedSprites – spritex	0x171F	$7E171F – Extended sprite X position, low byte. Last two bytes reserved for fireballs.
getExtendedSprites – spritex	0x1733	$7E1733 – Extended sprite X position, high byte. Last two bytes reserved for fireballs.
getExtendedSprites – spritey	0x1715	$7E1715 – Extended sprite Y position, low byte. Last two bytes reserved for fireballs.
getExtendedSprites – spritey	0x1729	$7E1729 – Extended sprite Y position, high byte. Last two bytes reserved for fireballs.

While the list is fairly straight forward, digging into the “why” behind each of these was very interesting. Let’s start with the inputs from the getPositions function. This RAM maps to Mario’s X and Y coordinate on the screen. This is used to drive the fitness within the algorithm, and to help determine the Map 16 tiles that need to be pulled from memory for purposes of building input tiles, and determining Mario’s distance from enemies. It doesn’t appear as if the layer1 variables are ultimately utilized in any of the calculations.

The getTile variable was probably the most confusing for me to find specific detail around. This variable maps to the map16 high byte. Thanks to the resources at SMW Central I was able to find Lunar Magic, which help clear up a lot of my confusion around how this memory was being used. In general, a Map 16 high byte of 01 in SMW maps to tiles that are solid – these can include everything from ground tiles, question mark blocks, slopes, and (unfortunately for the Mario character being controlled by this implementation) even spikes.

Map16 Tiles – Example 1 – The tiles in this tilemap seem like they would generally work fairly well. 🙂

Map16 Tiles – Example 2 – The tiles in this tilemap have tiles that would kill Mario. 🙁

These inputs are all brought together in the getInputs function to ultimately drive the creation of a feature matrix within the neural network representing small areas of the screen. All features are initially assigned a value of 0, then is further assigned a value of 1 if there is a solid tile in that particular area, or a value of -1 if an enemy (sprite) or obstacle (extended sprite) exists in that particular area.

While it isn’t lost on me that MarI/O was created mostly to create an awesome YouTube video, and I shouldn’t be too critical, using this input feature selection wouldn’t work for a large number of the game’s levels. Simply assigning a weight to a tile based on the high byte loses a lot of information that could be provided to the neural network, and will lead to inconsistent results across various tilesets.

For example, as discussed above, the example 2 tileset contains spikes with a high byte of 01. The image below shows an area where this would likely lead to issues for MarI/O. Based on how the current inputs are utilized, the spikes would be assigned a weight of 1.

The spikes on this level wouldn’t be recognized as being different than other ground tiles.

Ultimately, I’d like to pick up and continue to explore this (and other) implementations for ML in gaming. This examination lead me down a path of exploring emulation as a project for this course. I’m excited to dig more into emulation by building an NES emulator as a project for this course.

Leave a Reply Cancel reply

Archives

Meta