How CPUs Understand 0s and 1s

I recently started an exciting project: building an NES emulator in C++. An emulator allows a computer to mimic the behavior of other hardware—in my case, the Nintendo Entertainment System. It works by loading a copy of an NES game’s memory, and executing the program in a similar way a real NES would. This means replicating the behavior of the CPU, the PPU (Picture Processing Unit), and the APU (Audio Processing Unit).

I chose to start with the brain of the system, the CPU. The NES used a variant of the MOS Technology 6502 CPU, a legendary 8-bit microprocessor found in other tech of the time, like the Commodore 64, Atari 5200, and the Apple II computer just to name a few. So, I sought out to learn how a CPU works.

Instruction Execution

CPUs execute instructions. The “8-bit” part means that data is processed in 8-bit chunks. The 6502 registers—small, high-speed memory areas used to hold temporary variables—are 8 bits wide, which means instructions typically manipulate 8-bit numbers.

What is an Instruction?

An instruction is a directive telling the CPU to do something like moving data around, perform a calculation, or jump to another section of the program. Here’s a snippet of 6502 assembly code to illustrate:

LDA #$01 ;Load the value 01 into the accumulator register
TAX      ;Transfer value from accumulator to the X register
INX      ;Increment the X register by one
BRK      ;Break, or stop the program

We load a value into register A, transfer it into X, and then increment X. In a higher level language, the operation might look like:

int A = 1;
int X = A;
X = X + 1;
// End program

Assembly vs. Machine Code

To make the CPU understand any given instruction, we have to decode or “disassemble” the instruction into machine code. Figuring out how to assemble and disassemble instructions sounds like a great weekend, but it’s not required for the CPU emulation.

Executing Machine Code

Looking back at the first line of the assembly snippet, LDA #$01 can be written as 0xA9 0x01. You may have guessed it, but 0xA9 maps to LDA and 0x01 to 01 respectively.

To understand how, let’s load the above program into memory, starting at the address 0x0000:

0x0000
0x0001
0x0002
0x0003
0x0004
0xA9      ;LDA #$01
0x01
0xBA      ;TAX
0xE8      ;INX
0x00      ;BRK

Let’s briefly talk about how the CPU would execute this code.

  • The program counter (PC), starts program execution at address 0x0000.
  • The CPU sees 0xA9 and recognizes it as a valid opcode that maps to a 2-byte instruction that tells the CPU to load the next byte of memory into the A register.
  • The next byte of memory is 0x0001, where PC is currently located. The value at this address, 0x01 gets loaded into the A register and PC increments to the next address, 0x0002
  • The CPU recognizes 0xBA as a valid opcode for TAX, a 1-byte instruction. It transfers the value from register A to register X and increments the program counter to the next address, 0x0003
  • At 0x0003 we have another 1-byte instruction set, which increments the value in register X and moves the PC to 0x0004
  • The program sees BRK and stops execution

Where do Opcodes Come From?

The chip manufacturer designs the instructions and determines which opcode maps to which instruction. Programmers have to use a data sheet—chip manual also provided by the manufacturer—to get relevant chip information.

More than Just Bits and Bytes

The 6502 CPU has 56 instructions, all represented by some value between 0x00 and 0xFF. The “aha!” moment came when I realized that at its core, the CPU is just an interpreter. In essence, the process of developing the 6502 emulator has been:

  • Load an opcode into memory
  • Implement the instruction for the given opcode
  • Execute the instruction and test to make sure the registers and memory are updated correctly.

Of course, it’s not the whole picture; I’m still deep in the weeds of trying to understand how everything works, but the work I’ve done so far has given me a newfound appreciation for how computers work. It’s like discovering that the “magic” of computers is a sequence operations executed perfectly in sync.

If you’re curious about NES emulation, here are some of the resources I used to get started:

Thanks for reading!

Print Friendly, PDF & Email

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *