What is Computer Architecture?
Computer architecture is a set of rules and methods that describe the functionality, organization, and implementation of computer systems.
How does Computer Architecture relate to Embedded Systems?
It allows the efficient design of microcontrollers to be cheaper and faster.
Although computer architecture and embedded systems are similar, they differ with how they’re designed. Computer architecture is usually made open ended, while embedded systems has a very specific end result it’s meant to achieve.
FPGAs in embedded systems allow compute architecture to more easily integrate into embedded systems. FPGAs allow for custom hardware and custom computer architecture to be design, prototyped and implemented into projects. This allows designers to more easily make custom hardware that is better suited for the project than an off the self micro-controller. See below linked
What challenges does Computer Architecture face today?
The current challenges facing the manufacturing process is hitting a plateau. Moore’s law is breaking down at 13 nm. As the size of the manufacturing process of embedded processors and chips decreases, a number of effects begin to hinder the performance of the chips, slowing down the development process. Some of these affects include the transistor based Body Effect which can cause transistors to malfunction, Velocity Saturation which put a limit on the speed of the chip, Dopant Variation which causes swings in voltages reducing the strength of the transistors in embedded chips.
Power consumption is an extremely important part of computer architecture. Approximate computing is potentially a way to save power while doing lots of computations. Using approximate computing, we can save computers huge amounts of time and reduce power consumption considerably. Some applications (mostly in multimedia) don’t care about perfect accuracy: you can’t really tell the difference between a pixel shaded #204feb vs. #204fea. Also, some classes of applications, namely heuristics, don’t even have “right” answers in the first place – they were approximations to begin with! For example, by training a hardware neural network, you can trade off a bit of accuracy while improving power efficiency significantly. However, there are a lot of problems with this. Just because your approximate accelerator performs well on inputs A, B, and C doesn’t mean that it’ll perform well on input D, especially if input D didn’t even exist before the product was launched. How do you theoretically guarantee quality of service? Is a proof of QoS even possible with things like neural networks? What kinds of applications can you accelerate? Where do you draw the line for “good enough”? Do you have a one-size-fits-all “approximate computing” accelerator, or have multiple accelerators tailored for multiple applications? Should these approximate computing accelerators be fixed, or adapt at runtime?
Another byproduct of Moore’s law is that we are creating less reliable transistors and increasing the complexity of designs by using this high number of transistors. Some of the problem that occur with the density of transistors is different types of faults. These faults can be permanent or temporary. This is due to the heat that the transistors are creating from needing so much power. The heat can damage registers and slow down computation time. As we introduce more transistors, the circuits become more and more complex, and be difficult to deal with. In each core manufacturers are using predictive microarchitectural structures, which is very complex and can make it difficult to connect multiple cores together. As mentioned above, at some point we will hit a plateau and new techniques will need to be implemented.
Patents and Licensing
Patents and licensing can provide significant barriers to utilizing certain technologies. This can often result in a market controlled by a very small group of companies, typically the ones with the cash flow to pay for or develop these technologies. This often limits innovation as big companies get complacent and small companies can’t afford to compete.
One significant example is the x86_64 architecture used by all high performance computing. The current implementation of x86_64 is AMD64, a version developed by AMD who currently owns patents on the design. Due to the widespread use of AMD64 thanks to cross licensing between Intel and AMD, developing high performance processors that will be successful in today’s market require compatibility with AMD64 and require significant royalties to utilize AMD’s patents. Intel and AMD also have no interest in letting new companies tread upon their market leadership. That coupled with the cost of producing silicon have left the market with only two companies that develop performance CPUs. With AMD heavily overshadowed by Intel for the past decade up until recently, the lack of innovation by Intel shows how progress can easily be stifled by such a situation. 
In the embedded/mobile space, something similar can be said about ARM Holdings, the company that develops and licenses the ARM RISC architecture used smartphones and other embedded systems. As software is developed to work on architectures, the marketshare of ARM based devices makes it nearly impossible to escape ARM’s influence. Even under ARM’s control, patents and licensing often limit the use of chips built upon ARM’s architecture. Certain companies like Qualcomm (an ARM licensee) have a huge influence on the development of technology used in the embedded space. Qualcomm has often used anti-competitive licensing practices such as exclusivity arrangements to expand usage of their CPUs, modems, and other silicon. Recently in 2019, Intel decided to exit the mobile 5G market due to an exclusivity agreement between Apple and Qualcomm, removing Qualcomm’s main 5G competitor. These kinds of practices severely limit the development and availability of new technologies and can heavily impact anyone who wants to use the technology but isn’t a big corporation. 
What are some useful resources to learn more about Computer Architecture?
A preview version of Designing Embedded Hardware, 2nd Edition by John Catsoulis, where the first chapter introduces computer architecture through segemented topics:
Common and Popular Computer Architecture
- CISC (Complex Instruction Set Computing)
- RISC (Reduced Instruction Set Computing)
- MIPS (Million Instructions Per Second)
- ARM (Advanced RISC Machine)
RISC V, often pronounced “risk-five” is an open standard instruction set architecture (ISA). That may not sound very special or exciting, but by the time you finish reading this section I think I’ll have changed your mind.
RISC V was created to solve the problem of there not being CPU designs that were royalty free and open source. They wanted a practical ISA that could be used academically and for practical applications without needing to pay royalties to anyone. Because this was going to be of use academically, using the reduced instruction set computer (RISC) principles as a base was an easy choice, as RISC is much easier to write code for than CISC.
Why build your own CPU?
If you haven’t heard of Intel Management Engine (ME), then you are in for a surprise. You’d think that when you buy a computer and install your preferred operating system on it that makes you the boss, the administrator. Windows hides a bunch of settings behind walls designed to keep people from breaking their computers, but if you dig hard enough you can usually change them. You are still the boss.
Since 2008 however, Intel has included an autonomous subsystem that has been incorporated into virtually all of their chips, that means your Intel CPU. Before you say what about AMD, they do it too! Both chip manufacturers include code that has more authority than root users and can read and write arbitrary memory, and they don’t like telling anyone what they are up to. In other words, if Intel or AMD ever needed to, they could bust into your computer without your permission. Or at least this is what is speculated.
There are several open source RISC-V CPU designs out there, and several low cost purchasable chips implementing these designs. If you are unsettled by the presence of Intel/AMD secret code being executed in your CPU, consider purchasing a RISC-V CPU and contributing to the RISC-V embedded community.
How does Quantum Computers Differ from traditional silicon based chips?
Silicon chips relay on traditional transistor based design where data is processed in binary format. These binary bits are represented in term of voltages, and then are menu-plated per instruction clock cycle. Quantum Computers on the other hand is something called q-bits. A qbit maintains two different orientations at the same time. This jarring for most people to understand, but there is the cat in box analogy that works very well. Here is a video I really enjoyed:
In essence, quantum particles seem to only maintain one state at the moment they are observed. One of the biggest advantages of using Quantum Computers is their ability to solve computationally complex problems. The sort of problems that get exponentially more complex as they grow such as encryption. Google recently announced that their Quantum Computer reach supremacy. This means that they achieved something on their Quantum Computer that traditional computers today can not do. Here is the Blog by Google scientists:
Does mean in a few years we will have Quantum Computers in our homes? The answer is a no unless something radically changes, Quantum Computers need an extremely stable environment to function. That means temperatures close to 0 Kelvin, very sophisticated stabilizers, and other forms of insulators. This because any vibration would cause interference that will destroy or distort results from a Quantum computer. Also Quantum Computers are actually not very fast when it comes to computing mundane day to day tasks. So our traditional computers are not going anywhere.
Learning VLSI Design with Open-Source Software
The typical workflow for Very Large Scale Integration (VLSI) involves expensive programs created by Cadence and Mentor Graphics. While these programs are the most used programs in industry, they are also complex and expensive, as they are allowed to be, in a monopolized market without competition to decrease cost while increasing efficiency and maintenance. Virtually anyone going into the silicon industry for design needs to have knowledge of using Cadence’s Virtuoso for circuit design and layout, and Mentor Graphics’ Calibre and ModelSim for integrated circuit (IC) verification and functional testing. These products have extremely high prices and with information on the actual price being kept behind “contact us for a quote” it’s hard to find out just how much the software costs. According to an article posted all the way back in 2003 by eetimes.com, the cost of Virtuoso at the time for a 1-year license started at “$140,000 for Virtuoso Multi-mode Simulation”. Besides being a high price for most companies, this creates another barrier—high entry cost for students and others getting started in the VLSI design field.
A possible solution to the cost of getting started and practicing VLSI design is to use freely-available open-source software like Electric from Static Free Software. There are many benefits to using open-source software and hardware, as can be shown by the Arduino and Raspberry Pi platforms for getting into microcontroller/microprocessor systems and electronics. Some of the main benefits are increasing the size and diversity of the community through easier access to learning tools, increased collaboration between disciplines, and increased overall knowledge of subject areas.
Open-source products can also have disadvantages when compared to their closed-source counterparts. Closed-source or proprietary products can come with technical support and version updates that can help increase efficiency and remove bugs. However, this places a lot of trust with the companies that make the products and support can sometimes be unreliable.
With the current process limitations in transistor sizing, it is important for more people to be involved in the VLSI design industry. Using software like Electric can help to spark more interest in the area with an easy way to access design tools and methodologies that can be carried over to industry-standard software. Having a larger and more diverse community with access to design tools can help the industry mitigate process limitations and eventually push past them with new innovations.
x86 Architecture Overview:
The IA-32 is the instruction set architecture (ISA) of Intel’s most successful line of 32-bit processors, and the Intel 64 ISA is its extension into 64-bit processors.
Here’s a link to an article that goes over x86 architecture briefly, if your’e interested in learning more.
Computer Architecture at OSU
There are many ways to learn about computer architecture at OSU, with the most notable being ECE 375. This is a class that is required for ECE majors and is often the first experience students get with computer architecture. The class talks about the fetch and execute cycle and how the CPU uses its memory registers to store and move data where it needs to go. The class also teaches assembly language, which can also further help students understand computer architecture since it is a low-level language.
There is also ECE 472, which is titled “Computer Architecture”. This class likely builds off of the knowledge of ECE 375, since it is a prerequisite, and would be good for a person wanting to learn more about computer architecture.
The professor who is teaching ECE 472 spring term of 2020 is Dr. Lizhong Chen, and he is doing some significant research into implementing machine learning algorithms towards the optimization of GPU architecture design. While this is not necessarily directly related to embedded systems, the overall thrust of that research could show up in embedded design in the very near future. You can find some reading on that research on Dr. Chen’s website, here.
One of the greatest tools for embedded systems is the FPGA, although the FPGA does little on its own. External peripherals are a great way to increase the utility of the FPGA and allow easy implementations for users. Often companies will have boards with their FPGA and peripherals such as flash memory, PLLs, and more. In the IDEs for VLSI or hardware description language of choice there will be functional code that assigns the designed hardware to the existing peripheral instead of writing it to the fabric. In the case of flash memory this saves an amazing amount of space as flash is very expensive in terms of fabric.
As the years grow the complexity of these peripherals have greatly increased. This is in part to the power of the FPGA and the amount of usable pins as well as more descriptive and operational high speed communication. Protocols like PCIe and SATA, alongside SerDes, has made it easy to connect an FPGA to peripherals with a high throughput of information.
Peripherals come in all shapes and forms but the ability to increase the capabilities of an FPGA by connecting new components is ideal for embedded processing. Chips such as an FTDI chip is a fantastic example of how an FPGA can be connected to the rest of the system. FTDI chips allow the FPGA to communicate off board, or off the smaller embedded system, to much larger systems. This could be direct communication serially to a computer, through any wanted protocol to another part of the embedded system, or to simply convey bits to a smaller chip. Systems like this are extremely valuable.
Most of my research came from A New Standard for FPGA Peripherals
Useful Books to learn more about Computer Architecture
- Structured Computer Organization – Andrew S. Tanenbaum
- Code: The Hidden Language of Computer Hardware and Software- Charles Petzold
- Computer Architecture: A Quantitative Approach- John L. Hennessy
- Computer Organization and Design: The Hardware/Software interface- David A. Patterson
- Digital Design and Computer Architecture- David Harris
- Learning Computer Architecture with Raspberry pi- Eben Up