Funny how, the older you get, the more things seem to go in circles. You start by thinking that you are headed in a new direction, and then suddenly find yourself where you started. I don’t know about you, but just rereading the previous sentence reminds me of the words of Time of Pink Floyd – the part that says, “And you run, and you run to catch up with the sun but it’s going down, rushing to come back behind you.” The sun is the same in a relative way, but you are older, short of breath, and one day closer to death. Wow! Sounds a little depressing, doesn’t it? But I really do feel quite optimistic – honest!
As I have mentioned several times, my first job after graduating from college in 1980 was as a member of a central processing unit (CPU) design team for mainframe computers. A little over a year after I started, two of my managers left to start their own business called Cirrus Designs, and it didn’t take long for them to ask a few of the guys on the old team. , including yours, to join them. .
Although Cirrus Designs started out as a small business (I was the sixth person on the payroll), we have covered a lot of ground in the nine years I was with them, during which time I was made an offer I couldn’t refuse from another company, which culminated in my move to the United States in 1990. We started writing test programs for printed circuit boards (PCBs) designed by d ‘other companies. After a few years, we also started to develop our own RISC-based hardware accelerator. The idea here was for a “box” that sat next to your main Unix computer. The Unix machine remained in charge of the operating system and the management of files and the like, but it left the execution of the applications to the accelerator. Since you could see 2X to 10X acceleration depending on the task, that was pretty big let me tell you.
Another project that we worked on with our sister company, Cirrus Computers (which was famous for developing the HiLo and System HiLo digital logic simulators), was a hardware emulator whose name I have forgotten. In this case, the system accepted a netlist of your design – which could be anything from an application specific silicon chip (ASIC) to a PCB – and created an internal representation that was functionally equivalent to the product. final. Using the same stimulus (waveforms) as the System HiLo logic simulator, you can use the emulator to debug the functionality of your design before committing to a real world implementation. While there were already huge emulators (in terms of size and cost) on the market, ours was intended to be a much more affordable desktop system.
I have to admit that I moved away from emulation in the 1990s, but I remember meeting Lauro Rizzatti at a Design Automation Conference (DAC) in the early 2000s. At that time, Lauro was the managing director of the United States and vice president of global marketing for a French company called EVE, founded in 2000. When I met Lauro, EVE had recently launched its flagship hardware acceleration and emulation product, the ZeBu (where “ZeBu” means “Zero Bugs”). With a lot of hard work on the technology and marketing fronts, EVE and ZeBu enjoyed huge success, eventually being acquired by Synopsys in 2012.
By the way, the term “zebu” also refers to a species of domesticated cattle native to the Indian subcontinent. These little rascals are well adapted to withstand high temperatures and are bred in all tropical countries, both as a pure zebu and as a hybrid with taurine cattle. Now I bet you think I strayed from the topic as usual, but …
… I just finished a videoconference with the guys and girls from Mipsology. One of the guys I was speaking with was Ludovic Larzul, who was co-founder and VP of Engineering at EVE, and who is now Founder and CEO of Mipsology. As Ludovic told me, in the early days of artificial intelligence (AI) and machine learning (ML), he looked into the idea of using FPGAs to speed up inference made by artificial neural networks (ANNs), which led to the creation of Mipsology in 2015.
If you visit the Mipsology website, you will see them say, “We focus on acceleration, you focus on your application.” And they go on to say, “Speed up your inference, anywhere, in no time and effortlessly using Mipsology’s Zebra. “
Wait; Zebu… Zebra… do we see a pattern here?
On the one hand, we have the zebras we all know and love, i.e. African equines with distinctive black and white striped coats, of which there are three extant species: Grevy’s zebra (also known as the name imperial zebra, these little rascals were named after the French lawyer and politician Jules Grévy), the plains zebra and the mountain zebra. On the other hand, we have Mipsology’s Zebra, which, in their own words, “is the ideal deep learning computational engine for neural network inference. Zebra seamlessly replaces or complements CPU / GPUs, allowing any neural network to compute faster with lower power consumption and at a lower cost. Zebra deploys quickly, transparently and painlessly, without knowledge of the underlying hardware technology, without the use of specific compilation tools or without modification of the neural network, formation, framework or application. Well, it’s definitely a mouthful, but it pretty much covers all the bases.
The Zebra stack, which was first introduced in 2019, is an AI inference accelerator for FPGA-based PCIe accelerator cards. Suppose you build a neural network framework using one of the usual suspects (TensorFlow, PyTorch, Caffe, MXNet…), and train it using a traditional CPU / GPU setup. Transitioning from the CPU / GPU training environment to a Zebra inference deployment is a plug-and-play process that does not require any changes to the fabric, neural network, or application. As the folks at Mipsology say, “Transitioning from the CPU / GPU environment to Zebra and switching between the two is possible at any time without effort. “
If you tell the folks at Mipsology, they can (and will explain) in excruciating detail how AI / ML inference using FPGA-based PCIe accelerator cards handily beats CPU / GPU solutions when it comes to lower costs and higher performance, but that’s not what I wanted to talk about here. The point is, it’s all well and good to use FPGA-based PCIe accelerator cards in cloud-forming data center servers, but large amounts of AI / ML inference tasks are migrating to on-board systems and into the cloud. periphery of the Internet. For all kinds of reasons including cost, power consumption, and physical size, it is impractical to have PCs with PCIe cards as embedded systems or edge devices, so what should we do? make ?
Well fear not, because the folks at Mipsology have us covered. They recently introduced their Zebra IP to accelerate neural network computation for edge and embedded AI / ML applications. As the name suggests, Zebra IP is a macro that can be implemented in the programmable matrix of an FPGA and integrated with other functions inside the FPGA (it includes a simple Arm AXI-based interface that facilitates communication with on-chip processor cores and others). Now this is where things start to get really smart. I think we need a diagram, but I don’t have access to the one I like, so I cobbled together the following illustration from pieces of other images (with apologies to all concerned ).
Zebra and Zebra IP development flows.
We start with a neural network that has been trained in the traditional way on a CPU / GPU based system. As previously stated, this formed network can be transferred to Zebra on an FPGA-based PCIe card using a plug-and-play process. This is where we can check the performance and accuracy.
When it comes to Zebra IP, the smart first part is that it can be loaded into the FPGA (s) of the PCIe accelerator card, thus qualifying the behavior of the IP and reproducing any potential problems related to the edge. The last step is to load the Zebra IP instantiation of the neural network into the FPGA of the target edge device or on-board system.
As always, “the proof of the pudding is in the act of eating” as the old saying goes, so how good is Zebra IP? Well, first of all, we have a little problem in that people are prone to procrastinating in terms of TOPS (tera operations per second), but the real and lasting TOPS are not the same as the peak TOPS. which are so often quoted. A more reliable measurement in the case of a visual inference application is to use the actual performance measured in FPS (frames per second). In this case, Zebra would provide 5X to 10X more FPS with the same “peak” TOPS, as shown in the image below.
Inference Performance: Zebra IP vs. Custom GPUs and ASICs
(Image source: Mipsology)
I’m starting to feel like an old fool (but where can you find one at this time of day?). For example, I designed my first ASIC in 1980, over four decades ago as I was writing these words. At that time, I couldn’t design anything that surpassed an ASIC.
I remember when FPGAs first hit the scene in 1985 in the form of Xilinx’s XC2064. This little crook had an 8 x 8 array of configurable logic blocks (CLBs), each containing two 3-input look-up tables (LUTs). I remember being vaguely interested, but had no idea how these devices were going to explode in terms of capacity and performance.
I also remember when what we would now think of as GPUs started to appear in the 1990s (the term “GPU” was coined by Sony in reference to the Sony 32-bit GPU, designed by Toshiba, which appeared in the PlayStation video game console released in 1994).
As the name suggests, GPUs – which can be thought of (no pun intended) as a set of relatively simple processors, each with a certain amount of local memory – were designed to meet the demands of computer graphics and software. image processing. Over time, it was realized that this architecture could be used for a wide variety of parallel processing tasks, including mining Bitcoin and making AI / ML inferences.
I’m afraid to start rambling. The point is, just a few years ago, if you had asked me to guess the relative performance of ASICs, FPGAs, and GPUs for tasks like AI / ML inference, I would have placed ASICs at the top of the stack, GPUs in the middle, and FPGAs at the bottom, which shows how little I know. All I can say is that if I ever get my time machine running (you just can’t find the parts here in Huntsville, Alabama, where I’m currently hanging my hat), the one of the things I’m going to do – besides buying stocks in Apple, Amazon, Facebook and Google, while spending money on Bitcoins when they could be obtained for just 10 cents a piece, is like investing in companies like Mipsology. What you say?