A first step toward more agile hardware design, debugging
A newly-developed suite of hardware design tools has brought developers closer to being able to fix bugs in deployed reconfigurable hardware. The tools, designed by a team at the University of Michigan, take advantage of the reconfigurability of devices like field-programmable gate arrays (FPGAs) to enable a more software-like debugging and design cycle in hardware that is traditionally unchangeable after it’s been deployed.
The project, titled “Debugging in the Brave New World of Reconfigurable Hardware,” was authored by CSE students Jiacheng Ma, Gefei Zuo, Kevin Loughlin, Haoyang Zhang, Andrew Quinn (alum, now at University of California, Santa Cruz), and Prof. Baris Kasikci.
Originally motivated by a need for fast, task-specific circuits, FPGAs can be configured and reconfigured as different circuits to suit developers’ needs. The authors of the project saw this as an opportunity to discover and correct errors in the hardware’s configuration, potentially enabling more agile development like that seen in the software world.
“In traditional hardware, like a CPU, when you find a bug it is likely to be there forever,” says Jiacheng Ma, PhD student in Computer Science and Engineering. “There’s no way to fix it after the chip has been fabricated.”
Because of this limitation, hardware design time and resources are usually devoted to an extremely rigorous process of design verification. Traditional chip specifications need to be thoroughly ironed out prior to manufacturing.
“With reconfigurable hardware, the paradigm shifts a bit,” says Kevin Loughlin, CSE PhD student. “You can move towards addressing some of these bugs after you’ve actually shipped the hardware because, much like software, you can find the bug, redo your design, and update the accelerator design that’s been synthesized into the FPGA.”
The suite of tools designed by the team make this process of hardware debugging easier for developers. Currently, FPGAs ship with rudimentary libraries that help record the behavior of circuits during code execution. But these are labor-intensive and difficult to use, says Ma, nothing like the robust debugging tools available for software.
“What we’re trying to do is design things similar to what software developers have available to help localize bugs on FPGAs,” Ma says. “You tell the tools what you want to do and they will take care of several difficult things for you, like automatically generating code.”
The team first collected and classified a number of hardware bugs from open source repositories in order to assemble a testbed of reproducible bugs. They found that the types of bugs that tend to arise in hardware designs are actually very analogous to those found in software. Their paper describes three high-level categories — data mis-access bugs, communication bugs, and semantic bugs — and a number of subcategories that bear intuitive similarities to their software counterparts.
One example given by Ma is the familiar buffer overflow — a bug common to both hardware and software involving a data structure that was accessed with an out-of-bounds index. The similarities tend to end in the specific effects of the bug, however.
“A lot of the time the error in the human intuition is the same between the software and hardware,” says Loughlin, “but the way that the bug manifests leads to a different set of symptoms, and the way it’s detected requires a different set of tools than software.”
The team considers their initial set of debugging tools a first step in this field, covering some simple monitoring and analytical gaps. For example, one component can interpret statements similar to a programming language’s print statement, allowing the developer to log intermediate outputs about variables on a running circuit.
The other components enable variable monitoring, including the value of the registers each one depends on and any updates made to it during program execution, gathering of statistics about different events that occur during execution, the detection and tracking of data loss, and the tracking of state machines, a common component of hardware design.
The authors hope that these intuitive tools will enable hardware developers to approach their work with reconfigurable devices more like software, with the expectation that the toolset will grow more complex from here.
“In the future people will definitely build more hardware debugging and localization tools,” says Ma. “These future tools will use more sophisticated logic to help with various specific bugs.”
“If these tools develop in robustness and in their ability to identify, isolate, localize, and correct bugs after hardware has been shipped,” says Loughlin, “then it would be interesting to see if there would be a paradigm shift toward a willingness to ship reconfigurable hardware without total verification in order to save up-front and possibly lifetime costs.”
This project was presented at the 2022 session of Architectural Support for Programming Languages and Operating Systems (ASPLOS ’22).