Over the past few years, companies have released ASICs that support certain machine learning tasks.
These can be implemented on a PCB over standard digital interfaces.
Designers will need to make some smart design choices to support these ASICs.
Machine learning and artificial intelligence (AI) are buzzwords that are commonly misunderstood, even by technology professionals. The software world gets most of the attention when it comes to AI and machine learning development, but there is another important aspect of deploying machine learning models in the field that often gets overlooked. At some point, end devices can’t continue to rely on the cloud for even the most basic machine learning inference tasks, and these tasks need to be instantiated on an end device, or they need to be performed at the edge of the network.
Edge servers are one option for offloading machine tasks from end devices, but these units are basically trimmed-down, highly specific data centers that perform a single set of tasks for client devices. To reduce network traffic and overhead even further, the other option is to deploy on the end device with an application-specific integrated circuit (ASIC). The industry started to recognize this need early and has begun responding to demand with some specialized application-specific integrated circuits with machine learning capabilities.
Designers that want to add machine learning to their system through an ASIC should properly plan their board layout and stackup to support these capabilities. We’ll look at some of the available machine learning capabilities in ASICs in this article as well as some practical steps designers can take to build boards that support these.
Application-Specific Integrated Circuits With Machine Learning Capabilities
All ASICs implement highly specialized digital logic, which may be programmable through external configuration pins and/or an external digital bus that interfaces with a system controller (MCU, FPGA, another ASIC, etc.). Application-specific integrated circuits that implement machine learning and AI are specialized for certain types of inference tasks and/or neural network structures, meaning the logic they implement is specific to AI computation in standard digital logic.
The current class of ASICs available today focus on performing neural network inference computation directly on the device. They implement their own processor core and logic blocks where these computations are performed quickly and efficiently, whereas the same tasks may be power and time inefficient on a typical MCU with combinational and sequential logic. These chips are also sometimes referred to as AI accelerators, as they implement machine learning algorithms much faster and with less power consumption than the host processor.
A typical system-level block diagram for a digital design with a machine learning-capable ASIC is shown below.
Block diagram with an AI/ML-capable ASIC.
In this block diagram, we have a host controller with an embedded application, or it may be running an embedded operating system directly on the device. This host controller sends collected data to the ASIC for inference tasks. Note that the current class of ASICs is not cut out for on-device training due to the extreme computational demands and the amount of data required; this is best performed at the edge or in the cloud, and then the device’s application can be updated with a new model.
Some elements are not shown in the diagram. First, there will obviously be some power management system implemented in hardware and software. Specific interfaces on the machine learning ASIC are not shown, but these could be low-speed for configuration (SPI, I2C, etc.) and high-speed for sending and receiving data streams (usually PCIe).
Why Use an ASIC With Machine Learning?
There are several reasons why an ASIC with a machine learning core would be used in a digital system. These include some of the points in the following list:
Reduces application complexity, as the required application used to implement inference tasks in a neural network exists in the ASIC rather than in the firmware or software.
Reduces processor demands, which allows the host controller to allocate resources to other tasks in the system rather than using its compute for inference tasks. This enables the designer to select a smaller processor.
Ensure uptime; today’s systems that use machine learning don’t do any inference tasks on the device. Instead, they send it to the cloud, and the result is received to be processed locally. If you want a device to survive without a network connection, then you must perform inference tasks on the device.
Reconfigurability; the neural network that is instantiated on the ASIC is part of the device’s configuration and can be changed as needed. Updates to the network could be provided from a web service on the cloud, an edge data center that trains models, or another device.
Designing PCBs to Support ASICs With Machine Learning
These devices operate with a single high-speed digital interface (normally PCIe) and should be designed as such. Make sure you follow the standard set of high speed PCB layout and routing guidelines to ensure signal integrity. Power integrity is also important, as system size and IO counts scale; these systems draw more power during high speed signaling, so excessive noise can radiate from the board unless the PDN and stackup are designed properly. Make sure to use the best design software available to build these advanced systems and scale them into volume production.
If you are designing a custom PCB to support an application-specific integrated circuit with machine learning, you will need the right set of PCB layout and design software. Allegro PCB Designer and Cadence’s full suite of design tools can help.
Leading electronics providers rely on Cadence products to optimize power, space, and energy needs for a wide variety of market applications. If you’re looking to learn more about our innovative solutions, talk to our team of experts or subscribe to our YouTube channel.