FPGA: The super chip in the age of artificial intelligence

In modern society, artificial intelligence (AI) is developing more rapidly. And the Field Programmable Gate Array (FPGA) has always been the focus of research as a driving platform. This paper studies in detail the theoretical basis, applications, defects, and future development directions of FPGAs. It is concluded that FPGA has three characteristics: gate array, programmable, and scene, and the detailed positioning of FPGA, the structure, principle, tools, process, and description language of FPGA design. And the unique advantages of FPGA in the field of artificial intelligence: flexible and configurable, special optimizations for convolutional neural networks, and deterministic low latency. Several typical applications of FPGA in the field of artificial intelligence, deficiencies and solutions, and two future development directions. This article will make a great contribution to the development of FPGA in the field of artificial intelligence in the future.


Introduction
With the development of science and technology, artificial intelligence is becoming more perfect.FPGA, as an artificial intelligence chip, has also received more attention.It has become an essential part of this field and is widely used in industry and life.This article gives a detailed description of FPGA and its application research in the field of artificial intelligence.Firstly, this article introduces the theoretical basis of FPGA, which includes the basic introduction, positioning, and history of FPGA.Then the paper introduces the structure of FPGA, design principles, design tools, design process, and hardware description language.Secondly, this article presents the unique advantages of FPGA and its application in the area of artificial intelligence.Thirdly, this paper introduces the deficiencies of FPGAs and critical directions for future development.The research in this paper will benefit the future research of FPGA and its development and application in the field of artificial intelligence.

The basic introduction of FPGA
The Field Programmable Gate Array (FPGA), essentially a type of semiconductor chip, is the result of further research based on programmable devices such as Programmable Array Logic (PAL) and General Array Logic (GAL).FPGA has three significant features: gate array, programmable, and scene.Therefore, it is a chip with powerful functions and high flexibility.
When the final chip product is obtained, its logic function is fixed for most chips.Suppose the logic function is changed or find that there is a problem with the chip itself.Redesigning this chip will consume a lot of cost, workforce, and time.Therefore, to solve this problem, technicians designed a general-purpose chip, using its basic logic gates to rearrange and combine to realize different chip functions [1].This is the original idea of FPGA birth.Standard devices and custom chips can be used to generally categorize digital chips.Standard Logic Devices, Application Specific Standard Parts (ASSP), and Programmable Logic Devices (PLD)are the three categories of standard devices.PLDs are logic devices that can be programmed to implement different logic circuits.FPGA is a type of PLD with a higher degree of design freedom than PLD.As a result, its invention eliminates the limitations of custom circuits and the shortcomings of the finite number of original programmable device gates.
According to Moore's Law, which was put out by Gordon Moore, one of Intel's founders, the number of transistors that can fit into an integrated circuit would double roughly every two years.The development of FPGA has been following this law.Xilinx was the first company to commercialize FPGA.The company was founded in 1984 and launched the first commercial FPGA-XC2064.The chip has 64 programmable logic cells, two 3-input look-up tables (LUTs), and a total of 800 logic gates.At the end of the 20th century, the semiconductor industry ushered in rapid development, and increasing the capacity of FPGA became the first proposition chosen by scientists.At this stage, due to the increasing power of FPGA, the structure of FPGA became increasingly complicated, so there appeared Tools optimized for FPGA design.FPGA manufacturers have produced new product types for their developed tools, such as Xilinx's ISE and Vivado series and Intel's Quartus series.In modern society, programmable system-on-chips with complex functions have become the leading force in the market, reusable IP cores have been installed for some essential functions, and many dedicated logic units have been placed in new FPGAs.In the era of artificial intelligence, true IP cores and precise logic units for artificial intelligence are added to modern FPGAs, making FPGAs a super accelerator.

The design of FPGA
FPGA mainly consists of six parts:1.programmable input and output unit (IO) 2. programmable logic unit (CLB) 3. wiring resources 4. embedded block RAM 5. complete clock management, 6. embedded dedicated hardware modules and embedded underlying functional units.The most important ones are programmable output and output units, programmable logic units, and wiring resources.The logic of the FPGA is implemented through logic blocks using a look-up table (LUT), which is often combined with other components, such as flip-flops, to form an independent logic block.Most of the functions of the FPGA are implemented through these separate logic blocks.Realized.At present, in the field of artificial intelligence, an FPGA that can learn customized functions has 200,000 or even millions of logic blocks.The FPGA has a built-in fixed hardware loading interface, and the FPGA enters the configuration process in about 0.2 seconds.However, the configuration information is generally stored in the EEPROM or flash memory outside the FPGA because the logic block uses the information in the lookup table, and the layout matrix is unstable [2].The internal structure diagram of FPGA is shown in figure 1.The layout and routing of many logic blocks can be done by design software.Therefore, FPGA development tools include software tools and hardware tools.Among them, the hardware tools are mainly FPGA development boards, and FPGA manufacturers or third-party manufacturers develop their download lines.They also include board-level debugging instruments such as oscilloscopes and logic analyzers.In terms of software, FPGA manufacturers and EDA software companies provide many excellent EDA tools for each stage of FPGA design.FPGA development tools include software tools and hardware tools.Regarding software, FPGA manufacturers and EDA software companies provide many excellent EDA tools for each stage of FPGA design, like Xilinx's ISE, EDK, Altera's Quartus 2, and SOPC Builder.In terms of hardware, it is mainly FPGA development boards developed by FPGA manufacturers or third-party manufacturers, such as Elbert2 from Numato LAB, Papilio from Xilinx, and download cables, as well as board-level debugging instruments such as oscilloscopes and logic analyzers.Making full use of the characteristics of various tools and co-designing multiple EDAs is the key to designing a suitable FPGA.The most typical way is to use Verilog HDL or VHDL and other hardware description languages to write register transfer level (RTL) codes to describe the circuit.Then, RTL description through logic synthesis, process mapping, logic packaging, layout, and routing, and finally generate bit streams to realize the target circuit in the FPGA [3].

The unique advantages of FPGA in artificial intelligence
3.1.1.Flexible and configurable.In the practical application of deep learning neural network, in a scene with very high real-time requirements, the critical factor affecting the delay is to process the image quickly.FPGA can flexibly change the circuit structure by using its flexibility.Make hardware features take full advantage of their performance.For example, weight data is a necessary factor in the calculation process of the neural network.The data of each layer may be the same, so weight sharing and reducing the amount of data storage are effective means to speed up the running speed of FPGA.The FPGA can also increase the batch processing capability, reduce the bandwidth without affecting the accuracy, and use the parallelism of the FPGA to process multiple data simultaneously.Similar methods include light weights, compact networks, etc.

Special Optimizations for Convolutional Neural
Networks.Computational operations and floating-point calculations are the basic operating principles of convolutional neural networks.The storage characteristics of neural networks can be enhanced by designing FPGA computing units, bandwidth, and local memory, thereby improving the performance of FPGAs at various levels and dimensions.Give full play to the efficiency and performance of the AI computing stage [4].

Deterministic low latency.
Most application scenarios of artificial intelligence deep learning are in mobile devices or edge industrial devices, especially real-time monitoring.The highest requirements in this field are drones or autonomous driving.The time delay of reasoning will affect the response time and distance of car braking.Since FPGA has flexible and customizable I/O, it can ensure the provision of deterministic low-latency I/O.And FPGA can provide deterministic system delay, make full use of the parallelism of the chip, and reduce computing delay.Therefore, it can meet the needs of artificial intelligence.Deep learning has high requirements for real-time performance, and its performance can be better than that of humans.This is the most basic requirement for real-time performance in drones or autonomous driving [5].

The application of FPGA
3.2.1.Real-time monitoring and processing.Regarding real-time monitoring and processing, the fields with the highest real-time requirements for deep learning are the fields of drones and autonomous driving.
The UAV's onboard electronic equipment and the UAV's flight control system are the two significant elements of controlling the UAV.Hardware circuits and embedded software design mainly realize UAV controllers.The more common ones are single-chip microcomputers and ARM.However, when studying the timing synchronization and command delay requirements of the measurement and control system, especially some applications that require a low delay of the remote-control channel of the UAV, the time generated by the traditional UAV controller Delay is challenging to meet the requirements.This requires using the UAV controller based on FPGA design, which fully utilizes the advantages of FPGA synchronous design and the ability of parallel data processing.The peripheral circuit is composed of level conversion and chip drive, and the functional modules, such as command encoding and display and command asynchronous serial, are integrated into the FPGA.Such a system is simpler and more scalable, and the command data group sent by the remote-control command is delayed.Less than 80ms, the system designed to process control commands in real time can meet the requirements of various types of UAVs for real-time control [6].
With the progress of science and technology, the autonomous vehicle is also an essential application in artificial intelligence.The basic principle of the autonomous vehicle is to use onboard sensors to perceive the vehicle's surrounding environment and obtain information about roads, vehicle positions, and obstacles through feedback, thereby controlling the steering and speed of the car.This means onboard sensors need high-speed operation, processing, and feedback information applied to FPGA.As a result, vehicle FPGA provides high-throughput and real-time processing capabilities for the ADAS/AD system of the car camera, such as for sensor acquisition, pre-processing processing, and acceleration, providing advantages such as high flexibility, low latency, high performance per power ratio, etc.The schematic diagram of the basic principle of the autonomous vehicle is shown in figure 2.

The brain wave project of Microsoft.
Microsoft's initial practice on FPGA was the "Catapult" project, which helped Microsoft deploy tens of thousands of FPGA acceleration resources in its cloud data centers all over the world.From the end of 2015, Microsoft deployed on its newly purchased servers the Catapult FPGA board, which also makes Microsoft one of the largest FPGA customers in the world.In modern society, where artificial intelligence is developing rapidly, using the Catapult platform for FPGA acceleration of AI applications has become the goal of the following plan.This is the Brainwave Project, which is mainly to solve the needs of "low latency" and "high bandwidth" in the practical application of AI and, simultaneously, to ensure the integrity of the project model without causing a loss of accuracy and quality.It can provide users with automatic deployment and hardware acceleration without hardware design while ensuring real-time and low-cost systems and models.
The specific plan of the brain wave project is to implement and optimize NPU softcore and instruction set on FPGA; According to resources and requirements, the trained DNN model is automatically divided into regions to form a complete toolchain; Perform FPGA and CPU system architecture on the divided sub-models.The brainwave project's entire acceleration process for DNN is shown in figure 3.

Intelligent system field.
A primary research direction of artificial intelligence is to replace human mental work and use automata to simulate human thinking and unique behaviors.It integrates knowledge in many fields, such as natural and social sciences, intelligent computing, intelligent decision-making, and optimal path planning.It is said to be the three major applications in this direction.
(1) Intelligent computing.Intelligent computing refers to the automatic programming of machines through systematic calculations, automatic writing, and drawing, automatic generation of verses, etc.In addition, there are some specific applications such as online intelligent course management, etc.It is an integrated learning system based on FPGA, using machine learning algorithms to provide an opportunity to quickly create new interactive environments at a low cost, thereby improving students' learning efficiency [7].
(2) Intelligent decision.Intelligent decision-making refers to using social science and decisionmaking theory as the basis, analyzing and modeling past data and historical records as materials, and finally making the optimal decision.The most common one is stock trading.The system makes optimal trading decisions through original financial information, transaction data, and past historical records, thereby avoiding subjective judgments in the investment process.In recent years, applications have employed FPGA to create the best strategy to optimize resource allocation, allowing clients to share the same infrastructure in order to maximize scalability and resource utilization.[8] (3) Optimal path planning.Optimal route planning refers to the optimal route obtained by the system based on the stored route information, considering the distance, road conditions, and other human beings that cannot be predicted and are constrained by a large amount of data.The navigation systems in our vehicles and mobile phones uses this principle.Another example is the logistics system in the logistics distribution industry, which is usually designed with the highest efficiency, accuracy, lowest cost, and shortest distance as the design goals to obtain the optimal distribution route, which is the so-called intelligent logistics planning.The Internet of things has gradually formed with the advancement of science and technology.The use of FPGA-based systems to make cost-effective planning and proper management is the key to developing this industry.[9] 4. The deficiencies and future development of FPGA

Deficiencies of FPGA and solutions
Although FPGA has many of the advantages mentioned above and a wide range of applications, the development of modern artificial intelligence chip FPGA also has its pain points; that is, designing an efficient FPGA platform is a highly complicated task, compared with CPU and GPU, the development is more complex, a steep learning curve and challenging debugging are its downsides.Therefore, how to reduce the difficulty of development and improve the usability of FPGA has always been a problem that FPGA companies need to solve in recent years.
Some companies have chosen the OpenCL-based FPGA high-level development language and released their API and SDK tools, allowing engineers to use their familiar programming language to develop on the FPGA platform [10].With the progress of science and technology, high-level synthesis of FPGA has also appeared, which is also one of the directions of future FPGA development.

The development of FPGA in the future
Looking back at the development history of FPGA for decades, it has always followed Moore's Law in the chip industry.Whether it is from its structure, application scenarios, or development tools, there will be new changes almost every once in a while.Looking forward to the future, the application of artificial intelligence and high-level synthesis will be the development direction of FPGA.

The application of artificial intelligence.
In the future, the application of artificial intelligence based on the FPGA platform will gradually expand to the edge and endpoint of the network.Edge computing plus artificial intelligence will be the development direction of domestic and foreign enterprises, such as automatic driving, drones, robots, and video collection and processing.Hardware can directly model, process, and accelerate external data using FPGA.In addition, the design and acceleration of some artificial intelligence applications are critical directions for future development.

High-level synthesis.
High-level synthesis (HLS) automatically converts the logical structure described in a high-level language into a circuit model described in a low-level language.This technology has received a lot of attention and research, mainly because it has the following characteristics: software and algorithm engineers can rely on it to participate in and lead the design of chips or FPGAs; future design of integrated circuits will use a higher level of abstraction for circuits Modeling; IP reuses high-level languages can improve efficiency.
In the future, it is a cost-effective choice to use FPGA for customized architecture design in artificial intelligence applications because it can meet the requirements of improving data throughput while considering cost, flexibility, and power consumption.In terms of high-level synthesis, more advanced and easier-to-use development tools can enable more and more artificial intelligence software and algorithm engineers to choose FPGA as their hardware platform for algorithm and software implementation, especially for high-level synthesis in artificial intelligence.The design and development of tools and algorithms will be an inevitable choice for future development.To develop these two aspects at the same time, promoting each other and cooperating is an essential direction for the development of FPGA in the field of artificial intelligence in the future.

Conclusion
The conclusions obtained through the above research in this paper are as follows: 1. FPGA has three significant features: gate array, programmable, and scene.And FPGA is a type of PLD with a higher degree of design freedom than PLD.In addition, the development of FPGA has always followed the Moore Although the structure is complex, the law has a clear design principle, and most functions are realized by independent logic blocks.Its development tools include hardware and software.Making full use of various tools is the key to designing a good FPGA.The most typical method is to use a hardware description language such as Verilog HDL to write a register transfer level (RTL) to describe the circuit.Through a series of flow implements the target circuit in the FPGA.
2. FPGA has the advantages of being flexible and configurable, special optimizations for convolutional neural networks and deterministic low latency in the field of artificial intelligence, and has typical applications in Real-time monitoring and processing, The brain wave project of Microsoft and Intelligent system field 3.Although designing an efficient FPGA platform is a highly complicated task, this problem will be solved with the development of technology, and FPGA has a broad development space in the future.The application of artificial intelligence and high-level synthesis are the two major directions of development in the future.
In the future, with the increasing popularity of artificial intelligence in industry and life, there is reason to believe that FPGA will shine as an artificial intelligence chip.

Figure 1 .
Figure 1.The internal structure diagram of FPGA.

Figure 2 .
Figure 2. Schematic diagram of the basic principle of the autonomous vehicle.

Figure 3 .
Figure 3. Complete acceleration flow chart for DNN of Brainwave project.Regarding testing, the Brainwave project conducted accelerated experiments on DeepScan and Turing Prototype (TP1) in Microsoft's Bing search.Results Compared with the CPU, the brainwave project has a solution that can achieve more than ten times the delay reduction and more than ten times the model size [1].