#### PAPER • OPEN ACCESS

# A Novel Sleep Scheduling Strategy on RISC-V Processor

To cite this article: Weixin Zhou et al 2020 J. Phys.: Conf. Ser. 1631 012028

View the article online for updates and enhancements.

### You may also like

- <u>Thioxanthone-containing blue thermally</u> activated delayed fluorescent emitter Natsuko Kanno, Yongxia Ren, Yu Kusakabe et al.
- <u>Thermally activated delayed fluorescence</u> <u>materials for organic light-emitting diodes</u> Xiaoning Li, Shiyao Fu, Yujun Xie et al.
- Berger code based concurrent online selftesting of embedded processors G. Prasad Acharya and M. Asha Rani





DISCOVER how sustainability intersects with electrochemistry & solid state science research



This content was downloaded from IP address 3.144.4.221 on 15/05/2024 at 19:34

## A Novel Sleep Scheduling Strategy on RISC-V Processor

Weixin Zhou<sup>1,2</sup>, Dehua Wu<sup>1,2</sup>, Wan'ang Xiao<sup>3,4</sup>, Shan Gao<sup>1,2</sup> and Wanlin Gao<sup>1,2\*</sup>

<sup>1</sup> Key Laboratory of Agricultural Information Standardization, Ministry of Agriculture and Rural Affairs, China Agricultural University, Beijing 100083, China <sup>2</sup> College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China

<sup>3</sup> Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China <sup>4</sup> Center of Materials Science and Optoelectronics Engineering, School of Microelectronics, University of Chinese Academy of Sciences, Beijing, China Email: gaowlin@cau.edu.cn

**Abstract.** With the development of the Internet of Things, low-power technology has gradually become a primary factor in processor design. Processor sleeping mode is considered as an effective low-power technique. However, operation errors may occur if the long cycle instruction has not been completed during sleeping mode switch. To solve this problem, a novel sleep scheduling strategy based on RISC-V instruction set architecture is proposed. In this paper, the structure of RISC-V processor with task dispatching mechanism is described firstly. Then, WFI instruction and gating clock technology are adopted to realize the sleep scheduling strategy. Finally, hardware simulation is executed to demonstrate the feasibility of the novel sleep scheduling strategy.

Keywords. RISC-V; processor; task dispatching; sleep mode; clock gating,

#### 1. Introduction

With the promotion of RISC-V instruction set architecture, RISC-V microprocessor has been widely used in a variety of application scenarios. Microprocessor is the core component of information collection and network communication in IoT (Internet of Things) [1]. As one of the core devices in the development of information society, the power of microprocessor is facing great challenges. With the improvement of manufacturing technology, processor performance has made a great progress. However, due to various reasons, the performance of single-core processors gradually slows down after about a decade of rapid growth. Meanwhile, power consumption becomes one of the main factors that influence and restrict the development of processors [2]. High power consumption can cause many problems: it usually brings high cost, affects the portability of the equipment and the reliability of the processor [3]. In order to extend the working life of processor, power consumption must be reduced as much as possible and sleep mode is considered as one of the most effective ways. However, sleep mode is hard to be switched directly while the processor is running, which may cause function errors due to incomplete instructions.

To solve the problems above, a novel sleep scheduling strategy based on RISC-V instruction set architecture is proposed in this paper. First of all, the sources of power dissipation and low power technology are discussed. Then it presents a two-stage pipeline RISC-V processor and the theory of the novel sleep scheduling mode strategy. The WFI (Wait for Interrupt) sleep instruction and clock gating are used to optimize the power while the task dispatching mechanism can ensure the processor

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Published under licence by IOP Publishing Ltd 1

to run correctly. Finally, experiments are carried out to prove the feasibility of the novel sleep scheduling strategy.

#### 2. Low Power Technology

The power dissipation in one circuit falls into two broad categories: dynamic power and static power [4]. Dynamic power is the power dissipation when the circuit is active and is mainly composed of two kinds of power: switching power and internal power. Static power is dissipated in several ways which mainly results from source-to-drain subthreshold leakage and current leaks between diffusion layers and substrate.

In synchronous circuit design, the circuit dissipates power as long as the clock is on. And much of the power is redundant. In many cases, clock gating technology provides a power-efficient implementation of register banks that are disabled during some clock cycles [5]. As long as a certain Verilog HDL (Hardware Description Language) coding style is followed, the EDA (Electronic Design Automation) tools can infer clock gating directly from the code style to reducing dynamic power [6]. RISC-V instruction set architecture provides a WFI instruction for sleep mode [7]. When the processor executes the WFI instruction, it will stop the current instruction flow and enter a sleep mode. Due to the existence of multi-cycle instructions, entering the sleep mode while the instructions are not completed should be prevented, or it may cause function errors. In this paper, it adopts the WFI instruction and clock gating to implement the novel sleep scheduling strategy. A task dispatching mechanism is designed to ensure that the current instruction run has been finished when the processor enter the sleep mode.

#### **3.** Theory of the Sleep Scheduling Strategy

The processor used in this paper is a two-stage pipeline processor, which bases on the RISC-V instruction set architecture as shown in figure 1. The first pipeline stage is IFU composed of fetch unit that can fetch instructions from the memory. The second pipeline stage is EXU composed of decode unit, task dispatch unit, arithmetic unit, sleep control unit, write-back unit and register files. Table 1 summarizes the functions of units.

The task dispatching mechanism is divided into task dispatching and task committing. As shown in figure 2, the task dispatch unit is composed of FIFO. It will dispatch a task when an instruction decode happens; it will commit a task when the result is written into the register files. After the task is committed, the task is considered as finished. Task clear signal will finally switch to 0 after all the tasks are completed.



Figure 1. Processor architecture.

| Name               | Functions                                                                         |
|--------------------|-----------------------------------------------------------------------------------|
| Fetch Unit         | Fetch instruction from the memory and send the instruction to the decode unit.    |
| Decode Unit        | Decode instruction and send the instruction information to the task dispatch      |
|                    | unit and the arithmetic & logic unit.                                             |
| Arithmetic & Logic | Perform arithmetic operations and send the result to the write back unit.         |
| Unit               |                                                                                   |
| Write Back Unit    | Write the result back to the register files and send the information of result to |
|                    | the task dispatch unit.                                                           |
| Task Dispatch Unit | Dispatch and commit instruction tasks.                                            |
| Sleep Control Unit | Receive the instruction information from decode unit and send sleep signal to     |
| _                  | fetch unit and decode unit.                                                       |

| Table 1. Functions of u | inits. |
|-------------------------|--------|
|-------------------------|--------|



Figure 2. Task dispatching architecture.

The working mechanism of novel sleep scheduling strategy is implemented through the following steps, as shown in figure 3:

(1) After decoding a sleep instruction, a sleep command signal will be generated and sent to the sleep control unit by decode unit;

(2) Sleep control unit generates a sleep signal (0 signal) and transmits it to IFU and EXU respectively to control the clock signal of the two-stage pipeline;

(3) As clock of the fetch unit is gated with the sleep signal, when sleep signal is 0, the clock cannot enter the IFU and fetch unit will stop working;

(4) If the processor still has unfinished instructions, such as multi-cycle multiplication and division long instructions, the EXU will continue working.

(5) After all instructions are completed, a task clear signal (0 signal) will be sent by the task dispatch unit and go through OR gate with the sleep signal to generate a new signal. This signal will be 0 if the task clear signal is 0. In such a situation, the EXU will stop working because the clock cannot enter the EXU.

(6) Sleep control unit will receive the task clear signal then outputs a core sleep signal to indicate that the processor core has stopped working.



Figure 3. Novel sleep scheduling strategy mechanism.

#### 4. Experiment

Figure 4 shows the full flow of the experiment. The riscv-gcc compiler translates assembly code of the instructions into binary and stores them in a separate file [8]. Then Tsetbench is coded to load this file and VCS+VERDI is applied for hardware simulation to analyze the waveform of the design. The detailed analysis of the waveform is as below:



Figure 4. Experiment flow.

If the processor is not currently executing multi-cycle instructions, when the WFI instruction comes, the simulation waveform is shown in figure 5. The wfi\_control signal (sleep signal) switches to 0 when sleep command signal comes and the task\_done signal (task clear signal) is also 0 since no multi-cycle instruction is running. So, the gating enables signals ifu\_clk\_en and exu\_clk\_en will switch to 0, causing the clock signals clk\_ifu and clk\_exu to be 0. Finally, the core\_wfi signal (core sleep signal) will switch to 1 which indicates that the core has entered sleep mode.

**IOP** Publishing



Figure 5. No Multi-cycle instruction simulation waveform.

If the processor is currently executing multi-cycle instructions, when the WFI instruction comes, the simulation waveform is shown in figure 6. The wfi\_control (sleep signal) signal switches to 0 when sleep command signal comes but the task\_done (task clear signal) signal is 1 since multi-cycle instruction is running. In order to prevent the processor issues caused by execution of multi-cycle instructions, only the gating enable signal ifu\_clk\_en switches to 0, causing the clock signal clk\_ifu to be 0. After the multi-cycle instruction has been completed, the rask\_done (task clear signal) signal will switch to 0. Then the signal exu\_clk\_en switches to 0, causing the clock signal clk\_ifu to be 0. Finally, the core\_wfi signal (core sleep signal) will switch to 1 which indicates that the core has entered sleep mode.



Figure 6. Multi-cycle instruction simulation waveform.

#### 5. Conclusion

A novel sleep scheduling strategy based on RISC-V instruction set architecture is proposed in this paper. A two-stage pipeline RISC-V processor is presented and the theory of the strategy is introduced. The WFI sleep instruction and clock gating are used to optimize the power and the task dispatching mechanism is designed to ensure the operation correction of processor. Finally, the results of hardware simulation prove the feasibility of the novel sleep scheduling strategy.

#### References

[1] Do Rosario V M, Pisani F, Gomes A R, et al. 2018 Fog-assisted translation: towards efficient software emulation on heterogeneous IoT devices *2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)* pp 1268-1277.

AICS 2020

Journal of Physics: Conference Series

- [2] Ghasemazar M, Pakbaznia E and Pedram M 2010 Minimizing the power consumption of a chip multiprocessor under an average throughput constraint 2010 11th International Symposium on Quality Electronic Design (ISQED) pp 362-371.
- [3] Givargis T D, Vahid F and Henkel J 2001 Evaluating power consumption of parameterized cache and bus architectures in system-on-a-chip designs *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* pp 500-508.
- [4] Kuroda T 2002 Low-power, high-speed CMOS VLSI design *IEEE International Conference on Computer Design: VLSI in Computers and Processors* pp 310-315
- [5] Arsalan S, Saad A, Muhammad Y Q, et al. 2016 Power optimization using clock gating and power gating: A review *Innovative Research and Applications in Next-Generation High Performance Computing* 1-20.
- [6] Teng S K and Soin N 2010 Low power clock gates optimization for clock tree distribution *11th International Symposium on Quality Electronic Design* pp 488-492.
- [7] Waterman A, Lee Y and Patterson D 2014 *The RISC-V Instruction Set Manual*.
- [8] Dennis D K, Priyam A, Virk S S, et al. 2017 Single cycle RISC-V micro architecture processor and its FPGA prototype 2017 7th International Symposium on Embedded Computing and System Design (ISED) pp 1-5.