47560

Mohammed Waseem Khanooni and S. D. Chede / Elixir Elec. Engg. 108 (2017) 47560-47562

Available online at www.elixirpublishers.com (Elixir International Journal)



**Electrical Engineering** 



Elixir Elec. Engg. 108 (2017) 47560-47562

# Low Complexity Frequency Logic Controller For Network On Chip Router

Mohammed Waseem Khanooni<sup>1</sup> and S. D. Chede<sup>2</sup> <sup>1</sup>Priyadarshini College of Engineering, Nagpur, India. <sup>2</sup>Suryodaya College of Engineering, Nagpur, India.

| ARTICI  | LE INFO  |
|---------|----------|
| Article | history: |

Received: 2 June 2017; Received in revised form: 28 June 2017; Accepted: 6 July 2017;

# Keywords

Network-on-chip, Router, Dynamic voltage Frequency scaling, Power efficiency, Register transfer logic.

### ABSTRACT

Network-on-Chip (NoC) paradigm allows designers to integrate efficiently more intellectual properties (IPs) into a single chip system. However, the power consumption has become one of the most critical issues for designing such large complex systems. Low power design can be achieved by scaling the voltage and frequency of the target components. The question is how to make the voltage-frequency scaling adaptable to the required performance of the system at run-time while reducing as much as possible the power consumption. This paper presents a novel solution for NoC architectures to reduce power consumption. As the communication traffic is not equally distributed over the network architecture, depending on the communication load, each router in the network will be applied with a corresponding voltage and frequency to minimize the power consumption while keeping necessary communication throughput.

© 2017 Elixir All rights reserved.

# I. Introduction

The Network-on-Chip (NoC) paradigm has been known as an emerging solution for designing large, complex systemon-chips (SoCs) [1] in the last decade. In NoC based systems, computing units (i.e., Intellectual Properties or IPs) communicates with each other using a micro network that is composed of network routers and network links. This leads to the fact that designers integrate more and more IPs to a system in order to meet the need of targeted applications. However, on the other side, the power consumption becomes one of the most important factors in designing such complex systems. In fact, the system never works at its maximum power capacity. It often exists some idle or low speed operating parts of the system during operating time. Therefore, there is a zoom to apply low power techniques for that kind of systems.

The techniques for estimating and reducing the power in NoCs can be broadly classified according to the abstract levels in design process. Firstly, at the gate level, the lowest level of the abstraction, the transistor network can be explored to apply low power techniques. In [2], [3], for reducing the static power consumption, PMOS power switches which are controlled by an ultra-cut-off technique are applied. Secondly, at the register-transfer level, the Dynamic Voltage and Frequency Scaling (DVFS) and the Adaptive Voltage-Frequency Scaling (AVS) are techniques which are most used to reduce the power of NoC architectures. In [4], the authors investigated the NoC architectures partitioned into different voltage frequency domains. The so-called "Asynchronous Programmable Self-Timed Ring" for controlling the dynamic workload and the process variability effects has been proposed for Globally Asynchronous - Locally Synchronous (GALS) based NoC architectures.

Finally, at system level, the system's functionalities can be considered to apply power techniques without going into the hardware details of different components. The power models based upon the system level are therefore less accurate but advantageously they require less time and resource [5].

In this paper, we propose a low power solution for the NoC architectures at router level. Each network router is equipped with a voltage- frequency controller for determining the necessary voltage and frequency to minimize the power consumption while keeping its communication capacity required by the application. The design is modeled and verified using VHDL at register-transfer level and implemented using Xilinx FPGA technology.

The remaining part of the paper is organized as follows. Section II describes an overview of the proposed solution. Section III describes the detailed architecture of the proposed voltage frequency controller. The simulation and experimental results are provided in Section IV. Finally, conclusions and remarks are given in Section V.

# II. The Proposed Voltage-Frequency Controller

In this paper, we assume that the traffic going through a router is also a quantity that reflects the activities of this router. If the router has a large communication traffic, it must be supplied a higher frequency, as well as a higher voltage, to meet the high data transmission rate and vice versa.

Therefore, we propose to use a voltage- frequency controller to scaling voltage and frequency of the router according to the activities of it in order to reduce the power consumption of a router in a NoC based system. To do that the controller will monitor the traffic through the router, then predict the change of traffic to make a decision to increase or decrease the values of voltage and frequency accordingly. To simplify the structure and reduce hardware resources of the system, we propose a Voltage- Frequency Controller (VFC) as shown in Figure 1. In this controller, we use a fuzzy logic processor to predict the communication traffic and make decision about the values of frequency and voltage.



Figure 1. Proposed voltage frequency controller.

In this architecture, each input port of the target router will be equipped with a tra\_c counter. These counters count the data flits passing through the router in certain clock cycles (average traffic) based on the corresponding response signals from the router. Since the router normally has 5 input/output ports [11], there will be 5 communication traffic values from the router. The maximum traffic value passing through the router is then decided by the Max Average (MA) block. In fact, the MA will compare and find the maximum value of the five average traffic values given by the router. Finally, this information will be sent to the Input 1 of the LP for being processed.

The Derivative (DER) block calculates the derivative of traffics obtained from the counters. To do that, it receives the traffic values from the counters and store these values to buers. The derivation of traffic will be calculated by the present value and the previous value. The DER determines the derivative value of traffic according to the maximum traffic value decided by MA block and then gives it to Input 2 of the LP for further processes.

The Logic Processor (LP) will process the given information (maximum traffic value and derivative value) to predict the next communication load passing through the router and decide the suitable voltage and frequency supplied to the router. As mentioned above, the operation of LP is based on state machine model to simplify the process of modeling and calculation. As a result, this leads to the reduction of hardware resources required for implementing the whole voltage-frequency controller.

The Voltage-Frequency Adjusting (VFA) block controls the voltage and the frequency supplied to the router. In this design, the router is supplied by three pairs of frequency voltage values (low, medium, high). When the frequency is changed, the voltage will be also adjusted to new level corresponding to the new frequency. The change of frequency is determined by a control signal at the output of LP.

### **III. Modeling The Voltage-Frequency Controller** A. The Counter

The model of Counter x is described in Figure 2. In this model, the Clock counter block is used to count a number of the clock signal-(clk)-events. The Signal counter block counts events from the resp in signal (a handshaking signal at the router indicating a flit transaction complete). When the Clock counter reach a fixed value, the number of events in Signal count block is sent to the output of Counter x as a value of traffic.



Figure 2. Model of counter.

The activities of Counter x are modeled as a finite state machine (FSM) with three states: init st, count st and count full st (Figure 3). In the init st state, all of signals will be reset to the initial values. The next state of the init st will be count st state. The count st state will count the events of signal resp in and signal clk. If the number of clk's events equals 0x64, the next state of FSM is count full st state. In the count full st state, number events of resp in signal will be sent to the output as a value of traffic and FSM come back to the init st state. At this state, a signal (end count) is also sent to the DER to warn that a counting process has been finished.



Figure 3. Counter finite state machine.

B. The Max Average

The MA block receives traffic values from the Counter x at five ports of the router. By comparing those values, this block will chose the maximum value between them and send it to input 1 of the LP. Simply, the structure of this block is the combination of 16-bit comparators (COMP) as in Figure 4.



Figure 4. Moving average block.

C. The Derivative

The main structure of the DER is composed of two 16bit registers. One register stores the traffic value at present time. The other one stores the value of a frame of time before. The derivative of traffic is calculated as an absolute of subtraction between those registers.



Figure 5. Derivative block.

The source code of Process describes the DER block as below:

```
der_process : process (end_in, rst_n) is
begin
if rst_n = '0' then
 dev_traf_out <= x"0000";
 reg_nx <= x"0000";
 reg_pr <= x"0000";
elsif end_in 'event and end_in='l' then
 reg_pr <= val_traf_in;
 reg_nx <= reg_pr;
 if reg_pr > reg_nx then
  dev_traf_out <= reg_pr - reg_nx;
 else
  dev_traf_out <= reg_nx - reg_pr;
 end if;
end if:
end process der_process;
```

47562

# Figure 6. Derivative block source code. IV. Simulation and Implementation Results

After all of blocks of the controller had been modeled at RTL level, we simulate the operators of those blocks by ModelSim software. A test bench of each blocks will generate the random data for inputs. By observing the simulation waveforms and comparing the simulation results with the calculation results, we can conclude about the operations of those blocks.

Figure 7 is a short waveform of the Counter x. We can see, when the number of clock events reaches value 100, then the next state of FSMwill be count full st state and the output val count out gets the value 78.



#### Figure 7. Counter operation.

One of simulation results of the MA is shown in Figure 8. With five values of inputs, the output returns value 178 - the maximum value between those values.

| D-Q /max_average/baf 45    | 02      | 1    | B2 4 | 16     |  |  |
|----------------------------|---------|------|------|--------|--|--|
|                            | 24      |      | 102  | 152    |  |  |
|                            | 36      | 1.70 |      |        |  |  |
| Inex_average/traf 298      | 15      |      | 312  | 298    |  |  |
| 🖬 🔄 /nax_average/traf   47 | 20      |      |      | D1 397 |  |  |
| Inex_average/max 298       | 56      | 1.78 |      | 298    |  |  |
|                            | 2000000 |      |      |        |  |  |

### Figure 8. Moving average operation.

Figure 9 shows the simulation results of the DER. In the dash-line rectangle, we see the value of the input is 4161. The last value of input is 8449, so the result is an absolution of subtraction, equal to 3824.



### Figure 9. Derivative operation.

The simulation of the LP is shown in a short waveform as in Figure 10. In this waveform, we can see when the value of each input is 0x28 and then the output has a value of 0x4F. This result is accordant with the calculation results. All testing results have proved that the operations of LP are in accordance with the proposed model.



Figure 10. Logic processor operation.

After being successfully modeled and verified, the LP has been implemented on FPGA devices (Spartan 3E-xc3s500e-5vq100) by using Xilinx ISE tool suite. The implementation results are described in Table I.

| Table I.                 |      |           |             |  |  |  |  |
|--------------------------|------|-----------|-------------|--|--|--|--|
| Logic Utilization        | Used | Available | Utilization |  |  |  |  |
| Slices                   | 711  | 4656      | 15%         |  |  |  |  |
| Flip flop slices         | 197  | 9312      | 2%          |  |  |  |  |
| 4 input LUTs             | 1325 | 9312      | 14%         |  |  |  |  |
| Bonded IOBs              | 26   | 66        | 39%         |  |  |  |  |
| Number of MULT 18X18SIOs | 19   | 20        | 95%         |  |  |  |  |
| GCLKs                    | 1    | 24        | 4%          |  |  |  |  |

### V. Conclusions

In this paper, we has proposed a low power solution based on dynamic voltage frequency control for NoC architectures. Each network router has been equipped with a Voltage-Frequency Controller which analyzes the communication traffic and its variation and makes decision to increase/ decrease the frequency and voltage applied to the router. The design of the voltage- frequency controller using VHDL is also presented. Some main simulation and implementation results have been presented and discussed. **References** 

[1] W. Dally, B. Towles, Route packets, not wires: On-chip interconnection

networks, in: Proceedings of the 2011 DAC, 2001, pp. 684-689.

[2] E. Beigne, F. Clermidy, H. Lhermet, S. Miermont, Y. Thonnart, X.-T. Tran, A. Valentian, D. Varreau, P. Vivet, X. Popon, H. Lebreton, An asynchronous power aware and adaptive noc based circuit, IEEE Journal of Solid-State Circuits 44 (4) (2009) 1167–1177.

[3] E. Beigne, F. Clermidy, S. Miermont, Y. Thonnart, A. Valentian, P. Vivet, A localized power control mixing hopping and super cut-off techniques within a GALS NoC, in: IEEE ICICDT, 2008, pp. 37–42.

[4] H. Zakaria, L. Fesquet, Process variability robust energyefficient con-trol for nano-scaled complex socs, in: Faible Tension Faible Consom-mation (FTFC), 2011, pp. 95–98.

[5] S. E. Lee, N. Bagherzadeh, A high level power model for network-on-chip (noc) router, Journal Computers and Electrical Engineering 35 (6) (2009) 837–845.

[6] N.-K. Dang, T.-V. Le-Van, X.-T. Tran, FPGA implementation of a low latency and high throughput network-on-chip router architecture, in: Proceedings of the 2011 ICDV, 2011, pp. 112–116.

[7] M. Sugeno, An introductory survey of fuzzy control, Information Sciences 36 (1) (1985) 59–83.

[8] H.-P. Phan, X.-T. Tran, A fuzzy-logic based voltagefrequency con-troller for network-on-chip routers, in: Proceedings of the 11th Con-ference on PhD Research in Microelectronics and Electronics (IEEE PRIME), IEEE, Glasgow, Scotland, 2015, pp. 192–195.

[9] K. Deliparaschos, F. Nenedakis, S. Tzafestas, Design and implemen-tation of a fast digital fuzzy logic controller using FPGA technology, Journal of Intelligent and Robotic Systems 45 (1) (2006) 77–96.

[10] K. Deliparaschos, F. Nenedakis, S. Tzafestas, A fast digital fuzzy logic controller: FPGA design and implementation, in: Proceedings of 10th IEEE Conference on Emerging Technologies and Factory Automation (ETFA), Vol. 1, 2005, pp. 4 pp.–262.