# 6K-3

# Flexible L1 Cache Optimization for a Low Power Embedded System

Huatao ZHAO, Jiongyao YE, Takahiro WATANABE

Graduate School of Information, Production and Systems. Waseda University.

**Abstract:** Reducing power consumption in a cache is one of the important subjects in high performance and low power micro-processor systems, especially in an embedded system. In this paper, we propose a low power cache optimization method to meet a particular application. Our proposed method adjusts configuration parameters such as a cache size, a line size, associativity and so on, and then an L1 cache is reconfigured for an application program.

An effectiveness of this method will be verified by experiments using CACTI 6.5 and SPEC2006 benchmark on Simple-scalar 3.0.

### Introduction:

As for a general-purpose processor system, no cache architecture is the best effective for all applications. But, for an embedded system whose application is fixed, we can adjust cache parameters [1] to meet the requirement for the minimum hardware resource and low power consumption.

There are two ways for the search heuristic, which is used on adjusting cache parameters, static and dynamic approach. The static approach [2] to set the parameters predetermines the optimal configuration by implementing a profile-based sample execution or simulation, which needs a little extra hardware and has to view for all possible combinations of parameters [3], but needs an analysis in advance. The dynamic approach automatically adjusts parameters during executing. This approach is more automatic and widely

"Flexible L1 cache optimization for a low power embedded system"

The authors are with the Graduate School of Information, Production and Systems.

E-Mail: zhaohuatao@ruri.waseda.jp

applicable, but it needs more extra hardware and the exploration in itself, so that it might interfere with system behavior.

In this paper, we use the static approach method to evaluate the relationship among cache size, miss rate and power consumption.

#### Proposition:

For a fixed application, we consider parameters such as cache size, line size and access mode on cache.





Cache size has a decisive impact on the Miss Rate. As shown in Fig.1, Miss Rate almost linearly decreases as the cache size increases. Although low miss rate reduces dynamic power consumption, large cache also adds hardware energy consumption. In some ways, we need to make sacrifices. Using larger cache is applicable to non-stationary applications which need a large and enough capacity to handle each possible application.

Moreover, as for the set-associative access mode in Fig.2, access time increases with associativity [4] and becomes larger with cache size.



Figure 2: Access Time vs Cache Size with different way associativities (Bzip2 application).

Then, we can search for reasonable cache sizes to optimize the power consumption [5] under the designated process technology.

#### Simulation:

By using SPEC2006 benchmark [6] on Simple-scalar 3.0 [7], and the parameters of CACTI6.5, the power consumption is estimated. Fig. 3 shows the results of power consumption where Cache line size is 64 bits, the number of banks is two, and 45 nm process technology is used.



Figure 3: Power consumption (Bzip2 application)

As shown in Fig. 3, the lowest power consumption is achieved when Cache Size = 128KB. This is the L1 cache optimization considering the balance among the number of parameters for Bzip2 application.

## **Conclusion:**

On the basis of experimental results, it was

shown that the power consumption of embedded systems are closely related to cache size, and also that we can get the optimized values for some specified application such as Bzip2 benchmark.

We will experiment by using other benchmarks which include six integer Benchmarks and five floating point benchmarks to verify the applicability of this proposal. Besides, we will discuss on the other parameters to achieve the comprehensive choice on the optimization cache.

This work is partially supported by JSPS KAKENHI 2350006.

### **References:**

[1] Ye J, Watanabe T. "A behavior-based reconfigurable cache for the low-power embedded processor", 2011 IEEE 9th International Conference on ASIC (ASICON), pp.1-5, Oct. 2011.
[2] Zhang C., Yang J., Vahid F, "Low static-power

frequent-value data caches", The Design, Automation and Test in Europe Conference and Exhibition, Paris, France, Vol.1, pp. 214- 219, Feb. 2004.

[3] Liu W, Huang M.C, "EXPERT: expedited simulation exploiting program behavior repetition", In ICS '04: Proceedings of the 18th annual international conference on Supercomputing, pp. 126-135, June 2004.

[4] Chi Zhang, Xiang Wang, "Dynamic time tuning for way prediction cache in low power embedded processors", 28th Digital Avionics Systems
Conference, pp. 7.E.1-1 - 7.E.1-8, October 25-29, 2009.

[5] Chuanjun Zhang, Frank Vahid. "A Highly
Configurable Cache Architecture for Embedded
Systems", Proceedings of the Int. Symp On
Computer Architechure, San Diego, CA, pp. 136 – 146, June 2003.

[6] SPEC2006, Standard Performance Evaluation Corporation, http://www.specbench.org.[7] Simplescalar LLC, Infrastructure for hardware

modeling and software analysis.

http://www.simplescalar.com.