SmartApprox: Learning-based Configuration of Approximate Memories for Energy-efficient Execution

Our participation in the 12th International Green and Sustainable Computing Conference (IGSC)

Abstract

Approximate memories reduce power and increase energy efficiency, at the expense of errors in stored data. These errors may be tolerated, up to a point, by many applications with negligible impact on the quality of results. Uncontrolled errors in memory may, however, lead to crashes or broken outputs. Error rates are determined by fabrication and operation parameters, and error tolerance depends on algorithms, implementation, and inputs. An ideal configuration features parameters for approximate memory that minimize energy while allowing applications to produce acceptable results. This work introduces SmartApprox, a framework that configures approximation levels based on features of applications. In SmartApprox, a training phase executes a set of applications under different approximation settings, building a knowledge base that correlates application features (e.g., types of instructions and cache efficiency) with suitable approximate memory configurations. At runtime, features of new applications are sampled and approximation knobs are adjusted to correspond to the predicted error tolerance, according to existing knowledge and the current error scenario, in consonance with hardware characterization. In this work, we list and discuss sets of features that influence the approximation results and measure their impact on the error tolerance or applications. We evaluate SmartApprox on different voltage-scaled DRAM scenarios using a knowledge base of 26 applications, wherein energy savings of 36% are possible with acceptable output. An evaluation using a combined energy and quality metric shows that SmartApprox scores 97% of an exhaustive search for ideal configurations, with significantly lower effort and without application-specific quality evaluation.

Content

Video presentation

Presentation slides (to appear)
Link to conference proceedings (to appear)
Link to data and source code of SmartApprox

Applications

The experimental evaluation of SmartApprox was performed with 26 applications on the knowledge base.

The following table contains the complete list of applications, their source, quality metric used, input workload, execution time with AxPIKE, and total instructions executed.

application	benchmark	quality metric	input	exec. time (s)	total instructions
2mm	Polybench [5]	MAPE	32x32 double matrices	46	8,62E+07
atax	Polybench [5]	FEE	500x500 double matrices	43	6,72E+07
blackscholes	AxBench [1]	MAPE	1,000 entries	45	8,20E+07
bunzip2	CBench [3]	SSI	compressed 256x256 image	43	5,52E+07
bzip2	CBench [3]	SSI	256x256 image	109	1,11E+08
correlation	Polybench [5]	FEE	64x64 long int matrices	26	4,77E+07
covariance	Polybench [5]	FEE	32x32 double matrices	61	9,09E+07
dijkstra	MiBench [4]	FEE	complete graph of 60 vertices	18	3,37E+07
fannkuch-redux	Benchmarks Game [2]	MAPE	N=9	64	1,29E+08
fasta	Benchmarks Game [2]	FEE	N=50,000	10	2,15E+07
fft	MiBench [4]	MAPE	Max waves 8, size 1024	82	1,56E+08
floyd-warshall	Polybench [5]	FEE	complete graph of 64 vertices	68	1,30E+08
inversek2j	AxBench [1]	MAPE	1,000 coordinates	59	9,82E+07
jacobi-2d-imper	Polybench [5]	FEE	Arrays of 32 double values	36	6,39E+07
jmeint	AxBench [1]	FEE	1,0000 coordinates	62	9,77E+07
jpeg	AxBench [1]	SSI	512x512 image	127	2,30E+08
k-nucleotide	Benchmarks Game [2]	MAPE	2 sequences of 50,000 nucleotides	121	9,79E+07
mandelbrot	Benchmarks Game [2]	SSI	N=500	28	7,49E+07
mm		MAPE	100x100 int matrices	42	8,01E+07
nbody	Benchmarks Game [2]	MAPE	N=100,000	35	6,48E+07
pi		MAPE	N=1,000,000	17	3,53E+07
qsort	MiBench [4]	FEE	10,000 strings	36	6,68E+07
reg_detect	Polybench [5]	MAPE	Iter=10,000, Length=32, MaxGrid=6	41	7,78E+07
reverse-complement	Benchmarks Game [2]	FEE	2 sequences of 50,000 nucleotides	30	5,67E+07
sobel		SSI	256x256 image	124	1,54E+08
spectralnorm	Benchmarks Game [2]	MAPE	N=500	48	1,26E+08

Benchmarks repositories:

[1] AxBench: http://axbench.org/

[2] Benchmarks Game: https://benchmarksgame-team.pages.debian.net/benchmarksgame/

[3] CBench: http://ctuning.org/cbench

[4] MiBench: http://vhosts.eecs.umich.edu/mibench/

[5] Polybench: https://web.cse.ohio-state.edu/~pouchet.2/software/polybench/

The source code of the applications are available at https://github.com/VArchC/apps

Quality Metrics

Quality metrics quantify how different is an approximate output compared to the accurate output. Thus, the quality metric is resulting from a comparison between the outputs of accurate and approximate executions of the same code with the same input. The quality metrics express how much the approximate output data array (F) deviates from an accurate output data array (A). Below, we show the quality metrics used for each application.

Fraction of Equal Elements (FEE)

FEE

where:
n: the number of elements in the data array;
A_i: the i-th element in the accurate array;
F_i: the i-th element in the approximate array.

Mean Absolute Percentage Error (MAPE) - https://en.wikipedia.org/wiki/Mean_absolute_percentage_error

where:
n: the number of elements in the data array;
A_i: the i-th element in the accurate array;
F_i: the i-th element in the approximate array.

Structural Similarity Index (SSI) - http://doi.org/10.1109/TIP.2003.819861 , https://en.wikipedia.org/wiki/Structural_similarity

SSI

where:
μ_x: the average of the array x;
δx^2: the variance of the array x;
δ{xy}: the covariance of x and y;
c_1 = (k_1L)^2, c_2=(k_2L)^2 : two variables to stabilize the division with weak denominator;
L: the dynamic range of the pixel values (typicaly 2^{#bits per pixel} - 1);
k_1 = 0.01 and k_2=0.03.

Cite us

@INPROCEEDINGS{SmartApprox-IGSC2021,
  author={Jo\~ao {Fabr\'icio Filho} and Isa\'ias Felzmann and Lucas Wanner},
  booktitle={12th International Green and Sustainable Computing Conference (IGSC)}, 
  title={{SmartApprox: learning-based configuration of approximate memories for energy-efficient execution,}}, 
  year={2021}
}

[Download BibTeX]