Performance Improvement Techniques in Tightly Coupled Multicore Architectures for Single-Thread Applications

Keita, Doi; Ryota, Shioya; Hideki, Ando; Keita, Doi; Ryota, Shioya; Hideki, Ando

WEKO3

lat lon distance

[[sub_check.contents]]

[[sub_radio.contents]]

Field does not validate

[[sub_attr.contents]]　

インデックスツリー

アイテム

Performance Improvement Techniques in Tightly Coupled Multicore Architectures for Single-Thread Applications

https://ipsj.ixsq.nii.ac.jp/records/189999

名前 / ファイル	ライセンス	アクション
IPSJ-JNL5906003.pdf (1.6 MB)	Copyright (c) 2018 by the Information Processing Society of Japan
オープンアクセス

Item type

Journal(1)

公開日

2018-06-15

タイトル

Performance Improvement Techniques in Tightly Coupled Multicore Architectures for Single-Thread Applications

タイトル

言語

タイトル

Performance Improvement Techniques in Tightly Coupled Multicore Architectures for Single-Thread Applications

言語

eng

キーワード

主題Scheme

Other

主題

[一般論文] multicore, single-thread performance

資源タイプ

資源タイプ識別子

http://purl.org/coar/resource_type/c_6501

資源タイプ

journal article

著者所属

Department of Electrical Engineering and Computer Science, Nagoya University／Presently with Okuma Corporation

著者所属

Department of Information and Communication Engineering, Nagoya University

著者所属

Department of Information and Communication Engineering, Nagoya University

著者所属(英)

Department of Electrical Engineering and Computer Science, Nagoya University / Presently with Okuma Corporation

著者所属(英)

Department of Information and Communication Engineering, Nagoya University

著者所属(英)

Department of Information and Communication Engineering, Nagoya University

著者名

Keita, Doi
Ryota, Shioya
Hideki, Ando

著者名(英)

Keita, Doi
Ryota, Shioya
Hideki, Ando

論文抄録

内容記述タイプ

Other

内容記述

Current multicore processors achieve high throughput by executing multiple independent programs in parallel. However, it is difficult to utilize multiple cores effectively to reduce the execution time of a single program. This is due to a variety of problems, including slow inter-thread communication and high-overhead thread creation. Dramatic improvements in the single-core architecture have reached their limit; thus, it is necessary to effectively use multiple cores to reduce single-program execution time. Tightly coupled multicore architectures provide a potential solution because of their very low-latency inter-thread communication and very light-weight thread creation. One such multicore architecture called SKY has been proposed. SKY has shown its effectiveness in multithreaded execution of a single program, but several problems must be overcome before further performance improvements can be achieved. The problems this paper focuses on are as follows: 1) The SKY compiler partitions programs at a basic block level, but does not explore the inside of basic blocks. This misses an opportunity to find good partitioning. 2) The SKY processor always sequentializes a new thread if the forking core in which it is supposed to be created is busy. However, this is not necessarily a good decision. 3) If the execution of register communication instructions among cores is delayed, the other register communication instructions can be delayed, causing the following thread execution to stall. This situation occurs when the instruction window of a core becomes full. To address these problems, we propose the following three software and hardware techniques: 1) Instruction-level thread partitioning: the compiler explores the inside of basic blocks to find a better program partition. 2) Selective thread creation: the hardware selectively sequentializes or waits for the creation of a new thread to achieve better performance. 3) Automatic register communication: register communication is automatically performed by a small hardware support instead of using instruction window resources. We evaluated the performance of SKY using SPEC2000 benchmark programs. Results on four cores show that the proposed techniques improved performance by 4% and 26% on average (maximum of 11% and 206%) for SPECint2000 and SPECfp2000 programs, respectively, compared with the case where the proposed techniques are not applied. As a result, performance improvements of 1.21 and 1.93 times on average (maximum of 1.52 and 3.30 times) were achieved, respectively, compared with the performance of a single core.
------------------------------
This is a preprint of an article intended for publication Journal of
Information Processing(JIP). This preprint should not be cited. This
article should be cited as: Journal of Information Processing Vol.26(2018) (online)
DOI　http://dx.doi.org/10.2197/ipsjjip.26.445
------------------------------

論文抄録(英)

内容記述タイプ

Other

内容記述

書誌レコードID

収録物識別子タイプ

NCID

収録物識別子

AN00116647

書誌情報

情報処理学会論文誌

巻 59, 号 6, 発行日 2018-06-15

ISSN

収録物識別子タイプ

ISSN

収録物識別子

1882-7764

戻る

views

See details

	Views

Versions

Ver.1

2025-01-20 01:23:35.062942

Show All versions

Cite as

エクスポート

OAI-PMH

JPCOAR
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Performance Improvement Techniques in Tightly Coupled Multicore Architectures for Single-Thread Applications

× Keita, Doi

× Ryota, Shioya

× Hideki, Ando

× Keita, Doi

× Ryota, Shioya

× Hideki, Ando

Versions

Share

Cite as

エクスポート