A Comparative Analysis of Branch Prediction Schemes
Related Work
Branch prediction performance issues have been studied extensively.
J. Smith [S81]
gave a survey of early simple static and dynamic schemes.
The best scheme in his paper is the one which uses 2-bit saturating
up/down counters to collect history information which is then used to
make predictions. This is perhaps the most well-known technique.
McFarling [M93] referred to it as
bimodal branch prediction. It was also referred to as
one-level branch prediction in Yeh and Yatt 's paper
[YP91]. We will discuss this
scheme in more detail in later sections.
Lee and Smith [LS92]
evaluated several branch prediction schemes. In addition, they
addressed how to use branch target buffers to reduce the delay
due to target address calculation.
McFarling and Hennessy [MH86]
compared various hardware and software approaches to reducing
branch cost including using profiling information. Fisher
and Freudenberger
[FF92]
studied the stability of profile information across separate
runs of a program. In many programs with intensive control flow,
very often the direction of a branch is affected by behavior
of other branches. By observing this fact, Pan, So, & Rahmeh
[PSR92] and
Yeh & Patt [YP91]
independently proposed correlated branch prediction schemes,
also called two-level adaptive branch prediction schemes in Yeh and Patt's
paper. Correlation schemes use both single conditional branch branch history
and global branch history. Pan, So, Rahmeh
[PSR92]
described how both global history and branch address information can be
used in one predictor. This new approach improved the prediction
accuracy by a large factor. There are several variations of this kind
of dynamic schemes by using different indexing method and buffer
organizations. Yeh and Patt gave a comparison of these approaches.
[YP91]
In designing the 2-bit counter used in many of the dynamic schemes,
several variations exist. Yeh and Patt
[YP93]
discussed these variations. McFarling
[M93]
exploited the possibility of combining branch predictors
to achieve even higher prediction accuracy. He also presented a
sharing index scheme, referred to as gshare,
and a new scheme using combined predictors. Ball and Larus
[BL93]
described several techniques for guessing the most common branches
directions at compile time using static information.
Young and M. Smith
[YM94]
[YM95]
introduced the notion of static correlation branch prediction (SCBP).
In a recent paper
[GSY],
Gloy, M. Smith, and Young addressed performance issues of this approach.
They claimed a better performance in comparison to some dynamic approaches.
Several studies
[JW89]
[W91]
have looked at the implications of branches on available
instruction level parallelism (ILP). These studies show that branch
prediction miss is a crucial parameter in determining the amount
of parallelism that can be exploited.
Project Home
|
Previous Section: Introduction
|
Next Section: Design Methodology