A Comparative Analysis of Branch Prediction Schemes

Zhendong Su and Min Zhou

Computer Science Division
University of California at Berkeley
Berkeley, CA 94720


Related Work

Branch prediction performance issues have been studied extensively. J. Smith [S81] gave a survey of early simple static and dynamic schemes. The best scheme in his paper is the one which uses 2-bit saturating up/down counters to collect history information which is then used to make predictions. This is perhaps the most well-known technique. McFarling [M93] referred to it as bimodal branch prediction. It was also referred to as one-level branch prediction in Yeh and Yatt 's paper [YP91]. We will discuss this scheme in more detail in later sections. Lee and Smith [LS92] evaluated several branch prediction schemes. In addition, they addressed how to use branch target buffers to reduce the delay due to target address calculation. McFarling and Hennessy [MH86] compared various hardware and software approaches to reducing branch cost including using profiling information. Fisher and Freudenberger [FF92] studied the stability of profile information across separate runs of a program. In many programs with intensive control flow, very often the direction of a branch is affected by behavior of other branches. By observing this fact, Pan, So, & Rahmeh [PSR92] and Yeh & Patt [YP91] independently proposed correlated branch prediction schemes, also called two-level adaptive branch prediction schemes in Yeh and Patt's paper. Correlation schemes use both single conditional branch branch history and global branch history. Pan, So, Rahmeh [PSR92] described how both global history and branch address information can be used in one predictor. This new approach improved the prediction accuracy by a large factor. There are several variations of this kind of dynamic schemes by using different indexing method and buffer organizations. Yeh and Patt gave a comparison of these approaches. [YP91] In designing the 2-bit counter used in many of the dynamic schemes, several variations exist. Yeh and Patt [YP93] discussed these variations. McFarling [M93] exploited the possibility of combining branch predictors to achieve even higher prediction accuracy. He also presented a sharing index scheme, referred to as gshare, and a new scheme using combined predictors. Ball and Larus [BL93] described several techniques for guessing the most common branches directions at compile time using static information. Young and M. Smith [YM94] [YM95] introduced the notion of static correlation branch prediction (SCBP). In a recent paper [GSY], Gloy, M. Smith, and Young addressed performance issues of this approach. They claimed a better performance in comparison to some dynamic approaches. Several studies [JW89] [W91] have looked at the implications of branches on available instruction level parallelism (ILP). These studies show that branch prediction miss is a crucial parameter in determining the amount of parallelism that can be exploited.


Project Home | Previous Section: Introduction | Next Section: Design Methodology