Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Mar 9;9(3):211631.
doi: 10.1098/rsos.211631. eCollection 2022 Mar.

Stochastic rounding: implementation, error analysis and applications

Affiliations
Review

Stochastic rounding: implementation, error analysis and applications

Matteo Croci et al. R Soc Open Sci. .

Abstract

Stochastic rounding (SR) randomly maps a real number x to one of the two nearest values in a finite precision number system. The probability of choosing either of these two numbers is 1 minus their relative distance to x. This rounding mode was first proposed for use in computer arithmetic in the 1950s and it is currently experiencing a resurgence of interest. If used to compute the inner product of two vectors of length n in floating-point arithmetic, it yields an error bound with constant n u with high probability, where u is the unit round-off. This is not necessarily the case for round to nearest (RN), for which the worst-case error bound has constant nu. A particular attraction of SR is that, unlike RN, it is immune to the phenomenon of stagnation, whereby a sequence of tiny updates to a relatively large quantity is lost. We survey SR by discussing its mathematical properties and probabilistic error analysis, its implementation, and its use in applications, with a focus on machine learning and the numerical solution of differential equations.

Keywords: IEEE 754; bfloat16; binary16; floating-point arithmetic; machine learning; rounding error analysis.

PubMed Disclaimer

Conflict of interest statement

We declare we have no competing interests.

Figures

Figure 1.
Figure 1.
Stochastic rounding rounds the real number x to the next smaller or the next larger value in F, which we denote by x and x, respectively. In the example on the left, x is one quarter of the way between x and x, thus RN will round x to x, while mode 1 SR will round it to x with probability q(x) = 0.25 and to x with probability 1 − q(x) = 0.75. In the example on the right, w is three-quarters of the way between w and w, thus RN will round w to w, while mode 1 SR will round it to w with probability q(w) = 0.75 and to w with probability 1 − q(w) = 0.25.
Figure 2.
Figure 2.
Alignment of bits in algorithms for stochastic rounding based on sums. The random bits are added to the significand mt followed by the truncation of it. How the bits are generated and added depends on the implementation—we may only add the k bits to the top k bits of the bottom part of the significand and then use a carry-out bit to control the rounding of mr after the truncation, or we may pack the k random bits in a word as long as mt and add it to mt using integer arithmetic: the propagating carry will cause rounding in the top p bits.
Figure 3.
Figure 3.
Example: rounding a binary32 number y to a format with p = 11 significant digits (including the implicit bit) using the SR algorithm implemented in QPyTorch. The number n is the integer sum of the two bit strings representing y and m; fl(y) is obtained by zeroing out the trailing 24 − p = 13 trailing bits of n. For floating-point numbers, the three binary strings represent the sign (s), the unbiased exponent (e) and the integer significand without the implicit bit (m); the last group is further divided into a group of p − 1 bits to keep and 24 − p bits to zero out. We also report the hexadecimal string representing the numbers, and for floating-point numbers the corresponding exact decimal representations.
Figure 4.
Figure 4.
Relative errors for computing i=1n1/i with RN and SR. The densely dashed and dash-dotted lines are the worst-case error bound for RN and the probabilistic error bound for SR (with λ = 1), respectively. Stochastic rounding experiments are repeated 10 times; the solid line represents the average error, the edges of the shaded area the minimum and maximum error. (a) binary16 arithmetic, (b) bfloat16 arithmetic.
Figure 5.
Figure 5.
Backward error for computing y = Ax with RN and SR, where AR100×n has entries drawn from the uniform distribution over [0, 10−3] and xRn has entries sampled from the uniform distribution over [0, 1]. The densely dashed and dash-dotted lines are the worst-case error bound for RN and the probabilistic error bound for SR (with λ = 1), respectively. Stochastic rounding experiments are repeated 10 times; the solid line represents the average error, and the edges of the shaded area the minimum and maximum error. (a) binary16 arithmetic, (b) bfloat16 arithmetic.
Figure 6.
Figure 6.
Absolute errors in the forward Euler method for an ODE with exponentially decaying solutions with different floating-point arithmetics and rounding modes. Stochastic rounding experiments are repeated 10 times; the solid line represents the average error, the edges of the shaded area the minimum and maximum error. The step size is the interval length divided by n. The experiment is adapted from [37]. (a) y′ = −y, y(0) = 2−6, over [0, 1], (b) y=y/20,y(0)=1over[0,26].
Figure 7.
Figure 7.
Computed solutions from the forward Euler method with SR and RN for the ODE system (8.4). The exact solution is the unit circle. The x- and y-axis represent u and v, respectively. Note that in (d) and (h) only a small part of the solution computed with round-to-nearest is visible (marked with an arrow) since the ODE solver failed because of stagnation. The experiment is adapted from [37]. Stochastic rounding experiments are repeated 10 times; the solid line represents the average trajectory, the edges of the shaded area the points that are farthest from the exact solution in the Euclidean distance. (a) n = 25, (b) n = 29, (c) n = 211, (d) n = 213, (e) n = 25, (f) n = 29, (g) n = 214, (h) n = 216.
Figure 8.
Figure 8.
Comparison between the numerical steady-state solutions obtained with RN and SR with forward Euler and the bfloat16 format for different initial conditions. All SR solutions essentially converge to the same steady state. On the other hand, when RN is used different initial conditions lead to different steady-state solutions. The noise term in the initial condition has been obtained by sampling independent standard Gaussian random variables at each mesh node.
Figure 9.
Figure 9.
Plot of the relative global rounding error in the L2 norm for the solution of (8.5) in 1D (left) and 2D (right) with forward and backward Euler (FE and BE, respectively) and the bfloat16 format. We circled the RN data points for which the solution stagnates at the initial condition. The error behaviour matches the theoretical predictions from [43]. The SR errors are average errors computed within 2 digits of accuracy as in [43]. The worst-case SR errors were only a small constant factor larger than the average and are thus omitted.

Similar articles

Cited by

References

    1. Connolly MP, Higham NJ, Mary T. 2021. Stochastic rounding and its probabilistic backward error analysis. SIAM J. Sci. Comput. 43, A566-A585. (10.1137/20M1334796) - DOI
    1. Higham NJ, Pranesh S. 2019. Simulating low precision floating-point arithmetic. SIAM J. Sci. Comput. 41, C585-C602. (10.1137/19M1251308) - DOI
    1. Forsythe GE. 1950. Round-off errors in numerical integration on automatic machinery. Bull. Am. Math. Soc. 56, 55-64. (10.1090/S0002-9904-1950-09343-4) - DOI
    1. Huskey HD. 1949. On the precision of a certain procedure of numerical integration. With an appendix by Douglas R. Hartree. J. Res. Nat. Bur. Stand. 42, 57-62. (10.6028/jres.042.005) - DOI
    1. Barnes RCM, Cooke-Yarborough EH, Thomas DGA. 1951. An electronic digital computor using cold cathode counting tubes for storage. Electron. Eng. 23, 286-291. (doi:10.1088/1674-4926/41/2/022404) - DOI