Convergence properties of the Goldschmidt algorithm for the square root?

nullplan · Post by **nullplan** » Wed Jan 22, 2025 2:18 pm

Hi all,

I am currently implementing a libc for fun and profit. Part of that is obviously a square root function. On most systems, this can be implemented by just running the correct opcode, since it is usually implemented in hardware. However, not all of the platforms in my current list necessary have an FPU, or a hardware square root instruction. On PowerPC, for example, FPU is optional, and even where present, the platform might be an old one from back before fsqrt was part of the FPU category, and might just not implement it. Also, I don't want to hamper future porting efforts by requiring a hardware square root.

So a software square root must be part of a complete libc. As far as I can tell, the currently favored algorithm for that is the Goldschmidt algorithm. According to Wikipedia, it calculates the square root of a number s like this:

Code: Select all

init:
b = s
t = estimate of 1/√s
x = s*t

iterate:
b *= t²
t = (3 - b) / 2
x *= t

The iteration is continued until b is sufficiently close to 1, or a fixed number of times. And it is precisely this number of times I want to know more about. How many iterations do I need for 24, 53, 64, and 113 bits of convergence? And how close does the initial estimate of t have to be (i.e. how large of a lookup-table do I need)?

I looked at Goldschmidt's paper, but that doesn't even talk about the square root. It talks about round-off error, but since I cannot understand how the algorithm was adopted to a square root, I also don't know how that section relates to my quest. I also tried to answer the question myself by starting with t = (1 + ε)/√s and going from there, but that only got me stuck with a huge expression continuously getting larger and never quite looping.

I also tried asking ChatGPT, but it only told me about the Babylonian method, but kept insisting that was the Goldschmidt algorithm.

Octocontrabass · Post by **Octocontrabass** » Thu Jan 23, 2025 12:00 am

nullplan wrote: ↑Wed Jan 22, 2025 2:18 pmHow many iterations do I need for 24, 53, 64, and 113 bits of convergence? And how close does the initial estimate of t have to be (i.e. how large of a lookup-table do I need)?

According to musl, a 128-entry lookup table with 16-bit approximations is good enough for 24 bits in two iterations, 53 bits in three iterations, and 113 bits in four iterations.

nullplan wrote: ↑Wed Jan 22, 2025 2:18 pmI also tried to answer the question myself by starting with t = (1 + ε)/√s and going from there, but that only got me stuck with a huge expression continuously getting larger and never quite looping.

I'm pretty sure the Goldschmidt square root algorithm won't converge if the intermediate calculations don't have enough precision. Some papers describing floating-point implementations mention that fused multiply-add operations are required to avoid promotion to a bigger type.

nullplan · Post by **nullplan** » Thu Jan 23, 2025 11:07 am

I will achieve sufficient precision by using fixed-point arithmetic. In a binary FP system, it is trivial to reduce the square root of any number down to the square root of a number between 1/4 and 1. And so the Goldschmidt algorithm can be implemented, with a bit of care, so that no intermediate result is 4 or greater. Therefore it is practical to implement this in fixed-point arithmetic with 2 integer bits, and the normal integer sizes of 32, 64, and 128 bits give me more precision than single, double, and quadruple precision respectively.

Of course this means I'll have to implement expanding 128->256 bit multiplication when my language gives me 64-bit integers at the largest.

Alright, I will just use the musl numbers and hope nobody sues me for copyright infringement.

OSDev.org

Convergence properties of the Goldschmidt algorithm for the square root?

Convergence properties of the Goldschmidt algorithm for the square root?

Re: Convergence properties of the Goldschmidt algorithm for the square root?

Re: Convergence properties of the Goldschmidt algorithm for the square root?