I am currently implementing a libc for fun and profit. Part of that is obviously a square root function. On most systems, this can be implemented by just running the correct opcode, since it is usually implemented in hardware. However, not all of the platforms in my current list necessary have an FPU, or a hardware square root instruction. On PowerPC, for example, FPU is optional, and even where present, the platform might be an old one from back before fsqrt was part of the FPU category, and might just not implement it. Also, I don't want to hamper future porting efforts by requiring a hardware square root.
So a software square root must be part of a complete libc. As far as I can tell, the currently favored algorithm for that is the Goldschmidt algorithm. According to Wikipedia, it calculates the square root of a number s like this:
Code: Select all
init:
b = s
t = estimate of 1/√s
x = s*t
iterate:
b *= t²
t = (3 - b) / 2
x *= t
I looked at Goldschmidt's paper, but that doesn't even talk about the square root. It talks about round-off error, but since I cannot understand how the algorithm was adopted to a square root, I also don't know how that section relates to my quest. I also tried to answer the question myself by starting with t = (1 + ε)/√s and going from there, but that only got me stuck with a huge expression continuously getting larger and never quite looping.
I also tried asking ChatGPT, but it only told me about the Babylonian method, but kept insisting that was the Goldschmidt algorithm.