Userland entropy generation + How to use it?

Programming, for all ages and all languages.
Post Reply
avcado
Member
Member
Posts: 29
Joined: Wed Jan 20, 2021 11:32 am
Contact:

Userland entropy generation + How to use it?

Post by avcado »

I recently got interested in true RNGs, like RANDOM.org. I spent a bit of time reading their introduction to randomness:
Another suitable physical phenomenon is atmospheric noise, which is quite easy to pick up with a normal radio. This is the approach used by RANDOM.ORG. You could also use background noise from an office or laboratory, but you'll have to watch out for patterns. The fan from your computer might contribute to the background noise, and since the fan is a rotating device, chances are the noise it produces won't be as random as atmospheric noise.
Specifically,
In comparison with PRNGs, TRNGs extract randomness from physical phenomena and introduce it into a computer.
Is this what "entropy" everyone talks about? You just take some source of chaos, i.e. lava lamps, atmospheric noise, and "feed" it into the computer?

My questions are:
  • How does one "feed" this into the computer?
  • What do I use with the entropy once I add it into the entropy pool?
I also read that the source code isn't open source, so there's not much I could look at as an example,
Q1.2: Is the source code for the generator available?

Not currently, no. Maybe we'll make it available as open source some day.
I assume that the entropy pool is just some byte array, i.e.

Code: Select all

uint8_t entropy_pool[65535];

Finally, I read in the Wikipedia entry for entropy that
[...] entropy is the randomness collected by an operating system or application for use in cryptography or other uses that require random data. This randomness is often collected from hardware sources (variance in fan noise or HDD), either pre-existing ones such as mouse movements or specially provided randomness generators.
How would I access hardware sources from userland (assume Linux), (to then add to the entropy pool)?

Thanks in advance! (:
nullplan
Member
Member
Posts: 1801
Joined: Wed Aug 30, 2017 8:24 am

Re: Userland entropy generation + How to use it?

Post by nullplan »

avcado wrote: Wed Nov 27, 2024 6:20 pm Is this what "entropy" everyone talks about? You just take some source of chaos, i.e. lava lamps, atmospheric noise, and "feed" it into the computer?
Well, I don't know about lava lamps, but essentially yes. For the purposes of computing, entropy is anything that is inherently random. For example, the times at which you receive user inputs, or network packets, or the time a command to the hard disk takes. Interrupt timing is a good source of entropy. You may even want to add the contents of these messages, especially mouse movements.

This, BTW, is why kernel entropy is soo useful, because the kernel just gets all this entropy for free.
avcado wrote: Wed Nov 27, 2024 6:20 pm My questions are:
  • How does one "feed" this into the computer?
  • What do I use with the entropy once I add it into the entropy pool?
One simple idea (that will have intolerable speed and CPU usage) is to just use a cryptographic hash algorithm. Those have the property of compression; you can add as much data as you want and you still only get a single hash out. They also have the property of entropy distribution; the output is as entropic as the input, until the entropy is as big as the hash. Then it can't rise further.

Notice that due to physics, entropy doesn't go down. This is one thing that things like /dev/random on Linux get wrong. Entropy only ever increases.

Anyway, what you can do with this idea is

Code: Select all

static unsigned char entropy_pool[SHA256_HASH_SIZE];
void add_in_entropy(const void *data, size_t len) {
  sha256_ctx ctx;
  sha256_init(&ctx);
  sha256_add_bytes(&ctx, entropy_pool, sizeof entropy_pool);
  sha256_add_bytes(&ctx, data, len);
  sha_256_finalize(&ctx, entropy_pool);
}
Now you have a block of data that gets ever more random, the more unpredicable stuff is mixed in.

Now, when the user is requesting random data, you can deliver an infinite stream, but you must use feedback.

Code: Select all

void get_entropy(void *data, size_t len) {
  unsigned char buf[SHA256_HASH_SIZE];
  while (len) {
    sha256_ctx ctx;
    sha256_init(&ctx);
    sha256_add_bytes(&ctx, entropy_pool, sizeof entropy_pool);
    sha256_finalize(&ctx, buf);
    add_in_entropy(buf, sizeof buf);
    size_t tlen = MIN(len, sizeof buf);
    memcpy(data, buf, tlen);
    data = (char *)data + tlen;
    len -= tlen;
  }
}
This protects the entropy pool from being observed. The entropy pool is your only source of randomness, and you don't want it to become known. This is the second reason why you want this stuff in the kernel, so it is difficult to spy on from userspace. And the feedback mechanism hides the hash block size; otherwise the data would start repeating at some point.

Now, this can be optimized: For add_in_entropy(), you don't actually need a cryptographic hashing function. You can use a non-cryptographic one, because in that case you don't need the trap-door property. You do need it in get_entropy().

Anyway, this is basically my rant about why you need this stuff in or close to the kernel. In userland, you just don't get this data, unless you get the kernel to feed you the data.
Carpe diem!
cardboardaardvark
Posts: 20
Joined: Mon Nov 18, 2024 10:06 pm

Re: Userland entropy generation + How to use it?

Post by cardboardaardvark »

Entropy is the opposite of order. The concept as a whole is rather fascinating. Our whole life is a struggle where we try to decrease the entropy we have to live with while the universe constantly increases the entropy around us. When you make your bed you decrease it's entropy. When you wake up and kick the sheets off you increase it's entropy. If you get a cut that's an increase in entropy. As you heal your body is decreasing it's entropy.

In the domain you are asking about here, random numbers, entropy is considered a measurement of the magnitude of lack of order. For certain applications, such as generating a random number to use as part of a cryptographic key, you want the most entropy you can get. If your random number generator has a low entropy output and some attacker knows this then can reduce the space they have to search to guess the basis for your cryptographic key. That is bad.
avcado wrote: Wed Nov 27, 2024 6:20 pm Is this what "entropy" everyone talks about?
You are on the right track but strictly no. Entropy in this context reefers to the quality of the randomness. You'll find it qualified as "low entropy" and "high entropy." When feeding your entropy pool you want to give it the highest entropy data you can. Though a good quality algorithm can make use of even low entropy data as any entropy can be used to increase the entropy in the pool.
How does one "feed" this into the computer?
You get data, all of which will have some amount of entropy varying from zero to a lot, and you feed it to the entropy pool algorithm.

Code: Select all

const char *low_entropy = "                                                       ";
const char *bit_higher_entropy = "lkjfeakhjfeahfehafhjeak;jfejafaj";
There is basically no entropy at all in low_entropy. I mashed on my keyboard for the bit_higher_entropy. The point is though that the data has some amount of entropy associated with it. On a Linux box:

Code: Select all

echo fjelajklfjea | sudo cat > /dev/random
Just ever so slightly increased the entropy in the kernel's pool. You used to be able to cat /dev/audio on a Linux box and see the stream of PCM coming out of your sound card. If you had a microphone hooked up to your computer and did that in a quiet room you would see a fairly consistent stream of bytes coming out. This is very low entropy. If you hummed a constant tone into your microphone you'd see the data varying but that it also has a very easy to see pattern. This is higher entropy but not great entropy. If you grabbed a washboard and a spoon and rubbed the spoon on the washboard it would get harder to see the patterns. This is better entropy.
Another suitable physical phenomenon is atmospheric noise,
If you hooked an AM radio in the middle of no where up to your sound card line in and tuned it to a place between stations so all you hear is hiss and crackling from lightning (possibly hundreds of miles away) you've got a very high quality entropy source. If you took the stream of PCM as bytes coming out of your sound card and fed that into /dev/random you would be introducing very high quality entropy into the kernel's entropy pool. That is what they are talking about.

I'm not familiar with how the entropy pool algorithms work but in terms of getting high quality entropy data into the computer it's not hard.
cardboardaardvark
Posts: 20
Joined: Mon Nov 18, 2024 10:06 pm

Re: Userland entropy generation + How to use it?

Post by cardboardaardvark »

nullplan wrote: Wed Nov 27, 2024 10:20 pm Well, I don't know about lava lamps,
CloudFlare famously uses an array of lava lamps with a camera pointed at it as their prime source of entropy. The lava lamp is considered to be extremely chaotic. Like my sound card example you could take the individual frames of video and treat them as pure data and send it into the entropy pool.
avcado
Member
Member
Posts: 29
Joined: Wed Jan 20, 2021 11:32 am
Contact:

Re: Userland entropy generation + How to use it?

Post by avcado »

nullplan wrote: Wed Nov 27, 2024 10:20 pm One simple idea (that will have intolerable speed and CPU usage) is to just use a cryptographic hash algorithm. Those have the property of compression; you can add as much data as you want and you still only get a single hash out. They also have the property of entropy distribution; the output is as entropic as the input, until the entropy is as big as the hash. Then it can't rise further.

[...]

In userland, you just don't get this data, unless you get the kernel to feed you the data.
I think I understand what you're saying. In userland, could I do something like this?

Code: Select all

get webcam stream (from v4l2 or similar) (single frame) 
               -> pass to hash function (i.e. SHA1 or SHA256)
               -> add to entropy pool 
What would I then do with the entropy pool? I know about the get_entropy() function, but what would I do with the entropy I get from that? Seed some PRNG?
cardboardaardvark wrote: Wed Nov 27, 2024 10:31 pm If you hooked an AM radio in the middle of no where up to your sound card line in and tuned it to a place between stations so all you hear is hiss and crackling from lightning (possibly hundreds of miles away) you've got a very high quality entropy source.
I assume this is what RANDOM.org does. I do have an FM/AM/NOAA radio, though I'm not sure what
tuned it to a place between stations
If let's say we have two stations at 1000 AM and 1250 AM, does tuning it between those two stations count as "place between stations" (this seems like a dumb question)?

Thanks(:
nullplan
Member
Member
Posts: 1801
Joined: Wed Aug 30, 2017 8:24 am

Re: Userland entropy generation + How to use it?

Post by nullplan »

avcado wrote: Thu Nov 28, 2024 8:27 am What would I then do with the entropy pool? I know about the get_entropy() function, but what would I do with the entropy I get from that? Seed some PRNG?
You can use the entropy gained in that way wherever randomness (unpredictability) is required. For example to create cryptographic keys, or to perform a Diffie-Helman handshake with someone.
Carpe diem!
cardboardaardvark
Posts: 20
Joined: Mon Nov 18, 2024 10:06 pm

Re: Userland entropy generation + How to use it?

Post by cardboardaardvark »

avcado wrote: Thu Nov 28, 2024 8:27 am If let's say we have two stations at 1000 AM and 1250 AM, does tuning it between those two stations count as "place between stations" (this seems like a dumb question)?
Yes. The real requirement is to make sure the radio is not tuned to any station. The static you hear with an AM/FM radio (and on old analogue TVs, the snow picture where there is no station tuned in) is some combination of noise from the electronics that is happening down at the atomic level as well as photons coming in from outer space. You are essentially listening to the universe and it is quite chaotic. You want to use AM radio specifically because it is better at being noisy than FM. And you want to be as far away from man made noise sources as you can be as the universe is better at being chaotic than we are.
User avatar
eekee
Member
Member
Posts: 892
Joined: Mon May 22, 2017 5:56 am
Location: Kerbin
Discord: eekee
Contact:

Re: Userland entropy generation + How to use it?

Post by eekee »

There's CPU execution time jitter too. I just searched for "cpu entropy" and got a lot of results including some libraries. One library's README calls it "a small-scale, yet fast entropy source that is viable in almost all environments and on a lot of CPU architectures."

I recall that when Linux's entropy pool was introduced, you couldn't read about it without learning that it could quite easily run out. It might be an idea to implement all the entropy sources you can. It depends how often you'll want it, or in what size batches.

I'm not too sure about using an antenna. Sure it'll be a good source most of the time, but when Tom Stanton comes past with his electronics-free electric bike (Youtube) you could get a very regular signal from all that contact arcing. :) The same goes for other electrical experimenteers of course. Old electric motors and petrol engines may also be a problem if improperly repaired.

I remember a kit from the late 80s which used a noise diode with a counter to make a pair of e-dice. (Cryptographically secure family board games! :lol: ) A diode is a PN junction, and all such junctions produce noise. The amount of noise can be varied to some extent at the design phase. A noise diode is of course designed for high noise especially for TRNG use. Whenever entropy comes up around here, I'm always surprised modern CPUs don't have a noise diode and counter on-chip; I thought they did in the early 00s. Perhaps it's because noise diodes are temperature-sensitive to some extent. However, I'm pretty sure you can get entropy cards.
Kaph — a modular OS intended to be easy and fun to administer and code for.
"May wisdom, fun, and the greater good shine forth in all your work." — Leo Brodie
Post Reply