C++ STL modification + thread scheduling
Posted: Sun Jan 29, 2006 1:03 pm
Uhm... to start, I'm not sure I can explain this clearly and I'm not at all sure whether I'm placing this thread correctly.
I'm modifying the C++ STL to support generic typed streams and generic modifying filters between them. In short, this modifies the default basic_istream/ostream/iostream in a way that they now include a new base class that supports a small portion of their original function, and a new subclass of them is created (as well as from streambuf) that supports encapsulating a gen_istream to a basic_istream. This means that all basic_*stream types can be used as gen_*stream types and that all gen_*stream types that conform to a character-based stream can be converted back into a basic_*stream.
I modified it like this because the basic_*stream classes all assume you're using characters. I wanted normal typed streams that would intermix closely with the default streams, so I separated out the char-specific part (which was pretty much in fact). I didn't do this on the streambuf-level since that would still require me to make either freely-typed streambufs that couldn't be used as stream and that had to implement two-way communication (which would incur loads of overhead) or to split up streambuf which would make a lot of existing code incompatible - as well as making it very awkward to use and requiring buffering.
Now, I've been messing with it to start supporting multiple reads from a single stream at once, which makes my current problem. I'm trying to get this generic for the future, so it must support multithreading things that can be done in parallel. A small example:
Mass-encoding files to OGG files and writing them back
The type flow goes from disk_file -> wavfile -> wav sound object -> ogg object -> ogg file -> saved diskfile. There are three filters involved, a file reader (ifstream), a file writer (ofstream) and a conversion filter. Code example:
This constructs a series of filters that in combination, convert a file from wav file type to ogg file type. I'm still missing a few details, I put in somewhere that you could insert parameters to each filter but I think I've lost that somewhere along the way. The point of the problem is now in how this is handled.
The easiest way to see this is as a series of functions, each that calls the next and only needs its first N outputs for output. There are multiple ways to implement this. The simplest is by reading in all from the input, then running each sample through the first filter, then through the second etc. all in series. This is slow and very unparallel. The other extreme is to make a fifo in between each filter (for each stream) and to run threads that fill the fifo's. This is very scalable, but relatively slow (due to loads of thread switching on low-cpu-count machines).
I'm trying to find a middle-way that also works when this method is heavily used, without causing low-cpu-count computers to slow down too much in processing or to be too slow on N-cpu machines. Right now, I'm tempting toward the N-fifo solution and to let the thread scheduler/switcher handle this problem (which would place this thread in OSdev). I'm not sure whether that's the best way, so I'm wondering what your ideas toward this problem are?
PS: when I'm fairly done with this C++ library modification I'll release both the library and the modification into public domain so that the idea might be actually used to lift the software industry to another level. Stand on my shoulders, not on my toes.
I'm modifying the C++ STL to support generic typed streams and generic modifying filters between them. In short, this modifies the default basic_istream/ostream/iostream in a way that they now include a new base class that supports a small portion of their original function, and a new subclass of them is created (as well as from streambuf) that supports encapsulating a gen_istream to a basic_istream. This means that all basic_*stream types can be used as gen_*stream types and that all gen_*stream types that conform to a character-based stream can be converted back into a basic_*stream.
I modified it like this because the basic_*stream classes all assume you're using characters. I wanted normal typed streams that would intermix closely with the default streams, so I separated out the char-specific part (which was pretty much in fact). I didn't do this on the streambuf-level since that would still require me to make either freely-typed streambufs that couldn't be used as stream and that had to implement two-way communication (which would incur loads of overhead) or to split up streambuf which would make a lot of existing code incompatible - as well as making it very awkward to use and requiring buffering.
Now, I've been messing with it to start supporting multiple reads from a single stream at once, which makes my current problem. I'm trying to get this generic for the future, so it must support multithreading things that can be done in parallel. A small example:
Mass-encoding files to OGG files and writing them back
The type flow goes from disk_file -> wavfile -> wav sound object -> ogg object -> ogg file -> saved diskfile. There are three filters involved, a file reader (ifstream), a file writer (ofstream) and a conversion filter. Code example:
Code: Select all
gen_istream<char> *if = new ifstream("wavfile");
gen_ostream<char> *of = new ofstream("oggfile");
gen_istream<wav_sample> *i = new inputfilter(get_filter("file", "wav"), if);
gen_istream<ogg_unit> *ogg = new inputfilter(get_filter("wav", "ogg", "encode-hq"), i);
gen_istream<char> *afterconv = new inputfilter(get_filter("ogg", "file"), ogg);
afterconv >> of;
The easiest way to see this is as a series of functions, each that calls the next and only needs its first N outputs for output. There are multiple ways to implement this. The simplest is by reading in all from the input, then running each sample through the first filter, then through the second etc. all in series. This is slow and very unparallel. The other extreme is to make a fifo in between each filter (for each stream) and to run threads that fill the fifo's. This is very scalable, but relatively slow (due to loads of thread switching on low-cpu-count machines).
I'm trying to find a middle-way that also works when this method is heavily used, without causing low-cpu-count computers to slow down too much in processing or to be too slow on N-cpu machines. Right now, I'm tempting toward the N-fifo solution and to let the thread scheduler/switcher handle this problem (which would place this thread in OSdev). I'm not sure whether that's the best way, so I'm wondering what your ideas toward this problem are?
PS: when I'm fairly done with this C++ library modification I'll release both the library and the modification into public domain so that the idea might be actually used to lift the software industry to another level. Stand on my shoulders, not on my toes.