Posted: Fri Oct 19, 2007 2:14 pm
Why waste effort on something that you can get the computer to handle?
The Place to Start for Operating System Developers
http://forum.osdev.org./
The example was not one of a typical mistake. It was merely meant to demonstrate that pointers are type-unsafe.Candy wrote:Most typos and mistakes I've seen didn't involve pointer arithmetic of the variant of adding a random number to a pointer, nor have I seen reinterpret_cast's used at all. Nor C casts with the same effect.Colonel Kernel wrote:I've been working in the software industry for over a decade. I've seen code written by lots of people, most of whom are very smart and not lazy. They make mistakes. They are human beings, after all. Those mistakes cost time and money, and I would rather have the compiler catch those mistakes than my test team, thank you very much.
Code: Select all
// Vector is just like std::vector.
Vector v = getAVectorFromSomewhere();
assert( !v.isEmpty() ); // Assume for this example that it's not empty.
someOldCFunction( &v[0] ); // Legal in STL, and for our Vector.
// The above trick is explained in Scott Meyer's Effective STL.
// String is not like std::string -- it encapsulates an immutable buffer
// of characters that is managed by a reference count.
String str = getAStringFromSomewhere();
assert( !str.isEmpty() ); // Again, assume it's not empty...
anotherOldCFunctionThatFillsABuffer( &str[0] ); // Aaaaaarrggh!
Code: Select all
Foo* foo = getAFooFromSomewhere();
// Assume for the sake of this example that foo really points to an instance of Bar.
Bar* bar = (Bar*) foo; // Bar is derived from foo.
bar->doSomething(); // May crash in the presence of MI.
Code: Select all
+-----+ Where bar *should* point to.
| Baz |
--> +-----+ Where foo points to.
| Foo |
+-----+
| Bar |
+-----+
Thank you!!Zekrazey1 wrote:Why waste effort on something that you can get the computer to handle?
1) To let developers understand what they are doing.Why waste effort on something that you can get the computer to handle?
Ok. Sorry for the harsh reaction... you're one of many to use that word in this context.os64dev wrote:note the '' around lazy, which should have given a less hard meaning but ok, lazy might be wrongly chosen here.
It's a legacy C function that we have to use.in some of your examples you already see a potential problem, for instance anotherOldCFunctionThatFillsABuffer(&str[0]), the problem here IMHO is that you have designed/used here a function that has a pointer to primitive type char *, so what do you expect.
The problem is that someone has to write the wrapper, and that's where this kind of thing can happen.If you want to do string manipulation then only have string parameters. If you need to link to old c code as the example does, write a wrapper function or better yet don't do it unless you rewrite the code to use strings.
I would rather let the developers think about the domain problem and how they're going to solve it rather than the nitty-gritty details of memory management.1) To let developers understand what they are doing.Why waste effort on something that you can get the computer to handle?
When you really, really need it, it's good to have that flexibility. However, as I mentioned before, there are advances in static type systems and compiler optimizations (e.g. -- dependent types, better whole-program optimization) that will mean really good performance and type safety at the same time. I'm interested to see how these things pan out in the coming years.2) The get flexibility which might lead to better performance.
Sounds like a good idea.PS. If this all sound vague i am sorry, i have just finished 1.5 liter of beer.
I see a safe language as being one in which you can specify that certain requirements must be met at a lower level and having 'errors' (in quotes because you could incorrectly define what an error is) prevented through some language mechanism when you switch to a higher level.1) To let developers understand what they are doing.
2) The get flexibility which might lead to better performance.
That should give a compile error, or one out of four people screwed up. In increasing order of likelyhood:Colonel Kernel wrote:Code: Select all
// Vector is just like std::vector. Vector v = getAVectorFromSomewhere(); assert( !v.isEmpty() ); // Assume for this example that it's not empty. someOldCFunction( &v[0] ); // Legal in STL, and for our Vector. // The above trick is explained in Scott Meyer's Effective STL. // String is not like std::string -- it encapsulates an immutable buffer // of characters that is managed by a reference count. String str = getAStringFromSomewhere(); assert( !str.isEmpty() ); // Again, assume it's not empty... anotherOldCFunctionThatFillsABuffer( &str[0] ); // Aaaaaarrggh!
Your compiler should / must handle this case properly to be C++ compliant. It can never assume that the base class is at the same place if it even barely touches MI.I know that down-casting is typically a dubious practice anyway, but imagine for the sake of argument that this is one of the rare instances when it's necessary. Clearly this code should be using dynamic_cast, or at least static_cast if RTTI can't be used for whatever reason (portability, performance, etc.). For whatever reason, the knob who wrote this code didn't know that C-style casts could very well be interpreted as reinterpret_cast in this context. Imagine that Bar is derived from Foo, but also from Baz. What happens if the object layout looks like this?Code: Select all
Foo* foo = getAFooFromSomewhere(); // Assume for the sake of this example that foo really points to an instance of Bar. Bar* bar = (Bar*) foo; // Bar is derived from foo. bar->doSomething(); // May crash in the presence of MI.
You should have a checkin script that makes people that use reinterpret_cast ask you personally for agreement. It should never be used unless you're hacking - and when you're hacking you shouldn't be working on a product in an archive.static_cast or dynamic_cast will do the appropriate pointer adjustment for you, but reinterpret_cast (or possibly the C-style cast, depending on your compiler) will not. Yes, I've seen this happen (with an older version of GCC).
Did you consider sidewheels?Most of the time we're stuck with the equivalent of asking a taxi driver to drive a Formula-1. They're good drivers, but way out of their league. But calling them "lazy" is just stupid IMO.
You forgot option #5 -- I mis-remembered the example. The C function actually did take a const char* and didn't attempt to modify it. The problem is actually that our String class does not null-terminate its internal buffer because it is optimized for taking sub-strings efficiently. Each instance stores its own length and several instances can share the same buffer but point to different parts of it.Candy wrote:That should give a compile error, or one out of four people screwed up. In increasing order of likelyhood:
1. The compiler writer, for not checking const correctness.
2. The function writer for taking an argument of type const char * and stripping const
3. You, for stripping const off explicitly (which I don't see here, so it's not likely)
4. The author of String who knows that his buffer is CONST and returns a non-const pointer to it.
Do you mean it should interpret the C-style cast as a static_cast or dynamic_cast instead of a reinterpret_cast in this case? I would tend to agree. It was a very old version of GCC, and it's probably been fixed by now.Your compiler should / must handle this case properly to be C++ compliant. It can never assume that the base class is at the same place if it even barely touches MI.
As I said, we write database drivers. I'm pretty sure it's impossible to do type-unsafe things with buffers without using reinterpret_cast. Your idea is good though... maybe it's time to bring out the handcuffs.You should have a checkin script that makes people that use reinterpret_cast ask you personally for agreement. It should never be used unless you're hacking - and when you're hacking you shouldn't be working on a product in an archive.
You mean "training wheels", but yes, I did. There was no time. The API we're implementing (defined by M$, not us) is 16 years old and has several hundred violations of basic type safety baked right in. Next time I will insist on extra time in the schedule for such wrapping though...Did you consider sidewheels?
All good advice, but it doesn't help if the only developers available to do the wrapping make mistakes like the ones I mentioned above...On your "must use library, library is evil" note - wrap it. Basics of OO programming - if you have ANYTHING non-trivial, encapsulate it and hide the complexity. If your library can write to a buffer, wrap it so you can use the string class for holding a result, or some other class. If it offers only a buffer-overflow unsafe function, wrap it with code that makes it impossible (effectively) to overflow the buffer. If it offers only a excruciating interface to use, design your own and wrap the library into that interface.
Code: Select all
typedef Point = {x:int, y:int};
typedef NullablePointPtr = @Point | null;
Code: Select all
typedef VALUE1 = 0;
typedef VALUE2 = 1;
typedef VALUE3 = 0;
typedef VALUE4 = 1;
typedef foo = struct {
x:VALUE1 | VALUE2,
y:VALUE3 | VALUE4
};
Code: Select all
typedef flip_flop = 0 => 1 => 0;
Code: Select all
type EAX_INTEGER = EAX :> int32;
type EAX_UNSIGNED_INTEGER = EAX :> uint32;
Code: Select all
var x : 10 .. 20 = 15;
fun main(args : string[]) : int {
var y : int = 0;
match (y) {
case 10..20:
x = y;
other:
}
return 0;
}
Code: Select all
import foo.bar.cee, a.b.c, mymodules.myunit;
public:
var x : int = 0, y : int = 0;
private:
var data : @int | null = null;
protected foo.bar.cee, a.b.c:
var internals : char = 'a';
Code: Select all
typedef POINT = {x : int, y : int};
fun main(args : string[]) : int {
var p1 : @Point = new (0xfff0000) Point;
var io1 : int8 = input (0x60) int8;
var io2 : int8 = output (0x60) int8;
}
Code: Select all
fun new() : @POINT {
var buffer : byte[] | @POINT = alloc(sizeof(POINT));
match (typeof(buffer)) {
case typeof(byte[]):
for(i : uint8 = 0; i < sizeof(POINT); ++i) {
buffer[i] = 0;
}
}
return buffer;
}