Have you ever seen something like the following? Do you immediately see the issue?
This is a memory leak, because the memory
buffer originally pointed to cannot be freed when
realloc fails. There’s no pointer pointing to it anymore, after all.
I’ll try to formulate rules which - when followed - would make the above error impossible to occur:
- The pointer passed in as first argument must not be a temporary.
- The returned pointer must not be assigned to the pointer which was passed as first argument.
Of course I can add such rules to the documentation of the
realloc function, but - sadly - that won’t really help. People just don’t always read documentations. Or they forget what they read. Thus it would be much better if I could specify these rules such that the compiler understands and enforces them.
In C++ I can acutally express rule 1, at least somewhat:
If I now try to use
foo with a temporary, then the compiler will bark:
error: invalid initialization of non-const reference of type 'void*&' from an rvalue of type 'void*'
A drawback here is that this only works with non-const references … and as such allows me to do nasty things:
But, for rule 2, there’s no real way of telling the compiler what is allowed and what isn’t. I think, though, that a feature like this could turn out really helpfull. A completely blue-eyed take:
The syntax for this thought experiment is borrowed from the metaclasses proposal for C++ (if you haven’t read about this, please do now! It’s an excellent idea). The example above though also shows why implementing such a feature could turn out to be somewhere between hard and impossible:
baz, there’s the parameter
buffer which is a pointer to the allocated memory. It doesn’t get assigned, to. Thus everything seems fine. Until I leave
baz. Do I now have a pointer to the allocated memory or not? This depends on the runtime value returned by
rand(). Should the compiler be able to deal with such situations? Can it even? Or will this turn out to be a (another) lost case against the halting problem?
Instead of giving the compiler rules to enforce - which leads to a conservative behaviour … if it cannot be sure that the rules are adhered to, then it’s an error - it could be easier to give the compiler patterns for common errors. These could be handled more permissive: If it cannot match the situation to any of the patterns, then it assumes that everything is fine. Only if a pattern that’s known to lead to an error is found, then issue an error.
This could also catch simpler issues, for example when the return value is not checked against “special” values (like
NULL) inside the calling function and also not passed to any other function nor returned from the calling function:
This could then also catch issues like this one I found in uzbl where
g_io_channel_write_chars was used but the special return value
G_IO_STATUS_AGAIN (which indicates that the function should be called again … due to a non blocking socket not being ready, for example) wasn’t handled in any way.
But is this really the job of the compiler? Shouldn’t this be some kind of code analysis tool? I think in the end, this wouldn’t matter. Type checking is also some kind of code analysis. It’s part of the compiler. Style checking is another kind of code analysis. It’s most often not part of a compiler but a separate tool.
Actually, I think there are already tools capable of finding errors like the ones I presented here. But I think the most important issue with these tools is that they have some kind of a database of “known error-prone patterns”. This database is continuously extended by the users of the analysis tool. It would be better if the developer writing a function could specify the correct handling of its return values in a standardized way, directly next to the function itself.