OSDev.org

Posted: **Thu Jul 25, 2013 9:19 pm**

So I guess some of you have recently read thru' your codebase to spot any funny comment, but how about comment that does not make sense, or it is so poor that it provide zero insight on what's going on?
Sharing this might actually help each other to improve comment skill.

So, here we go, code from a simple SNMP implementation in a rush.

Code: Select all

    //! Send SNMP trap
    //! \param[in] addr       Destination for this trap
    //! \param[in] community  Community string, for example "public"
    //! \param[in] enterprise Object ID of the enterprise identifying the event
    //! \param[in] trapcode   Trap code
    //! \param[in] value      Value of the object
    //! \return 0 on success, -1 on fail.
    int	snmp_trap    ( const struct sockaddr *addr, const char * community, const unsigned char* enterprise, int trapcode, int value );

Explanation:
1. Misleading. The parameter description of enterprise misleads to be an variable Object ID, which mean something else in the context of other functions.
2. Incomplete info: what is trap code? value? Two weeks after I finished working with that component I can't decide what to fill on these fields without stare on the RFC.

Posted: **Fri Jul 26, 2013 1:57 am**

Code: Select all

// Figure this clusterf**k out yourself.

I should redocument that section.

Posted: **Fri Jul 26, 2013 3:09 am**

How to make an unhelpful comment:

Write a method.
Document method perfectly using Doxygen, Javadoc, etc.
Update method parameters ever so slightly (for best results change the type and meaning but not the name of a single parameter) and neglect to update documentation.
Attempt to use IDE autocompletion. It correctly tells you what the type and name are, but gives you bad information as to what it means.

Similar thinking exists with unit tests: "Unit test passes, no need to update it..." eventually leads to 0% code coverage from your unit tests.

Posted: **Fri Jul 26, 2013 5:26 am**

JackScott wrote:Similar thinking exists with unit tests: "Unit test passes, no need to update it..." eventually leads to 0% code coverage from your unit tests.

Well put.

Posted: **Fri Jul 26, 2013 7:19 am**

There is this widespread view that code should be abundant with comments. I always try to convince people of the opposite for two reasons:

There is nothing worse than unhelpful comments, as they not only clutter the code but also require extra maintainability resulting in wasted development time and confusion if and when they happen to go out of sync with said simple code.
Code should strive to be as comprehensible as possible; it should speak for itself. If comments are required, it may be the case that the code is more complicated than it needs to be.

Comments can be useful but can also get in the way. They should be used with care.

Posted: **Sat Jul 27, 2013 9:46 am**

I think there's a pretty big difference between the comments you put in front of a whole function (detailing parameters, return values, error codes, etc.) and comments in front of blocks of code that describe the intent of the next few lines so that it is possible to understand what is happening more quickly; basically inline pseudocode. It is really easy for function comments to get out of sync with functionality, while block comments make it harder because the code is right there when you are editing it. I think block comments are much more valuable, but are often not used, because IDEs will not take advantage of them.

For example, here's a piece of code from one of my projects that allocates thread structures (stripped of all but the function comment):

Code: Select all

// tcb_new - allocate a new thread control block
//
// Returns a pointer to a new, initialized TCB on success, NULL on error.
// The TCB will have a locked mutex, and will be in a suspended state.
// This function can fail if there is no memory available for new TCBs or
// if there are no free TCB table slots.
// 
struct tcb *tcb_new(void) {
    struct tcb *tcb = tcb_alloc();
    if (!tcb) {
        return NULL;
    }
    mutex_acquire(&tcb_table_lock);
    for (uint32_t i = 1; i < TCB_LIMIT; i++) {
        if (!tcb_table[i / 512]) {
            struct tcb **subtab = pge_alloc();
            if (!subtab) {
                mutex_release(&tcb_table_lock);
                tcb_free(tcb);
                return NULL;
            }
            tcb_table[i / 512] = subtab;
            for (int j = 0; j < 512; j++) {
                tcb_table[i / 512][j] = NULL;
            }
        }
        if (!tcb_table[i / 512][i % 512]) {
            tcb_table[i / 512][i % 512] = tcb;
            tcb->state = TCB_STATE_SUSPENDED;
            tcb->id = i;
            tcb->pcx_id = 0;
            tcb->xstate = NULL;
            tcb->cs = 0x08;
            tcb->ss = 0x10;
            tcb->rflags = 0x200200;
            mutex_release(&tcb_table_lock);
            return tcb;
        }
    }
    mutex_release(&tcb_table_lock);
    tcb_free(tcb);
    return NULL;
}

You can get an idea of what it's doing from the function comment, but it's basically impossible to scan or verify that the functionality is correct. Here's the code with block comments:

Code: Select all

// tcb_new - allocate a new thread control block
//
// Returns a pointer to a new, initialized TCB on success, NULL on error.
// The TCB will have a locked mutex, and will be in a suspended state.
// This function can fail if there is no memory available for new TCBs or
// if there are no free TCB table slots.
// 
struct tcb *tcb_new(void) {

    // try to allocate TCB structure space
    // we want to do this before acquiring the lock for latency reasons
    struct tcb *tcb = tcb_alloc();

    if (!tcb) {

        // allocation failed, error
        return NULL;
    }

    mutex_acquire(&tcb_table_lock);

    // search for empty TCB table slot
    for (uint32_t i = 1; i < TCB_LIMIT; i++) {

        // ensure second-level table exists
        if (!tcb_table[i / 512]) {

            // try to allocate new TCB second-level table
            // XXX should we ever allocate while holding the TCB table lock?
            struct tcb **subtab = pge_alloc();

            if (!subtab) {

                // allocation failed, free TCB and error
                mutex_release(&tcb_table_lock);
                tcb_free(tcb);
                return NULL;
            }

            // initialize second-level table
            tcb_table[i / 512] = subtab;
            for (int j = 0; j < 512; j++) {
                tcb_table[i / 512][j] = NULL;
            }
        }

        if (!tcb_table[i / 512][i % 512]) {
            // empty TCB table slot found

            // fill slot
            tcb_table[i / 512][i % 512] = tcb;

            // initialize new TCB
            tcb->state = TCB_STATE_SUSPENDED;
            tcb->id    = i;

            tcb->pcx_id = 0;
            tcb->xstate = NULL;
            tcb->cs     = 0x08;
            tcb->ss     = 0x10;
            tcb->rflags = 0x200200; // XXX magic numbers

            // return new TCB
            mutex_release(&tcb_table_lock);
            return tcb;
        }
    }

    // no free slots, free TCB and error
    mutex_release(&tcb_table_lock);
    tcb_free(tcb);
    return NULL;
}

And to prove the point further, the function minus everything but block comments and control structure:

Code: Select all

struct tcb *tcb_new(void) {
    // try to allocate TCB structure space
    if (?) {
        // allocation failed, error
    }
    // search for empty TCB table slot
    for (?) {
        // ensure second-level table exists
        if (?) {
            // try to allocate new TCB second-level table
            if (?) {
                // allocation failed, free TCB and error
            }
            // initialize second-level table
        }
        if (?) {
            // empty TCB table slot found
            // fill slot
            // initialize new TCB
            // return new TCB
        }
    }
    // no free slots, free TCB and error
}

You can still figure out what the code is doing algorithmically.

Posted: **Sat Jul 27, 2013 10:09 am**

Has your target audience never programmed before? If you need a verbose comment to tell you that "struct tcb *tcb = tcb_alloc();" tries to allocate memory for a TCB structure, you have no business writing code. Next, an if statement that returns NULL when tcb_alloc fails---hmm, I wonder what that could mean... Better check the comment!

Such bad practice simply wastes development time for both the person who initially writes the code (since they need to type in the same thing twice) and the maintainers to follow, who need to read through the silly comments to get to the actual code---and then maintain them.

Posted: **Sat Jul 27, 2013 10:20 am**

Love4Boobies wrote:Has your target audience never programmed before? If you need a verbose comment to tell you that "struct tcb *tcb = tcb_alloc();" tries to allocate memory for a TCB structure, you have no business writing code. Next, an if statement that returns NULL when tcb_alloc fails---hmm, I wonder what that could mean... Better check the comment!

Such bad practice simply wastes development time for both the person who initially writes the code (since they need to type in the same thing twice) and the maintainers to follow, who need to read through the silly comments to get to the actual code---and then maintain them.

Maybe the "error" is to comment the task itself being performed by that line. It would be better to comment about ultimately why that is being done, with what purpose or intention. In this way, the thinking behind the code is more accessible, and it would be very useful, informative and educative even for someone planning to write a very different implementation in a completely different language (we are talking about actual implementation tricks more than raw concepts from a generic specification or document).

I once read that if a programmer doesn't care enough to keep the comments in the source code synchronized with the current code, then most certainly the overall quality of the code produced by that programmer will be equally poor and the programmer would be demonstrating carelessness (I would say that it could even be considered incorrect because if the comments help keep track of, for instance, tens of parameters to use and update very sligthly for different functions -in other words, a very error-prone thing-, it could lead to errors and even make coding too confusing for not needing to improve it).

What I do is an iteration between documentation and code. And when I code, I write down the ideas that appear fast in my mind in the form of codes, of what I will do next.

Then I write documentation and let the current code to become bulky, even with partially wrong lines of code commented out.

After I have a minimally useful program, I archive the whole current code in a ZIP file with the current date/time as its version number. Then I start cleaning up everything (outdated code, commented-out lines of code, inefficient code, some bugs, add some more code...).

Then I leave comments that explain how to use functions, but also group very similar functions with very small variations inside INIT-END blocks of comments (INIT and END are repeated at least in 3 consecutive lines and explaining briefly in 1 line what that block is for).

And then I keep comments for the major, overall, non-obvious parts/"sections" of the code and specially functions right from my most recent documentation. I do this because when I implement what I have documented, almost always many strange quirks start appearing in the code (so the explanation of the concepts from the documentation and the explanation of how to actually implement them are virtually two different things). That's why I leave comments, but at least I update and order them iteratively, and I also accompany my code with full human-readable documentation (of course, even accumulated from separated documents written and corrected over time, often directly useable for different projects) for the concepts and the implementation quirks.

Keeping a more human-readable code structure allows me to catch up on projects I have left for months and remember what I was doing faster in a more precise code section, and at least for me it's specially useful when I'm trying to learn very different things from one another.

Posted: **Sat Jul 27, 2013 11:02 am**

I agree with Love4Boobies.

If a function is so complex that it requires many additional comment to document what's going on, you should break down the function into smaller pieces; except the very limited case that you do special hack and trick inside a function that a few comment would help explain the operations.

And I think document for interface is important, and should always keep in sync with code, if not, it's the programmer's fault.

Posted: **Sat Jul 27, 2013 5:51 pm**

Love4Boobies wrote:Has your target audience never programmed before? If you need a verbose comment to tell you that "struct tcb *tcb = tcb_alloc();" tries to allocate memory for a TCB structure, you have no business writing code. Next, an if statement that returns NULL when tcb_alloc fails---hmm, I wonder what that could mean... Better check the comment!

Such bad practice simply wastes development time for both the person who initially writes the code (since they need to type in the same thing twice) and the maintainers to follow, who need to read through the silly comments to get to the actual code---and then maintain them.

If you need to "get to the actual code to maintain it", you aren't going to be reading the internals of functions first. You are going to be reading high-level docs to figure out which module contains your code, then header and function docs to find which function contains it, then you're going to scan through the function until you find the code you want. You want "indexing" via comments at every level. And if your editor colors comments differently (my color scheme makes them gray against black) you can very easily skip over them visually if you don't want to read them. They just enforce the line blocking structure, which is itself a form of documentation.

Splitting code into different functions is good, but only to a point. When you add a function (or class or whatever), you also add the overhead of locating and reading that function if you are trying to figure out what the code is doing. IMO, when an action only occurs in one place (like finding an empty slot in the TCB table when allocating a TCB) it is better to place it inline and add a comment describing the action than to factor it into a descriptively-named function; it takes up less space and is easier to find all the pieces.

I admit, that line is superfluous. I took it from a more excessively-commented function in my code. But it also doesn't cause damage. Nobody types continuously while programming; thinking and designing are the bottlenecks, and should be if you're doing anything more than mindless. Typing one extra line of semi-redundant information takes zero brain time and forces you to visually structure your code in an accessible way. Also, note what happens if you search for all instances of the word "error" in that code.

And if you are so sure I'm wrong, how would you comment (or not comment) that block of code?

Posted: **Sat Jul 27, 2013 7:02 pm**

Not to offense I but want to share some insight about the code you demonstrated, it got poor naming, and thus require excessive comment.
For instance, you can't be sure which to use, tcb_new() or tcb_alloc(), without actually read not only the header, but also the code body.

Furthermore, you merged data handling with logic implementation, therefore end up with bigger function block, and thus require more comment.

Think about this:

1. you implement an array class (or collection) that grow on demand
2. tcb_table is an instant of array which grow on demand.

you code will then look likes this:

Code: Select all

int tcb_new(struct tcb ** des) {
    if ( des==NULL ) return E_INVAL;

    struct tcb *tcb = tcb_alloc();
    if (tcb==NULL) return E_NOMEM;

    // Fill with defaults
    tcb->state = TCB_STATE_SUSPENDED;
    tcb->id = i;
    tcb->pcx_id = 0;
    tcb->xstate = NULL;
    tcb->cs = 0x08;
    tcb->ss = 0x10;
    tcb->rflags = 0x200200;

    mutex_acquire(&tcb_table_lock);
    if ( tcb_table.add ( tab ) < 0 ) {
        mutex_release(&tcb_table_lock);
        tcb_free(tcb);
        return E_INTERNAL;  // internal error
    }
    mutex_release(&tcb_table_lock);
    *des = tcb;
    return E_OK;
}

Now the code is pretty much self explanatory without intensive comment.

If you worry on function call overhead, ask the compiler do them inline.

Posted: **Sat Jul 27, 2013 7:20 pm**

If I had the facilities of C++, I would agree. And I do agree that that code could be factored better for reuse by using a single resizeable table implementation, but this happened to be in a project where there were only a couple of tables like that in the entire system, and that number was unlikely to grow. But yes, as much as I hate to say it, I admit that code is badly laid out.

I don't, however, think that this negates the usefulness of block comments.

Posted: **Sun Jul 28, 2013 1:54 am**

~ wrote:After I have a minimally useful program, I archive the whole current code in a ZIP file with the current date/time as its version number.

This implies that you are not using a version control system, which diminishes your usefulness as a developer to zero. Didn't read after that sentence.

Posted: **Sun Jul 28, 2013 3:37 am**

The "// allocation failed, error" style comments are the only ones which I would really consider unnecessary (and actually harmful, because they decrease the visibility of the useful comments). I very much like using comments as kind of headlines for the following lines of code. It allows scanning a function quicky and only go into actual code when you're already at the place that you're interested in. You could break up everything into smaller functions and use the function name as the headline, but then you don't have sequential code to read any more but get to gather the pieces.

For comparison, if I had written the code, it might look like this (though this isn't even compile-tested, so details may be wrong):

Code: Select all

/*
 * Returns a newly allocated thread control block or NULL on error.
 * The TCB will have a locked mutex, and will be in a suspended state.
 */
struct tcb *tcb_new(void)
{
    struct tcb *tcb;
    uint32_t i;

    tcb = tcb_alloc();
    if (!tcb) {
        return NULL;
    }

    mutex_acquire(&tcb_table_lock);

    /* Search for free table entry */
    /* XXX Why do we start iterating at 1? */
    for (i = 1; i < TCB_LIMIT; i++) {

        /* Create second-level table if not present */
        if (!tcb_table[i / 512]) {
            struct tcb **subtab = pge_alloc();
            if (!subtab) {
                goto fail;
            }
            tcb_table[i / 512] = subtab;
            for (int j = 0; j < 512; j++) {
                tcb_table[i / 512][j] = NULL;
            }
        }

        /* Second-level table lookup */
        if (!tcb_table[i / 512][i % 512]) {
            goto found;
        }
    }
    goto fail;

found:
    /* Initialise new entry */
    tcb_table[i / 512][i % 512] = tcb;
    *tcb = (struct tcb*) {
        .state  = TCB_STATE_SUSPENDED,
        .id     = i,
        .pcx_id = 0,
        .xstate = NULL,
        .cs     = 0x08,
        .ss     = 0x10,
        .rflags = 0x200200,
    };

    mutex_release(&tcb_table_lock);
    return tcb;

fail:
    mutex_release(&tcb_table_lock);
    tcb_free(tcb);
    return NULL;
}

As you can see, there is a comment that doesn't really help with explaining the code (the "Second-level table lookup" one), but it's there because it makes sense as a headline. These three lines aren't part of "Create second-level table" any more.

Oh, and I'm fully expecting to get flamed about my gotos (especially by the kind of people not using a VCS), but in my opinion this is the proper way to use them.

Posted: **Thu Aug 01, 2013 2:52 pm**

Please have a look at this piece of code (originates from my multitasking scheduler) and tell me if you really need comments for understanding it.

Code: Select all

// Above: task_t current; task_t tasks;

struct cpu_state* kschedule(struct cpu_state* cpu)
{
        if (current == NULL)
        {
                current = tasks;
        }
        else
        {
                current->cpu = cpu;
                do
                {
                        if (current->next != NULL)
                        {
                                current = current->next;
                        }
                        else
                        {
                                current = tasks;
                        }
                } while (current->blocked);
        }
        cpu = current->cpu;

        return cpu;
}

P.S. Code is not really heavily tested, could - possibly - contain smaller mistakes.

OSDev.org

Unhelpful comments in your code?

Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?

Re: Unhelpful comments in your code?