Linus Torvalds's Double Pointer Problem

By: Philip Buuck

519   54   55141

Uploaded on 04/03/2016

In 2012, Linus Torvalds presented an 'intuitive' way to use double pointers to easily remove a node from a linked list. But how intuitive can double pointers even really be? I present a way of visualizing pointers that I do not see often presented, but works very well for me when trying to figure out the jungles of addresses, dereferences, and other commands a complex C program can use.

Comments (4):

By anonymous    2017-09-20

I like this "real world" code example of pointer to pointer usage, in Git 2.0, commit 7b1004b:

Linus once said:

I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc.
For example, I've seen too many people who delete a singly-linked list entry by keeping track of the "prev" entry, and then to delete the entry, doing something like

if (prev)
  prev->next = entry->next;
else
  list_head = entry->next;

and whenever I see code like that, I just go "This person doesn't understand pointers". And it's sadly quite common.

People who understand pointers just use a "pointer to the entry pointer", and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a

*pp =  entry->next

http://i.stack.imgur.com/bpfxT.gif

Applying that simplification lets us lose 7 lines from this function even while adding 2 lines of comment.

-   struct combine_diff_path *p, *pprev, *ptmp;
+   struct combine_diff_path *p, **tail = &curr;

Chris points out in the comments to the 2016 video "Linus Torvalds's Double Pointer Problem " by Philip Buuck.


kumar points out in the comments the blog post "Linus on Understanding Pointers", where Grisha Trubetskoy explains:

Imagine you have a linked list defined as:

typedef struct list_entry {
    int val;
    struct list_entry *next;
} list_entry;

You need to iterate over it from the beginning to end and remove a specific element whose value equals the value of to_remove.
The more obvious way to do this would be:

list_entry *entry = head; /* assuming head exists and is the first entry of the list */
list_entry *prev = NULL;

while (entry) { /* line 4 */
    if (entry->val == to_remove)     /* this is the one to remove ; line 5 */
        if (prev)
           prev->next = entry->next; /* remove the entry ; line 7 */
        else
            head = entry->next;      /* special case - first entry ; line 9 */

    /* move on to the next entry */
    prev = entry;
    entry = entry->next;
}

What we are doing above is:

  • iterating over the list until entry is NULL, which means we’ve reached the end of the list (line 4).
  • When we come across an entry we want removed (line 5),
    • we assign the value of current next pointer to the previous one,
    • thus eliminating the current element (line 7).

There is a special case above - at the beginning of the iteration there is no previous entry (prev is NULL), and so to remove the first entry in the list you have to modify head itself (line 9).

What Linus was saying is that the above code could be simplified by making the previous element a pointer to a pointer rather than just a pointer.
The code then looks like this:

list_entry **pp = &head; /* pointer to a pointer */
list_entry *entry = head;

while (entry) {
    if (entry->val == to_remove)
        *pp = entry->next;

    pp = &entry->next;
    entry = entry->next;
}

The above code is very similar to the previous variant, but notice how we no longer need to watch for the special case of the first element of the list, since pp is not NULL at the beginning. Simple and clever.

Also, someone in that thread commented that the reason this is better is because *pp = entry->next is atomic. It is most certainly NOT atomic.
The above expression contains two dereference operators (* and ->) and one assignment, and neither of those three things is atomic.
This is a common misconception, but alas pretty much nothing in C should ever be assumed to be atomic (including the ++ and -- operators)!

Original Thread

By anonymous    2018-02-18

Explanation by video

This problem has been discussed by Philip Buuck in this YouTube video. I recommend you to watch this if you need more detailed explanation.


Explanation by example

If you like learning from examples, I prepared one. Let's say that we have the following single-linked list:

Singly-linked list example

that is represented as follows (click to enlarge):

Singly-linked list representation

We want to delete the node with the value = 8.

Code

Here is the simple code that do this:

#include <assert.h>
#include <stdio.h>
#include <stdlib.h>

struct node_t {
    int value;
    node_t *next;
};

node_t* create_list() {
    int test_values[] = { 28, 1, 8, 70, 56 };
    node_t *new_node, *head = NULL;
    int i;

    for (i = 0; i < 5; i++) {
        new_node = (node_t*)malloc(sizeof(struct node_t));
        assert(new_node);
        new_node->value = test_values[i];
        new_node->next = head;
        head = new_node;
    }

    return head;
}

void print_list(const node_t *head) {
    for (; head; head = head->next)
        printf("%d ", head->value);
    printf("\n");
}

void destroy_list(node_t **head) {
    node_t *next;

    while (*head) {
        next = (*head)->next;
        free(*head);
        *head = next;
    }
}

void remove_from_list(int val, node_t **head) {
    node_t *del, **p = head;

    while (*p && (**p).value != val)
        p = &(*p)->next;  // alternatively: p = &(**p).next

    if (p) {  // non-empty list and value was found
        del = *p;
        *p = del->next;
        del->next = NULL;  // not necessary in this case
        free(del);
    }
}

int main(int argc, char **argv) {
    node_t *head;

    head = create_list();
    print_list(head);

    remove_from_list(8, &head);
    print_list(head);

    destroy_list(&head);
    assert (head == NULL);

    return EXIT_SUCCESS;
}

If you compile and run this code you'll get:

56 70 8 1 28 
56 70 1 28

Explanation of the code

Let's create **p 'double' pointer to *head pointer:

Singly-linked list example #2

Now let's analyze how void remove_from_list(int val, node_t **head) works. It iterates over list as long as *p && (**p).value != val.

Singly-linked list example #3

Singly-linked list example #4

In this example given list contains value that we want to delete (which is 8). (**p).value is 8, so we stop iterating.

Note that *p points to the variable node_t *next within node_t that is before the node_t that we want to delete (which is **p).

Now let's assign the address of the element that we want to remove (del->value == 8) to the *del pointer.

Singly-linked list example #5

We need to fix the *p pointer so that **p was pointing to the one element after *del element that we are going to delete:

Singly-linked list example #6

In the code above we call free(del), thus it's not necessary to set del->next to NULL, but if we would like to return the pointer to the element 'detached' from the list instead of the completely removing it, we would set del->next = NULL:

Singly-linked list example #7

Original Thread

By anonymous    2018-02-18

As @R. Martinho Fernandes pointed out in his answer, using pointer to pointer as an argument in void push(struct node** head, int data) allows you to change the head pointer directly from within push function instead of returning the new pointer.

There is yet another good example which shows why using pointer to pointer instead a single pointer may shorten, simplify and speed up your code. You asked about adding a new node to the list which probably typically doesn't need pointer-to-pointer in contrast to removing the node from the singly-linked list. You can implement removing node from the list without pointer-to-pointer but it is suboptimal. I described the details here. I recommend you also to watch this YouTube video which addresses the problem.

BTW: If you count with Linus Torvalds opinion, you would better learn how to use pointer-to-pointer. ;-)

Linus Torvalds: (...) At the opposite end of the spectrum, I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc. For example, I've seen too many people who delete a singly-linked list entry by keeping track of the "prev" entry, and then to delete the entry, doing something like

if (prev)
prev->next = entry->next;
else
list_head = entry->next;

and whenever I see code like that, I just go "This person doesn't understand pointers". And it's sadly quite common.

People who understand pointers just use a "pointer to the entry pointer", and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a "*pp = entry->next". (...)


Other resources that may be helpful:

Original Thread

Submit Your Video

If you have some great dev videos to share, please fill out this form.