Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential bad pointer deref in minix fs #1793

Closed
wants to merge 1 commit into from

Conversation

ccoffing
Copy link
Contributor

If minix_bread fails, bh will be NULL, and the loop simply advances to the next directory entry. During the next iteration, index won't be at a block boundary, so won't try to set bh, yet minix_match() will dereference it.

Found by static analysis.

If minix_bread fails, bh will be NULL, and the loop simply advances
to the next directory entry.  During the next iteration, index won't
be at a block boundary, so won't try to set bh, yet minix_match()
will dereference it.
@ghaerr
Copy link
Owner

ghaerr commented Jan 24, 2024

Nice catch! I'm assuming you're using your more sophisticated analysis tools on the kernel now?

While this fix definitely solves the NULL bh->b_data dereference problem, since this is kernel code perhaps we might consider a bit more how an I/O error on a directory entry read could be handled. With the proposed fix, the do/while might be executed another 31 times (sizeof(directory entry) == 2^5 times 2^5 = 2^10 = BLOCK_SIZE). This would fill the screen with I/O errors for the same block, needlessly for each block (re)read. I suggest incrementing the bo directory offset to the next block when an error occurs, which would at least skip the re-read attempts while possibly moving forward with any subsequent directory entries. Given the minor complexity of the do loop, something like the following could work for the first change:

-   goto minix_find;
+   while (bo < dir->i_size) {
        offset = (__u16)bo & (BLOCK_SIZE - 1);
+       if (!offset || !bh) {
            unmap_brelse(bh);
-     minix_find:
            bh = minix_bread(dir, (__u16)(bo >> BLOCK_SIZE_BITS), 0);
            if (!bh) {
+               bo = (bo + BLOCK_SIZE) & ~(BLOCK_SIZE - 1);
                continue;
            }
            map_buffer(bh);
        }
        if (minix_match(namelen, name,
                (struct minix_dir_entry *)(bh->b_data + offset), info->s_namelen)) {
            *res_dir = (struct minix_dir_entry *) (bh->b_data + offset);
            return bh;
        }
+       bo += info->s_dirsize;
+  }

The 2nd change might be a bit easier due to its loop design.

Both bigger changes would likely need to be heavily tested to ensure correctness. If this sounds like more than you'd like to jump into at the moment, we can accept this PR as-is and I'll make a note to perform the above changes at a later date.

@ccoffing
Copy link
Contributor Author

After I opened this PR, I was thinking about the repeated tries as you point out, but also thinking that it's strange that a kernel function could silently fail. I suppose that in some scenario, a user program could end up seeing ENOENT instead of EIO. I'll take your suggestion, and think about this more. This would be fun to make robust.

@ccoffing ccoffing closed this Jan 25, 2024
@ghaerr
Copy link
Owner

ghaerr commented Jan 25, 2024

it's strange that a kernel function could silently fail

a user program could end up seeing ENOENT instead of EIO

Due to the way that the kernel filesystem code interacts with the async- and shared- block I/O subsystem, along with multi-tasking issues around simultaneously accessed blocks, it is very tricky to report back filesystem-related block I/O errors to an individual application. This is because it can't be guaranteed that the application that requested a given block I/O transaction is the same one that first or eventually gets the buffer header, and this a known area of tricky race conditions. I don't recommend going overboard trying to "fix" this, as the entire filesystem codebase has the same issues.

Frankly, given your insight about the possibility of ENOENT rather than EIO above, which I hadn't fully considered, and the already-existing problem of the kernel being on the verge of being too large, it might be worth considering the idea of aborting the directory search immediately upon encountering a getblk I/O error and returning NULL. The user (if watching) would see the kernel error display and accept an incompleted result as OK, rather than continuation across a bad block read.

That said, it's quite tricky to second guess kernel semantics without an extensive test suite that can simulate I/O errors, something we don't yet have.

Looking at the Linux 2.0 source, which I sometimes do in cases like this, I see that the bug was noticed and resolved by attempting continuation, rather than aborting. Perhaps this is under the idea its better to complete a task if possible, rather than aborting early with a particular error code:

Linux 2.0 fs/minix/namei.c:

    bh = NULL;
    block = offset = 0;
    while (block*BLOCK_SIZE+offset < dir->i_size) {
        if (!bh) {
            bh = minix_bread(dir,block,0);
            if (!bh) {
                block++;
                continue;
            }
        }
        *res_dir = (struct minix_dir_entry *) (bh->b_data + offset);
        if (minix_match(namelen,name,bh,&offset,info))
            return bh;
        if (offset < bh->b_size)
            continue;
        brelse(bh);
        bh = NULL;
        offset = 0;
        block++;
    }

I suppose after looking at this code, I would suggest not trying to get into a heavy kernel rewrite for error correctness at the expense of kernel size and continuation of execution if possible if not required; early termination could result in an otherwise usable filesystem becoming unusable for all directory entries past a bad sector, for instance.

What do you think?

@ccoffing ccoffing deleted the minix-fs-ptr-deref branch January 25, 2024 05:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants