Skip to content

Commit

Permalink
BACKPORT: mm: remove gup_flags FOLL_WRITE games from __get_user_pages()
Browse files Browse the repository at this point in the history
commit 19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream.

This is an ancient bug that was actually attempted to be fixed once
(badly) by me eleven years ago in commit 4ceb5db ("Fix
get_user_pages() race for write access") but that was then undone due to
problems on s390 by commit f33ea7f ("fix get_user_pages bug").

In the meantime, the s390 situation has long been fixed, and we can now
fix it by checking the pte_dirty() bit properly (and do it better).  The
s390 dirty bit was implemented in abf09bed3cce ("s390/mm: implement
software dirty bits") which made it into v3.9.  Earlier kernels will
have to look at the page state itself.

Also, the VM has become more scalable, and what used a purely
theoretical race back then has become easier to trigger.

To fix it, we introduce a new internal FOLL_COW flag to mark the "yes,
we already did a COW" rather than play racy games with FOLL_WRITE that
is very fundamental, and then use the pte dirty flag to validate that
the FOLL_COW flag is still valid.

Change-Id: Id9bec3722797dff7d0ff0d9f6097c4229e31fd62
Reported-and-tested-by: Phil "not Paul" Oester <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Reviewed-by: Michal Hocko <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Oleg Nesterov <[email protected]>
Cc: Willy Tarreau <[email protected]>
Cc: Nick Piggin <[email protected]>
Cc: Greg Thelen <[email protected]>
Cc: [email protected]
Signed-off-by: Linus Torvalds <[email protected]>
[wt: s/gup.c/memory.c; s/follow_page_pte/follow_page_mask;
     s/faultin_page/__get_user_page]
Signed-off-by: Willy Tarreau <[email protected]>
  • Loading branch information
torvalds authored and Caio99BR committed Nov 5, 2016
1 parent 4a6da14 commit 0d2aaff
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 2 deletions.
1 change: 1 addition & 0 deletions include/linux/mm.h
Original file line number Diff line number Diff line change
Expand Up @@ -1549,6 +1549,7 @@ struct page *follow_page(struct vm_area_struct *, unsigned long address,
#define FOLL_MLOCK 0x40 /* mark page as mlocked */
#define FOLL_SPLIT 0x80 /* don't return transhuge pages, split them */
#define FOLL_HWPOISON 0x100 /* check page is hwpoisoned */
#define FOLL_COW 0x4000 /* internal GUP flag */

typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr,
void *data);
Expand Down
14 changes: 12 additions & 2 deletions mm/memory.c
Original file line number Diff line number Diff line change
Expand Up @@ -1449,6 +1449,16 @@ int zap_vma_ptes(struct vm_area_struct *vma, unsigned long address,
}
EXPORT_SYMBOL_GPL(zap_vma_ptes);

/*
* FOLL_FORCE can write to even unwritable pte's, but only
* after we've gone through a COW cycle and they are dirty.
*/
static inline bool can_follow_write_pte(pte_t pte, unsigned int flags)
{
return pte_write(pte) ||
((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte));
}

/**
* follow_page - look up a page descriptor from a user-virtual address
* @vma: vm_area_struct mapping @address
Expand Down Expand Up @@ -1531,7 +1541,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
pte = *ptep;
if (!pte_present(pte))
goto no_page;
if ((flags & FOLL_WRITE) && !pte_write(pte))
if ((flags & FOLL_WRITE) && !can_follow_write_pte(pte, flags))
goto unlock;

page = vm_normal_page(vma, address, pte);
Expand Down Expand Up @@ -1834,7 +1844,7 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
*/
if ((ret & VM_FAULT_WRITE) &&
!(vma->vm_flags & VM_WRITE))
foll_flags &= ~FOLL_WRITE;
foll_flags |= FOLL_COW;

cond_resched();
}
Expand Down

0 comments on commit 0d2aaff

Please sign in to comment.