Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MADV_COLLAPSE when committing a range? #628

Open
SchrodingerZhu opened this issue Aug 17, 2023 · 1 comment
Open

Add MADV_COLLAPSE when committing a range? #628

SchrodingerZhu opened this issue Aug 17, 2023 · 1 comment

Comments

@SchrodingerZhu
Copy link
Collaborator

SchrodingerZhu commented Aug 17, 2023

Linux 6.1 introduces a new flag MADV_COLLAPSE. I wonder if it is also helpful in snmalloc.

Introduce a new madvise mode, MADV_COLLAPSE, that allows users to request a
synchronous collapse of memory at their own expense.

The benefits of this approach are:

  • CPU is charged to the process that wants to spend the cycles for the
    THP
  • Avoid unpredictable timing of khugepaged collapse

An immediate user of this new functionality are malloc() implementations
that manage memory in hugepage-sized chunks, but sometimes subrelease
memory back to the system in native-sized chunks via MADV_DONTNEED;
zapping the pmd. Later, when the memory is hot, the implementation
could madvise(MADV_COLLAPSE) to re-back the memory by THPs to regain
hugepage coverage and dTLB performance. TCMalloc is such an
implementation that could benefit from this[2].

Only privately-mapped anon memory is supported for now, but it is
expected that file and shmem support will be added later to support the
use-case of backing executable text by THPs. Current support provided
by CONFIG_READ_ONLY_THP_FOR_FS may take a long time on a large system
which might impair services from serving at their full rated load after
(re)starting. Tricks like mremap(2)'ing text onto anonymous memory to
immediately realize iTLB performance prevents page sharing and demand
paging, both of which increase steady state memory footprint. With
MADV_COLLAPSE, we get the best of both worlds: Peak upfront performance
and lower RAM footprints.

This call respects THP eligibility as determined by the system-wide
/sys/kernel/mm/transparent_hugepage/enabled sysfs settings and the VMA
flags for the memory range being collapsed.

@mjp41
Copy link
Member

mjp41 commented Aug 21, 2023

Thanks for highlighting this. It would be interesting to integrate this into the backend, but I am not sure how to yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants