Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue 22306 - scope array variable should be stack allocated #14562

Merged
merged 2 commits into from
Oct 18, 2022

Conversation

dkorpel
Copy link
Contributor

@dkorpel dkorpel commented Oct 14, 2022

This includes a backend change @kinke @ibuclaw. I tried to make it a rewrite like in #11039, but didn't find a way to declare a temporary static array in the parent scope so it wouldn't immediately be destructed after the array literal assignment. So currently, I'm mirroring the way it's done with scope Object o = new Object() and keep an onstack variable.

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @dkorpel! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the coverage diff by visiting the details link of the codecov check)
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close Bugzilla Severity Description
22306 enhancement scope array variable should be stack allocated

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#14562"

@Geod24
Copy link
Member

Geod24 commented Oct 14, 2022

Hum, the real MVP here would be to support scope int[] array = new int[](someRuntimeValue);

@dkorpel
Copy link
Contributor Author

dkorpel commented Oct 15, 2022

Hum, the real MVP here would be to support scope int[] array = new int[](someRuntimeValue);

I know two kinds of MVP, do you refer to "Minimum viable" or "Most valuable"? Because the run time value makes it a lot harder. You need to decide on a good size cap to prevent stack overflow, and dynamically switch at runtime. It can't statically be @nogc either.

@dkorpel dkorpel marked this pull request as ready for review October 15, 2022 11:14
@dkorpel dkorpel requested a review from ibuclaw as a code owner October 15, 2022 11:14
@Geod24
Copy link
Member

Geod24 commented Oct 15, 2022

You need to decide on a good size cap to prevent stack overflow, and dynamically switch at runtime.

static arrays already have a limit, how about re-using that ?

It can't statically be @nogc either.

That's the whole point. Instead of dynamically switching at runtime, you just give a hard error. It's easy enough to check beforehand.

I know two kinds of MVP

Actually I was thinking "Most Valuable Player" but not sure my usage was right 🙃

@dkorpel
Copy link
Contributor Author

dkorpel commented Oct 15, 2022

static arrays already have a limit, how about re-using that ?

That limit is far above a typical stack size. I'm basically asking for stack overflow if I write this:

void main() {
    ubyte[32*1024*1024] x; // Initializes 32 Mb of stack memory
}

But here, I don't want the compiler to irresponsibly move this GC allocation to the stack:

void main() {
    scope ubyte[] x = new ubyte[](32*1024*1024);
}

That's the whole point. Instead of dynamically switching at runtime, you just give a hard error. It's easy enough to check beforehand.

When I use the new operator, I assume free space on the heap is available to me, I don't want an unnecessary Out of Memory error.

Actually I was thinking "Most Valuable Player" but not sure my usage was right 🙃

I think it's right, I'm just not used to hearing it in a software context where "Minimum Viable Product" is more prevalent.

@maxhaton
Copy link
Member

A general framework for moving gc allocations either onto the stack or malloc and free or whatever would be good but requires a level of trust in the rest of the frontend that I don't really have

@maxhaton
Copy link
Member

A test involving a scope array and a closure would be good, i.e. to make sure it ends up on the correct "stack"

@thewilsonator
Copy link
Contributor

If you want GC promotions to the stack, use LDC

@nordlow
Copy link
Contributor

nordlow commented Oct 16, 2022

Hum, the real MVP here would be to support scope int[] array = new int[](someRuntimeValue);

Exactly my thought aswell. Such an optimization could have unforeseen (big) benefits to performance of allocation-heavy code. Moreover if the compiler could track uniqueness of array references it could do even more awesome optimizations like understanding that x ~= ... can reallocate x in-place thereby avoiding the need for manual cluttering optimizations like replacing T[] with Appender!(T[]) potentially resulting in even greater speed-ups in allocation-heavy code.

Cases such as

void f()
{
    class C { int x; }
    scope c = new C(); // c is freed at the end of f()
}

already uses malloc+free stack-allocation under the hood. Can we please have arrays utilize the same optimization? Starting with POD-types element types only, I guess, as described above.

@Geod24
Copy link
Member

Geod24 commented Oct 16, 2022

already uses malloc+free under the hood.

No it doesn't. It's stack allocated, not malloced, and has been since D1.

If you want GC promotions to the stack, use LDC

It's an optimization, which is great, but has the major downsides of all optimizations: It cannot be enforced (easily & statically).

That limit is far above a typical stack size.

What's a typical stack size ? There are a myriad of ways to control stack size (ulimit, fibers) and thus there's no "Right default".

When I use the new operator, I assume free space on the heap is available to me, I don't want an unnecessary Out of Memory error.

Out of memory happens when an allocation can't be performed. Where the allocation is, that's an implementation detail. Remember we have different types of GC.

Also you can do this currently:

class Foobar
{
    int[32*1024*1024 - 1] loads;
}

void main () @nogc
{
    scope f = new Foobar;
}

And yes it SEGV.

@nordlow
Copy link
Contributor

nordlow commented Oct 16, 2022

already uses malloc+free under the hood.

No it doesn't. It's stack allocated, not malloced, and has been since D1.

Ok, sorry. My mistake. I've overstriked it.

class Foobar { int[32*1024*1024 - 1] loads; }
void main () @nogc { scope f = new Foobar; }

And yes it SEGV.

Would be worthwhile looking into if GCC/Clang statically diagnoses corresponding C++ code as an error:

Clang allows

class C { int data[1*1024*1024*1024]; };

whereas disallows

class C { int data[2*1024*1024*1024]; };

with output

<stdin>:1:15: error: fields must have a constant size: 'variable length array in structure' extension will never be supported
<stdin>:1:15: warning: private field 'data' is not used

.

@Geod24
Copy link
Member

Geod24 commented Oct 17, 2022

whereas disallows

It doesn't disallow it. It just overflows the integer constant, leading to a negative size, and clang fails to correctly diagnose this, while GCC tells you exactly what the problem is. If you use 2*1024*1024*1024UL it'll happily compile.

@dkorpel
Copy link
Contributor Author

dkorpel commented Oct 17, 2022

What's a typical stack size ?

1 Mb on Windows, 8 Mb on linux.

There are a myriad of ways to control stack size (ulimit, fibers) and thus there's no "Right default".

In any case, you generally want to keep a function's stack frame under 4 Kb. Even if you don't overflow, large allocations can degrade cache performance, so it's not always better to move from the heap to the stack.

Out of memory happens when an allocation can't be performed. Where the allocation is, that's an implementation detail.

Here's a compliant implementation of malloc:

void* malloc(size_t sz) { return null; }

That it actually attempts to allocate memory that the system has is also an implementation detail, but one that's very important to users.

Also you can do this currently:

That's still a static size, so the compiler can (and should) diagnose that. I don't think it's a pressing issue though, classes usually don't have a huge instance size.

@WalterBright
Copy link
Member

Hum, the real MVP here would be to support scope int[] array = new int;

That is an important topic for discussion, but should not delay this PR.

@WalterBright WalterBright merged commit 92299d8 into dlang:master Oct 18, 2022
@WalterBright
Copy link
Member

@dkorpel could you please follow this up with a spec PR?

@nordlow
Copy link
Contributor

nordlow commented Oct 18, 2022

It doesn't disallow it. It just overflows the integer constant, leading to a negative size, and clang fails to correctly diagnose this, while GCC tells you exactly what the problem is. If you use 2*1024*1024*1024UL it'll happily compile.

Oops. Thanks.

@dkorpel dkorpel deleted the scope-array-stack-1 branch October 19, 2022 09:53
@dkorpel
Copy link
Contributor Author

dkorpel commented Oct 19, 2022

@dkorpel could you please follow this up with a spec PR?

dlang/dlang.org#3442

@ntrel
Copy link
Contributor

ntrel commented Oct 19, 2022

class Foobar
{
    int[32*1024*1024 - 1] loads;
}

void main () @nogc
{
    scope f = new Foobar;
}

And yes it SEGV.

@Geod24 interesting, is this filed in bugzilla?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants