-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: System.Linq.Shuffle()
#111221
Comments
Tagging subscribers to this area: @dotnet/area-system-linq |
See #78419 (comment), #73864 (comment).
|
Even if you say "you can easily implement it yourself," it is not efficient, and if someone (including me) has implemented it themselves many times, I think that is a reason to include it in the official API. I don't think it's a problem even if If "Random, not the collection, should be responsible", would that be |
Yes, the key point is how to implement the shuffle effect with randomness.
The
I agree but the similar problems aren't always the responsibility of runtime. As I mentioned above, the upper level developers/libraries can provides some encapsulation practice based on BCL. You can add the following extension method, which I think covers 99% usage scenes. public static void Shuffle<T>(this IList<T> source)
{
if (source is T[] array)
{
Random.Shared.Shuffle(array);
return;
}
if (source is List<T> list)
{
Random.Shared.Shuffle(CollectionsMarshal.AsSpan(list));
return;
}
ShuffleSlow(source);
static void ShuffleSlow(IList<T> values)
{
int n = values.Count;
var rand = Random.Shared;
for (int i = 0; i < n - 1; i++)
{
int j = rand.Next(i, n);
if (j != i)
{
(values[j], values[i]) = (values[i], values[j]);
}
}
}
} Even so, you should be realized:
So you can see, the answer depends on your use cases. You have to change the implement as you require. |
How could this be implemented without bias in cases where the length of the enumerable isn't known ahead of time or could potentially be infinite? |
To answer my own question, this is typically addressed using reservoir sampling. This necessarily introduces bias in the generated permutations, however it might be a good-enough compromise for some use cases. One potential implementation could involve implementing the classical shuffling algorithm for sources implementing |
As far as I understand, reservoir sampling is an algorithm that extracts elements partially, so it may be difficult to apply it to the current application, which requires shuffling the entire array. In my opinion, it would be implemented like Of course, there may be better ideas. |
Background and motivation
Shuffle is a universal operation, such as in games and preprocessing for machine learning.
LINQ includes APIs for changing the order, such as
OrderBy()
andReverse()
.However, there is no dedicated API for shuffling an
IEnumerable<T>
yet.(There is
System.Random.Shuffle()
forSpan<T>
, but if you want to take an arbitraryIEnumerable<T>
sequence, you'll probably need a custom implementation.)On StackOverflow, we often see examples of shuffling implemented with code like
.OrderBy(_ => Guid.NewGuid())
.However, this implementation is inefficient, and there are more efficient ways to shuffle.
Therefore, I propose to implement
System.Linq.Shuffle()
.API Proposal
API Usage
Alternative Designs
Possible considerations:
IOrderedEnumerable<T>
forOrderBy()
.Random
instance, which allows for reproducible shuffling, but requires a fixed implementation.Besides,
Shuffle()
allows for an easy (though not optimal) solution to #102229.Risks
No response
The text was updated successfully, but these errors were encountered: