Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issue on BeatmapSet.IsFileUsed(string filePath) #10

Open
Milkitic opened this issue Nov 24, 2021 · 1 comment
Open

Performance issue on BeatmapSet.IsFileUsed(string filePath) #10

Milkitic opened this issue Nov 24, 2021 · 1 comment

Comments

@Milkitic
Copy link

Mapset Verifier will hit a horrible performance issue while reading https://osu.ppy.sh/beatmapsets/1619982 this mapset with storyboard caused by CheckUnusedFiles plugin. It is mainly because that there is a large amount of files in this mapset.

/// <summary> Returns whether the given full file path is used by the beatmapset. </summary>
public bool IsFileUsed(string filePath)
{
string relativePath = PathStatic.RelativePath(filePath, songPath);
string fileName = relativePath.Split(new char[] { '/', '\\' }).Last().ToLower();
string parsedPath = PathStatic.ParsePath(relativePath);
string strippedPath = PathStatic.ParsePath(relativePath, withoutExtension: true);
if (beatmaps.Any(beatmap => beatmap.generalSettings.audioFileName.ToLower() == parsedPath))
return true;
// When the path is "go", and "go.png" is over "go.jpg" in order, then "go.jpg" will be the one used.
// So we basically want to find the last path which matches the name.
string lastMatchingPath = PathStatic.ParsePath(GetLastMatchingPath(parsedPath));
// These are always used, but you won't be able to update them unless they have the right format.
if (fileName.EndsWith(".osu"))
return true;
if (beatmaps.Any(beatmap =>
beatmap.sprites .Any(element => element.path.ToLower() == parsedPath) ||
beatmap.videos .Any(element => element.path.ToLower() == parsedPath) ||
beatmap.backgrounds .Any(element => element.path.ToLower() == parsedPath) ||
beatmap.animations .Any(element => element.path.ToLower() == parsedPath) ||
beatmap.samples .Any(element => element.path.ToLower() == parsedPath)))
return true;
// animations cannot be stripped of their extension
if (beatmaps.Any(beatmap =>
beatmap.sprites .Any(element => element.strippedPath == strippedPath) ||
beatmap.videos .Any(element => element.strippedPath == strippedPath) ||
beatmap.backgrounds .Any(element => element.strippedPath == strippedPath) ||
beatmap.samples .Any(element => element.strippedPath == strippedPath)) &&
parsedPath == lastMatchingPath)
return true;
if (osb != null && (
osb.sprites .Any(element => element.path.ToLower() == parsedPath) ||
osb.videos .Any(element => element.path.ToLower() == parsedPath) ||
osb.backgrounds .Any(element => element.path.ToLower() == parsedPath) ||
osb.animations .Any(element => element.path.ToLower() == parsedPath) ||
osb.samples .Any(anElement => anElement.path.ToLower() == parsedPath)))
return true;
if (osb != null && (
osb.sprites .Any(element => element.strippedPath == strippedPath) ||
osb.videos .Any(element => element.strippedPath == strippedPath) ||
osb.backgrounds .Any(element => element.strippedPath == strippedPath) ||
osb.samples .Any(element => element.strippedPath == strippedPath)) &&
parsedPath == lastMatchingPath)
return true;
if (beatmaps.Any(beatmap => beatmap.hitObjects.Any(hitObject =>
(hitObject.filename != null ? PathStatic.ParsePath(hitObject.filename, true) : null) == strippedPath)))
return true;
if (hitSoundFiles.Any(hsPath => PathStatic.ParsePath(hsPath) == parsedPath))
return true;
if (SkinStatic.IsUsed(fileName, this))
return true;
if (fileName == GetOsbFileName().ToLower() && osb.IsUsed())
return true;
foreach (Beatmap beatmap in beatmaps)
if (IsAnimationPathUsed(parsedPath, beatmap.animations))
return true;
if (osb != null && IsAnimationPathUsed(parsedPath, osb.animations))
return true;
return false;
}

Naxesss added a commit that referenced this issue Nov 24, 2021
Improves performance on mapsets with extreme amounts of files.

In the case of https://osu.ppy.sh/beatmapsets/1619982, we improve loading time by more than 100x (from ~120 seconds to 1 second).

See #10
@Naxesss
Copy link
Owner

Naxesss commented Nov 24, 2021

On top of the above, the following code also quickly becomes slow as the amount of files increases:

private List<string> GetUsedHitSoundFiles()
{
IEnumerable<string> hitSoundFilePaths =
songFilePaths.Select(path => path.Substring(songPath.Length + 1));
// If we input a path here, like "sb/c.ogg", it won't be found since we only check for the name itself.
IEnumerable<string> usedHitSoundFiles =
GetUsedHitSoundFilesOf(hitSoundFilePaths);
return usedHitSoundFiles.ToList();
}

As far as I've diagnosed, the bottleneck for the latter case seems to be the loop of GetLastMatchingPath, which for each file checks all directory paths for any starting with the same name. This is also used in CheckUnusedFiles.

To address the above, I made the following optimizations:

Step Optimization Time to load mapset
0 - 126 seconds
1 Filter for the file name in Directory.EnumerateFiles of GetLastMatchingPath 31 seconds
2 Refactor GetUsedHitSoundFilesOf to make better use of GetLastMatchingPath 1 second

Running Mapset Verifier over the same mapset now takes about 16 seconds instead of 5 minutes. Although I could probably improve this more, I think it's acceptable enough now, especially considering how rare these cases are.

I will want to test this before releasing it, though, in case I've introduced any bugs by doing this. Will leave this open until then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants