Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace serde_json with simd_json #3174

Draft
wants to merge 2 commits into
base: latest
Choose a base branch
from
Draft

Replace serde_json with simd_json #3174

wants to merge 2 commits into from

Conversation

luizirber
Copy link
Member

@luizirber luizirber commented May 25, 2024

I found out the simd-json crate supports the same interface as serde_json, so it is an easy swap for potentially much better perf.

Drawbacks are only supported on x86_64, so need to keep serde_json for all other archs. this is now supported on ARM and wasm too, with (slower) fallbacks when conditions are not met.

This PR also shows how spread the use of serde_json::from_reader across the codebase is, should probably consolidate into a method on signatures... rebased on top of #3443, which greatly limited changes

Copy link

codecov bot commented May 25, 2024

Codecov Report

Attention: Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.

Project coverage is 86.39%. Comparing base (b69c960) to head (a35e64b).

Files with missing lines Patch % Lines
src/core/src/signature.rs 66.66% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           latest    #3174   +/-   ##
=======================================
  Coverage   86.38%   86.39%           
=======================================
  Files         137      137           
  Lines       16125    16133    +8     
  Branches     2219     2219           
=======================================
+ Hits        13930    13938    +8     
  Misses       1888     1888           
  Partials      307      307           
Flag Coverage Δ
hypothesis-py 25.43% <ø> (ø)
python 92.40% <ø> (ø)
rust 62.18% <85.71%> (+0.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@luizirber luizirber changed the base branch from latest to lirber/sig_from_reader December 19, 2024 18:48
@luizirber luizirber force-pushed the lirber/sig_from_reader branch from 8fe5104 to 32950d7 Compare December 19, 2024 19:02
@luizirber luizirber force-pushed the lirber/simd_json branch 2 times, most recently from eb471c8 to c222848 Compare December 19, 2024 19:25
@luizirber
Copy link
Member Author

Tried out with a big signature (1.3GB compressed):

5m34s -> 3m29s
14.2 GB -> 23.3 GB

Tradeoffs, tradeoffs... =]


This PR:

        Command being timed: "sourmash sig describe /data/wort/wort-sra/sigs/SRR21113408.sig"
        User time (seconds): 160.24
        System time (seconds): 50.81
        Percent of CPU this job got: 100%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 3:29.01
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 23327760
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 359
        Minor (reclaiming a frame) page faults: 25031343
        Voluntary context switches: 64
        Involuntary context switches: 2786
        Swaps: 0
        File system inputs: 2566414
        File system outputs: 22
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

latest:

        Command being timed: "sourmash sig describe /data/wort/wort-sra/sigs/SRR21113408.sig"
        User time (seconds): 292.09
        System time (seconds): 43.21
        Percent of CPU this job got: 100%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 5:34.15
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 14223848
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 4812
        Minor (reclaiming a frame) page faults: 20088817
        Voluntary context switches: 1282
        Involuntary context switches: 3187
        Swaps: 0
        File system inputs: 5159955
        File system outputs: 22
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Base automatically changed from lirber/sig_from_reader to latest December 19, 2024 21:29
@luizirber luizirber changed the title Replace serde_json with simd_json where supported Replace serde_json with simd_json Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant