Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Performance Benchmarking #6

Open
yuvipanda opened this issue Nov 16, 2024 · 1 comment
Open

Basic Performance Benchmarking #6

yuvipanda opened this issue Nov 16, 2024 · 1 comment

Comments

@yuvipanda
Copy link
Member

I thought it would be useful to run some basic perf benchmarks, particularly against EFS.

fio is a very helpful tool for running performance benchmarks. I made an image that can run on jupyterhub and had fio in quay.io/yuvipanda/fio-notebook:latest.

At the most basic level, I ran fio --filename=test --size=1GB --direct=1 --rw=randrw --bs=4k --ioengine=libaio --iodepth=256 --runtime=120 --numjobs=4 --time_based --group_reporting --name=iops-test-job --eta-newline=1 against both EFS and this project + EBS:

EFS:

ops-test-job: (groupid=0, jobs=4): err= 0: pid=69: Sat Nov 16 04:44:17 2024
  read: IOPS=2801, BW=10.9MiB/s (11.5MB/s)(1315MiB/120182msec)
    slat (usec): min=2, max=138, avg= 7.81, stdev= 3.47
    clat (msec): min=2, max=349, avg=177.56, stdev= 9.68
     lat (msec): min=2, max=349, avg=177.57, stdev= 9.68
    clat percentiles (msec):
     |  1.00th=[  163],  5.00th=[  167], 10.00th=[  169], 20.00th=[  171],
     | 30.00th=[  174], 40.00th=[  176], 50.00th=[  178], 60.00th=[  180],
     | 70.00th=[  182], 80.00th=[  184], 90.00th=[  188], 95.00th=[  192],
     | 99.00th=[  205], 99.50th=[  209], 99.90th=[  232], 99.95th=[  296],
     | 99.99th=[  338]
   bw (  KiB/s): min= 9408, max=13088, per=100.00%, avg=11204.38, stdev=146.81, samples=960
   iops        : min= 2352, max= 3272, avg=2801.09, stdev=36.71, samples=960
  write: IOPS=2802, BW=10.9MiB/s (11.5MB/s)(1316MiB/120182msec); 0 zone resets
    slat (usec): min=2, max=360, avg= 8.18, stdev= 3.62
    clat (msec): min=7, max=359, avg=187.88, stdev=10.19
     lat (msec): min=7, max=359, avg=187.88, stdev=10.19
    clat percentiles (msec):
     |  1.00th=[  171],  5.00th=[  176], 10.00th=[  178], 20.00th=[  182],
     | 30.00th=[  184], 40.00th=[  186], 50.00th=[  188], 60.00th=[  190],
     | 70.00th=[  192], 80.00th=[  194], 90.00th=[  199], 95.00th=[  203],
     | 99.00th=[  218], 99.50th=[  222], 99.90th=[  245], 99.95th=[  300],
     | 99.99th=[  347]
   bw (  KiB/s): min= 9616, max=12472, per=99.99%, avg=11209.85, stdev=124.00, samples=960
   iops        : min= 2404, max= 3118, avg=2802.46, stdev=31.00, samples=960
  lat (msec)   : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.03%, 100=0.04%
  lat (msec)   : 250=99.82%, 500=0.09%
  cpu          : usr=0.65%, sys=1.76%, ctx=576216, majf=0, minf=58
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=336641,336809,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
   READ: bw=10.9MiB/s (11.5MB/s), 10.9MiB/s-10.9MiB/s (11.5MB/s-11.5MB/s), io=1315MiB (1379MB), run=120182-120182msec
  WRITE: bw=10.9MiB/s (11.5MB/s), 10.9MiB/s-10.9MiB/s (11.5MB/s-11.5MB/s), io=1316MiB (1380MB), run=120182-120182msec

EBS:

iops-test-job: (groupid=0, jobs=4): err= 0: pid=65: Sat Nov 16 04:42:17 2024
  read: IOPS=2873, BW=11.2MiB/s (11.8MB/s)(1349MiB/120182msec)
    slat (nsec): min=1831, max=3259.2k, avg=8388.23, stdev=12247.11
    clat (usec): min=1628, max=353655, avg=170253.27, stdev=17594.25
     lat (usec): min=1633, max=353685, avg=170261.65, stdev=17594.25
    clat percentiles (msec):
     |  1.00th=[   48],  5.00th=[  157], 10.00th=[  161], 20.00th=[  165],
     | 30.00th=[  167], 40.00th=[  169], 50.00th=[  171], 60.00th=[  174],
     | 70.00th=[  176], 80.00th=[  180], 90.00th=[  182], 95.00th=[  186],
     | 99.00th=[  190], 99.50th=[  194], 99.90th=[  300], 99.95th=[  326],
     | 99.99th=[  347]
   bw (  KiB/s): min= 8000, max=33176, per=100.00%, avg=11495.90, stdev=384.46, samples=960
   iops        : min= 2000, max= 8294, avg=2873.98, stdev=96.11, samples=960
  write: IOPS=2874, BW=11.2MiB/s (11.8MB/s)(1349MiB/120182msec); 0 zone resets
    slat (usec): min=2, max=575, avg= 8.82, stdev=11.05
    clat (usec): min=1669, max=377827, avg=186013.12, stdev=19548.99
     lat (usec): min=1672, max=377834, avg=186021.94, stdev=19549.02
    clat percentiles (msec):
     |  1.00th=[   50],  5.00th=[  171], 10.00th=[  176], 20.00th=[  180],
     | 30.00th=[  182], 40.00th=[  186], 50.00th=[  188], 60.00th=[  190],
     | 70.00th=[  192], 80.00th=[  197], 90.00th=[  201], 95.00th=[  205],
     | 99.00th=[  211], 99.50th=[  215], 99.90th=[  321], 99.95th=[  347],
     | 99.99th=[  359]
   bw (  KiB/s): min= 8656, max=34448, per=99.99%, avg=11496.73, stdev=390.81, samples=960
   iops        : min= 2164, max= 8612, avg=2874.18, stdev=97.70, samples=960
  lat (msec)   : 2=0.01%, 4=0.01%, 10=0.02%, 20=0.03%, 50=1.03%
  lat (msec)   : 100=0.17%, 250=98.50%, 500=0.23%
  cpu          : usr=0.51%, sys=1.40%, ctx=533657, majf=0, minf=51
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1%
     issued rwts: total=345360,345439,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
   READ: bw=11.2MiB/s (11.8MB/s), 11.2MiB/s-11.2MiB/s (11.8MB/s-11.8MB/s), io=1349MiB (1415MB), run=120182-120182msec
  WRITE: bw=11.2MiB/s (11.8MB/s), 11.2MiB/s-11.2MiB/s (11.8MB/s-11.8MB/s), io=1349MiB (1415MB), run=120182-120182msec

EBS is slightly faster, but I think these are all just limitations of NFS rather than EFS or EBS.

We need to test concurrent access next, with X number of clients doing this simultaneously

@achtsnits
Copy link

it looks like there’s a global trend of moving away from EFS for cloud based JupyterHub offerings...

I’m curious about the EFS stats here - are you using EFS bursting throughput mode on a relatively empty volume with perhaps 200GB? with more data, or if you switch to elastic mode (=resulting in unpredictable cloud costs), you’d likely see much higher throughput

also, which type of node are you hosting the NFS instance on as this will affect the max EBS throughput? this may not be relevant for above test but may become if repeated with lot of NFS clients=users, not just a few

by the way, what’s your plan for setting up HA? we're considering moving away from NFS to a distributed (Kubernetes native) filesystem, but we haven't finalized our decision yet. it’s great to see your work and thoughts shared publicly—it’s really helpful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants