Pi CM4 NVME performance: Bicool Mini Base vs. IO Board
Purpose
Compare performance of an NVME SSD on the (Waveshare) Bicool Mini Base A vs. the official Compute module 4 IO Board. testing is on CM4 H/W that uses direct PCIe/NVME connections (no USB adapter.)
Motivation:
I configured a recently received CM4 on a Mini Base and performed some intitial testing to establish correct operation. Initial tests produced lower than expected disk bandwidth. But there are confounding factors that could affect the tests. The initial test was performed with the SSD installed on an IO Board using a PCIe/NVME adapter and a CM4 with 8GB RAM. The card was then moved to the Mini Base which had a CM4 with 4GB RAM. Possible confounding factors.
- 4GB vs 8GB CM4
- Cooling on the SSD due to mounting location.
- Debian vs. Ubuntu (Further comparisons were performed between Debian on the CM4/8GB and the CM4/4GB on the respective base boards.)
- Differing ZFS versions on the Debian vs. Ubuntu installations.
Process
- Install vanilla Debian Bookworm to the 256GB NVME SSD. The same SSD will be used for all testing. It will be erased using
blkdiscard
prior to each installatioin and it will be trimmed prior to each disk benchmark. - Perform two benchmarks on the CM4/4GB on the Mini Base. Direct a small fan at the SSD to control temperature and note results including SSD temperature.
- Evaluate.
- Swap the NVME SSD to the IO Board with the CM4/8GB.
time -p iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
time -p fio --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --directory=/home/hbarta/ --name="io 256GB SSSTC NVME CM4/4GB Mini Base" --filename=test.dat --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
time -p fio --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --directory=/home/hbarta/ --name="io 256GB SSSTC NVME CM4/8GB IO Board" --filename=test.dat --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
Testing on Bicool Mini Base
hbarta@io:~$ time -p iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
Iozone: Performance Test of File I/O
Version $Revision: 3.489 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
Vangel Bojaxhi, Ben England, Vikentsi Lapa,
Alexey Skidanov, Sudhir Kumar.
Run began: Tue Oct 3 18:02:42 2023
Include fsync in write timing
SYNC Mode.
Auto Mode
File size set to 102400 kB
Record Size 4 kB
Command line used: iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
Output is in kBytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 kBytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
102400 4 2781 4202 996001 1008897 986084 4180
iozone test complete.
real 86.36
user 0.41
sys 7.08
hbarta@io:~$ sudo fstrim -v /
[sudo] password for hbarta:
/: 291.3 MiB (305418240 bytes) trimmed
hbarta@io:~$ time -p iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
Iozone: Performance Test of File I/O
Version $Revision: 3.489 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
Vangel Bojaxhi, Ben England, Vikentsi Lapa,
Alexey Skidanov, Sudhir Kumar.
Run began: Tue Oct 3 18:04:48 2023
Include fsync in write timing
SYNC Mode.
Auto Mode
File size set to 102400 kB
Record Size 4 kB
Command line used: iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
Output is in kBytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 kBytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
102400 4 2824 4204 972903 1030087 992526 4181
iozone test complete.
real 85.77
user 0.45
sys 6.91
hbarta@io:~$
SSD temperature started at 30°C and ranged up to 35°C.
hbarta@io:~$ sudo fstrim -v /
/: 217.6 MiB (228143104 bytes) trimmed
hbarta@io:~$ time -p fio --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --directory=/home/hbarta/ --name="io 256GB SSSTC NVME CM4/4GB Mini Base" --filename=test.dat --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
io 256GB SSSTC NVME CM4/4GB Mini Base: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.33
Starting 1 process
io 256GB SSSTC NVME CM4/4GB Mini Base: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [m(1)][100.0%][r=28.2MiB/s,w=9512KiB/s][r=7220,w=2378 IOPS][eta 00m:00s]
io 256GB SSSTC NVME CM4/4GB Mini Base: (groupid=0, jobs=1): err= 0: pid=3879: Tue Oct 3 18:09:33 2023
read: IOPS=6987, BW=27.3MiB/s (28.6MB/s)(3070MiB/112478msec)
bw ( KiB/s): min= 7048, max=35136, per=100.00%, avg=27977.00, stdev=3829.51, samples=224
iops : min= 1762, max= 8784, avg=6994.25, stdev=957.38, samples=224
write: IOPS=2335, BW=9341KiB/s (9565kB/s)(1026MiB/112478msec); 0 zone resets
bw ( KiB/s): min= 2368, max=11656, per=100.00%, avg=9350.50, stdev=1294.02, samples=224
iops : min= 592, max= 2914, avg=2337.63, stdev=323.50, samples=224
cpu : usr=6.07%, sys=31.19%, ctx=786476, majf=0, minf=19
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=27.3MiB/s (28.6MB/s), 27.3MiB/s-27.3MiB/s (28.6MB/s-28.6MB/s), io=3070MiB (3219MB), run=112478-112478msec
WRITE: bw=9341KiB/s (9565kB/s), 9341KiB/s-9341KiB/s (9565kB/s-9565kB/s), io=1026MiB (1076MB), run=112478-112478msec
Disk stats (read/write):
nvme0n1: ios=785907/179450, merge=0/27, ticks=76811/91049, in_queue=167872, util=100.00%
real 127.56
user 9.10
sys 47.59
hbarta@io:~$ sudo fstrim -v /
/: 0 B (0 bytes) trimmed
hbarta@io:~$ time -p fio --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --directory=/home/hbarta/ --name="io 256GB SSSTC NVME CM4/4GB Mini Base" --filename=test.dat --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
io 256GB SSSTC NVME CM4/4GB Mini Base: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [m(1)][100.0%][r=25.2MiB/s,w=8752KiB/s][r=6442,w=2188 IOPS][eta 00m:00s]
io 256GB SSSTC NVME CM4/4GB Mini Base: (groupid=0, jobs=1): err= 0: pid=4012: Tue Oct 3 18:12:22 2023
read: IOPS=6171, BW=24.1MiB/s (25.3MB/s)(3070MiB/127349msec)
bw ( KiB/s): min= 6744, max=28472, per=100.00%, avg=25109.00, stdev=3328.02, samples=250
iops : min= 1686, max= 7118, avg=6277.23, stdev=832.00, samples=250
write: IOPS=2062, BW=8250KiB/s (8448kB/s)(1026MiB/127349msec); 0 zone resets
bw ( KiB/s): min= 2232, max=10184, per=100.00%, avg=8391.55, stdev=1142.27, samples=250
iops : min= 558, max= 2546, avg=2097.87, stdev=285.58, samples=250
cpu : usr=5.29%, sys=29.41%, ctx=786636, majf=0, minf=19
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=24.1MiB/s (25.3MB/s), 24.1MiB/s-24.1MiB/s (25.3MB/s-25.3MB/s), io=3070MiB (3219MB), run=127349-127349msec
WRITE: bw=8250KiB/s (8448kB/s), 8250KiB/s-8250KiB/s (8448kB/s-8448kB/s), io=1026MiB (1076MB), run=127349-127349msec
Disk stats (read/write):
nvme0n1: ios=784644/184516, merge=0/32, ticks=89281/97942, in_queue=187238, util=98.44%
real 127.78
user 7.14
sys 37.96
hbarta@io:~$
SSD temperature hit a high of 39°C.
Results were considerably better than when running Ubuntu on ZFS (without cooling). Next step will be to move the SSD to the IO Board (which presently has a CM4/8GB installed) and repeat the benchmark.
Repeat testing using CM4/8GB on IO Board
hbarta@io:~$ time -p iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
Iozone: Performance Test of File I/O
Version $Revision: 3.489 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
Vangel Bojaxhi, Ben England, Vikentsi Lapa,
Alexey Skidanov, Sudhir Kumar.
Run began: Tue Oct 3 22:11:09 2023
Include fsync in write timing
SYNC Mode.
Auto Mode
File size set to 102400 kB
Record Size 4 kB
Command line used: iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
Output is in kBytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 kBytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
102400 4 2707 4187 1107793 1098466 992342 4187
iozone test complete.
real 87.40
user 0.47
sys 7.17
hbarta@io:~$ sudo fstrim /
[sudo] password for hbarta:
Sorry, try again.
[sudo] password for hbarta:
hbarta@io:~$ time -p iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
Iozone: Performance Test of File I/O
Version $Revision: 3.489 $
Compiled for 64 bit mode.
Build: linux
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
Randy Dunlap, Mark Montague, Dan Million, Gavin Brebner,
Jean-Marc Zucconi, Jeff Blomberg, Benny Halevy, Dave Boone,
Erik Habbinga, Kris Strecker, Walter Wong, Joshua Root,
Fabrice Bacchella, Zhenghua Xue, Qin Li, Darren Sawyer,
Vangel Bojaxhi, Ben England, Vikentsi Lapa,
Alexey Skidanov, Sudhir Kumar.
Run began: Tue Oct 3 22:14:17 2023
Include fsync in write timing
SYNC Mode.
Auto Mode
File size set to 102400 kB
Record Size 4 kB
Command line used: iozone -f /home/hbarta/iozone.tst -e -o -a -s 100M -r 4k -i 0 -i 1 -i 2
Output is in kBytes/sec
Time Resolution = 0.000001 seconds.
Processor cache size set to 1024 kBytes.
Processor cache line size set to 32 bytes.
File stride size set to 17 * record size.
random random bkwd record stride
kB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
102400 4 2771 4184 1084930 1077146 988608 4175
iozone test complete.
real 86.62
user 0.43
sys 7.26
hbarta@io:~$
SSD Temperature up to about 34°C.
hbarta@io:~$ time -p fio --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --directory=/home/hbarta/ --name="io 256GB SSSTC NVME CM4/8GB IO Board" --filename=test.dat --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
io 256GB SSSTC NVME CM4/8GB IO Board: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [m(1)][100.0%][r=25.5MiB/s,w=8900KiB/s][r=6539,w=2225 IOPS][eta 00m:00s]
io 256GB SSSTC NVME CM4/8GB IO Board: (groupid=0, jobs=1): err= 0: pid=1042: Tue Oct 3 22:20:31 2023
read: IOPS=5991, BW=23.4MiB/s (24.5MB/s)(3070MiB/131174msec)
bw ( KiB/s): min= 6872, max=26364, per=100.00%, avg=23992.80, stdev=3362.60, samples=262
iops : min= 1718, max= 6591, avg=5998.15, stdev=840.66, samples=262
write: IOPS=2002, BW=8009KiB/s (8202kB/s)(1026MiB/131174msec); 0 zone resets
bw ( KiB/s): min= 2272, max= 9328, per=100.00%, avg=8019.03, stdev=1150.93, samples=262
iops : min= 568, max= 2332, avg=2004.74, stdev=287.74, samples=262
cpu : usr=5.06%, sys=26.46%, ctx=786433, majf=0, minf=18
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=23.4MiB/s (24.5MB/s), 23.4MiB/s-23.4MiB/s (24.5MB/s-24.5MB/s), io=3070MiB (3219MB), run=131174-131174msec
WRITE: bw=8009KiB/s (8202kB/s), 8009KiB/s-8009KiB/s (8202kB/s-8202kB/s), io=1026MiB (1076MB), run=131174-131174msec
Disk stats (read/write):
nvme0n1: ios=785656/234183, merge=0/36, ticks=95721/121855, in_queue=217591, util=100.00%
real 131.68
user 6.99
sys 35.37
hbarta@io:~$ sudo fstrim /
hbarta@io:~$ time -p fio --randrepeat=1 --ioengine=libaio --gtod_reduce=1 --directory=/home/hbarta/ --name="io 256GB SSSTC NVME CM4/8GB IO Board" --filename=test.dat --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75
io 256GB SSSTC NVME CM4/8GB IO Board: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.33
Starting 1 process
Jobs: 1 (f=1): [m(1)][100.0%][r=25.3MiB/s,w=8596KiB/s][r=6483,w=2149 IOPS][eta 00m:00s]
io 256GB SSSTC NVME CM4/8GB IO Board: (groupid=0, jobs=1): err= 0: pid=1215: Tue Oct 3 22:23:52 2023
read: IOPS=6044, BW=23.6MiB/s (24.8MB/s)(3070MiB/130019msec)
bw ( KiB/s): min= 7344, max=26000, per=100.00%, avg=24581.84, stdev=3220.57, samples=255
iops : min= 1836, max= 6500, avg=6145.43, stdev=805.13, samples=255
write: IOPS=2020, BW=8081KiB/s (8274kB/s)(1026MiB/130019msec); 0 zone resets
bw ( KiB/s): min= 2264, max= 9296, per=100.00%, avg=8216.15, stdev=1117.15, samples=255
iops : min= 566, max= 2324, avg=2054.02, stdev=279.29, samples=255
cpu : usr=5.28%, sys=28.31%, ctx=786591, majf=0, minf=18
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwts: total=785920,262656,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=23.6MiB/s (24.8MB/s), 23.6MiB/s-23.6MiB/s (24.8MB/s-24.8MB/s), io=3070MiB (3219MB), run=130019-130019msec
WRITE: bw=8081KiB/s (8274kB/s), 8081KiB/s-8081KiB/s (8274kB/s-8274kB/s), io=1026MiB (1076MB), run=130019-130019msec
Disk stats (read/write):
nvme0n1: ios=785133/182571, merge=0/30, ticks=92316/93259, in_queue=185591, util=98.26%
real 130.42
user 7.28
sys 37.34
hbarta@io:~$
Using elapsed time as a crude approximation of the benchmark result:
CM4 | base | benchmark | ET |
---|---|---|---|
4GB | Mini Base | iozone | 86.36 |
4GB | Mini Base | iozone | 85.77 |
4GB | Mini Base | fio | 127.56 |
4GB | Mini Base | fio | 127.78 |
8GB | IO Board | iozone | 87.40 |
8GB | IO Board | iozone | 86.62 |
8GB | IO Board | fio | 131.68 |
8GB | IO Board | fio | 130.42 |
Interpretation
The results are close enough to not warrant further investigation.