在25G以太网环境下使用Perftest对RoCEv2性能进行测试
摘要
perftest package is a collection of tests written over uverbs intended for use as a performance micro-benchmark. The tests may be used for tuning as well as for functional testing.
包含以下的测试
– InfiniBand / RoCE
ib_send_bw
ib_send_lat
ib_write_bw
ib_write_lat
ib_read_bw
ib_read_lat
ib_atomic_bw
ib_atomic_lat
带宽
ib_send_bw
$ ib_send_bw -a -F -d mlx5_0 -R cpn57-eth4
Requested SQ size might be too big. Try reducing TX depth and/or inline size.
Current TX depth is 128 and inline size is 0 .
---------------------------------------------------------------------------------------
Send BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01a2 PSN 0x8cc3b8
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:58
remote address: LID 0000 QPN 0x01a5 PSN 0xda91dc
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:57
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
2 1000 7.13 7.09 3.719685
4 1000 28.30 28.29 7.414812
8 1000 56.75 56.62 7.421375
16 1000 113.81 108.27 7.095815
32 1000 234.75 234.20 7.674423
64 1000 469.50 468.97 7.683683
128 1000 936.33 934.62 7.656447
256 1000 1841.28 1837.05 7.524538
512 1000 2496.89 2494.85 5.109456
1024 1000 2712.67 2694.83 2.759506
2048 1000 2736.60 2736.28 1.400974
4096 1000 2752.32 2751.69 0.704432
8192 1000 2758.80 2758.75 0.353121
16384 1000 2762.59 2762.44 0.176796
32768 1000 2764.22 2764.15 0.088453
65536 1000 2765.03 2765.01 0.044240
131072 1000 2765.42 2765.41 0.022123
262144 1000 2765.60 2765.59 0.011062
524288 1000 2765.75 2765.75 0.005531
1048576 1000 2765.80 2765.80 0.002766
2097152 1000 2765.83 2765.83 0.001383
4194304 1000 2765.84 2765.84 0.000691
8388608 1000 2765.85 2765.85 0.000346
---------------------------------------------------------------------------------------
ib_write_bw
$ ib_write_bw -a -d mlx5_0 -R cpn57-eth4
Requested SQ size might be too big. Try reducing TX depth and/or inline size.
Current TX depth is 128 and inline size is 0 .
---------------------------------------------------------------------------------------
RDMA_Write BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 0[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01a6 PSN 0xb1ad95
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:58
remote address: LID 0000 QPN 0x01a9 PSN 0x85d0f3
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:57
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
2 5000 9.26 9.23 4.839865
4 5000 28.37 28.12 7.372481
8 5000 56.90 56.87 7.454676
16 5000 113.50 112.68 7.384878
32 5000 228.25 227.74 7.462513
64 5000 455.23 450.95 7.388370
128 5000 898.06 874.00 7.159775
256 5000 1805.97 1787.43 7.321303
512 5000 2496.89 2495.04 5.109850
1024 5000 2723.88 2721.20 2.786512
2048 5000 2743.72 2743.04 1.404436
4096 5000 2755.19 2754.61 0.705179
8192 5000 2760.24 2760.23 0.353309
16384 5000 2762.77 2762.71 0.176813
32768 5000 2764.49 2764.42 0.088461
65536 5000 2765.17 2765.15 0.044242
131072 5000 2765.51 2765.50 0.022124
262144 5000 2765.68 2765.68 0.011063
524288 5000 2765.77 2765.77 0.005532
1048576 5000 2765.82 2765.81 0.002766
2097152 5000 2765.84 2765.84 0.001383
4194304 5000 2765.85 2765.85 0.000691
8388608 5000 2765.85 2765.85 0.000346
---------------------------------------------------------------------------------------
ib_read_bw
$ ib_read_bw -a -d mlx5_0 -R cpn57-eth4
Requested SQ size might be too big. Try reducing TX depth and/or inline size.
Current TX depth is 128 and inline size is 0 .
---------------------------------------------------------------------------------------
RDMA_Read BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Outstand reads : 16
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01a8 PSN 0xda8e33
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:58
remote address: LID 0000 QPN 0x01ab PSN 0x3975f8
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:57
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
2 1000 8.82 8.63 4.526966
4 1000 19.85 19.17 5.025238
8 1000 39.84 39.79 5.214932
16 1000 80.31 80.14 5.251999
32 1000 157.55 157.51 5.161280
64 1000 312.11 311.99 5.111661
128 1000 610.35 609.87 4.996022
256 1000 1168.76 1168.64 4.786759
512 1000 2116.15 2113.60 4.328657
1024 1000 2726.70 2708.98 2.773999
2048 1000 2742.30 2742.29 1.404052
4096 1000 2754.47 2753.80 0.704972
8192 1000 2759.88 2759.76 0.353249
16384 1000 2762.95 2762.83 0.176821
32768 1000 2764.40 2764.31 0.088458
65536 1000 2765.12 2765.12 0.044242
131072 1000 2765.51 2765.50 0.022124
262144 1000 2765.62 2765.62 0.011062
524288 1000 2765.79 2765.78 0.005532
1048576 1000 2765.84 2765.84 0.002766
2097152 1000 2765.86 2765.86 0.001383
4194304 1000 2765.87 2765.87 0.000691
8388608 1000 2765.88 2765.88 0.000346
---------------------------------------------------------------------------------------
ib_atomic_bw
$ ib_atomic_bw -a -d mlx5_0 -R cpn57-eth4
---------------------------------------------------------------------------------------
Atomic FETCH_AND_ADD BW Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 128
CQ Moderation : 100
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Outstand reads : 16
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01aa PSN 0x58ce
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:58
remote address: LID 0000 QPN 0x01ad PSN 0xbd22ee
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:57
---------------------------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
8 1000 16.07 15.92 2.086695
---------------------------------------------------------------------------------------
延迟
ib_send_lat
$ ib_send_lat -a -d mlx5_0 -R cpn57-eth4
Requested SQ size might be too big. Try reducing TX depth and/or inline size.
Current TX depth is 1 and inline size is 236 .
---------------------------------------------------------------------------------------
Send Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 236[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01af PSN 0xf1c1aa
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:58
remote address: LID 0000 QPN 0x01b2 PSN 0x211ffb
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:57
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec]
2 1000 1.48 3.36 1.53 1.53 0.03 1.59 3.36
4 1000 1.47 3.01 1.51 1.51 0.05 1.57 3.01
8 1000 1.47 2.51 1.51 1.51 0.03 1.55 2.51
16 1000 1.48 2.80 1.51 1.51 0.05 1.55 2.80
32 1000 1.49 2.90 1.52 1.52 0.05 1.57 2.90
64 1000 1.52 2.46 1.56 1.56 0.04 1.59 2.46
128 1000 1.55 2.43 1.59 1.59 0.04 1.62 2.43
256 1000 2.02 3.21 2.07 2.07 0.03 2.18 3.21
512 1000 2.17 3.10 2.22 2.23 0.03 2.32 3.10
1024 1000 2.49 3.67 2.55 2.60 0.09 2.79 3.67
2048 1000 2.91 4.50 2.99 3.01 0.04 3.15 4.50
4096 1000 3.77 4.82 3.90 3.90 0.03 3.94 4.82
8192 1000 5.15 6.26 5.24 5.27 0.08 5.48 6.26
16384 1000 7.95 8.81 8.08 8.15 0.14 8.56 8.81
32768 1000 13.67 14.30 13.78 13.84 0.15 14.25 14.30
65536 1000 24.91 25.66 25.07 25.13 0.14 25.51 25.66
131072 1000 47.50 49.10 47.71 47.75 0.13 48.11 49.10
262144 1000 92.71 93.46 92.94 92.97 0.13 93.29 93.46
524288 1000 183.12 183.79 183.28 183.30 0.08 183.53 183.79
1048576 1000 363.90 364.60 364.09 364.13 0.13 364.48 364.60
2097152 1000 725.45 726.17 725.63 725.67 0.12 726.05 726.17
4194304 1000 1448.64 1451.62 1448.77 1448.82 0.14 1449.28 1451.62
8388608 1000 2894.73 2896.16 2894.92 2894.97 0.14 2895.43 2896.16
ib_write_lat
$ ib_write_lat -a -d mlx5_0 -R cpn57-eth4
Requested SQ size might be too big. Try reducing TX depth and/or inline size.
Current TX depth is 1 and inline size is 220 .
---------------------------------------------------------------------------------------
RDMA_Write Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Max inline data : 220[B]
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01b1 PSN 0x9f4265
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:58
remote address: LID 0000 QPN 0x01b4 PSN 0x941b8f
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:57
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec]
2 1000 1.44 2.63 1.46 1.46 0.04 1.49 2.63
4 1000 1.43 2.38 1.45 1.45 0.04 1.49 2.38
8 1000 1.43 2.47 1.45 1.45 0.03 1.49 2.47
16 1000 1.44 2.63 1.46 1.46 0.02 1.47 2.63
32 1000 1.47 2.21 1.49 1.49 0.04 1.51 2.21
64 1000 1.47 2.14 1.49 1.50 0.03 1.52 2.14
128 1000 1.53 2.71 1.56 1.56 0.03 1.58 2.71
256 1000 1.96 2.46 1.98 1.98 0.01 2.04 2.46
512 1000 2.19 2.67 2.22 2.23 0.03 2.31 2.67
1024 1000 2.52 3.33 2.59 2.60 0.04 2.68 3.33
2048 1000 2.95 3.75 3.01 3.03 0.05 3.17 3.75
4096 1000 3.74 4.35 3.90 3.90 0.02 3.94 4.35
8192 1000 5.17 6.14 5.28 5.30 0.07 5.45 6.14
16384 1000 8.03 8.62 8.14 8.17 0.10 8.44 8.62
32768 1000 13.66 19.92 13.80 13.83 0.11 14.16 19.92
65536 1000 24.95 25.67 25.05 25.09 0.09 25.40 25.67
131072 1000 47.54 59.08 47.71 47.73 0.10 47.95 59.08
262144 1000 92.73 93.65 92.89 92.92 0.10 93.30 93.65
524288 1000 183.09 183.53 183.28 183.30 0.09 183.49 183.53
1048576 1000 363.91 364.46 364.06 364.09 0.10 364.39 364.46
2097152 1000 725.47 731.55 725.62 725.65 0.12 725.96 731.55
4194304 1000 1448.63 1449.20 1448.77 1448.80 0.10 1449.07 1449.20
8388608 1000 2894.77 2897.00 2894.93 2894.97 0.12 2895.30 2897.00
---------------------------------------------------------------------------------------
ib_read_lat
$ ib_read_lat -a -d mlx5_0 -R cpn57-eth4
Requested SQ size might be too big. Try reducing TX depth and/or inline size.
Current TX depth is 1 and inline size is 0 .
---------------------------------------------------------------------------------------
RDMA_Read Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Outstand reads : 16
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01b3 PSN 0x559f79
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:58
remote address: LID 0000 QPN 0x01b6 PSN 0xabbde3
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:57
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec]
2 1000 2.66 4.60 2.70 2.72 0.08 2.91 4.60
4 1000 2.68 4.70 2.73 2.73 0.02 2.79 4.70
8 1000 2.68 4.27 2.73 2.73 0.02 2.78 4.27
16 1000 2.68 4.05 2.72 2.73 0.04 2.88 4.05
32 1000 2.67 6.25 2.74 2.74 0.04 2.87 6.25
64 1000 2.70 3.78 2.75 2.76 0.05 2.91 3.78
128 1000 2.80 4.54 2.85 2.87 0.06 3.15 4.54
256 1000 2.89 4.72 2.95 2.97 0.07 3.29 4.72
512 1000 3.07 4.02 3.12 3.13 0.04 3.45 4.02
1024 1000 3.39 5.71 3.44 3.47 0.09 3.77 5.71
2048 1000 3.83 4.49 3.93 3.96 0.07 4.24 4.49
4096 1000 4.68 5.13 4.84 4.84 0.05 5.02 5.13
8192 1000 6.08 6.69 6.25 6.28 0.11 6.59 6.69
16384 1000 8.89 9.69 9.08 9.13 0.14 9.56 9.69
32768 1000 14.56 15.38 14.76 14.80 0.14 15.21 15.38
65536 1000 25.83 26.57 25.99 26.04 0.13 26.52 26.57
131072 1000 48.33 49.28 48.61 48.68 0.18 49.14 49.28
262144 1000 93.57 94.52 93.76 93.84 0.16 94.32 94.52
524288 1000 183.96 184.85 184.17 184.24 0.17 184.70 184.85
1048576 1000 364.72 365.50 364.96 365.03 0.16 365.46 365.50
2097152 1000 726.30 727.26 726.54 726.60 0.14 727.03 727.26
4194304 1000 1449.43 1450.97 1449.64 1449.70 0.14 1450.15 1450.97
8388608 1000 2895.62 2896.74 2895.84 2895.91 0.16 2896.40 2896.74
---------------------------------------------------------------------------------------
ib_atomic_lat
$ ib_atomic_lat -a -d mlx5_0 -R cpn57-eth4
---------------------------------------------------------------------------------------
Atomic FETCH_AND_ADD Latency Test
Dual-port : OFF Device : mlx5_0
Number of qps : 1 Transport type : IB
Connection type : RC Using SRQ : OFF
TX depth : 1
Mtu : 1024[B]
Link type : Ethernet
GID index : 3
Outstand reads : 16
rdma_cm QPs : ON
Data ex. method : rdma_cm
---------------------------------------------------------------------------------------
local address: LID 0000 QPN 0x01b5 PSN 0xf3ebd6
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:58
remote address: LID 0000 QPN 0x01b8 PSN 0xdd4456
GID: 00:00:00:00:00:00:00:00:00:00:255:255:10:25:01:57
---------------------------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec] t_avg[usec] t_stdev[usec] 99% percentile[usec] 99.9% percentile[usec]
8 1000 2.69 5.85 2.74 2.75 0.05 2.94 5.85
---------------------------------------------------------------------------------------