This is a disclaimer: Using the notes below is dangerous for both your sanity and peace of mind. If you still want to read them beware of the fact that they may be "not even wrong". Everything I write in there is just a mnemonic device to give me a chance to fix things I badly broke because I'm bloody stupid and think I can tinker with stuff that is way above my head and go away with it. It reminds me of Gandalf's warning: "Perilous to all of us are the devices of an art deeper than we ourselves possess." Moreover, a lot of it I blatantly stole on the net from other obviously cleverer persons than me -- not very hard. Forgive me. My bad. Please consider it and go away. You have been warned!
NetPipe Network Benchmarks
Introduction
A few years ago I had performed tests using nttcp and NetPipe to get a baseline on the network performance of a few hosts networked through Cisco Catalyst switches. The goal was to establish a baseline in order to test the performance of jumbo frames.
Here is what I was writing in Sept. 2007:
You’ll find below some network benchmarks done with ttcp and NetPipe. For more information on NetPipe have a look at the design site http://www.scl.ameslab.gov/netpipe . Even though NetPipe is typically used to benchmark message-passing APIs in clustered environment it can give very useful information for raw TCP performance. My goal using ttcp and NetPipe is just to establish a baseline in order to compare and fine tune systems parameters when the new switch is installed and jumbo frames are enabled.
The benchmarks involved 2 systems (Debian/Etch 64bit, 8 cores Xeons, 12GB RAM) with their dual-onboard Intel 82546GB Gigabit Ethernet controllers directly connected in a point-to-point configuration. See setup file for details on the hardware and how it was configured. Note: I’ve added benchmarks between one of the Debian/Etch system above and a Origin3000 (SGI) 28CPUs, 28MB RAM running Irix-6.5.19f. The SGI server (yorick) is connected in a Cisco 3500XL switch with a FC Gigabit Ethernet as are the Debian/Etch boxes (they use copper however).
I baselined the throughput between the 2 p-to-p connected NICs using nttcp and different buffer lengths. The raw TCP performance reported by ttcp was always near saturation for a 1Gbps connection no matter what I used for buffer lengths, from 4KB to 512KB. See ttcp.txt for the gory details. In the case of the ‘switched’ NICs, no matter the buffer length value used, nttcp always reported transfer rates of the order of 500Mbps, a disappointing ~50% value of the maximum throughput achievable.
Using NetPipe with the onboards NICs directly connected or switched provide interesting numbers. Not repeating the discussion and analysis that you can find on the NetPipe site (see above) I’ve plotted the raw TCP throughput vs the data block size transfered through the sockets (important note: all units are expressed in bits) and what’s called the ‘signature graph’ along with the ‘saturation graph’.
A few points about NetPipe. The standard way of analysing NetPipe output is to plot 3 graphs:
- The standard
throuhput graph
, ie, the throughput versus the message block size, usually plotted using a log scale for the block size.
Here is an example:

The maximum throughput is easily measured from this graph. However, it is difficult to analyse network latencies, an very important network metric.
- The
saturation graph
, the block size values versus the transfer/elapsed time.

Plotted using a log-log scale, this graph allows us to determine the saturation point, the block size threshold value after which an increase of the block size just results in a close-to-linear increase in transfer time. The region from the saturation point until the end of the graph is the saturation interval, the region where where throughput cannot be improved by increasing the block size.
- The
signature graph
, the transfer speed versus the elapsed time.

This graph represents a network accelaration graph. When plotted using a logarithmic scale for the elapsed time one can easily see that the network latency coincides with the time value where the graph takes off.
Hardware
Use 2 hosts (edgar and wart) networked on the actual 1 GbE switched network (Cisco Catalyst 5300XL).
Physical device description:
- — edgar —
This box has 2 onboard Inte GbE interfaces and one dual 10 GbE Intel card (disabled at the moment).
edgar:~# lshw -short -class network H/W path Device Class Description ======================================================== /0/100/1/0 eth2 network 82598EB 10-Gigabit AF Dual Port Net /0/100/1/0.1 eth3 network 82598EB 10-Gigabit AF Dual Port Net /0/100/1c.4/0 eth1 network 82574L Gigabit Network Connection /0/100/1c.5/0 eth0 network 82574L Gigabit Network Connection edgar:~# ifconfig eth0 eth0 Link encap:Ethernet HWaddr 00:25:90:39:e6:73 inet addr:132.206.178.150 Bcast:132.206.178.255 Mask:255.255.255.0 inet6 addr: fe80::225:90ff:fe39:e673/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:15873265826 errors:0 dropped:179714 overruns:0 frame:0 TX packets:8092049789 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:23963781369736 (21.7 TiB) TX bytes:589736966818 (549.2 GiB) Interrupt:17 Memory:fabe0000-fac00000 edgar:~# netstat -ina Kernel Interface table Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg eth0 1500 0 15873267284 0 179726 0 8092050170 0 0 0 BMRU eth1 1500 0 25033235 1 179750 0 4492 0 0 0 BMRU eth1:0 1500 0 - no statistics available - BMRU eth2 1500 0 0 0 0 0 0 0 0 0 BM eth3 1500 0 0 0 0 0 0 0 0 0 BM lo 16436 0 2041108511 0 0 0 2041108511 0 0 0 LRU
- — wart —
wart:~# lshw -short -class network H/W path Device Class Description =========================================================== /0/100/1/0 eth2 network Intel Corporation /0/100/1/0.1 eth1 network Intel Corporation /0/100/9/0/2 eth0 network AceNIC Gigabit Ethernet /1 vboxnet0 network Ethernet interface wart:~# ifconfig eth0 eth0 Link encap:Ethernet HWaddr 08:00:69:13:6e:ec inet addr:132.206.178.8 Bcast:132.206.178.255 Mask:255.255.255.0 inet6 addr: fe80::a00:69ff:fe13:6eec/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:198706055033 errors:0 dropped:0 overruns:0 frame:0 TX packets:101432453119 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:298319406723513 (271.3 TiB) TX bytes:10555129095400 (9.5 TiB) Interrupt:28 Base address:0xc000 wart:~# netstat -ina Kernel Interface table Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg eth0 1500 0 198706056180 0 0 0 101432453387 0 0 0 BMRU eth0:0 1500 0 - no statistics available - BMRU eth1 1500 0 8729 0 0 0 227 0 0 0 BMU eth2 1500 0 0 0 0 0 0 0 0 0 BM lo 16436 0 24631334609 0 0 0 24631334609 0 0 0 LRU vboxnet0 1500 0 0 0 0 0 0 0 0 0 BM
The H/W path /1 is a virtual device used by VirtualBox.
Cisco Nexus 5010 10GbE switch
Cisco Nexus 5010 10GbE switch along with 2 hosts with Intel 10GBE AF DA Dual Ports Adapter (8x PCI-E) with Twinax copper cables.
Hardware
node17:~# lshw -short -class network H/W path Device Class Description ======================================================== /0/100/4/0 eth1 network 82599EB 10-Gigabit SFI/SFP+ Network /0/100/4/0.1 eth2 network 82599EB 10-Gigabit SFI/SFP+ Network /0/100/1c/0 eth0 network 82576 Gigabit Network Connection /0/100/1c/0.1 eth3 network 82576 Gigabit Network Connection /1 bond0 network Ethernet interface
NetPipe Benchmarks
qlen=1000 (as for a typical 1GbE interface configuration)
- Throughput graph

- Saturation graph

- Signature graph

NTTCP
nntcp runs, with different buffer length size, 4k, 16k, 64k , 256k, 512k (the -l option). Default is 4k, the 1st run.
edgar:~# nttcp -v -T -t 10.0.0.229 nttcp-l: nttcp, version 1.47 nttcp-l: Pid=24881 nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Pid=20663, InetPeer= ) nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.229: "10.0.0.229" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=4096, bufcnt=2048, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 16384 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: receive window size = 87380 nttcp-1: buflen=4096, bufcnt=2048, dataport=5038/tcp nttcp-l: transmitted 8388608 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 8388608 0.01 0.00 5226.5470 20134.6727 2048 159501.56 614461.4 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 8388608 bytes nttcp-l: try to get outstanding messages from 1 remote clients 1 8388608 0.01 0.01 4661.3089 6711.5576 2095 145516.43 209521.0 nttcp-1: exiting edgar:~# nttcp -D -v -t -T -l 16384 10.0.0.229 nttcp -D -v -t -T -l 16384 10.0.0.229 nttcp-l: nttcp, version 1.47 nttcp-l: Pid=25103 nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Pid=20945, InetPeer= ) nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.229: "10.0.0.229" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=16384, bufcnt=2048, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: send window size = 16384 nttcp-1: receive window size = 87380 nttcp-1: buflen=16384, bufcnt=2048, dataport=5038/tcp nttcp-l: transmitted 33554432 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 33554432 0.05 0.01 5302.5335 26846.2302 2048 40455.12 204820.5 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 33554432 bytes nttcp-l: try to get outstanding messages from 1 remote clients 1 33554432 0.05 0.03 5131.8241 8053.6274 2257 43148.28 67714.7 nttcp-1: exiting edgar:~# nttcp -D -v -t -T -l 65536 10.0.0.229 nttcp-l: nttcp, version 1.47 nttcp-l: Pid=25307 nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Pid=20953, InetPeer= ) nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.229: "10.0.0.229" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=65536, bufcnt=2048, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 16384 nttcp-1: receive window size = 87380 nttcp-1: buflen=65536, bufcnt=2048, dataport=5038/tcp nttcp-l: transmitted 134217728 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l134217728 0.19 0.03 5706.0205 35793.7804 2048 10883.37 68271.2 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 134217728 bytes nttcp-l: try to get outstanding messages from 1 remote clients 1134217728 0.19 0.13 5660.9261 8053.6274 2975 15684.64 22314.1 nttcp-1: exiting edgar:~# nttcp -D -v -t -T -l 262144 10.0.0.229 nttcp-l: nttcp, version 1.47 nttcp-l: Pid=25465 nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Pid=20958, InetPeer= ) nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.229: "10.0.0.229" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=262144, bufcnt=2048, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 16384 nttcp-1: receive window size = 87380 nttcp-1: buflen=262144, bufcnt=2048, dataport=5038/tcp nttcp-l: transmitted 536870912 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l536870912 0.75 0.14 5744.4846 29967.0485 2048 2739.18 14289.4 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 536870912 bytes nttcp-l: try to get outstanding messages from 1 remote clients 1536870912 0.75 0.54 5731.1621 8003.5766 5961 7954.30 11108.2 nttcp-1: exiting edgar:~# nttcp -D -v -t -T -l 524288 10.0.0.229 nttcp-l: nttcp, version 1.47 nttcp-l: Pid=25655 nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Pid=20961, InetPeer= ) nttcp-l: from 10.0.0.229: "10.0.0.229" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.229: "10.0.0.229" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=524288, bufcnt=2048, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 16384 nttcp-1: receive window size = 87380 nttcp-1: buflen=524288, bufcnt=2048, dataport=5038/tcp nttcp-l: transmitted 1073741824 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l1073741824 1.49 0.29 5747.0942 29966.9440 2048 1370.21 7144.7 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 1073741824 bytes nttcp-l: try to get outstanding messages from 1 remote clients 11073741824 1.50 1.13 5740.0088 7579.8512 11913 7960.56 10512.2 nttcp-1: exiting
qlen=10000
- Throughput graph:

- Saturation graph:

- Signature graph

NTTCP
edgar:~# for i in 4096 8192 16384 65536 262144 524288; do nttcp -D -v -t -T -x 8388608 -l $i 10.0.0.217; done nttcp-l: nttcp, version 1.47 nttcp-l: Pid=11036 nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Pid=28483, InetPeer= ) nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.217: "10.0.0.217" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=4096, bufcnt=2048, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 24040 nttcp-1: receive window size = 87380 nttcp-1: buflen=4096, bufcnt=2048, dataport=5038/tcp nttcp-l: transmitted 8388608 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 8388608 0.01 0.00 5753.0102 20134.6727 2048 175567.94 614461.4 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 8388608 bytes nttcp-l: try to get outstanding messages from 1 remote clients 1 8388608 0.01 0.01 5411.5687 6711.5576 2291 184743.17 229122.9 nttcp-1: exiting nttcp-l: nttcp, version 1.47 nttcp-l: Pid=11037 nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Pid=28485, InetPeer= ) nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.217: "10.0.0.217" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=8192, bufcnt=1024, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: send window size = 24040 nttcp-1: receive window size = 87380 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: buflen=8192, bufcnt=1024, dataport=5038/tcp nttcp-l: transmitted 8388608 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 8388608 0.04 0.00 1855.2197 20134.6727 1024 28308.41 307230.7 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 8388608 bytes nttcp-l: try to get outstanding messages from 1 remote clients 1 8388608 0.04 0.01 1809.3032 5033.6682 1157 31193.55 86783.7 nttcp-1: exiting nttcp-l: nttcp, version 1.47 nttcp-l: Pid=11038 nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Pid=28486, InetPeer= ) nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.217: "10.0.0.217" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=16384, bufcnt=512, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 24040 nttcp-1: receive window size = 87380 nttcp-1: buflen=16384, bufcnt=512, dataport=5038/tcp nttcp-l: transmitted 8388608 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 8388608 0.01 0.00 5963.6421 20134.6727 512 45498.98 153615.4 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 8388608 bytes nttcp-l: try to get outstanding messages from 1 remote clients 1 8388608 0.01 0.01 5657.4662 6711.5576 759 63985.84 75907.6 nttcp-1: exiting nttcp-l: nttcp, version 1.47 nttcp-l: Pid=11039 nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Pid=28487, InetPeer= ) nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.217: "10.0.0.217" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=65536, bufcnt=128, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 24040 nttcp-1: receive window size = 87380 nttcp-1: buflen=65536, bufcnt=128, dataport=5038/tcp nttcp-l: transmitted 8388608 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 8388608 0.01 0.00 6294.8001 20134.6727 128 12006.38 38403.8 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 8388608 bytes 1 8388608 0.01 0.01 5886.7425 6711.5576 504 44210.53 50405.0 nttcp-1: exiting nttcp-l: nttcp, version 1.47 nttcp-l: Pid=11040 nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Pid=28488, InetPeer= ) nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.217: "10.0.0.217" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=262144, bufcnt=32, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 24040 nttcp-1: receive window size = 87380 nttcp-1: buflen=262144, bufcnt=32, dataport=5038/tcp nttcp-l: transmitted 8388608 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 8388608 0.01 0.00 5932.5375 20134.6727 32 2828.85 9601.0 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 8388608 bytes 1 8388608 0.01 0.01 5568.7382 10067.3363 525 43564.85 78757.9 nttcp-1: exiting nttcp-l: nttcp, version 1.47 nttcp-l: Pid=11041 nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Pid=28489, InetPeer= ) nttcp-l: from 10.0.0.217: "10.0.0.217" (=nttcp-1: Optionline="nttcp@-r@) nttcp-l: from 10.0.0.217: "10.0.0.217" (=dataport: 5038) nttcp-l: send window size = 16384 nttcp-l: receive window size = 87380 nttcp-l: buflen=524288, bufcnt=16, dataport=5038/tcp nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: accept from 10.0.0.150 nttcp-1: send window size = 24040 nttcp-1: receive window size = 87380 nttcp-1: buflen=524288, bufcnt=16, dataport=5038/tcp nttcp-l: transmitted 8388608 bytes Bytes Real s CPU s Real-MBit/s CPU-MBit/s Calls Real-C/s CPU-C/s l 8388608 0.01 0.00 6303.6694 20134.6727 16 1502.91 4800.5 nttcp-l: try to get outstanding messages from 1 remote clients nttcp-1: received 8388608 bytes 1 8388608 0.01 0.01 5973.7283 6711.5576 151 13441.34 15101.5 nttcp-1: exiting
Jumbo frames enabled and qlen=10000 (10x of a typical 1GbE interface configuration)
Jumbo frames are enabled on the Nexus-5010 10GbE switch and the SFP+ interface on the 2 hosts have been also modified to allow jumbo frames along with a txqueuelen of 10000.