-
Notifications
You must be signed in to change notification settings - Fork 8
/
README.ctp.sock
116 lines (83 loc) · 4.5 KB
/
README.ctp.sock
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
= Configuration ================================================================
The sock CTP is enabled by default.
= Endpoint naming ==============================================================
Sock endpoint URIs are host and port where host is a hostname or IPv4 address:
sock://foo.com:5555
sock://1.2.3.4:5555
= Config file options ==========================================================
Sock devices _must_ have these two items in the config file:
transport = sock
ip = 1.2.3.4
where 1.2.3.4 is the valid IPv4 address of the device to bind to _or_
they _must_ have these two items in the config file:
transport = sock
interface = ethN
where ethN is the name of the interface to use. It is allowable to have
both ip= and interface=, however, if they disagree, the IP address will
be used.
Sock devices may have the following items in the config file:
mtu = 9000
The sock transport will then set the endpoint->max_send_size to this size
less what it needs for headers. If the OS supports MTU discovery (i.e.
recent Linux), then CCI will use the actual MTU (less headers) and ignore
this keyword
bufsize = 20971520
The sock transport will then set the socket buffers (both send and receive)
to the specified size. Note that you need to ensure that the system
configuration allows the specified size. On Linux, this means that you must
check the value of net.core.rmem_max and net.core.wmem_max using sysctl
(e.g., systctl -n net.core.rmem_max) and eventually increase the value (e.g.,
sudo sysctl -w net.core.rmem_max=30000000).
If the socket send/receive buffers are not big enough, the transport may not
be able to send/receive messages. In the context of a send, if CCI debugging
is turned on and if a message cannot be sent because of a lack of system
buffers, users will see an error message with unordered/unreliable
connections, saying that a sent failed because the resources was temporarily
unavailable.
= Run-time notes ===============================================================
1. Most devices that support transports other than sock will also provide an
Ethernet interface. Generally, you will want to use the native transport and
not sock for these devices.
2. If you do not use a config file, the sock driver will start progress
threads even if you do not open a sock endpoint. We will fix this, but in the
meantime, use a config file and do not sepcify a sock device unless you want
one.
= Known limitations ============================================================
Not implemented:
RMA Fence
RO connections may not be ordered (i.e. RU)
Platform specific limitations:
* MacOS: when using the loopback interface, it is not possible to use the
entire MTU size to send data; much less should be sent. To use the
loopback interface, please set the appropriate MTU in the
configuration file (e.g., mtu = 9000).
= CCI Performance Tuning =======================================================
SOCK_DEFAULT_MSS
The default Maximum Segment Size is the default amount of data that is sent
in a message. This does not include the size of the header and should not
big bigger than the maximum size of a UDP packet. A given communication
pattern may require a small or big MSS to be efficient
SOCK_EP_RX_CNT
Number of buffers used to receive messages. Directly impact the memory
footprint of the CCI transport.
SOCK_EP_TX_CNT
Number of buffers used to send messages. Directly impact the memory
footprint of the CCI transport.
SOCK_PROG_TIME_US
Specify the amount of time in microseconds to make progress (the thread
will make up every N microseconds). A low progress timeout decrease the
latency but increase the CPU consumption.
SOCK_RMA_DEPTH
Number of in-flight RMA message.
ACK_TIMEOUT
The transport can acknowledge messages by blocks. The ACK timeout is
triggered when not enough ACKs are pending within a given period of
time.
PENDING_ACK_THRESHOLD
Maximum number of messages waiting for acknowledgment.
= System Performance Tuning ====================================================
If the system parameters are not tuned for high-performance communications,
the transport may not be able to send/receive messages using some of the
latest networking hardware. The following page provides a good reference for
system tuning (TCP tuning proved to improve performance also for the
transport): http://fasterdata.es.net/host-tuning/linux/