@@ -7,17 +7,17 @@ weight = 1
7
7
## Troubleshooting Timeouts
8
8
9
9
Client complaining about "timeout errors", but not sure how to track it down?
10
- Here's a simple utility for examining your situation .
10
+ This page will guide you through the most common causes .
11
11
12
- ### First, check listen_disabled_num
12
+ ### Check your firewalls
13
13
14
- Before you go ahead with troubleshooting, you'll want to telnet to your
15
- memcached instance and run ` stats ` , then look for "listen_disabled_num". This
16
- is a poorly named counter which describes how many times you've reached
17
- maxconns. Each time memcached hits maxconns it will delay new connections,
18
- which means you'll possibly get timeouts.
14
+ You may need to tune or disable firewalls that exist between the client and
15
+ memcached. Most firewalls track the connections in use (connection tracking)
16
+ and hitting that limit will often cause timeout errors as existing TCP
17
+ connections are dropped.
19
18
20
- Also, disable or tune any firewalls you may have in the way.
19
+ You will have to refer to your firewall vendor's documentation to figure this
20
+ out.
21
21
22
22
### Then, carefully check the usual suspects
23
23
@@ -27,18 +27,12 @@ memcached to disk periodically.
27
27
Is the machine overloaded? 0% CPU idle with a load of 400 and memcached
28
28
probably isn't getting enough CPU time. You can try ` nice ` or ` renice ` , or
29
29
just run less on the machine. If you're severely overloaded on CPU, you might
30
- notice the mc_conn_tester below reporting very high wait times for ` set `
30
+ notice the " mc_conn_tester" below reporting very high wait times for ` set `
31
31
commands.
32
32
33
- Is the memcached server 32bit? 32bit hosts have less memory available to the
34
- kernel for TCP sockets and friends. We've observed some odd behavior under
35
- large numbers of open sockets and high load with 32bit systems. Strongly
36
- consider going 64bit, as it may help some hard to trace problems go away,
37
- including segfaults due to the 2/4g memory limit.
38
-
39
33
### Next, mc_conn_tester.pl
40
34
41
- Fetch this:
35
+ Download this perl script :
42
36
43
37
https://www.memcached.org/files/mc_conn_tester.pl
44
38
@@ -55,7 +49,7 @@ Options:
55
49
[...etc...]
56
50
```
57
51
58
- This is a minimal utility for testing a quick routine with a memcached
52
+ This is a small utility for testing a quick routine directly against a memcached
59
53
instance. It will connect, attempt a couple sets, attempt a few gets, then loop and
60
54
repeat.
61
55
@@ -102,7 +96,7 @@ with `Fail:` as usual.
102
96
103
97
### You're probably dropping packets.
104
98
105
- In most cases, where listen_disabled_num doesn't apply , you're likely dropping
99
+ In many cases, you're likely dropping
106
100
packets for some reason. Either a firewall is in the way and has run out of
107
101
stateful tracking slots, or your network card or switch is dropping packets.
108
102
@@ -113,7 +107,8 @@ Fail: (timeout: 1) (elapsed: 1.00145602) (conn: 0.00000000) (set: 0.00000000) (g
113
107
```
114
108
115
109
... where ` conn: ` and the rest are all zero. So the test was not able to
116
- connect to memcached.
110
+ connect to memcached. You may also see random numbers across conn/set/get
111
+ mixed in with total failures.
117
112
118
113
On most systems SYN retries are 3 seconds, which is awfully long. Losing a
119
114
single SYN packet will certainly mean a timeout. This is easily proven:
@@ -166,4 +161,4 @@ them here.
166
161
### But your utility never fails!
167
162
168
163
Odds are good your client has a bug :( Try reaching out to the client author
169
- for help.
164
+ for help. If not, please reach out to us for support via github discussions.
0 commit comments