]>
Commit | Line | Data |
---|---|---|
997358a6 MW |
1 | <HTML> |
2 | <HEAD> | |
3 | <TITLE>FreeS/WAN troubleshooting</TITLE> | |
4 | <meta name="keywords" content="Linux, IPSEC, VPN, security, FreeSWAN, troubleshooting, debugging"> | |
5 | <!-- | |
6 | Written by Claudia Schmeing for the Linux FreeS/WAN project | |
7 | Freely distributable under the GNU General Public License | |
8 | ||
9 | More information at www.freeswan.org | |
10 | Feedback to users@lists.freeswan.org | |
11 | ||
12 | CVS information: | |
13 | RCS ID: $Id: trouble.html,v 1.1 2004/03/15 20:35:24 as Exp $ | |
14 | Last changed: $Date: 2004/03/15 20:35:24 $ | |
15 | Revision number: $Revision: 1.1 $ | |
16 | ||
17 | CVS revision numbers do not correspond to FreeS/WAN release numbers. | |
18 | --> | |
19 | ||
20 | </HEAD> | |
21 | <BODY> | |
22 | ||
23 | <H1><A NAME="trouble"></A>Linux FreeS/WAN Troubleshooting Guide</H1> | |
24 | ||
25 | <H2><A NAME="overview"></A>Overview</H2> | |
26 | ||
27 | <P> | |
28 | This document covers several general places where you might have a problem:</P> | |
29 | <OL> | |
30 | <LI><A HREF="#install">During install</A>.</LI> | |
31 | <LI><A HREF="#negotiation">During the negotiation process</A>.</LI> | |
32 | <LI><A HREF="#use">Using an established connection</A>.</LI> | |
33 | </OL> | |
34 | <P>This document also contains <A HREF="#notes">notes</A> which | |
35 | expand on points made in these sections, and tips for | |
36 | <A HREF="#prob.report">problem | |
37 | reporting</A>. If the other end of your connection is not FreeS/WAN, | |
38 | you'll also want to read our | |
39 | <A HREF="interop.html#interop.problem">interoperation</A> document.</P> | |
40 | <H2><A NAME="install"></A>1. During Install</H2> | |
41 | <H3>1.1 RPM install gotchas</H3> | |
42 | <P>With the RPM method:</P> | |
43 | <UL> | |
44 | <LI>Be sure you have installed both the userland tools and the kernel | |
45 | components. One will not work without the other. For example, when using | |
46 | FreeS/WAN-produced RPMs for our 2.04 release, you need both: | |
47 | <PRE> freeswan-userland-2.04_2.4.20_20.9-0.i386.rpm | |
48 | freeswan-module-2.04_2.4.20_20.9-0.i386.rpm | |
49 | </PRE> | |
50 | </LI> | |
51 | </UL> | |
52 | <H3>1.2 Problems installing from source</H3> | |
53 | <P>When installing from source, you may find these problems:</P> | |
54 | <UL> | |
55 | <LI>Missing library. See <A HREF="faq.html#gmp.h_missing">this</A> | |
56 | FAQ.</LI> | |
57 | <LI>Missing utilities required for compile. See this | |
58 | <A HREF="install.html#tool.lib">checklist</A>.</LI> | |
59 | <LI>Kernel version incompatibility. See <A HREF="faq.html#k.versions">this</A> | |
60 | FAQ.</LI> | |
61 | <LI>Another compile problem. Find information in the out.* files, | |
62 | ie. out.kpatch, out.kbuild, created at compile time in the top-level | |
63 | Linux FreeS/WAN directory. Error messages generated by KLIPS during | |
64 | the boot sequence are accessible with the <VAR>dmesg</VAR> command. | |
65 | <BR> | |
66 | Check the list archives and the List in Brief to see if this is a | |
67 | known issue. If it is not, report it to the bugs list as described | |
68 | in our <A HREF="#prob.report">problem reporting</A> section. In some | |
69 | cases, you may be asked to provide debugging information using gdb; | |
70 | details <A HREF="#gdb">below</A>.</LI> | |
71 | <LI>If your kernel compiles but you fail to install your new | |
72 | FreeS/WAN-enabled kernel, review the sections on <A HREF="install.html#newk">installing | |
73 | the patched kernel</A>, and <A HREF="install.html#testinstall">testing</A> | |
74 | to see if install succeeded.</LI> | |
75 | </UL> | |
76 | <H3><A NAME="install.check"></A>1.3 Install checks</H3> | |
77 | <P><VAR>ipsec verify</VAR> checks a number | |
78 | of FreeS/WAN essentials. Here are some hints on what do to when your | |
79 | system doesn't check out:</P> | |
80 | <P> | |
81 | <TABLE border=1> | |
82 | <TR> | |
83 | <TD><STRONG>Problem</STRONG></TD> | |
84 | <TD><STRONG>Status</STRONG></TD> | |
85 | <TD><STRONG>Action</STRONG></TD> | |
86 | </TR> | |
87 | <TR> | |
88 | <TD><VAR>ipsec</VAR> not on-path</TD> | |
89 | <TD> </TD> | |
90 | <TD><P>Add <VAR>/usr/local/sbin</VAR> to your PATH.</P></TD> | |
91 | </TR> | |
92 | <TR> | |
93 | <TD>Missing KLIPS support</TD> | |
94 | <TD><FONT COLOR="#FF0000">critical</FONT></TD> | |
95 | <TD>See <A HREF="faq.html#noKLIPS">this FAQ.</A></TD> | |
96 | </TR> | |
97 | <TR> | |
98 | <TD>No RSA private key</TD> | |
99 | <TD> </TD> | |
100 | <TD> | |
101 | <P>Follow <A HREF="install.html#genrsakey">these | |
102 | instructions</A> to create an RSA key pair for your host. RSA keys are:</P> | |
103 | <UL> | |
104 | <LI>required for opportunistic encryption, and</LI> | |
105 | <LI>our preferred method to authenticate pre-configured connections.</LI> | |
106 | </UL> | |
107 | </TD> | |
108 | </TR> | |
109 | <TR> | |
110 | <TD><VAR>pluto</VAR> not running</TD> | |
111 | <TD><FONT COLOR="#FF0000">critical</FONT></TD> | |
112 | <TD><PRE>service ipsec start</PRE></TD> | |
113 | </TR> | |
114 | <TR> | |
115 | <TD>No port 500 hole</TD> | |
116 | <TD><FONT COLOR="#FF0000">critical</FONT></TD> | |
117 | <TD>Open port 500 for IKE negotiation.</TD> | |
118 | </TR> | |
119 | <TR> | |
120 | <TD>Port 500 check N/A</TD> | |
121 | <TD> </TD> | |
122 | <TD>Check that port 500 is open for IKE negotiation.</TD> | |
123 | </TR> | |
124 | <TR> | |
125 | <TD>Failed DNS checks</TD> | |
126 | <TD> </TD> | |
127 | <TD>Opportunistic encryption requires information from DNS. | |
128 | To set this up, see <A HREF="quickstart.html#opp.setup">our instructions</A>. | |
129 | </TD> | |
130 | </TR> | |
131 | <TR> | |
132 | <TD>No public IP address</TD> | |
133 | <TD> </TD> | |
134 | <TD>Check that the interface which you want to protect with IPSec is up and | |
135 | running.</TD> | |
136 | </TR> | |
137 | </TABLE> | |
138 | ||
139 | ||
140 | <H3><A NAME="oe.trouble"></A>1.3 Troubleshooting OE</H3> | |
141 | <P>OE should work with no local configuration, if you have posted | |
142 | DNS TXT records according to the instructions in our | |
143 | <A HREF="quickstart.html">quickstart guide</A>. | |
144 | If you encounter trouble, try these hints. | |
145 | We welcome additional hints via the | |
146 | <A HREF="mail.html">users' mailing list</A>.</P> | |
147 | ||
148 | <TABLE border=1> | |
149 | <TR> | |
150 | <TD><STRONG>Symptom</STRONG></TD> | |
151 | <TD><STRONG>Problem</STRONG></TD> | |
152 | <TD><STRONG>Action</STRONG></TD> | |
153 | </TR> | |
154 | <TR> | |
155 | <TD> | |
156 | You're running FreeS/WAN 2.01 (or later), | |
157 | and initiating a connection to FreeS/WAN | |
158 | 2.00 (or earlier). | |
159 | In your logs, you see a message like: | |
160 | <pre>no RSA public key known for '192.0.2.13'; | |
161 | DNS search for KEY failed (no KEY record | |
162 | for 13.2.0.192.in-addr.arpa.)</pre> | |
163 | The older FreeS/WAN logs no error. | |
164 | </TD> | |
165 | <TD> | |
166 | <A NAME="oe.trouble.flagday"></A> | |
167 | A protocol level incompatibility between 2.01 (or later) and | |
168 | 2.00 (or earlier) causes this error. It occurs when a FreeS/WAN 2.01 | |
169 | (or later) box for which no KEY record is posted attempts to initiate an OE | |
170 | connection to older FreeS/WAN versions (2.00 and earlier). | |
171 | Note that older versions can initiate to newer versions without this error. | |
172 | </TD> | |
173 | <TD>If you control the peer host, upgrade its FreeS/WAN to 2.01 (or later), and | |
174 | post new style TXT records for it. If not, but if you know its sysadmin, | |
175 | perhaps a quick note is in order. If neither option is possible, you can | |
176 | ease the transition by posting an old style KEY record (created with a | |
177 | command like "ipsec showhostkey --key") to the reverse map for | |
178 | the FreeS/WAN 2.01 (or later) box.</TD> | |
179 | </TR> | |
180 | <TR> | |
181 | <TD>OE host is very slow to contact other hosts.</TD> | |
182 | <TD>Slow DNS service while running OE.</TD> | |
183 | <TD>It's a good idea to run a caching DNS server on your OE host, | |
184 | as outlined in <A HREF="http://lists.freeswan.org/pipermail/design/2003-January/004205.html">this | |
185 | mailing list message</A>. If your DNS servers are elsewhere, | |
186 | put their IPs | |
187 | in the <VAR>clear</VAR> policy group, and | |
188 | re-read groups with <PRE>ipsec auto --rereadgroups</PRE> | |
189 | </TD> | |
190 | </TR> | |
191 | <TR> | |
192 | <TD> | |
193 | <PRE>Can't Opportunistically initiate for | |
194 | 192.0.2.2 to 192.0.2.3: no TXT record | |
195 | for 13.2.0.192.in-addr.arpa.</PRE> | |
196 | </TD> | |
197 | <TD>Peer is not set up for OE.</TD> | |
198 | <TD><P>None. Plenty of hosts on the Internet | |
199 | do not run OE. If, however, you have set OE up on that peer, this may | |
200 | indicate that you need to wait up to 48 hours | |
201 | for its DNS records to propagate.</P></TD> | |
202 | </TR> | |
203 | <TR> | |
204 | <TD><VAR>ipsec verify</VAR> does not find DNS records: | |
205 | <PRE>... | |
206 | Looking for TXT in forward map: | |
207 | xy.example.com...[FAILED] | |
208 | Looking for TXT in reverse map...[FAILED] | |
209 | ...</PRE> | |
210 | ||
211 | You also experience authentication failure:<BR> | |
212 | <PRE>Possible authentication failure: | |
213 | no acceptable response to our | |
214 | first encrypted message</PRE> | |
215 | </TD> | |
216 | ||
217 | <TD>DNS records are not posted or have not propagated.</TD> | |
218 | <TD>Did you post the DNS records necessary for OE? If not, | |
219 | do so using the instructions in our | |
220 | <A HREF="quickstart.html#quickstart">quickstart guide</A>. | |
221 | If so, wait up to 48 hours for the DNS records to propagate.</TD> | |
222 | </TR> | |
223 | <TR> | |
224 | <TD><VAR>ipsec verify</VAR> does not find DNS records, and you experience | |
225 | authentication failure.</TD> | |
226 | <TD>For iOE, your ID | |
227 | does not match location of | |
228 | forward DNS record.</TD> | |
229 | <TD>In <VAR>config setup</VAR>, change | |
230 | <VAR>myid=</VAR> to match the forward DNS where you posted the record. | |
231 | Restart FreeS/WAN. | |
232 | For reference, see our | |
233 | <A HREF="quickstart.html#opp.client">iOE instructions</A>.</TD> | |
234 | </TR> | |
235 | <TR> | |
236 | <TD><VAR>ipsec verify</VAR> finds DNS records, yet there is | |
237 | still authentication failure. ( ? )</TD> | |
238 | <TD>DNS records are malformed.</TD> | |
239 | <TD>Re-create the records and send new copies to your DNS administrator.</TD> | |
240 | </TR> | |
241 | <TR> | |
242 | <TD><VAR>ipsec verify</VAR> finds DNS records, yet there is | |
243 | still authentication failure. ( ? )</TD> | |
244 | <TD>DNS records show different keys for a gateway vs. its subnet hosts.</TD> | |
245 | <TD>All TXT records for boxes protected by an OE gateway must contain the | |
246 | gateway's public key. Re-create and re-post any incorrect records using | |
247 | <A HREF="quickstart.html#opp.incoming">these instructions</A>.</TD> | |
248 | </TR> | |
249 | <TR> | |
250 | <TD>OE gateway loses connectivity to its subnet. The gateway's | |
251 | routing table shows routes to the subnet through IPsec interfaces.</TD> | |
252 | <TD>The subnet is part of the <VAR>private</VAR> or <VAR>block</VAR> | |
253 | policy group on the gateway.</TD> | |
254 | <TD>Remove the subnet from the group, and reread | |
255 | groups with <PRE>ipsec auto --rereadgroups</PRE></TD> | |
256 | </TR> | |
257 | <TR> | |
258 | <TD>OE does not work to hosts on the local LAN.</TD> | |
259 | <TD>This is a known issue.</TD> | |
260 | <TD>See <A HREF="opportunism.known-issues">this list</A> of known issues | |
261 | with OE. | |
262 | </TD> | |
263 | </TR> | |
264 | ||
265 | <TR> | |
266 | <TD>FreeS/WAN does not seem to be executing your default policy. In your | |
267 | logs, you see a message like: | |
268 | <PRE>/etc/ipsec.d/policies/iprivate-or-clear" | |
269 | line 14: subnet "0.0.0.0/0", | |
270 | source 192.0.2.13/32, | |
271 | already "private-or-clear"</PRE> | |
272 | </TD> | |
273 | <TD><A HREF="glossary.html#fullnet">Fullnet</A> in a policy group file defines | |
274 | your default policy. Fullnet should normally be present in only one policy | |
275 | group file. The fine print: you can have two default policies defined so long | |
276 | as they protect different local endpoints (e.g. the FreeS/WAN gateway and a | |
277 | subnet).</TD> | |
278 | <TD> | |
279 | Find all policies which contain fullnet with:<br> | |
280 | <PRE>grep -F 0.0.0.0/0 /etc/ipsec.d/policies/*</PRE> | |
281 | then remove the unwanted occurrence(s). | |
282 | </TD> | |
283 | </TR> | |
284 | ||
285 | </TABLE> | |
286 | ||
287 | ||
288 | <H2><A NAME="negotiation"></A>2. During Negotiation</H2> | |
289 | <P>When you fail to bring up a tunnel, you'll need to find out:</P> | |
290 | <UL> | |
291 | <LI><A HREF="#state">what your connection state is,</A> and often</LI> | |
292 | <LI><A HREF="#find.pluto.error">an error message</A>.</LI> | |
293 | </UL> | |
294 | <P>before you can | |
295 | <A HREF="#interpret.pluto.error">diagnose your problem</A>.</P> | |
296 | <H3><A NAME="state"></A>2.1 Determine Connection State</H3> | |
297 | <H4>Finding current state</H4> | |
298 | <P>You can see connection states (STATE_MAIN_I1 and so on) when you | |
299 | bring up a connection on the command line. If you have missed this, | |
300 | or brought up your connection automatically, use: | |
301 | </P> | |
302 | <PRE>ipsec auto --status</PRE> | |
303 | <P>The most relevant state is the last one reached.</P> | |
304 | <H4><VAR>What's this supposed to look like?</VAR></H4> | |
305 | <P>Negotiations should proceed though various states, in the processes of:</P> | |
306 | <OL> | |
307 | <LI>IKE negotiations (aka Phase 1, Main Mode, STATE_MAIN_*)</LI> | |
308 | <LI>IPSEC negotiations (aka Phase 2, Quick Mode, STATE_QUICK_*)</LI> | |
309 | </OL> | |
310 | <P>These are done and a connection is established when you see messages like:</P> | |
311 | <PRE> 000 #21: "myconn" STATE_MAIN_I4 (ISAKMP SA established)... | |
312 | 000 #2: "myconn" STATE_QUICK_I2 (sent QI2, IPsec SA established)...</PRE><P> | |
313 | Look for the key phrases are "ISAKMP SA established" and "IPSec | |
314 | SA established", with the relevant connection name. Often, this happens | |
315 | at STATE_MAIN_I4 and STATE_QUICK_I2, respectively.</P> | |
316 | <P><VAR>ipsec auto --status</VAR> will tell you what states <STRONG>have | |
317 | been achieved</STRONG>, rather than the current state. Since | |
318 | determining the current state is rather more difficult to do, current | |
319 | state information is not available from Linux FreeS/WAN. If you are | |
320 | actively bringing a connection up, the status report's last states | |
321 | for that connection likely reflect its current state. Beware, though, | |
322 | of the case where a connection was correctly brought up but is now | |
323 | downed: Linux FreeS/WAN will not notice this until it attempts to | |
324 | rekey. Meanwhile, the last known state indicates that the connection | |
325 | has been established.</P> | |
326 | <P>If your connection is stuck at STATE_MAIN_I1, skip straight to | |
327 | <A HREF="#ikepath">here</A>. | |
328 | ||
329 | <H3><A NAME="find.pluto.error"></A>2.2 Finding error text</H3> | |
330 | <P>Solving most errors will require you to find verbose error text, | |
331 | either on the command line or in the logs.</P> | |
332 | <H4>Verbose start for more information</H4> | |
333 | <P> | |
334 | Note that you can get more detail from <VAR>ipsec auto</VAR> using | |
335 | the --verbose flag:</P> | |
336 | <PRE STYLE="margin-bottom: 0.2in"> ipsec auto --verbose --up west-east</PRE><P> | |
337 | More complete information can be gleaned from the <A HREF="#logusage">log | |
338 | files</A>.</P> | |
339 | ||
340 | <H4>Debug levels count</H4> | |
341 | <P>The amount of description you'll get here depends on ipsec.conf debug | |
342 | settings, <VAR>klipsdebug</VAR>= and <VAR>plutodebug</VAR>=. | |
343 | When troubleshooting, set at least one of these to <VAR>all</VAR>, and | |
344 | when done, reset it to <VAR>none</VAR> so your logs don't fill up. | |
345 | Note that you must have enabled the <VAR>klipsdebug</VAR> | |
346 | <A HREF="install.html#allbut">compile-time option</A> for the | |
347 | <VAR>klipsdebug</VAR> configuration switch to work.</P> | |
348 | <P>For negotiation problems <VAR>plutodebug</VAR> is most relevant. | |
349 | <VAR>klipsdebug</VAR> applies mainly to attempts to use an | |
350 | already-established connection. See also <A HREF="ipsec.html#parts">this</A> | |
351 | description of the division of duties within Linux FreeS/WAN.</P> | |
352 | <P>After raising your debug levels, restart Linux FreeS/WAN to ensure | |
353 | that ipsec.conf is reread, then recreate the error to generate | |
354 | verbose logs. | |
355 | </P> | |
356 | <H4><VAR>ipsec barf</VAR> for lots of debugging information</H4> | |
357 | <P> | |
358 | <A HREF="manpage.d/ipsec_barf.8.html"><VAR>ipsec barf (8)</VAR></A> | |
359 | collects a bunch of useful debugging information, including these logs | |
360 | Use the command</P> | |
361 | <PRE> | |
362 | ipsec barf > barf.west | |
363 | </PRE> | |
364 | <P>to generate one.</P> | |
365 | <H4>Find the error</H4> | |
366 | <P>Search out the failure point in your logs. | |
367 | Are there a handful of lines which succinctly describe how | |
368 | things are going wrong or contrary to your expectation? Sometimes the | |
369 | failure point is not immediately obvious: Linux FreeS/WAN's errors | |
370 | are usually not marked "Error". Have a look in the | |
371 | <A HREF="faq.html">FAQ</A> | |
372 | for what some common failures look like.</P> | |
373 | <P>Tip: problems snowball. | |
374 | Focus your efforts on the first problem, which is likely to be the | |
375 | cause of later errors.</P> | |
376 | <H4>Play both sides</H4> | |
377 | <P>Also find error text on the peer IPSec box. | |
378 | This gives you two perspectives on the same failure.</P> | |
379 | <P>At times you will require information which only one side has. | |
380 | The peer can merely indicate the presence of an error, and its | |
381 | approximate point in the negotiations. If one side keeps retrying, | |
382 | it may be because there is a show stopper on the other side. | |
383 | Have a look at the other side and figure out what it doesn't like.</P> | |
384 | <P>If the other end is not Linux FreeS/WAN, the principle is the | |
385 | same: replicate the error with its most verbose logging on, and | |
386 | capture the output to a file.</P> | |
387 | <H3><A NAME="interpret.pluto.error"></A>2.3 Interpreting a Negotiation Error</H3> | |
388 | <H4><A NAME="ikepath"></A>Connection stuck at STATE_MAIN_I1</H4> | |
389 | <P>This error commonly happens because IKE (port 500) packets, needed | |
390 | to negotiate an IPSec connection, cannot travel freely between your IPSec | |
391 | gateways. See <A HREF="firewall.html#packets">our firewall document</A> | |
392 | for details.</P> | |
393 | <H4>Other errors</H4> | |
394 | <P>Other errors require a bit more digging. Use the following resources:</P> | |
395 | <UL> | |
396 | <LI><A HREF="faq.html">the FAQ</A> . Since this document is | |
397 | constantly updated, the snapshot's FAQ may have a new entry relevant | |
398 | to your problem.</LI> | |
399 | <LI>our <A HREF="background.html">background document</A> . | |
400 | Special considerations which, while not central to Linux FreeS/WAN, | |
401 | are often tripped over. Includes problems with | |
402 | <a href="background.html#MTU.trouble">packet fragmentation</a>, | |
403 | and considerations for | |
404 | testing opportunism.</LI> | |
405 | <LI>the <A HREF="mail.html#lists">list archives</A>. Each of the | |
406 | searchable archives works differently, so it's worth checking each. | |
407 | Use a search term which is generic, but identifies your error, for | |
408 | example "No connection is known for". | |
409 | <BR> | |
410 | Often, you will find that your question has been answered in the | |
411 | past. Finding an archived answer is quicker than asking the list. | |
412 | You may, however, find similar questions without answers. If you do, | |
413 | send their URLs to the list with your trouble report. The additional | |
414 | examples may help the list tech support person find your answer.</LI> | |
415 | <LI>Look into the code where the error is being generated. The | |
416 | pluto code is nicely documented with comments and meaningful | |
417 | variable names.</LI> | |
418 | </UL> | |
419 | <P>If you have failed to solve your problem with the help of these | |
420 | resources, send a detailed problem report to the users list, | |
421 | following these <A HREF="#prob.report">guidelines</A>.</P> | |
422 | <H2><A NAME="use"></A>3. Using a Connection</H2> | |
423 | <H3>3.1 Orienting yourself</H3> | |
424 | <H4><VAR>How do I know if it works?</VAR></H4> | |
425 | <P>Test your connection by sending packets through it. The simplest way | |
426 | to do this is with ping, but the ping needs to <STRONG>test the correct | |
427 | tunnel.</STRONG> See <A HREF="#testgates">this example scenario</A> if | |
428 | you don't understand this.<P> | |
429 | <P>If your ping returns, test any other connections you've brought | |
430 | u all check out, great. You may wish to <A HREF="#bigpacket">test | |
431 | with large packets</A> for MTU problems.</P> | |
432 | <H4><VAR>ipsec barf</VAR> is useful again</H4> | |
433 | <P>If your ping fails to return, generate an ipsec barf debugging | |
434 | report on each IPSec gateway. On a non-Linux FreeS/WAN | |
435 | implementation, gather equivalent information. Use this, and the tips | |
436 | in the next sections, to troubleshoot. Are you sure that both | |
437 | endpoints are capable of hearing and responding to ping?</P> | |
438 | <H3>3.2 Those pesky configuration errors</H3> | |
439 | <P>IPSec may be dropping your ping packets since they do not belong in the | |
440 | tunnels you have constructed:</P> | |
441 | <UL> | |
442 | <LI>Your ping may not test the tunnel you intend to test. For details, see our | |
443 | <A HREF="faq.html#cantping">"I can't ping"</A> FAQ. | |
444 | </LI> | |
445 | <LI> | |
446 | Alternately, you may have a configuration error. | |
447 | For example, you may have configured one of the four possible tunnels between | |
448 | two gateways, but not the one required to secure the important | |
449 | traffic you're now testing. In this case, add and start the tunnel, | |
450 | and try again. | |
451 | </LI> | |
452 | </UL> | |
453 | <P>In either case, you will often see a message like:</P> | |
454 | <PRE>klipsdebug... no eroute</PRE> | |
455 | <P>which we discuss in <A HREF="faq.html#no_eroute">this | |
456 | FAQ</A>.</P> | |
457 | <P>Note:</P> | |
458 | <UL> | |
459 | <LI><A HREF="glossary.html#NAT.gloss">Network Address Translation (NAT)</A> | |
460 | and <A HREF="glossary.html#masq">IP masquerade</A> may have an effect on | |
461 | which tunnels you need to configure.</LI> | |
462 | <LI>When testing a tunnel that protects a multi-node subnet, try several | |
463 | subnet nodes as ping targets, in case one node is routing incorrectly.</LI> | |
464 | </UL> | |
465 | <H3><A NAME="route.firewall"></A>3.3 Check Routing and Firewalling</H3> | |
466 | <P>If you've confirmed your configuration assumptions, the problem is | |
467 | almost certainly with routing or firewalling. Isolate the problem | |
468 | using interface statistics, firewall statistics, or a packet sniffer.</P> | |
469 | <H4>Background:</H4> | |
470 | <UL> | |
471 | <LI>Linux FreeS/WAN supplies all the special routing it needs; | |
472 | you need only route packets out through your IPSec gateway. Verify | |
473 | that on the <VAR>subnetted</VAR> machines you are using for your | |
474 | ping-test, your routing is as expected. I have seen a tunnel | |
475 | "fail" because the subnet machine sending packets | |
476 | out an alternate gateway (not our IPSec gateway) on their return path. | |
477 | <LI>Linux FreeS/WAN requires particular <A HREF="firewall.html"> | |
478 | firewalling considerations</A>. | |
479 | Check the firewall rules on your IPSec gateways and ensure that they | |
480 | allow IPSec traffic through. Be sure that no other machine - for | |
481 | example a router between the gateways - is blocking your IPSec | |
482 | packets. | |
483 | </UL> | |
484 | <H4><A NAME="ifconfig"></A>View Interface and Firewall | |
485 | Statistics</H4> | |
486 | <P>Interface reports and firewall statistics can help you track down | |
487 | lost packets at a glance. Check any firewall statistics you may be keeping | |
488 | on your IPSec gateways, for dropped packets.</P> | |
489 | ||
490 | <P><STRONG>Tip</STRONG>: You can take a snapshot of the packets processed | |
491 | by your firewall with:</P> | |
492 | ||
493 | <PRE> iptables -L -n -v</PRE> | |
494 | ||
495 | <P>You can get creative with "diff" to find out what happens to a | |
496 | particular packet during transmission.</P> | |
497 | ||
498 | <P>Both <VAR>cat /proc/net/dev</VAR> and <VAR>ifconfig</VAR> display | |
499 | interface statistics, and both are included in <VAR>ipsec barf</VAR>. Use | |
500 | either to check if any interface has dropped packets. If you find | |
501 | that one has, test whether this is related to your ping. While you | |
502 | ping continuously, print that interface's statistics several times. | |
503 | Does its drop count increase in proportion to the ping? If so, check | |
504 | why the packets are dropped there.</P> | |
505 | ||
506 | <P>To do this, look at the firewall rules that apply to that interface. If the | |
507 | interface is an IPSec interface, more information may be available in | |
508 | the log. Grep for the word "drop" in a log which was | |
509 | created with <VAR>klipsdebug=all</VAR> as the error happened.</P> | |
510 | <P>See also this <A HREF="#ifconfig1">discussion</A> on interpreting | |
511 | <VAR>ifconfig</VAR> statistics.</P> | |
512 | <H3><A NAME="sniff"></A>3.4 When in doubt, sniff it out</H3> | |
513 | <P>If you have checked configuration assumptions, routing, and | |
514 | firewall rules, and your interface statistics yield no clue, it | |
515 | remains for you to investigate the mystery of the lost packet by the | |
516 | most thorough method: with a packet sniffer (providing, of course, | |
517 | that this is legal where you are working). | |
518 | <P>In order to detect packets on the ipsec virtual interfaces, | |
519 | you will need an up-to-date sniffer (tcpdump, ethereal, ksnuffle) on | |
520 | your IPSec gateway machines. You may also find it useful to sniff the ping | |
521 | endpoints.</P> | |
522 | <H4>Anticipate your packets' path</H4> | |
523 | <P>Ping, and examine each interface along the projected path, checking for your | |
524 | ping's arrival. If it doesn't get to the the next stop, you have narrowed | |
525 | down where to look for it. In this way, you can isolate a problem area, | |
526 | and narrow your troubleshooting focus.</P> | |
527 | <P>Within a machine running Linux FreeS/WAN, this | |
528 | <A HREF="firewall.html#packets">packet flow diagram</A> will help you | |
529 | anticipate a packet's path. | |
530 | <P>Note that:</P> | |
531 | <UL> | |
532 | <LI> | |
533 | from the perspective of the tunneled packet, the entire tunnel is one hop. | |
534 | That's explained in <A HREF="faq.html#no_trace">this</A> FAQ. | |
535 | </LI> | |
536 | <LI> | |
537 | an encapsulated IPSec packet will look different, when | |
538 | sniffed, from the plaintext packet which generated it. You | |
539 | can see plaintext packets entering an IPSec interface and the | |
540 | resulting cyphertext packets as they emerge from the corresponding | |
541 | physical interface. | |
542 | </LI> | |
543 | </UL> | |
544 | <P>Once you isolate where the packet is lost, take a closer look at | |
545 | firewall rules, routing and configuration assumptions as they affect | |
546 | that specific area. If the packet is lost on an IPSec gateway, comb | |
547 | through <VAR>klipsdebug</VAR> output for anomalies. | |
548 | </P> | |
549 | <P>If the packet goes through both gateways successfully and reaches | |
550 | the ping target, but does not return, suspect routing. Check that the | |
551 | ping target routes packets back to the IPSec gateway.</P> | |
552 | <H3><A NAME="find.use.error"></A>3.5 Check your logs</H3> | |
553 | <P>Here, too, log information can be useful. Start with the | |
554 | <A HREF="#find.pluto.error">guidelines above</A>.</P> | |
555 | <P>For connection use problems, set <VAR>klipsdebug=all</VAR>. Note | |
556 | that you must have enabled the <VAR>klipsdebug</VAR> | |
557 | <A HREF="install.html#allbut">compile-time option</A> to do this. | |
558 | Restart Linux FreeS/WAN so that it rereads <VAR>ipsec.conf</VAR>, | |
559 | then recreate the error condition. When searching through | |
560 | <VAR>klipsdebug</VAR> data, look especially for the keywords | |
561 | "drop" (as in dropped packets) and "error".</P> | |
562 | <P>Often the problem with connection use is not software error, but | |
563 | rather that the software is behaving contrary to expectation. | |
564 | </P> | |
565 | <H4><A NAME="interpret.use.error"></A>Interpreting log text</H4> | |
566 | <P>To interpret the Linux FreeS/WAN log text you've found, use the | |
567 | same resources as indicated for troubleshooting | |
568 | connection negotiation: | |
569 | <A HREF="faq.html">the FAQ</A> , our | |
570 | <A HREF="background.html">background document</A>, and the | |
571 | <A HREF="mail.html#lists">list archives</A>. | |
572 | Looking in the KLIPS code is only for the very brave.</P> | |
573 | <P>If you are still stuck, send a <A HREF="#prob.report">detailed | |
574 | problem report</A> to the users' list.</P> | |
575 | <H3><A NAME="bigpacket"></A>3.6 More testing for the truly thorough</H3> | |
576 | <H4>Large Packets</H4> | |
577 | <P>If each of your connections passed the ping test, you may wish to | |
578 | test by pinging with large packets (2000 bytes or larger). If it does | |
579 | not return, suspect MTU issues, and see this <A HREF="background.html#MTU.trouble">discussion</A>.</P> | |
580 | <H4>Stress Tests</H4> | |
581 | <P>In most users' view, a simple ping test, and perhaps a | |
582 | large-packet ping test suffice to indicate a working IPSec | |
583 | connection.</P> | |
584 | <P>Some people might like to do additional stress tests prior to | |
585 | production use. They may be interested in this <A HREF="http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/12/msg00224.html">testing | |
586 | protocol</A> we use at interoperation conferences, aka "bakeoffs". | |
587 | We also have a <VAR>testing</VAR> directory that ships with the | |
588 | release.</P> | |
589 | <H2><A NAME="prob.report"></A>4. Problem Reporting</H2> | |
590 | <H3>4.1 How to ask for help</H3> | |
591 | <P>Ask for troubleshooting help on the users' mailing list, | |
592 | <A HREF="mailto:users@lists.freeswan.org">users@lists.freeswan.org</A>. | |
593 | While sometimes an initial query with a quick description of your | |
594 | intent and error will twig someone's memory of a similar problem, | |
595 | it's often necessary to send a second mail with a complete problem | |
596 | report. | |
597 | </P> | |
598 | ||
599 | ||
600 | <P>When reporting problems to the mailing list(s), please include: | |
601 | </P> | |
602 | <UL> | |
603 | <LI>a brief description of the problem</LI> | |
604 | <LI>if it's a compile problem, the actual output from make, | |
605 | showing the problem. Try to edit it down to only the relevant part, | |
606 | but when in doubt, be as complete as you can. If it's a kernel | |
607 | compile problem, any relevant out.* files</LI> | |
608 | <LI>if it's a run-time problem, pointers to where we can find the | |
609 | complete output from "ipsec barf" from BOTH ENDS (not just | |
610 | one of them). Remember that it's common outside the US and Canada to | |
611 | pay for download volume, so if you can't post barfs on the web and | |
612 | send the URL to the mailing list, at least compress them with tar or | |
613 | gzip.<BR> | |
614 | If you can, try to simplify the case that is causing the problem. | |
615 | In particular, if you clear your logs, start FreeS/WAN with no other | |
616 | connections running, cause the problem to happen, and then do <VAR>ipsec | |
617 | barf</VAR> on both ends immediately, that gives the smallest and | |
618 | least cluttered output.</LI> | |
619 | <LI>any other error messages, complaints, etc. that you saw. | |
620 | Please send the complete text of the messages, not just a summary.</LI> | |
621 | <LI>what your network setup is. Include subnets, gateway | |
622 | addresses, etc. A schematic diagram is a | |
623 | good format for this information.</LI> | |
624 | <LI>exactly what you were trying to do with Linux FreeS/WAN, and | |
625 | exactly what went wrong</LI> | |
626 | <LI>a fix, if you have one. But remember, you are sending mail to | |
627 | people all over the world; US residents and US citizens in | |
628 | particular, please read doc/exportlaws.html before sending code -- | |
629 | even small bug fixes -- to the list or to us.</LI> | |
630 | <LI>When in doubt about whether to include some seemingly-trivial | |
631 | item of information, include it. It is rare for problem reports to | |
632 | have too much information, and common for them to have too little.</LI> | |
633 | </UL> | |
634 | ||
635 | <P>Here are some good general guidelines on bug reporting: | |
636 | <a href="http://tuxedo.org/~esr/faqs/smart-questions.html">How To Ask Questions | |
637 | The Smart Way</a> and <a | |
638 | href="http://www.chiark.greenend.org.uk/~sgtatham/bugs.html">How to Report | |
639 | Bugs Effectively</a>.</p> | |
640 | ||
641 | ||
642 | <H3>4.2 Where to ask</H3> | |
643 | <P>To report a problem, send mail about it to the users' list. If you | |
644 | are certain that you have found a bug, report it to the bugs list. If | |
645 | you encounter a problem while doing your own coding on the Linux | |
646 | FreeS/WAN codebase and think it is of interest to the design team, | |
647 | notify the design list. When in doubt, default to the users' list. | |
648 | More information about the mailing lists is found <A HREF="mail.html#lists">here</A>.</P> | |
649 | <P>For a number of reasons -- including export-control regulations | |
650 | affecting almost any <STRONG>private</STRONG> discussion of | |
651 | encryption software -- we prefer that problem reports and discussions | |
652 | go to the lists, not directly to the team. Beware that the list goes | |
653 | worldwide; US citizens, read this important information about your | |
654 | <A HREF="politics.html#exlaw">export laws</A>. If you're using this | |
655 | software, you really should be on the lists. To get onto them, visit | |
656 | <A HREF="http://lists.freeswan.org/">lists.freeswan.org</A>.</P> | |
657 | <P>If you do send private mail to our coders or want a private reply | |
658 | from them, please make sure that the return address on your mail | |
659 | (From or Reply-To header) is a valid one. They have more important | |
660 | things to do than to unravel addresses that have been mangled in an | |
661 | attempt to confuse spammers. | |
662 | </P> | |
663 | <H2><A NAME="notes"></A>5. Additional Notes on Troubleshooting</H2> | |
664 | <P>The following sections supplement the Guide: <A HREF="#system.info">information | |
665 | available on your system</A>; <A HREF="#testgates">testing between | |
666 | security gateways</A>; <A HREF="#ifconfig1">ifconfig reports for | |
667 | KLIPS debugging</A>; <A HREF="#gdb">using GDB on Pluto</A>.</P> | |
668 | <H3><A NAME="system.info"></A>5.1 Information available on your | |
669 | system</H3> | |
670 | <H4><A NAME="logusage"></A>Logs used</H4> | |
671 | <P>Linux FreeS/WAN logs to:</P> | |
672 | <UL> | |
673 | <LI>/var/log/secure (or, on Debian, /var/log/auth.log)</LI> | |
674 | <LI>/var/log/messages</LI> | |
675 | </UL> | |
676 | <P>Check both places to get full information. If you find nothing, | |
677 | check your <VAR>syslogd.conf(5)</VAR> to see where your | |
678 | /etc/syslog.conf or equivalent is directing <VAR>authpriv</VAR> | |
679 | messages.</P> | |
680 | <H4><A NAME="pages"></A>man pages provided</H4> | |
681 | <DL> | |
682 | <DT><A HREF="manpage.d/ipsec.conf.5.html">ipsec.conf(5)</A> | |
683 | </DT><DD> | |
684 | Manual page for IPSEC configuration file. | |
685 | </DD><DT> | |
686 | <A HREF="manpage.d/ipsec.8.html">ipsec(8)</A> | |
687 | </DT><DD STYLE="margin-bottom: 0.2in"> | |
688 | Primary man page for ipsec utilities. | |
689 | </DD></DL> | |
690 | <P> | |
691 | Other man pages are on <A HREF="manpages.html">this list</A> and in</P> | |
692 | <UL> | |
693 | <LI>/usr/local/man/man3</LI> | |
694 | <LI>/usr/local/man/man5</LI> | |
695 | <LI>/usr/local/man/man8/ipsec_*</LI> | |
696 | </UL> | |
697 | <H4><A NAME="statusinfo"></A>Status information</H4> | |
698 | <DL> | |
699 | <DT>ipsec auto --status | |
700 | </DT><DD> | |
701 | Command to get status report from running system. Displays Pluto's | |
702 | state. Includes the list of connections which are currently "added" | |
703 | to Pluto's internal database; lists state objects reflecting ISAKMP | |
704 | and IPsec SAs being negotiated or installed. | |
705 | </DD><DT> | |
706 | ipsec look | |
707 | </DT><DD> | |
708 | Brief status info. | |
709 | </DD><DT> | |
710 | ipsec barf | |
711 | </DT><DD STYLE="margin-bottom: 0.2in"> | |
712 | Copious debugging info. | |
713 | </DD></DL> | |
714 | <H3> | |
715 | <A NAME="testgates"></A>5.2 Testing between security gateways</H3> | |
716 | <P>Sometimes you need to test a subnet-subnet tunnel. This is a | |
717 | tunnel between two security gateways, which protects traffic on | |
718 | behalf of the subnets behind these gateways. On this network:</P> | |
719 | <PRE> Sunset==========West------------------East=========Sunrise | |
720 | IPSec gateway IPSec gateway | |
721 | local net untrusted net local net</PRE><P> | |
722 | you might name this tunnel sunset-sunrise. You can test this tunnel | |
723 | by having a machine behind one gateway ping a machine behind the | |
724 | other gateway, but this is not always convenient or even possible.</P> | |
725 | <P>Simply pinging one gateway from the other is not useful. Such a | |
726 | ping does not normally go through the tunnel. <STRONG>The tunnel | |
727 | handles traffic between the two protected subnets, not between the | |
728 | gateways</STRONG> . Depending on the routing in place, a ping might</P> | |
729 | <UL> | |
730 | <LI>either succeed by finding an | |
731 | unencrypted route</LI> | |
732 | <LI>or fail by finding no route. Packets without an IPSEC eroute | |
733 | are discarded.</LI> | |
734 | </UL> | |
735 | <P><STRONG>Neither event tells you anything about the tunnel</STRONG>. | |
736 | You can explicitly create an eroute to force such packets through the | |
737 | tunnel, or you can create additional tunnels as described in our | |
738 | <A HREF="config.html#multitunnel">configuration document</A>, but | |
739 | those may be unnecessary complications in your situation.</P> | |
740 | <P>The trick is to explicitly test between <STRONG>both gateways' | |
741 | private-side IP addresses</STRONG>. Since the private-side interfaces | |
742 | are on the protected subnets, the resulting packets do go via the | |
743 | tunnel. Use either ping -I or traceroute -i, both of which allow you | |
744 | to specify a source interface. (Note: unsupported on older Linuxes). | |
745 | The same principles apply for a road warrior (or other) case where | |
746 | only one end of your tunnel is a subnet.</P> | |
747 | <H3><A NAME="ifconfig1"></A>5.3 ifconfig reports for KLIPS debugging</H3> | |
748 | <P>When diagnosing problems using ifconfig statistics, you may wonder | |
749 | what type of activity increments a particular counter for an ipsecN | |
750 | device. Here's an index, posted by KLIPS developer Richard Guy | |
751 | Briggs:</P> | |
752 | <PRE>Here is a catalogue of the types of errors that can occur for which | |
753 | statistics are kept when transmitting and receiving packets via klips. | |
754 | I notice that they are not necessarily logged in the right counter. | |
755 | . . . | |
756 | ||
757 | Sources of ifconfig statistics for ipsec devices | |
758 | ||
759 | rx-errors: | |
760 | - packet handed to ipsec_rcv that is not an ipsec packet. | |
761 | - ipsec packet with payload length not modulo 4. | |
762 | - ipsec packet with bad authenticator length. | |
763 | - incoming packet with no SA. | |
764 | - replayed packet. | |
765 | - incoming authentication failed. | |
766 | - got esp packet with length not modulo 8. | |
767 | ||
768 | tx_dropped: | |
769 | - cannot process ip_options. | |
770 | - packet ttl expired. | |
771 | - packet with no eroute. | |
772 | - eroute with no SA. | |
773 | - cannot allocate sk_buff. | |
774 | - cannot allocate kernel memory. | |
775 | - sk_buff internal error. | |
776 | ||
777 | ||
778 | The standard counters are: | |
779 | ||
780 | struct enet_statistics | |
781 | { | |
782 | int rx_packets; /* total packets received */ | |
783 | int tx_packets; /* total packets transmitted */ | |
784 | int rx_errors; /* bad packets received */ | |
785 | int tx_errors; /* packet transmit problems */ | |
786 | int rx_dropped; /* no space in linux buffers */ | |
787 | int tx_dropped; /* no space available in linux */ | |
788 | int multicast; /* multicast packets received */ | |
789 | int collisions; | |
790 | ||
791 | /* detailed rx_errors: */ | |
792 | int rx_length_errors; | |
793 | int rx_over_errors; /* receiver ring buff overflow */ | |
794 | int rx_crc_errors; /* recved pkt with crc error */ | |
795 | int rx_frame_errors; /* recv'd frame alignment error */ | |
796 | int rx_fifo_errors; /* recv'r fifo overrun */ | |
797 | int rx_missed_errors; /* receiver missed packet */ | |
798 | ||
799 | /* detailed tx_errors */ | |
800 | int tx_aborted_errors; | |
801 | int tx_carrier_errors; | |
802 | int tx_fifo_errors; | |
803 | int tx_heartbeat_errors; | |
804 | int tx_window_errors; | |
805 | }; | |
806 | ||
807 | of which I think only the first 6 are useful.</PRE><H3> | |
808 | <A NAME="gdb"></A>5.4 Using GDB on Pluto</H3> | |
809 | <P>You may need to use the GNU debugger, gdb(1), on Pluto. This | |
810 | should be necessary only in unusual cases, for example if you | |
811 | encounter a problem which the Pluto developer cannot readily | |
812 | reproduce or if you are modifying Pluto. | |
813 | </P> | |
814 | <P>Here are the Pluto developer's suggestions for doing this: | |
815 | </P> | |
816 | <PRE>Can you get a core dump and use gdb to find out what Pluto was doing | |
817 | when it died? | |
818 | ||
819 | To get a core dump, you will have to set dumpdir to point to a | |
820 | suitable directory (see <A HREF="manpage.d/ipsec.conf.5.html">ipsec.conf(5)</A>). | |
821 | ||
822 | To get gdb to tell you interesting stuff: | |
823 | $ script | |
824 | $ cd dump-directory-you-chose | |
825 | $ gdb /usr/local/lib/ipsec/pluto core | |
826 | (gdb) where | |
827 | (gdb) quit | |
828 | $ exit | |
829 | ||
830 | The resulting output will have been captured by the script command in | |
831 | a file called "typescript". Send it to the list. | |
832 | ||
833 | Do not delete the core file. I may need to ask you to print out some | |
834 | more relevant stuff.</PRE><P> | |
835 | Note that the <VAR>dumpdir</VAR> parameter takes effect only when the | |
836 | IPsec subsystem is restarted -- reboot or ipsec setup restart.</P> | |
837 | <P><BR><BR> | |
838 | </P> | |
839 | </BODY> | |
840 | </HTML> |