]> git.ipfire.org Git - people/ms/strongswan.git/blame - doc/src/trouble.html
(no commit message)
[people/ms/strongswan.git] / doc / src / trouble.html
CommitLineData
997358a6
MW
1<HTML>
2<HEAD>
3 <TITLE>FreeS/WAN troubleshooting</TITLE>
4 <meta name="keywords" content="Linux, IPSEC, VPN, security, FreeSWAN, troubleshooting, debugging">
5<!--
6 Written by Claudia Schmeing for the Linux FreeS/WAN project
7 Freely distributable under the GNU General Public License
8
9 More information at www.freeswan.org
10 Feedback to users@lists.freeswan.org
11
12CVS information:
13RCS ID: $Id: trouble.html,v 1.1 2004/03/15 20:35:24 as Exp $
14Last changed: $Date: 2004/03/15 20:35:24 $
15Revision number: $Revision: 1.1 $
16
17CVS revision numbers do not correspond to FreeS/WAN release numbers.
18-->
19
20</HEAD>
21<BODY>
22
23<H1><A NAME="trouble"></A>Linux FreeS/WAN Troubleshooting Guide</H1>
24
25<H2><A NAME="overview"></A>Overview</H2>
26
27<P>
28This document covers several general places where you might have a problem:</P>
29<OL>
30 <LI><A HREF="#install">During install</A>.</LI>
31 <LI><A HREF="#negotiation">During the negotiation process</A>.</LI>
32 <LI><A HREF="#use">Using an established connection</A>.</LI>
33</OL>
34<P>This document also contains <A HREF="#notes">notes</A> which
35expand on points made in these sections, and tips for
36<A HREF="#prob.report">problem
37reporting</A>. If the other end of your connection is not FreeS/WAN,
38you'll also want to read our
39<A HREF="interop.html#interop.problem">interoperation</A> document.</P>
40<H2><A NAME="install"></A>1. During Install</H2>
41<H3>1.1 RPM install gotchas</H3>
42<P>With the RPM method:</P>
43<UL>
44<LI>Be sure you have installed both the userland tools and the kernel
45 components. One will not work without the other. For example, when using
46 FreeS/WAN-produced RPMs for our 2.04 release, you need both:
47<PRE> freeswan-userland-2.04_2.4.20_20.9-0.i386.rpm
48 freeswan-module-2.04_2.4.20_20.9-0.i386.rpm
49</PRE>
50</LI>
51</UL>
52<H3>1.2 Problems installing from source</H3>
53<P>When installing from source, you may find these problems:</P>
54<UL>
55 <LI>Missing library. See <A HREF="faq.html#gmp.h_missing">this</A>
56 FAQ.</LI>
57 <LI>Missing utilities required for compile. See this
58 <A HREF="install.html#tool.lib">checklist</A>.</LI>
59 <LI>Kernel version incompatibility. See <A HREF="faq.html#k.versions">this</A>
60 FAQ.</LI>
61 <LI>Another compile problem. Find information in the out.* files,
62 ie. out.kpatch, out.kbuild, created at compile time in the top-level
63 Linux FreeS/WAN directory. Error messages generated by KLIPS during
64 the boot sequence are accessible with the <VAR>dmesg</VAR> command.
65 <BR>
66 Check the list archives and the List in Brief to see if this is a
67 known issue. If it is not, report it to the bugs list as described
68 in our <A HREF="#prob.report">problem reporting</A> section. In some
69 cases, you may be asked to provide debugging information using gdb;
70 details <A HREF="#gdb">below</A>.</LI>
71 <LI>If your kernel compiles but you fail to install your new
72 FreeS/WAN-enabled kernel, review the sections on <A HREF="install.html#newk">installing
73 the patched kernel</A>, and <A HREF="install.html#testinstall">testing</A>
74 to see if install succeeded.</LI>
75</UL>
76<H3><A NAME="install.check"></A>1.3 Install checks</H3>
77<P><VAR>ipsec verify</VAR> checks a number
78of FreeS/WAN essentials. Here are some hints on what do to when your
79system doesn't check out:</P>
80<P>
81<TABLE border=1>
82<TR>
83<TD><STRONG>Problem</STRONG></TD>
84<TD><STRONG>Status</STRONG></TD>
85<TD><STRONG>Action</STRONG></TD>
86</TR>
87<TR>
88<TD><VAR>ipsec</VAR> not on-path</TD>
89<TD>&nbsp;</TD>
90<TD><P>Add <VAR>/usr/local/sbin</VAR> to your PATH.</P></TD>
91</TR>
92<TR>
93<TD>Missing KLIPS support</TD>
94<TD><FONT COLOR="#FF0000">critical</FONT></TD>
95<TD>See <A HREF="faq.html#noKLIPS">this FAQ.</A></TD>
96</TR>
97<TR>
98<TD>No RSA private key</TD>
99<TD>&nbsp;</TD>
100<TD>
101<P>Follow <A HREF="install.html#genrsakey">these
102instructions</A> to create an RSA key pair for your host. RSA keys are:</P>
103<UL>
104<LI>required for opportunistic encryption, and</LI>
105<LI>our preferred method to authenticate pre-configured connections.</LI>
106</UL>
107</TD>
108</TR>
109<TR>
110<TD><VAR>pluto</VAR> not running</TD>
111<TD><FONT COLOR="#FF0000">critical</FONT></TD>
112<TD><PRE>service ipsec start</PRE></TD>
113</TR>
114<TR>
115<TD>No port 500 hole</TD>
116<TD><FONT COLOR="#FF0000">critical</FONT></TD>
117<TD>Open port 500 for IKE negotiation.</TD>
118</TR>
119<TR>
120<TD>Port 500 check N/A</TD>
121<TD>&nbsp;</TD>
122<TD>Check that port 500 is open for IKE negotiation.</TD>
123</TR>
124<TR>
125<TD>Failed DNS checks</TD>
126<TD>&nbsp;</TD>
127<TD>Opportunistic encryption requires information from DNS.
128To set this up, see <A HREF="quickstart.html#opp.setup">our instructions</A>.
129</TD>
130</TR>
131<TR>
132<TD>No public IP address</TD>
133<TD>&nbsp;</TD>
134<TD>Check that the interface which you want to protect with IPSec is up and
135running.</TD>
136</TR>
137</TABLE>
138
139
140<H3><A NAME="oe.trouble"></A>1.3 Troubleshooting OE</H3>
141<P>OE should work with no local configuration, if you have posted
142DNS TXT records according to the instructions in our
143<A HREF="quickstart.html">quickstart guide</A>.
144If you encounter trouble, try these hints.
145We welcome additional hints via the
146<A HREF="mail.html">users' mailing list</A>.</P>
147
148<TABLE border=1>
149<TR>
150<TD><STRONG>Symptom</STRONG></TD>
151<TD><STRONG>Problem</STRONG></TD>
152<TD><STRONG>Action</STRONG></TD>
153</TR>
154<TR>
155<TD>
156You're running FreeS/WAN 2.01 (or later),
157and initiating a connection to FreeS/WAN
1582.00 (or earlier).
159In your logs, you see a message like:
160<pre>no RSA public key known for '192.0.2.13';
161DNS search for KEY failed (no KEY record
162for 13.2.0.192.in-addr.arpa.)</pre>
163The older FreeS/WAN logs no error.
164</TD>
165<TD>
166<A NAME="oe.trouble.flagday"></A>
167A protocol level incompatibility between 2.01 (or later) and
1682.00 (or earlier) causes this error. It occurs when a FreeS/WAN 2.01
169(or later) box for which no KEY record is posted attempts to initiate an OE
170connection to older FreeS/WAN versions (2.00 and earlier).
171Note that older versions can initiate to newer versions without this error.
172</TD>
173<TD>If you control the peer host, upgrade its FreeS/WAN to 2.01 (or later), and
174post new style TXT records for it. If not, but if you know its sysadmin,
175perhaps a quick note is in order. If neither option is possible, you can
176ease the transition by posting an old style KEY record (created with a
177command like "ipsec&nbsp;showhostkey&nbsp;--key") to the reverse map for
178the FreeS/WAN 2.01 (or later) box.</TD>
179</TR>
180<TR>
181<TD>OE host is very slow to contact other hosts.</TD>
182<TD>Slow DNS service while running OE.</TD>
183<TD>It's a good idea to run a caching DNS server on your OE host,
184as outlined in <A HREF="http://lists.freeswan.org/pipermail/design/2003-January/004205.html">this
185mailing list message</A>. If your DNS servers are elsewhere,
186put their IPs
187in the <VAR>clear</VAR> policy group, and
188re-read groups with <PRE>ipsec auto --rereadgroups</PRE>
189</TD>
190</TR>
191<TR>
192<TD>
193<PRE>Can't Opportunistically initiate for
194192.0.2.2 to 192.0.2.3: no TXT record
195for 13.2.0.192.in-addr.arpa.</PRE>
196</TD>
197<TD>Peer is not set up for OE.</TD>
198<TD><P>None. Plenty of hosts on the Internet
199do not run OE. If, however, you have set OE up on that peer, this may
200indicate that you need to wait up to 48 hours
201for its DNS records to propagate.</P></TD>
202</TR>
203<TR>
204<TD><VAR>ipsec verify</VAR> does not find DNS records:
205<PRE>...
206Looking for TXT in forward map:
207 xy.example.com...[FAILED]
208Looking for TXT in reverse map...[FAILED]
209...</PRE>
210
211You also experience authentication failure:<BR>
212<PRE>Possible authentication failure:
213no acceptable response to our
214first encrypted message</PRE>
215</TD>
216
217<TD>DNS records are not posted or have not propagated.</TD>
218<TD>Did you post the DNS records necessary for OE? If not,
219do so using the instructions in our
220<A HREF="quickstart.html#quickstart">quickstart guide</A>.
221If so, wait up to 48 hours for the DNS records to propagate.</TD>
222</TR>
223<TR>
224<TD><VAR>ipsec verify</VAR> does not find DNS records, and you experience
225authentication failure.</TD>
226<TD>For iOE, your ID
227does not match location of
228forward DNS record.</TD>
229<TD>In <VAR>config setup</VAR>, change
230<VAR>myid=</VAR> to match the forward DNS where you posted the record.
231Restart FreeS/WAN.
232 For reference, see our
233<A HREF="quickstart.html#opp.client">iOE instructions</A>.</TD>
234</TR>
235<TR>
236<TD><VAR>ipsec verify</VAR> finds DNS records, yet there is
237still authentication failure. ( ? )</TD>
238<TD>DNS records are malformed.</TD>
239<TD>Re-create the records and send new copies to your DNS administrator.</TD>
240</TR>
241<TR>
242<TD><VAR>ipsec verify</VAR> finds DNS records, yet there is
243still authentication failure. ( ? )</TD>
244<TD>DNS records show different keys for a gateway vs. its subnet hosts.</TD>
245<TD>All TXT records for boxes protected by an OE gateway must contain the
246gateway's public key. Re-create and re-post any incorrect records using
247<A HREF="quickstart.html#opp.incoming">these instructions</A>.</TD>
248</TR>
249<TR>
250<TD>OE gateway loses connectivity to its subnet. The gateway's
251routing table shows routes to the subnet through IPsec interfaces.</TD>
252<TD>The subnet is part of the <VAR>private</VAR> or <VAR>block</VAR>
253policy group on the gateway.</TD>
254<TD>Remove the subnet from the group, and reread
255groups with <PRE>ipsec auto --rereadgroups</PRE></TD>
256</TR>
257<TR>
258<TD>OE does not work to hosts on the local LAN.</TD>
259<TD>This is a known issue.</TD>
260<TD>See <A HREF="opportunism.known-issues">this list</A> of known issues
261with OE.
262</TD>
263</TR>
264
265<TR>
266<TD>FreeS/WAN does not seem to be executing your default policy. In your
267logs, you see a message like:
268<PRE>/etc/ipsec.d/policies/iprivate-or-clear"
269line 14: subnet "0.0.0.0/0",
270source 192.0.2.13/32,
271already "private-or-clear"</PRE>
272</TD>
273<TD><A HREF="glossary.html#fullnet">Fullnet</A> in a policy group file defines
274your default policy. Fullnet should normally be present in only one policy
275group file. The fine print: you can have two default policies defined so long
276as they protect different local endpoints (e.g. the FreeS/WAN gateway and a
277subnet).</TD>
278<TD>
279Find all policies which contain fullnet with:<br>
280<PRE>grep -F 0.0.0.0/0 /etc/ipsec.d/policies/*</PRE>
281then remove the unwanted occurrence(s).
282</TD>
283</TR>
284
285</TABLE>
286
287
288<H2><A NAME="negotiation"></A>2. During Negotiation</H2>
289<P>When you fail to bring up a tunnel, you'll need to find out:</P>
290<UL>
291<LI><A HREF="#state">what your connection state is,</A> and often</LI>
292<LI><A HREF="#find.pluto.error">an error message</A>.</LI>
293</UL>
294<P>before you can
295<A HREF="#interpret.pluto.error">diagnose your problem</A>.</P>
296<H3><A NAME="state"></A>2.1 Determine Connection State</H3>
297<H4>Finding current state</H4>
298<P>You can see connection states (STATE_MAIN_I1 and so on) when you
299bring up a connection on the command line. If you have missed this,
300or brought up your connection automatically, use:
301</P>
302<PRE>ipsec auto --status</PRE>
303<P>The most relevant state is the last one reached.</P>
304<H4><VAR>What's this supposed to look like?</VAR></H4>
305<P>Negotiations should proceed though various states, in the processes of:</P>
306<OL>
307<LI>IKE negotiations (aka Phase 1, Main Mode, STATE_MAIN_*)</LI>
308<LI>IPSEC negotiations (aka Phase 2, Quick Mode, STATE_QUICK_*)</LI>
309</OL>
310<P>These are done and a connection is established when you see messages like:</P>
311<PRE> 000 #21: &quot;myconn&quot; STATE_MAIN_I4 (ISAKMP SA established)...
312 000 #2: &quot;myconn&quot; STATE_QUICK_I2 (sent QI2, IPsec SA established)...</PRE><P>
313Look for the key phrases are &quot;ISAKMP SA established&quot; and &quot;IPSec
314SA established&quot;, with the relevant connection name. Often, this happens
315at STATE_MAIN_I4 and STATE_QUICK_I2, respectively.</P>
316<P><VAR>ipsec auto --status</VAR> will tell you what states <STRONG>have
317been achieved</STRONG>, rather than the current state. Since
318determining the current state is rather more difficult to do, current
319state information is not available from Linux FreeS/WAN. If you are
320actively bringing a connection up, the status report's last states
321for that connection likely reflect its current state. Beware, though,
322of the case where a connection was correctly brought up but is now
323downed: Linux FreeS/WAN will not notice this until it attempts to
324rekey. Meanwhile, the last known state indicates that the connection
325has been established.</P>
326<P>If your connection is stuck at STATE_MAIN_I1, skip straight to
327<A HREF="#ikepath">here</A>.
328
329<H3><A NAME="find.pluto.error"></A>2.2 Finding error text</H3>
330<P>Solving most errors will require you to find verbose error text,
331either on the command line or in the logs.</P>
332<H4>Verbose start for more information</H4>
333<P>
334Note that you can get more detail from <VAR>ipsec auto</VAR> using
335the --verbose flag:</P>
336<PRE STYLE="margin-bottom: 0.2in"> ipsec auto --verbose --up west-east</PRE><P>
337More complete information can be gleaned from the <A HREF="#logusage">log
338files</A>.</P>
339
340<H4>Debug levels count</H4>
341<P>The amount of description you'll get here depends on ipsec.conf debug
342settings, <VAR>klipsdebug</VAR>= and <VAR>plutodebug</VAR>=.
343When troubleshooting, set at least one of these to <VAR>all</VAR>, and
344when done, reset it to <VAR>none</VAR> so your logs don't fill up.
345Note that you must have enabled the <VAR>klipsdebug</VAR>
346<A HREF="install.html#allbut">compile-time option</A> for the
347<VAR>klipsdebug</VAR> configuration switch to work.</P>
348<P>For negotiation problems <VAR>plutodebug</VAR> is most relevant.
349<VAR>klipsdebug</VAR> applies mainly to attempts to use an
350already-established connection. See also <A HREF="ipsec.html#parts">this</A>
351description of the division of duties within Linux FreeS/WAN.</P>
352<P>After raising your debug levels, restart Linux FreeS/WAN to ensure
353that ipsec.conf is reread, then recreate the error to generate
354verbose logs.
355</P>
356<H4><VAR>ipsec barf</VAR> for lots of debugging information</H4>
357<P>
358<A HREF="manpage.d/ipsec_barf.8.html"><VAR>ipsec barf (8)</VAR></A>
359collects a bunch of useful debugging information, including these logs
360Use the command</P>
361<PRE>
362 ipsec barf &gt; barf.west
363</PRE>
364<P>to generate one.</P>
365<H4>Find the error</H4>
366<P>Search out the failure point in your logs.
367 Are there a handful of lines which succinctly describe how
368things are going wrong or contrary to your expectation? Sometimes the
369failure point is not immediately obvious: Linux FreeS/WAN's errors
370are usually not marked &quot;Error&quot;. Have a look in the
371<A HREF="faq.html">FAQ</A>
372for what some common failures look like.</P>
373<P>Tip: problems snowball.
374Focus your efforts on the first problem, which is likely to be the
375cause of later errors.</P>
376<H4>Play both sides</H4>
377<P>Also find error text on the peer IPSec box.
378This gives you two perspectives on the same failure.</P>
379<P>At times you will require information which only one side has.
380The peer can merely indicate the presence of an error, and its
381approximate point in the negotiations. If one side keeps retrying,
382it may be because there is a show stopper on the other side.
383Have a look at the other side and figure out what it doesn't like.</P>
384<P>If the other end is not Linux FreeS/WAN, the principle is the
385same: replicate the error with its most verbose logging on, and
386capture the output to a file.</P>
387<H3><A NAME="interpret.pluto.error"></A>2.3 Interpreting a Negotiation Error</H3>
388<H4><A NAME="ikepath"></A>Connection stuck at STATE_MAIN_I1</H4>
389<P>This error commonly happens because IKE (port 500) packets, needed
390to negotiate an IPSec connection, cannot travel freely between your IPSec
391gateways. See <A HREF="firewall.html#packets">our firewall document</A>
392for details.</P>
393<H4>Other errors</H4>
394<P>Other errors require a bit more digging. Use the following resources:</P>
395<UL>
396 <LI><A HREF="faq.html">the FAQ</A> . Since this document is
397 constantly updated, the snapshot's FAQ may have a new entry relevant
398 to your problem.</LI>
399 <LI>our <A HREF="background.html">background document</A> .
400 Special considerations which, while not central to Linux FreeS/WAN,
401 are often tripped over. Includes problems with
402 <a href="background.html#MTU.trouble">packet fragmentation</a>,
403 and considerations for
404 testing opportunism.</LI>
405 <LI>the <A HREF="mail.html#lists">list archives</A>. Each of the
406 searchable archives works differently, so it's worth checking each.
407 Use a search term which is generic, but identifies your error, for
408 example &quot;No connection is known for&quot;.
409 <BR>
410 Often, you will find that your question has been answered in the
411 past. Finding an archived answer is quicker than asking the list.
412 You may, however, find similar questions without answers. If you do,
413 send their URLs to the list with your trouble report. The additional
414 examples may help the list tech support person find your answer.</LI>
415 <LI>Look into the code where the error is being generated. The
416 pluto code is nicely documented with comments and meaningful
417 variable names.</LI>
418</UL>
419<P>If you have failed to solve your problem with the help of these
420resources, send a detailed problem report to the users list,
421following these <A HREF="#prob.report">guidelines</A>.</P>
422<H2><A NAME="use"></A>3. Using a Connection</H2>
423<H3>3.1 Orienting yourself</H3>
424<H4><VAR>How do I know if it works?</VAR></H4>
425<P>Test your connection by sending packets through it. The simplest way
426to do this is with ping, but the ping needs to <STRONG>test the correct
427tunnel.</STRONG> See <A HREF="#testgates">this example scenario</A> if
428you don't understand this.<P>
429<P>If your ping returns, test any other connections you've brought
430u all check out, great. You may wish to <A HREF="#bigpacket">test
431with large packets</A> for MTU problems.</P>
432<H4><VAR>ipsec barf</VAR> is useful again</H4>
433<P>If your ping fails to return, generate an ipsec barf debugging
434report on each IPSec gateway. On a non-Linux FreeS/WAN
435implementation, gather equivalent information. Use this, and the tips
436in the next sections, to troubleshoot. Are you sure that both
437endpoints are capable of hearing and responding to ping?</P>
438<H3>3.2 Those pesky configuration errors</H3>
439<P>IPSec may be dropping your ping packets since they do not belong in the
440tunnels you have constructed:</P>
441<UL>
442<LI>Your ping may not test the tunnel you intend to test. For details, see our
443<A HREF="faq.html#cantping">&quot;I can't ping&quot;</A> FAQ.
444</LI>
445<LI>
446Alternately, you may have a configuration error.
447For example, you may have configured one of the four possible tunnels between
448two gateways, but not the one required to secure the important
449traffic you're now testing. In this case, add and start the tunnel,
450and try again.
451</LI>
452</UL>
453<P>In either case, you will often see a message like:</P>
454<PRE>klipsdebug... no eroute</PRE>
455<P>which we discuss in <A HREF="faq.html#no_eroute">this
456FAQ</A>.</P>
457<P>Note:</P>
458<UL>
459<LI><A HREF="glossary.html#NAT.gloss">Network Address Translation (NAT)</A>
460and <A HREF="glossary.html#masq">IP masquerade</A> may have an effect on
461which tunnels you need to configure.</LI>
462<LI>When testing a tunnel that protects a multi-node subnet, try several
463subnet nodes as ping targets, in case one node is routing incorrectly.</LI>
464</UL>
465<H3><A NAME="route.firewall"></A>3.3 Check Routing and Firewalling</H3>
466<P>If you've confirmed your configuration assumptions, the problem is
467almost certainly with routing or firewalling. Isolate the problem
468using interface statistics, firewall statistics, or a packet sniffer.</P>
469<H4>Background:</H4>
470<UL>
471 <LI>Linux FreeS/WAN supplies all the special routing it needs;
472 you need only route packets out through your IPSec gateway. Verify
473 that on the <VAR>subnetted</VAR> machines you are using for your
474 ping-test, your routing is as expected. I have seen a tunnel
475 &quot;fail&quot; because the subnet machine sending packets
476 out an alternate gateway (not our IPSec gateway) on their return path.
477 <LI>Linux FreeS/WAN requires particular <A HREF="firewall.html">
478 firewalling considerations</A>.
479 Check the firewall rules on your IPSec gateways and ensure that they
480 allow IPSec traffic through. Be sure that no other machine - for
481 example a router between the gateways - is blocking your IPSec
482 packets.
483</UL>
484<H4><A NAME="ifconfig"></A>View Interface and Firewall
485Statistics</H4>
486<P>Interface reports and firewall statistics can help you track down
487lost packets at a glance. Check any firewall statistics you may be keeping
488on your IPSec gateways, for dropped packets.</P>
489
490<P><STRONG>Tip</STRONG>: You can take a snapshot of the packets processed
491by your firewall with:</P>
492
493<PRE> iptables -L -n -v</PRE>
494
495<P>You can get creative with "diff" to find out what happens to a
496particular packet during transmission.</P>
497
498<P>Both <VAR>cat /proc/net/dev</VAR> and <VAR>ifconfig</VAR> display
499interface statistics, and both are included in <VAR>ipsec barf</VAR>. Use
500either to check if any interface has dropped packets. If you find
501that one has, test whether this is related to your ping. While you
502ping continuously, print that interface's statistics several times.
503Does its drop count increase in proportion to the ping? If so, check
504why the packets are dropped there.</P>
505
506<P>To do this, look at the firewall rules that apply to that interface. If the
507interface is an IPSec interface, more information may be available in
508the log. Grep for the word &quot;drop&quot; in a log which was
509created with <VAR>klipsdebug=all</VAR> as the error happened.</P>
510<P>See also this <A HREF="#ifconfig1">discussion</A> on interpreting
511<VAR>ifconfig</VAR> statistics.</P>
512<H3><A NAME="sniff"></A>3.4 When in doubt, sniff it out</H3>
513<P>If you have checked configuration assumptions, routing, and
514firewall rules, and your interface statistics yield no clue, it
515remains for you to investigate the mystery of the lost packet by the
516most thorough method: with a packet sniffer (providing, of course,
517that this is legal where you are working).
518<P>In order to detect packets on the ipsec virtual interfaces,
519you will need an up-to-date sniffer (tcpdump, ethereal, ksnuffle) on
520your IPSec gateway machines. You may also find it useful to sniff the ping
521endpoints.</P>
522<H4>Anticipate your packets' path</H4>
523<P>Ping, and examine each interface along the projected path, checking for your
524ping's arrival. If it doesn't get to the the next stop, you have narrowed
525down where to look for it. In this way, you can isolate a problem area,
526and narrow your troubleshooting focus.</P>
527<P>Within a machine running Linux FreeS/WAN, this
528<A HREF="firewall.html#packets">packet flow diagram</A> will help you
529anticipate a packet's path.
530<P>Note that:</P>
531<UL>
532<LI>
533from the perspective of the tunneled packet, the entire tunnel is one hop.
534That's explained in <A HREF="faq.html#no_trace">this</A> FAQ.
535</LI>
536<LI>
537 an encapsulated IPSec packet will look different, when
538sniffed, from the plaintext packet which generated it. You
539can see plaintext packets entering an IPSec interface and the
540resulting cyphertext packets as they emerge from the corresponding
541physical interface.
542</LI>
543</UL>
544<P>Once you isolate where the packet is lost, take a closer look at
545firewall rules, routing and configuration assumptions as they affect
546that specific area. If the packet is lost on an IPSec gateway, comb
547through <VAR>klipsdebug</VAR> output for anomalies.
548</P>
549<P>If the packet goes through both gateways successfully and reaches
550the ping target, but does not return, suspect routing. Check that the
551ping target routes packets back to the IPSec gateway.</P>
552<H3><A NAME="find.use.error"></A>3.5 Check your logs</H3>
553<P>Here, too, log information can be useful. Start with the
554<A HREF="#find.pluto.error">guidelines above</A>.</P>
555<P>For connection use problems, set <VAR>klipsdebug=all</VAR>. Note
556that you must have enabled the <VAR>klipsdebug</VAR>
557<A HREF="install.html#allbut">compile-time option</A> to do this.
558Restart Linux FreeS/WAN so that it rereads <VAR>ipsec.conf</VAR>,
559then recreate the error condition. When searching through
560<VAR>klipsdebug</VAR> data, look especially for the keywords
561&quot;drop&quot; (as in dropped packets) and &quot;error&quot;.</P>
562<P>Often the problem with connection use is not software error, but
563rather that the software is behaving contrary to expectation.
564</P>
565<H4><A NAME="interpret.use.error"></A>Interpreting log text</H4>
566<P>To interpret the Linux FreeS/WAN log text you've found, use the
567same resources as indicated for troubleshooting
568connection negotiation:
569<A HREF="faq.html">the FAQ</A> , our
570<A HREF="background.html">background document</A>, and the
571<A HREF="mail.html#lists">list archives</A>.
572Looking in the KLIPS code is only for the very brave.</P>
573<P>If you are still stuck, send a <A HREF="#prob.report">detailed
574problem report</A> to the users' list.</P>
575<H3><A NAME="bigpacket"></A>3.6 More testing for the truly thorough</H3>
576<H4>Large Packets</H4>
577<P>If each of your connections passed the ping test, you may wish to
578test by pinging with large packets (2000 bytes or larger). If it does
579not return, suspect MTU issues, and see this <A HREF="background.html#MTU.trouble">discussion</A>.</P>
580<H4>Stress Tests</H4>
581<P>In most users' view, a simple ping test, and perhaps a
582large-packet ping test suffice to indicate a working IPSec
583connection.</P>
584<P>Some people might like to do additional stress tests prior to
585production use. They may be interested in this <A HREF="http://www.sandelman.ottawa.on.ca/linux-ipsec/html/2000/12/msg00224.html">testing
586protocol</A> we use at interoperation conferences, aka &quot;bakeoffs&quot;.
587We also have a <VAR>testing</VAR> directory that ships with the
588release.</P>
589<H2><A NAME="prob.report"></A>4. Problem Reporting</H2>
590<H3>4.1 How to ask for help</H3>
591<P>Ask for troubleshooting help on the users' mailing list,
592<A HREF="mailto:users@lists.freeswan.org">users@lists.freeswan.org</A>.
593While sometimes an initial query with a quick description of your
594intent and error will twig someone's memory of a similar problem,
595it's often necessary to send a second mail with a complete problem
596report.
597</P>
598
599
600<P>When reporting problems to the mailing list(s), please include:
601</P>
602<UL>
603 <LI>a brief description of the problem</LI>
604 <LI>if it's a compile problem, the actual output from make,
605 showing the problem. Try to edit it down to only the relevant part,
606 but when in doubt, be as complete as you can. If it's a kernel
607 compile problem, any relevant out.* files</LI>
608 <LI>if it's a run-time problem, pointers to where we can find the
609 complete output from &quot;ipsec barf&quot; from BOTH ENDS (not just
610 one of them). Remember that it's common outside the US and Canada to
611 pay for download volume, so if you can't post barfs on the web and
612 send the URL to the mailing list, at least compress them with tar or
613 gzip.<BR>
614 If you can, try to simplify the case that is causing the problem.
615 In particular, if you clear your logs, start FreeS/WAN with no other
616 connections running, cause the problem to happen, and then do <VAR>ipsec
617 barf</VAR> on both ends immediately, that gives the smallest and
618 least cluttered output.</LI>
619 <LI>any other error messages, complaints, etc. that you saw.
620 Please send the complete text of the messages, not just a summary.</LI>
621 <LI>what your network setup is. Include subnets, gateway
622 addresses, etc. A schematic diagram is a
623 good format for this information.</LI>
624 <LI>exactly what you were trying to do with Linux FreeS/WAN, and
625 exactly what went wrong</LI>
626 <LI>a fix, if you have one. But remember, you are sending mail to
627 people all over the world; US residents and US citizens in
628 particular, please read doc/exportlaws.html before sending code --
629 even small bug fixes -- to the list or to us.</LI>
630 <LI>When in doubt about whether to include some seemingly-trivial
631 item of information, include it. It is rare for problem reports to
632 have too much information, and common for them to have too little.</LI>
633</UL>
634
635<P>Here are some good general guidelines on bug reporting:
636<a href="http://tuxedo.org/~esr/faqs/smart-questions.html">How To Ask Questions
637The Smart Way</a> and <a
638href="http://www.chiark.greenend.org.uk/~sgtatham/bugs.html">How to Report
639Bugs Effectively</a>.</p>
640
641
642<H3>4.2 Where to ask</H3>
643<P>To report a problem, send mail about it to the users' list. If you
644are certain that you have found a bug, report it to the bugs list. If
645you encounter a problem while doing your own coding on the Linux
646FreeS/WAN codebase and think it is of interest to the design team,
647notify the design list. When in doubt, default to the users' list.
648More information about the mailing lists is found <A HREF="mail.html#lists">here</A>.</P>
649<P>For a number of reasons -- including export-control regulations
650affecting almost any <STRONG>private</STRONG> discussion of
651encryption software -- we prefer that problem reports and discussions
652go to the lists, not directly to the team. Beware that the list goes
653worldwide; US citizens, read this important information about your
654<A HREF="politics.html#exlaw">export laws</A>. If you're using this
655software, you really should be on the lists. To get onto them, visit
656<A HREF="http://lists.freeswan.org/">lists.freeswan.org</A>.</P>
657<P>If you do send private mail to our coders or want a private reply
658from them, please make sure that the return address on your mail
659(From or Reply-To header) is a valid one. They have more important
660things to do than to unravel addresses that have been mangled in an
661attempt to confuse spammers.
662</P>
663<H2><A NAME="notes"></A>5. Additional Notes on Troubleshooting</H2>
664<P>The following sections supplement the Guide: <A HREF="#system.info">information
665available on your system</A>; <A HREF="#testgates">testing between
666security gateways</A>; <A HREF="#ifconfig1">ifconfig reports for
667KLIPS debugging</A>; <A HREF="#gdb">using GDB on Pluto</A>.</P>
668<H3><A NAME="system.info"></A>5.1 Information available on your
669system</H3>
670<H4><A NAME="logusage"></A>Logs used</H4>
671<P>Linux FreeS/WAN logs to:</P>
672<UL>
673 <LI>/var/log/secure (or, on Debian, /var/log/auth.log)</LI>
674 <LI>/var/log/messages</LI>
675</UL>
676<P>Check both places to get full information. If you find nothing,
677check your <VAR>syslogd.conf(5)</VAR> to see where your
678/etc/syslog.conf or equivalent is directing <VAR>authpriv</VAR>
679messages.</P>
680<H4><A NAME="pages"></A>man pages provided</H4>
681<DL>
682 <DT><A HREF="manpage.d/ipsec.conf.5.html">ipsec.conf(5)</A>
683 </DT><DD>
684 Manual page for IPSEC configuration file.
685 </DD><DT>
686 <A HREF="manpage.d/ipsec.8.html">ipsec(8)</A>
687 </DT><DD STYLE="margin-bottom: 0.2in">
688 Primary man page for ipsec utilities.
689 </DD></DL>
690<P>
691Other man pages are on <A HREF="manpages.html">this list</A> and in</P>
692<UL>
693 <LI>/usr/local/man/man3</LI>
694 <LI>/usr/local/man/man5</LI>
695 <LI>/usr/local/man/man8/ipsec_*</LI>
696</UL>
697<H4><A NAME="statusinfo"></A>Status information</H4>
698<DL>
699 <DT>ipsec auto --status
700 </DT><DD>
701 Command to get status report from running system. Displays Pluto's
702 state. Includes the list of connections which are currently &quot;added&quot;
703 to Pluto's internal database; lists state objects reflecting ISAKMP
704 and IPsec SAs being negotiated or installed.
705 </DD><DT>
706 ipsec look
707 </DT><DD>
708 Brief status info.
709 </DD><DT>
710 ipsec barf
711 </DT><DD STYLE="margin-bottom: 0.2in">
712 Copious debugging info.
713 </DD></DL>
714<H3>
715<A NAME="testgates"></A>5.2 Testing between security gateways</H3>
716<P>Sometimes you need to test a subnet-subnet tunnel. This is a
717tunnel between two security gateways, which protects traffic on
718behalf of the subnets behind these gateways. On this network:</P>
719<PRE> Sunset==========West------------------East=========Sunrise
720 IPSec gateway IPSec gateway
721 local net untrusted net local net</PRE><P>
722you might name this tunnel sunset-sunrise. You can test this tunnel
723by having a machine behind one gateway ping a machine behind the
724other gateway, but this is not always convenient or even possible.</P>
725<P>Simply pinging one gateway from the other is not useful. Such a
726ping does not normally go through the tunnel. <STRONG>The tunnel
727handles traffic between the two protected subnets, not between the
728gateways</STRONG> . Depending on the routing in place, a ping might</P>
729<UL>
730 <LI>either succeed by finding an
731 unencrypted route</LI>
732 <LI>or fail by finding no route. Packets without an IPSEC eroute
733 are discarded.</LI>
734</UL>
735<P><STRONG>Neither event tells you anything about the tunnel</STRONG>.
736You can explicitly create an eroute to force such packets through the
737tunnel, or you can create additional tunnels as described in our
738<A HREF="config.html#multitunnel">configuration document</A>, but
739those may be unnecessary complications in your situation.</P>
740<P>The trick is to explicitly test between <STRONG>both gateways'
741private-side IP addresses</STRONG>. Since the private-side interfaces
742are on the protected subnets, the resulting packets do go via the
743tunnel. Use either ping -I or traceroute -i, both of which allow you
744to specify a source interface. (Note: unsupported on older Linuxes).
745The same principles apply for a road warrior (or other) case where
746only one end of your tunnel is a subnet.</P>
747<H3><A NAME="ifconfig1"></A>5.3 ifconfig reports for KLIPS debugging</H3>
748<P>When diagnosing problems using ifconfig statistics, you may wonder
749what type of activity increments a particular counter for an ipsecN
750device. Here's an index, posted by KLIPS developer Richard Guy
751Briggs:</P>
752<PRE>Here is a catalogue of the types of errors that can occur for which
753statistics are kept when transmitting and receiving packets via klips.
754I notice that they are not necessarily logged in the right counter.
755. . .
756
757Sources of ifconfig statistics for ipsec devices
758
759rx-errors:
760- packet handed to ipsec_rcv that is not an ipsec packet.
761- ipsec packet with payload length not modulo 4.
762- ipsec packet with bad authenticator length.
763- incoming packet with no SA.
764- replayed packet.
765- incoming authentication failed.
766- got esp packet with length not modulo 8.
767
768tx_dropped:
769- cannot process ip_options.
770- packet ttl expired.
771- packet with no eroute.
772- eroute with no SA.
773- cannot allocate sk_buff.
774- cannot allocate kernel memory.
775- sk_buff internal error.
776
777
778The standard counters are:
779
780struct enet_statistics
781{
782 int rx_packets; /* total packets received */
783 int tx_packets; /* total packets transmitted */
784 int rx_errors; /* bad packets received */
785 int tx_errors; /* packet transmit problems */
786 int rx_dropped; /* no space in linux buffers */
787 int tx_dropped; /* no space available in linux */
788 int multicast; /* multicast packets received */
789 int collisions;
790
791 /* detailed rx_errors: */
792 int rx_length_errors;
793 int rx_over_errors; /* receiver ring buff overflow */
794 int rx_crc_errors; /* recved pkt with crc error */
795 int rx_frame_errors; /* recv'd frame alignment error */
796 int rx_fifo_errors; /* recv'r fifo overrun */
797 int rx_missed_errors; /* receiver missed packet */
798
799 /* detailed tx_errors */
800 int tx_aborted_errors;
801 int tx_carrier_errors;
802 int tx_fifo_errors;
803 int tx_heartbeat_errors;
804 int tx_window_errors;
805};
806
807of which I think only the first 6 are useful.</PRE><H3>
808<A NAME="gdb"></A>5.4 Using GDB on Pluto</H3>
809<P>You may need to use the GNU debugger, gdb(1), on Pluto. This
810should be necessary only in unusual cases, for example if you
811encounter a problem which the Pluto developer cannot readily
812reproduce or if you are modifying Pluto.
813</P>
814<P>Here are the Pluto developer's suggestions for doing this:
815</P>
816<PRE>Can you get a core dump and use gdb to find out what Pluto was doing
817when it died?
818
819To get a core dump, you will have to set dumpdir to point to a
820suitable directory (see <A HREF="manpage.d/ipsec.conf.5.html">ipsec.conf(5)</A>).
821
822To get gdb to tell you interesting stuff:
823 $ script
824 $ cd dump-directory-you-chose
825 $ gdb /usr/local/lib/ipsec/pluto core
826 (gdb) where
827 (gdb) quit
828 $ exit
829
830The resulting output will have been captured by the script command in
831a file called &quot;typescript&quot;. Send it to the list.
832
833Do not delete the core file. I may need to ask you to print out some
834more relevant stuff.</PRE><P>
835Note that the <VAR>dumpdir</VAR> parameter takes effect only when the
836IPsec subsystem is restarted -- reboot or ipsec setup restart.</P>
837<P><BR><BR>
838</P>
839</BODY>
840</HTML>