515 lines
16 KiB
HTML
515 lines
16 KiB
HTML
|
<!doctype html public "-//W3C//DTD HTML 4.01 Transitional//EN"
|
||
|
"http://www.w3.org/TR/html4/loose.dtd">
|
||
|
|
||
|
<html>
|
||
|
|
||
|
<head>
|
||
|
|
||
|
<title> Postfix Debugging Howto </title>
|
||
|
|
||
|
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
|
||
|
|
||
|
</head>
|
||
|
|
||
|
<body>
|
||
|
|
||
|
<h1><img src="postfix-logo.jpg" width="203" height="98" ALT="">Postfix Debugging Howto</h1>
|
||
|
|
||
|
<hr>
|
||
|
|
||
|
<h2>Purpose of this document</h2>
|
||
|
|
||
|
<p> This document describes how to debug parts of the Postfix mail
|
||
|
system when things do not work according to expectation. The methods
|
||
|
vary from making Postfix log a lot of detail, to running some daemon
|
||
|
processes under control of a call tracer or debugger. </p>
|
||
|
|
||
|
<p> The text assumes that the Postfix main.cf and master.cf
|
||
|
configuration files are stored in directory /etc/postfix. You can
|
||
|
use the command "<b>postconf config_directory</b>" to find out the
|
||
|
actual location of this directory on your machine. </p>
|
||
|
|
||
|
<p> Listed in order of increasing invasiveness, the debugging
|
||
|
techniques are as follows: </p>
|
||
|
|
||
|
<ul>
|
||
|
|
||
|
<li><a href="#logging">Look for obvious signs of trouble</a>
|
||
|
|
||
|
<li><a href="#trace_mail">Debugging Postfix from inside</a>
|
||
|
|
||
|
<li><a href="#no_chroot">Try turning off chroot operation in
|
||
|
master.cf</a>
|
||
|
|
||
|
<li><a href="#debug_peer">Verbose logging for specific SMTP
|
||
|
connections</a>
|
||
|
|
||
|
<li><a href="#sniffer">Record the SMTP session with a network
|
||
|
sniffer</a>
|
||
|
|
||
|
<li><a href="#verbose">Making Postfix daemon programs more verbose</a>
|
||
|
|
||
|
<li><a href="#man_trace">Manually tracing a Postfix daemon process</a>
|
||
|
|
||
|
<li><a href="#auto_trace">Automatically tracing a Postfix daemon
|
||
|
process</a>
|
||
|
|
||
|
<li><a href="#xxgdb">Running daemon programs with the interactive
|
||
|
xxgdb debugger</a>
|
||
|
|
||
|
<li><a href="#gdb">Running daemon programs under a non-interactive
|
||
|
debugger</a>
|
||
|
|
||
|
<li><a href="#unreasonable">Unreasonable behavior</a>
|
||
|
|
||
|
<li><a href="#mail">Reporting problems to postfix-users@postfix.org</a>
|
||
|
|
||
|
</ul>
|
||
|
|
||
|
<h2><a name="logging">Look for obvious signs of trouble</a></h2>
|
||
|
|
||
|
<p> Postfix logs all failed and successful deliveries to a logfile.
|
||
|
The file is usually called /var/log/maillog or /var/log/mail; the
|
||
|
exact pathname is defined in the /etc/syslog.conf file. </p>
|
||
|
|
||
|
<p> When Postfix does not receive or deliver mail, the first order
|
||
|
of business is to look for errors that prevent Postfix from working
|
||
|
properly: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
% egrep '(warning|error|fatal|panic):' /some/log/file | more
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Note: the most important message is near the BEGINNING of the
|
||
|
output. Error messages that come later are less useful. </p>
|
||
|
|
||
|
<p> The nature of each problem is indicated as follows: </p>
|
||
|
|
||
|
<ul>
|
||
|
|
||
|
<li> <p> "<b>panic</b>" indicates a problem in the software itself
|
||
|
that only a programmer can fix. Postfix cannot proceed until this
|
||
|
is fixed. </p>
|
||
|
|
||
|
<li> <p> "<b>fatal</b>" is the result of missing files, incorrect
|
||
|
permissions, incorrect configuration file settings that you can
|
||
|
fix. Postfix cannot proceed until this is fixed. </p>
|
||
|
|
||
|
<li> <p> "<b>error</b>" reports a fatal or non-fatal error condition.
|
||
|
Postfix cannot proceed until this is fixed. </p>
|
||
|
|
||
|
<li> <p> "<b>warning</b>" indicates a non-fatal error. These are
|
||
|
problems that you may not be able to fix (such as a broken DNS
|
||
|
server elsewhere on the network) but may also indicate local
|
||
|
configuration errors that could become a problem later. </p>
|
||
|
|
||
|
</ul>
|
||
|
|
||
|
<h2><a name="trace_mail">Debugging Postfix from inside</a> </h2>
|
||
|
|
||
|
<p> With Postfix version 2.1 and later you can ask Postfix to
|
||
|
produce mail delivery reports for debugging purposes. These reports
|
||
|
not only show sender/recipient addresses after address rewriting
|
||
|
and alias expansion or forwarding, they also show information about
|
||
|
delivery to mailbox, delivery to non-Postfix command, responses
|
||
|
from remote SMTP servers, and so on.
|
||
|
</p>
|
||
|
|
||
|
<p> Postfix can produce two types of mail delivery reports for
|
||
|
debugging: </p>
|
||
|
|
||
|
<ul>
|
||
|
|
||
|
<li> <p> What-if: report what would happen, but do not actually
|
||
|
deliver mail. This mode of operation is requested with: </p>
|
||
|
|
||
|
<pre>
|
||
|
$ <b>/usr/sbin/sendmail -bv address...</b>
|
||
|
Mail Delivery Status Report will be mailed to <your login name>.
|
||
|
</pre>
|
||
|
|
||
|
<li> <p> What happened: deliver mail and report successes and/or
|
||
|
failures, including replies from remote SMTP servers. This mode
|
||
|
of operation is requested with: </p>
|
||
|
|
||
|
<pre>
|
||
|
$ <b>/usr/sbin/sendmail -v address...</b>
|
||
|
Mail Delivery Status Report will be mailed to <your login name>.
|
||
|
</pre>
|
||
|
|
||
|
</ul>
|
||
|
|
||
|
<p> These reports contain information that is generated by Postfix
|
||
|
delivery agents. Since these run as daemon processes and do not
|
||
|
interact with users directly, the result is sent as mail to the
|
||
|
sender of the test message. The format of these reports is practically
|
||
|
identical to that of ordinary non-delivery notifications. </p>
|
||
|
|
||
|
<p> For a detailed example of a mail delivery status report, see
|
||
|
the <a href="ADDRESS_REWRITING_README.html#debugging"> debugging</a>
|
||
|
section at the end of the ADDRESS_REWRITING_README document. </p>
|
||
|
|
||
|
<h2><a name="no_chroot">Try turning off chroot operation in master.cf</a></h2>
|
||
|
|
||
|
<p> A common mistake is to turn on chroot operation in the master.cf
|
||
|
file without going through all the necessary steps to set up a
|
||
|
chroot environment. This causes Postfix daemon processes to fail
|
||
|
due to all kinds of missing files. </p>
|
||
|
|
||
|
<p> The example below shows an SMTP server that is configured with
|
||
|
chroot turned off: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/master.cf:
|
||
|
# =============================================================
|
||
|
# service type private unpriv <b>chroot</b> wakeup maxproc command
|
||
|
# (yes) (yes) <b>(yes)</b> (never) (100)
|
||
|
# =============================================================
|
||
|
smtp inet n - <b>n</b> - - smtpd
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Inspect master.cf for any processes that have chroot operation
|
||
|
not turned off. If you find any, save a copy of the master.cf file,
|
||
|
and edit the entries in question. After executing the command
|
||
|
"<b>postfix reload</b>", see if the problem has gone away. </p>
|
||
|
|
||
|
<p> If turning off chrooted operation made the problem go away,
|
||
|
then congratulations. Leaving Postfix running in this way is
|
||
|
adequate for most sites. If you prefer chrooted operation, see
|
||
|
the Postfix <a href="BASIC_CONFIGURATION_README.html#chroot_setup">
|
||
|
BASIC_CONFIGURATION_README</a> file for information about how to
|
||
|
prepare Postfix for chrooted operation. </p>
|
||
|
|
||
|
<h2><a name="debug_peer">Verbose logging for specific SMTP
|
||
|
connections</a></h2>
|
||
|
|
||
|
<p> In /etc/postfix/main.cf, list the remote site name or address
|
||
|
in the debug_peer_list parameter. For example, in order to make
|
||
|
the software log a lot of information to the syslog daemon for
|
||
|
connections from or to the loopback interface: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/main.cf:
|
||
|
debug_peer_list = 127.0.0.1
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> You can specify one or more hosts, domains, addresses or
|
||
|
net/masks. To make the change effective immediately, execute the
|
||
|
command "<b>postfix reload</b>". </p>
|
||
|
|
||
|
<h2><a name="sniffer">Record the SMTP session with a network sniffer</a></h2>
|
||
|
|
||
|
<p> This example uses <b>tcpdump</b>. In order to record a conversation
|
||
|
you need to specify a large enough buffer with the "-s" option or
|
||
|
else you will miss some or all of the packet payload. </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
# tcpdump -w /file/name -s 2000 host example.com and port 25
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Run this for a while, stop with Ctrl-C when done. To view the
|
||
|
data use a binary viewer, or <b>ethereal</b>, or use my <b>tcpdumpx</b>
|
||
|
utility that is available from ftp://ftp.porcupine.org/pub/debugging/.
|
||
|
</p>
|
||
|
|
||
|
<h2><a name="verbose">Making Postfix daemon programs more verbose</a></h2>
|
||
|
|
||
|
<p> Append one or more "<b>-v</b>" options to selected daemon
|
||
|
definitions in /etc/postfix/master.cf and type "<b>postfix reload</b>".
|
||
|
This will cause a lot of activity to be logged to the syslog daemon.
|
||
|
Example: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/master.cf:
|
||
|
smtp inet n - n - - smtpd -v
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> This makes the Postfix SMTP server more verbose. To diagnose
|
||
|
problems with address rewriting one would specify a "<b>-v</b>"
|
||
|
option for the cleanup(8) and/or trivial-rewrite(8) daemon, and to
|
||
|
diagnose problems with mail delivery one would specify a "<b>-v</b>"
|
||
|
option for the qmgr(8) or oqmgr(8) queue manager, or for the lmtp(8),
|
||
|
local(8), pipe(8), smtp(8), or virtual(8) delivery agent. </p>
|
||
|
|
||
|
<h2><a name="man_trace">Manually tracing a Postfix daemon process</a></h2>
|
||
|
|
||
|
<p> Many systems allow you to inspect a running process with a
|
||
|
system call tracer. For example: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
# trace -p process-id (SunOS 4)
|
||
|
# strace -p process-id (Linux and many others)
|
||
|
# truss -p process-id (Solaris, FreeBSD)
|
||
|
# ktrace -p process-id (generic 4.4BSD)
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Even more informative are traces of system library calls.
|
||
|
Examples: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
# ltrace -p process-id (Linux, also ported to FreeBSD and BSD/OS)
|
||
|
# sotruss -p process-id (Solaris)
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> See your system documentation for details. </p>
|
||
|
|
||
|
<p> Tracing a running process can give valuable information about
|
||
|
what a process is attempting to do. This is as much information as
|
||
|
you can get without running an interactive debugger program, as
|
||
|
described in a later section. </p>
|
||
|
|
||
|
<h2><a name="auto_trace">Automatically tracing a Postfix daemon
|
||
|
process</a></h2>
|
||
|
|
||
|
<p> Postfix can attach a call tracer whenever a daemon process
|
||
|
starts. Call tracers come in several kinds. </p>
|
||
|
|
||
|
<ol>
|
||
|
|
||
|
<li> <p> System call tracers such as <b>trace</b>, <b>truss</b>,
|
||
|
<b>strace</b>, or <b>ktrace</b>. These show the communication
|
||
|
between the process and the kernel. </p>
|
||
|
|
||
|
<li> <p> Library call tracers such as <b>sotruss</b> and <b>ltrace</b>.
|
||
|
These show calls of library routines, and give a better idea of
|
||
|
what is going on within the process. </p>
|
||
|
|
||
|
</ol>
|
||
|
|
||
|
<p> Append a <b>-D</b> option to the suspect command in
|
||
|
/etc/postfix/master.cf, for example: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/master.cf:
|
||
|
smtp inet n - n - - smtpd -D
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Edit the debugger_command definition in /etc/postfix/main.cf
|
||
|
so that it invokes the call tracer of your choice, for example:
|
||
|
</p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/main.cf:
|
||
|
debugger_command =
|
||
|
PATH=/bin:/usr/bin:/usr/local/bin;
|
||
|
(truss -p $process_id 2>&1 | logger -p mail.info) & sleep 5
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Type "<b>postfix reload</b>" and watch the logfile. </p>
|
||
|
|
||
|
<h2><a name="xxgdb">Running daemon programs with the interactive
|
||
|
xxgdb debugger</a></h2>
|
||
|
|
||
|
<p> If you have X Windows installed on the Postfix machine, then
|
||
|
an interactive debugger such as <b>xxgdb</b> can be convenient.
|
||
|
</p>
|
||
|
|
||
|
<p> Edit the debugger_command definition in /etc/postfix/main.cf
|
||
|
so that it invokes <b>xxgdb</b>: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/main.cf:
|
||
|
debugger_command =
|
||
|
PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin
|
||
|
xxgdb $daemon_directory/$process_name $process_id & sleep 5
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Be sure that <b>gdb</b> is in the command search path, and
|
||
|
export <b>XAUTHORITY</b> so that X access control works, for example:
|
||
|
</p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
% setenv XAUTHORITY ~/.Xauthority (csh syntax)
|
||
|
$ export XAUTHORITY=$HOME/.Xauthority (sh syntax)
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Append a <b>-D</b> option to the suspect daemon definition in
|
||
|
/etc/postfix/master.cf, for example: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/master.cf:
|
||
|
smtp inet n - n - - smtpd -D
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Stop and start the Postfix system. This is necessary so that
|
||
|
Postfix runs with the proper <b>XAUTHORITY</b> and <b>DISPLAY</b>
|
||
|
settings. </p>
|
||
|
|
||
|
<p> Whenever the suspect daemon process is started, a debugger
|
||
|
window pops up and you can watch in detail what happens. </p>
|
||
|
|
||
|
<h2><a name="gdb">Running daemon programs under a non-interactive
|
||
|
debugger</a></h2>
|
||
|
|
||
|
<p> If you do not have X Windows installed on the Postfix machine,
|
||
|
or if you are not familiar with interactive debuggers, then you
|
||
|
can try to run <b>gdb</b> in non-interactive mode, and have it
|
||
|
print a stack trace when the process crashes. </p>
|
||
|
|
||
|
<p> Edit the debugger_command definition in /etc/postfix/main.cf
|
||
|
so that it invokes the <b>gdb</b> debugger: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/main.cf:
|
||
|
debugger_command =
|
||
|
PATH=/bin:/usr/bin:/usr/local/bin; export PATH; (echo cont;
|
||
|
echo where) | gdb $daemon_directory/$process_name $process_id 2>&1
|
||
|
>$config_directory/$process_name.$process_id.log & sleep 5
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Append a <b>-D</b> option to the suspect daemon in
|
||
|
/etc/postfix/master.cf, for example: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
/etc/postfix/master.cf:
|
||
|
smtp inet n - n - - smtpd -D
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> Type "<b>postfix reload</b>" to make the configuration changes
|
||
|
effective. </p>
|
||
|
|
||
|
<p> Whenever a suspect daemon process is started, an output file
|
||
|
is created, named after the daemon and process ID (for example,
|
||
|
smtpd.12345.log). When the process crashes, a stack trace (with
|
||
|
output from the "<b>where</b>" command) is written to its logfile.
|
||
|
</p>
|
||
|
|
||
|
<h2><a name="unreasonable">Unreasonable behavior</a></h2>
|
||
|
|
||
|
<p> Sometimes the behavior exhibited by Postfix just does not match the
|
||
|
source code. Why can a program deviate from the instructions given
|
||
|
by its author? There are two possibilities. </p>
|
||
|
|
||
|
<ul>
|
||
|
|
||
|
<li> <p> The compiler has erred. This rarely happens. </p>
|
||
|
|
||
|
<li> <p> The hardware has erred. Does the machine have ECC memory? </p>
|
||
|
|
||
|
</ul>
|
||
|
|
||
|
<p> In both cases, the program being executed is not the program
|
||
|
that was supposed to be executed, so anything could happen. </p>
|
||
|
|
||
|
<p> There is a third possibility: </p>
|
||
|
|
||
|
<ul>
|
||
|
|
||
|
<li> <p> Bugs in system software (kernel or libraries). </p>
|
||
|
|
||
|
</ul>
|
||
|
|
||
|
<p> Hardware-related failures usually do not reproduce in exactly
|
||
|
the same way after power cycling and rebooting the system. There's
|
||
|
little Postfix can do about bad hardware. Be sure to use hardware
|
||
|
that at the very least can detect memory errors. Otherwise, Postfix
|
||
|
will just be waiting to be hit by a bit error. Critical systems
|
||
|
deserve real hardware. </p>
|
||
|
|
||
|
<p> When a compiler makes an error, the problem can be reproduced
|
||
|
whenever the resulting program is run. Compiler errors are most
|
||
|
likely to happen in the code optimizer. If a problem is reproducible
|
||
|
across power cycles and system reboots, it can be worthwhile to
|
||
|
rebuild Postfix with optimization disabled, and to see if optimization
|
||
|
makes a difference. </p>
|
||
|
|
||
|
<p> In order to compile Postfix with optimizations turned off: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
% make tidy
|
||
|
% make makefiles OPT=
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> This produces a set of Makefiles that do not request compiler
|
||
|
optimization. </p>
|
||
|
|
||
|
<p> Once the makefiles are set up, build the software: </p>
|
||
|
|
||
|
<blockquote>
|
||
|
<pre>
|
||
|
% make
|
||
|
% su
|
||
|
# make install
|
||
|
</pre>
|
||
|
</blockquote>
|
||
|
|
||
|
<p> If the problem goes away, then it is time to ask your vendor
|
||
|
for help. </p>
|
||
|
|
||
|
<h2><a name="mail">Reporting problems to postfix-users@postfix.org</a></h2>
|
||
|
|
||
|
<p> The people who participate on the postfix-users@postfix.org
|
||
|
are very helpful, especially if YOU provide them with sufficient
|
||
|
information. Remember, these volunteers are willing to help, but
|
||
|
their time is limited. </p>
|
||
|
|
||
|
<p> When reporting a problem, be sure to include the following
|
||
|
information. </p>
|
||
|
|
||
|
<ul>
|
||
|
|
||
|
<li> <p> A summary of the problem. Please do not just send some
|
||
|
logging without explanation of what YOU believe is wrong. </p>
|
||
|
|
||
|
<li> <p> Consider using a test email address so that you don't have
|
||
|
to reveal email addresses of innocent people. </p>
|
||
|
|
||
|
<li> <p> If you can't use a test email address, please anonymize
|
||
|
information consistently. Replace each letter by "A", each digit
|
||
|
by "D" so that the helpers can still recognize syntactical errors.
|
||
|
</p>
|
||
|
|
||
|
<li> <p> Complete error messages. Please use cut-and-paste, or use
|
||
|
attachments, instead of reciting information from memory.
|
||
|
</p>
|
||
|
|
||
|
<li> <p> Postfix logging. See the text at the top of the DEBUG_README
|
||
|
document to find out where logging is stored. Please do not frustrate
|
||
|
the helpers by word wrapping the logging. </p>
|
||
|
|
||
|
<li> <p> Output from "postconf -n". Please do not send your main.cf
|
||
|
file. Or better, provide output from the "postfinger" tool. </p>
|
||
|
|
||
|
<li> <p> If the problem is about too much mail in the queue, consider
|
||
|
including output from the qshape tool, as described in the
|
||
|
QSHAPE_README file. </p>
|
||
|
|
||
|
<li> <p> If the problem is protocol related (connections time out
|
||
|
or an SMTP server complains about syntax errors etc.) consider
|
||
|
recording a session with tcpdump, as described in the DEBUG_README
|
||
|
document. </ul>
|
||
|
|
||
|
</body>
|
||
|
|
||
|
</html>
|