NetBSD/gnu/dist/postfix/DEBUG_README

154 lines
5.0 KiB
Plaintext

1 - Purpose of this document
============================
This document describes how to debug parts of the Postfix mail
system, either by making the software log a lot of detail to the
syslog daemon, or by running some daemon processes under control
of an interactive debugger.
2 - Verbose logging for specific SMTP connections
=================================================
In /etc/postfix/main.cf, list the remote site name or address in
the "debug_peer_list" parameter. For example, in order to make the
software log a lot of information to the syslog daemon for connections
from or to the loopback interface:
debug_peer_list = 127.0.0.1
You can specify one or more hosts, domains, addresses or net/masks.
2b - Record the SMTP connection with a sniffer
==============================================
This example uses tcpdump. In order to record a conversation you
need to specify a large enough buffer or else you will miss some
or all of the packet payload.
tcpdump -w /file/name -s 2000 host hostname and port 25
Run this for a while, stop with Ctrl-C when done. To view the data
use a binary viewer, or use my tcpdumpx utility that is available
from ftp://ftp.porcupine.org/pub/debugging.
3 - Making Postfix daemon programs more verbose
===============================================
Append one or more -v options to selected daemon definitions in
/etc/postfix/master.cf and type "postfix reload". This will cause
a lot of activity to be logged to the syslog daemon.
4 - Manually tracing a Postfix daemon process
=============================================
Some systems allow you to inspect a running process with a system
call tracer. For example:
# trace -p process-id
# strace -p process-id
# truss -p process-id
# ktrace -p process-id
See your system documentation for details.
Tracing a running process can give valuable information about what
a process is attempting to do. This is as much information as you
can get without running an interactive debugger program, as described
in a later section.
5 - Automatically tracing a Postfix daemon process
==================================================
Postfix can attach a call tracer whenever a daemon process starts.
Append a -D option to the suspect command in /etc/postfix/master.cf,
for example:
smtp inet n - n - - smtpd -D
Edit the debugger_command definition in /etc/postfix/main.cf so
that it invokes the call tracer of your choice, for example:
debugger_command =
PATH=/bin:/usr/bin:/usr/local/bin
(truss -p $process_id 2>&1 | logger -p mail.info) & sleep 5
Instead of truss use trace or strace.
Type "postfix reload" and watch the logfile.
6 - Running daemon programs under an interactive debugger
=========================================================
Append a -D option to the suspect command in /etc/postfix/master.cf,
for example:
smtp inet n - n - - smtpd -D
Edit the debugger_command definition in /etc/postfix/main.cf so
that it invokes the debugger of your choice, for example:
debugger_command =
PATH=/usr/bin:/usr/X11R6/bin
xxgdb $daemon_directory/$process_name $process_id & sleep 5
If you use xxgdb, be sure that gdb is in the command search path.
Export XAUTHORITY so that X access control works, for example:
% setenv XAUTHORITY ~/.Xauthority
Stop and start the Postfix system.
Whenever the suspect daemon process is started, a debugger window
pops up and you can watch in detail what happens.
7 - Unreasonable behavior
=========================
Sometimes the behavior exhibit by Postfix just does not match the
source code. Why can a program deviate from the instructions given
by its author? There are two possibilities.
1 - The compiler has messed up.
2 - The hardware has messed up.
In both cases, the program being executed is not the program that
was supposed to be executed, so anything can happen.
There is a third possibility:
3 - Bugs in system software (kernel or libraries).
Hardware-related failures happen erratically, and they usually do
not reproduce after power cycling and rebooting the system. There's
little I can do about bad hardware. Be sure to use hardware that
at the very least can detect memory errors. Otherwise, Postfix will
just be a sitting duck waiting to be hit by a bit error. Critical
systems deserve real hardware.
When a compiler messes up, the problem can be reproduced whenever
the resulting program is run. Compiler errors are most likely to
happen in the code optimizer. If a problem is reproducible across
power cycles and system reboots, it can be worthwhile to rebuild
Postfix with optimization disabled, and to see if optimization
makes a difference.
In order to compile Postfix with optimizations turned off:
% make tidy
% make makefiles OPT=
This produces a set of Makefiles that do not request compiler
optimization.
Once the makefiles are set up, build the software:
% make
% su
# make install
And see if the problem reproduces. If the problem goes away, talk
to your vendor.