Bugfix (introduced: 20071004) missing exception handling
in smtp-sink per-command delay feature. Victor Duchovni.
File: smtpstone/smtp-sink.c.
+
+2007117-20
+
+ Revised queue manager with separate mechanisms for
+ per-destination concurrency control and dead destination
+ detection. The concurrency control supports non-integer
+ feedback for more gradual concurrency adjustments, and uses
+ hysteresis to avoid rapid oscillations. A destination is
+ declared "dead" after a configurable number of pseudo-cohorts
+ (number of deliveries equal to a destination's concurrency)
+ reports connection or handshake failure. This work began
+ with a discussion that Wietse started with Patrik Rak and
+ Victor Duchovni late January 2004, and that Victor revived
+ late October 2007. To establish a baseline for further
+ improvement, Wietse implemented a few simple mechanisms.
+
+ Configuration parameters: qmgr_concurrency_feedback_debug,
+ qmgr_negative_concurrency_feedback_hysteresis,
+ qmgr_negative_concurrency_feedback_style,
+ qmgr_positive_concurrency_feedback_hysteresis,
+ qmgr_positive_concurrency_feedback_style, qmgr_sacrifice_cohorts.
+ See postconf(5) for detailed information. Right now, the
+ defaults are compatible with older Postfix versions. After
+ further review the number of parameters will be consolidated
+ and the defaults will select the better algorithms. Files:
+ qmgr/qmgr_queue.c, qmgr/qmgr_deliver.c.
If you upgrade from Postfix 2.3 or earlier, read RELEASE_NOTES-2.4
before proceeding.
-Major changes with Postfix snapshot 20071110
+Major changes with Postfix snapshot 20071121
+============================================
+
+Revised queue manager with separate mechanisms for per-destination
+concurrency control and for dead destination detection. The
+concurrency control supports non-integer feedback to allow for more
+gradual concurrency adjustments, and uses hysteresis to avoid rapid
+oscillations. A destination is declared "dead" after a configurable
+number of pseudo-cohorts(*) reports connection or handshake failure.
+
+(*) A pseudo-cohort is a number of delivery requests equal to a
+ destination's delivery concurrency.
+
+The drawbacks of the old +/-1 feedback scheduler are a) overshoot
+due to exponential delivery concurrency growth with each pseudo-cohort(*)
+(5-10-20...); b) throttling down to zero concurrency after a single
+pseudo-cohort(*) failure. The second problem was especially an issue
+with low-concurrency channels where a single failure could be
+sufficient to mark a destination as "dead", and suspend further
+deliveries.
+
+The new code is a laboratory model with a multitude of configuration
+parameters, so that developers can experiment with different feedback
+functions and hysteresis values. This is a baseline against which
+further improvements will be measured: a) is the additional improvement
+worth the additional complexity; b) is the design sound, i.e. free
+from arbitrary constants and other tweaks that optimize for a narrow
+range of application.
+
+New main.cf parameters: qmgr_concurrency_feedback_debug,
+qmgr_negative_feedback_hysteresis, qmgr_negative_feedback_method,
+qmgr_positive_feedback_hysteresis, qmgr_positive_feedback_method,
+qmgr_sacrifice_cohorts. See postconf(5) for extensive descriptions.
+
+The default parameter settings are backwards compatible with older
+Postfix versions. However, after a testing period, the number of
+parameters will be consolidated, and the default settings will be
+changed to take advantage of the "better" algorithm.
+
+Major changes with Postfix snapshot 20071111
============================================
Header/body checks are now available in the SMTP client, after the
</p>
+</DD>
+
+<DT><b><a name="qmgr_concurrency_feedback_debug">qmgr_concurrency_feedback_debug</a>
+(default: no)</b></DT><DD>
+
+<p> Make the queue manager's feedback algorithm verbose for performance
+analysis purposes. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. </p>
+
+
</DD>
<DT><b><a name="qmgr_fudge_factor">qmgr_fudge_factor</a>
</p>
+</DD>
+
+<DT><b><a name="qmgr_negative_concurrency_feedback_hysteresis">qmgr_negative_concurrency_feedback_hysteresis</a>
+(default: 1)</b></DT><DD>
+
+<p> The per-destination integer amount of negative concurrency
+feedback that must accumulate between negative adjustments of a
+destination's delivery concurrency. The concurrency adjustment is
+equal in size to the negative hysteresis value, and is applied at
+the <b>beginning</b> of a cycle of (hysteresis / feedback) steps.
+At that same time, the destination's positive feedback hysteresis
+cycle is reset to its beginning. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions. </p>
+
+
+</DD>
+
+<DT><b><a name="qmgr_negative_concurrency_feedback_style">qmgr_negative_concurrency_feedback_style</a>
+(default: fixed_1)</b></DT><DD>
+
+<p> The per-destination amount of negative delivery concurrency
+feedback, after a delivery completes with a connection or handshake
+failure. </p>
+
+<dl>
+
+<dt> <b> inverse_concurrency </b> </dt> <dd> Variable feedback of
+1 / (delivery concurrency). With this setting, and with
+"<a href="postconf.5.html#qmgr_negative_concurrency_feedback_hysteresis">qmgr_negative_concurrency_feedback_hysteresis</a> = 1", the destination's
+delivery concurrency is decremented by 1 after each failed
+pseudo-cohort, and the destination is marked dead (further delivery
+suspended) after the failed pseudo-cohort count reaches
+$<a href="postconf.5.html#qmgr_sacrificial_cohorts">qmgr_sacrificial_cohorts</a>. </dd>
+
+<dt> <b> inverse_sqrt_concurrency </b> </dt> <dd> Variable feedback
+of 1 / (square root of delivery concurrency). This is an intermediate
+form between the other two. It lacks sound justification, and is a
+candidate for removal. </dd>
+
+<dt> <b> fixed_1 </b> </dt> <dd> Constant feedback of 1. This setting
+is compatible with Postfix versions before 2.5, where a destination's
+delivery concurrency is throttled down to zero (and further delivery
+suspended) after a single failed pseudo-cohort. </dd>
+
+</dl>
+
+<p> A pseudo-cohort is a number of deliveries equal to the destination's
+delivery concurrency. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions. </p>
+
+
+</DD>
+
+<DT><b><a name="qmgr_positive_concurrency_feedback_hysteresis">qmgr_positive_concurrency_feedback_hysteresis</a>
+(default: 1)</b></DT><DD>
+
+<p> The per-destination integer amount of positive concurrency
+feedback that must accumulate before positive adjustments of a
+destination's delivery concurrency. The concurrency adjustment is
+equal in size to the positive hysteresis value, and is applied at
+the <b>end</b> of a cycle of (hysteresis / feedback) steps. At that
+same time, the destination's negative feedback hysteresis cycle is
+reset to its beginning. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions. </p>
+
+
+</DD>
+
+<DT><b><a name="qmgr_positive_concurrency_feedback_style">qmgr_positive_concurrency_feedback_style</a>
+(default: fixed_1)</b></DT><DD>
+
+<p> The per-destination amount of positive delivery concurrency
+feedback, after a delivery completes without connection or handshake
+failure. </p>
+
+<dl>
+
+<dt> <b> inverse_concurrency </b> </dt> <dd> Variable feedback of
+1 / (delivery concurrency). With this setting, and with
+"<a href="postconf.5.html#qmgr_positive_concurrency_feedback_hysteresis">qmgr_positive_concurrency_feedback_hysteresis</a> = 1", the destination's
+delivery concurrency is incremented by 1 after each successful
+pseudo-cohort, until it reaches the per-destination maximal concurrency
+limit. </dd>
+
+<dt> <b> inverse_sqrt_concurrency </b> </dt> <dd> Variable feedback
+of 1 / (square root of delivery concurrency). This is an intermediate
+form between the other two. It lacks sound justification, and is a
+candidate for removal. </dd>
+
+<dt> <b> fixed_1 </b> </dt> <dd> Constant feedback of 1. This setting
+is compatible with Postfix versions before 2.5, where a destination's
+delivery concurrency is doubled after each successful pseudo-cohort,
+until it reaches the per-destination maximal concurrency limit.
+</dd>
+
+</dl>
+
+<p> A pseudo-cohort is a number of deliveries equal to the destination's
+delivery concurrency. </p>
+
+<p> This feature is temporarily available in Postfix 2.5. The default
+setting is compatible with earlier Postfix versions. </p>
+
+
+</DD>
+
+<DT><b><a name="qmgr_sacrificial_cohorts">qmgr_sacrificial_cohorts</a>
+(default: 1)</b></DT><DD>
+
+<p> How many pseudo-cohorts must suffer connection or handshake
+failure before a specific destination is considered unavailable
+(and further delivery is suspended). A pseudo-cohort is a number
+of deliveries equal to a destination's concurrency. The pseudo-cohort
+failure count is reset each time a delivery completes without
+connection or handshake failure for that specific destination. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions. </p>
+
+
</DD>
<DT><b><a name="qmqpd_authorized_clients">qmqpd_authorized_clients</a>
<b><a href="postconf.5.html#default_destination_concurrency_limit">tion_concurrency_limit</a>)</b>
Idem, for delivery via the named message <i>transport</i>.
+ <b><a href="postconf.5.html#qmgr_concurrency_feedback_debug">qmgr_concurrency_feedback_debug</a> (no)</b>
+ Make the queue manager's feedback algorithm verbose
+ for performance analysis purposes.
+
+ <b><a href="postconf.5.html#qmgr_negative_concurrency_feedback_hysteresis">qmgr_negative_concurrency_feedback_hysteresis</a> (1)</b>
+ The per-destination integer amount of negative con-
+ currency feedback that must accumulate between neg-
+ ative adjustments of a destination's delivery con-
+ currency.
+
+ <b><a href="postconf.5.html#qmgr_negative_concurrency_feedback_style">qmgr_negative_concurrency_feedback_style</a> (fixed_1)</b>
+ The per-destination amount of negative delivery
+ concurrency feedback, after a delivery completes
+ with a connection or handshake failure.
+
+ <b><a href="postconf.5.html#qmgr_positive_concurrency_feedback_hysteresis">qmgr_positive_concurrency_feedback_hysteresis</a> (1)</b>
+ The per-destination integer amount of positive con-
+ currency feedback that must accumulate before posi-
+ tive adjustments of a destination's delivery con-
+ currency.
+
+ <b><a href="postconf.5.html#qmgr_positive_concurrency_feedback_style">qmgr_positive_concurrency_feedback_style</a> (fixed_1)</b>
+ The per-destination amount of positive delivery
+ concurrency feedback, after a delivery completes
+ without connection or handshake failure.
+
+ <b><a href="postconf.5.html#qmgr_sacrificial_cohorts">qmgr_sacrificial_cohorts</a> (1)</b>
+ How many pseudo-cohorts must suffer connection or
+ handshake failure before a specific destination is
+ considered unavailable (and further delivery is
+ suspended).
+
<b>RECIPIENT SCHEDULING CONTROLS</b>
<b><a href="postconf.5.html#default_destination_recipient_limit">default_destination_recipient_limit</a> (50)</b>
The default maximal number of recipients per mes-
Idem, for delivery via the named message <i>transport</i>.
<b>OTHER RESOURCE AND RATE CONTROLS</b>
- <b><a href="postconf.5.html#minimal_backoff_time">minimal_backoff_time</a> (version dependent)</b>
+ <b><a href="postconf.5.html#minimal_backoff_time">minimal_backoff_time</a> (300s)</b>
The minimal time between attempts to deliver a
- deferred message.
+ deferred message; prior to Postfix 2.4 the default
+ value was 1000s.
<b><a href="postconf.5.html#maximal_backoff_time">maximal_backoff_time</a> (4000s)</b>
- The maximal time between attempts to deliver a
+ The maximal time between attempts to deliver a
deferred message.
<b><a href="postconf.5.html#maximal_queue_lifetime">maximal_queue_lifetime</a> (5d)</b>
- The maximal time a message is queued before it is
+ The maximal time a message is queued before it is
sent back as undeliverable.
- <b><a href="postconf.5.html#queue_run_delay">queue_run_delay</a> (version dependent)</b>
- The time between <a href="QSHAPE_README.html#deferred_queue">deferred queue</a> scans by the queue
- manager.
+ <b><a href="postconf.5.html#queue_run_delay">queue_run_delay</a> (300s)</b>
+ The time between <a href="QSHAPE_README.html#deferred_queue">deferred queue</a> scans by the queue
+ manager; prior to Postfix 2.4 the default value was
+ 1000s.
<b><a href="postconf.5.html#transport_retry_time">transport_retry_time</a> (60s)</b>
The time between attempts by the Postfix queue man-
This feature is enabled with the helpful_warnings parameter.
.PP
This feature is available in Postfix 2.0 and later.
+.SH qmgr_concurrency_feedback_debug (default: no)
+Make the queue manager's feedback algorithm verbose for performance
+analysis purposes.
+.PP
+This feature is temporarily available in Postfix 2.5; its final
+form is likely to change.
.SH qmgr_fudge_factor (default: 100)
Obsolete feature: the percentage of delivery resources that a busy
mail system will use up for delivery of a large mailing list
the global qmgr_message_recipient_limit and the per transport
_recipient_limit) if necessary. The minimum value allowed for this
parameter is 1.
+.SH qmgr_negative_concurrency_feedback_hysteresis (default: 1)
+The per-destination integer amount of negative concurrency
+feedback that must accumulate between negative adjustments of a
+destination's delivery concurrency. The concurrency adjustment is
+equal in size to the negative hysteresis value, and is applied at
+the \fBbeginning\fR of a cycle of (hysteresis / feedback) steps.
+At that same time, the destination's positive feedback hysteresis
+cycle is reset to its beginning.
+.PP
+This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions.
+.SH qmgr_negative_concurrency_feedback_style (default: fixed_1)
+The per-destination amount of negative delivery concurrency
+feedback, after a delivery completes with a connection or handshake
+failure.
+.IP "\fB inverse_concurrency \fR"
+Variable feedback of
+1 / (delivery concurrency). With this setting, and with
+"qmgr_negative_concurrency_feedback_hysteresis = 1", the destination's
+delivery concurrency is decremented by 1 after each failed
+pseudo-cohort, and the destination is marked dead (further delivery
+suspended) after the failed pseudo-cohort count reaches
+$qmgr_sacrificial_cohorts.
+.IP "\fB inverse_sqrt_concurrency \fR"
+Variable feedback
+of 1 / (square root of delivery concurrency). This is an intermediate
+form between the other two. It lacks sound justification, and is a
+candidate for removal.
+.IP "\fB fixed_1 \fR"
+Constant feedback of 1. This setting
+is compatible with Postfix versions before 2.5, where a destination's
+delivery concurrency is throttled down to zero (and further delivery
+suspended) after a single failed pseudo-cohort.
+.PP
+A pseudo-cohort is a number of deliveries equal to the destination's
+delivery concurrency.
+.PP
+This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions.
+.SH qmgr_positive_concurrency_feedback_hysteresis (default: 1)
+The per-destination integer amount of positive concurrency
+feedback that must accumulate before positive adjustments of a
+destination's delivery concurrency. The concurrency adjustment is
+equal in size to the positive hysteresis value, and is applied at
+the \fBend\fR of a cycle of (hysteresis / feedback) steps. At that
+same time, the destination's negative feedback hysteresis cycle is
+reset to its beginning.
+.PP
+This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions.
+.SH qmgr_positive_concurrency_feedback_style (default: fixed_1)
+The per-destination amount of positive delivery concurrency
+feedback, after a delivery completes without connection or handshake
+failure.
+.IP "\fB inverse_concurrency \fR"
+Variable feedback of
+1 / (delivery concurrency). With this setting, and with
+"qmgr_positive_concurrency_feedback_hysteresis = 1", the destination's
+delivery concurrency is incremented by 1 after each successful
+pseudo-cohort, until it reaches the per-destination maximal concurrency
+limit.
+.IP "\fB inverse_sqrt_concurrency \fR"
+Variable feedback
+of 1 / (square root of delivery concurrency). This is an intermediate
+form between the other two. It lacks sound justification, and is a
+candidate for removal.
+.IP "\fB fixed_1 \fR"
+Constant feedback of 1. This setting
+is compatible with Postfix versions before 2.5, where a destination's
+delivery concurrency is doubled after each successful pseudo-cohort,
+until it reaches the per-destination maximal concurrency limit.
+.PP
+A pseudo-cohort is a number of deliveries equal to the destination's
+delivery concurrency.
+.PP
+This feature is temporarily available in Postfix 2.5. The default
+setting is compatible with earlier Postfix versions.
+.SH qmgr_sacrificial_cohorts (default: 1)
+How many pseudo-cohorts must suffer connection or handshake
+failure before a specific destination is considered unavailable
+(and further delivery is suspended). A pseudo-cohort is a number
+of deliveries equal to a destination's concurrency. The pseudo-cohort
+failure count is reset each time a delivery completes without
+connection or handshake failure for that specific destination.
+.PP
+This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions.
.SH qmqpd_authorized_clients (default: empty)
What clients are allowed to connect to the QMQP server port.
.PP
destination.
.IP "\fItransport\fB_destination_concurrency_limit ($default_destination_concurrency_limit)\fR"
Idem, for delivery via the named message \fItransport\fR.
+.IP "\fBqmgr_concurrency_feedback_debug (no)\fR"
+Make the queue manager's feedback algorithm verbose for performance
+analysis purposes.
+.IP "\fBqmgr_negative_concurrency_feedback_hysteresis (1)\fR"
+The per-destination integer amount of negative concurrency
+feedback that must accumulate between negative adjustments of a
+destination's delivery concurrency.
+.IP "\fBqmgr_negative_concurrency_feedback_style (fixed_1)\fR"
+The per-destination amount of negative delivery concurrency
+feedback, after a delivery completes with a connection or handshake
+failure.
+.IP "\fBqmgr_positive_concurrency_feedback_hysteresis (1)\fR"
+The per-destination integer amount of positive concurrency
+feedback that must accumulate before positive adjustments of a
+destination's delivery concurrency.
+.IP "\fBqmgr_positive_concurrency_feedback_style (fixed_1)\fR"
+The per-destination amount of positive delivery concurrency
+feedback, after a delivery completes without connection or handshake
+failure.
+.IP "\fBqmgr_sacrificial_cohorts (1)\fR"
+How many pseudo-cohorts must suffer connection or handshake
+failure before a specific destination is considered unavailable
+(and further delivery is suspended).
.SH "RECIPIENT SCHEDULING CONTROLS"
.na
.nf
.nf
.ad
.fi
-.IP "\fBminimal_backoff_time (version dependent)\fR"
-The minimal time between attempts to deliver a deferred message.
+.IP "\fBminimal_backoff_time (300s)\fR"
+The minimal time between attempts to deliver a deferred message;
+prior to Postfix 2.4 the default value was 1000s.
.IP "\fBmaximal_backoff_time (4000s)\fR"
The maximal time between attempts to deliver a deferred message.
.IP "\fBmaximal_queue_lifetime (5d)\fR"
The maximal time a message is queued before it is sent back as
undeliverable.
-.IP "\fBqueue_run_delay (version dependent)\fR"
-The time between deferred queue scans by the queue manager.
+.IP "\fBqueue_run_delay (300s)\fR"
+The time between deferred queue scans by the queue manager;
+prior to Postfix 2.4 the default value was 1000s.
.IP "\fBtransport_retry_time (60s)\fR"
The time between attempts by the Postfix queue manager to contact
a malfunctioning message delivery transport.
s;\bqmgr_message_recip[-</bB>]*\n* *[<bB>]*ient_limit\b;<a href="postconf.5.html#qmgr_message_recipient_limit">$&</a>;g;
s;\bqmgr_message_recip[-</bB>]*\n* *[<bB>]*ient_minimum\b;<a href="postconf.5.html#qmgr_message_recipient_minimum">$&</a>;g;
s;\bqmqpd_authorized_clients\b;<a href="postconf.5.html#qmqpd_authorized_clients">$&</a>;g;
+
+ s;\bqmgr_negative_concurrency_feedback_hysteresis\b;<a href="postconf.5.html#qmgr_negative_concurrency_feedback_hysteresis">$&</a>;g;
+ s;\bqmgr_negative_concurrency_feedback_style\b;<a href="postconf.5.html#qmgr_negative_concurrency_feedback_style">$&</a>;g;
+ s;\bqmgr_positive_concurrency_feedback_hysteresis\b;<a href="postconf.5.html#qmgr_positive_concurrency_feedback_hysteresis">$&</a>;g;
+ s;\bqmgr_positive_concurrency_feedback_style\b;<a href="postconf.5.html#qmgr_positive_concurrency_feedback_style">$&</a>;g;
+ s;\bqmgr_sacrificial_cohorts\b;<a href="postconf.5.html#qmgr_sacrificial_cohorts">$&</a>;g;
+ s;\bqmgr_concurrency_feedback_debug\b;<a href="postconf.5.html#qmgr_concurrency_feedback_debug">$&</a>;g;
+
s;\bqmqpd_error_delay\b;<a href="postconf.5.html#qmqpd_error_delay">$&</a>;g;
s;\bqmqpd_timeout\b;<a href="postconf.5.html#qmqpd_timeout">$&</a>;g;
s;\bqueue_directory\b;<a href="postconf.5.html#queue_directory">$&</a>;g;
</p>
<p> This feature is available in Postfix 2.5 and later. </p>
+
+%PARAM qmgr_concurrency_feedback_debug no
+
+<p> Make the queue manager's feedback algorithm verbose for performance
+analysis purposes. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. </p>
+
+%PARAM qmgr_sacrificial_cohorts 1
+
+<p> How many pseudo-cohorts must suffer connection or handshake
+failure before a specific destination is considered unavailable
+(and further delivery is suspended). A pseudo-cohort is a number
+of deliveries equal to a destination's concurrency. The pseudo-cohort
+failure count is reset each time a delivery completes without
+connection or handshake failure for that specific destination. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions. </p>
+
+%PARAM qmgr_negative_concurrency_feedback_hysteresis 1
+
+<p> The per-destination integer amount of negative concurrency
+feedback that must accumulate between negative adjustments of a
+destination's delivery concurrency. The concurrency adjustment is
+equal in size to the negative hysteresis value, and is applied at
+the <b>beginning</b> of a cycle of (hysteresis / feedback) steps.
+At that same time, the destination's positive feedback hysteresis
+cycle is reset to its beginning. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions. </p>
+
+%PARAM qmgr_positive_concurrency_feedback_hysteresis 1
+
+<p> The per-destination integer amount of positive concurrency
+feedback that must accumulate before positive adjustments of a
+destination's delivery concurrency. The concurrency adjustment is
+equal in size to the positive hysteresis value, and is applied at
+the <b>end</b> of a cycle of (hysteresis / feedback) steps. At that
+same time, the destination's negative feedback hysteresis cycle is
+reset to its beginning. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions. </p>
+
+%PARAM qmgr_negative_concurrency_feedback_style fixed_1
+
+<p> The per-destination amount of negative delivery concurrency
+feedback, after a delivery completes with a connection or handshake
+failure. </p>
+
+<dl>
+
+<dt> <b> inverse_concurrency </b> </dt> <dd> Variable feedback of
+1 / (delivery concurrency). With this setting, and with
+"qmgr_negative_concurrency_feedback_hysteresis = 1", the destination's
+delivery concurrency is decremented by 1 after each failed
+pseudo-cohort, and the destination is marked dead (further delivery
+suspended) after the failed pseudo-cohort count reaches
+$qmgr_sacrificial_cohorts. </dd>
+
+<dt> <b> inverse_sqrt_concurrency </b> </dt> <dd> Variable feedback
+of 1 / (square root of delivery concurrency). This is an intermediate
+form between the other two. It lacks sound justification, and is a
+candidate for removal. </dd>
+
+<dt> <b> fixed_1 </b> </dt> <dd> Constant feedback of 1. This setting
+is compatible with Postfix versions before 2.5, where a destination's
+delivery concurrency is throttled down to zero (and further delivery
+suspended) after a single failed pseudo-cohort. </dd>
+
+</dl>
+
+<p> A pseudo-cohort is a number of deliveries equal to the destination's
+delivery concurrency. </p>
+
+<p> This feature is temporarily available in Postfix 2.5; its final
+form is likely to change. The default setting is compatible with
+earlier Postfix versions. </p>
+
+%PARAM qmgr_positive_concurrency_feedback_style fixed_1
+
+<p> The per-destination amount of positive delivery concurrency
+feedback, after a delivery completes without connection or handshake
+failure. </p>
+
+<dl>
+
+<dt> <b> inverse_concurrency </b> </dt> <dd> Variable feedback of
+1 / (delivery concurrency). With this setting, and with
+"qmgr_positive_concurrency_feedback_hysteresis = 1", the destination's
+delivery concurrency is incremented by 1 after each successful
+pseudo-cohort, until it reaches the per-destination maximal concurrency
+limit. </dd>
+
+<dt> <b> inverse_sqrt_concurrency </b> </dt> <dd> Variable feedback
+of 1 / (square root of delivery concurrency). This is an intermediate
+form between the other two. It lacks sound justification, and is a
+candidate for removal. </dd>
+
+<dt> <b> fixed_1 </b> </dt> <dd> Constant feedback of 1. This setting
+is compatible with Postfix versions before 2.5, where a destination's
+delivery concurrency is doubled after each successful pseudo-cohort,
+until it reaches the per-destination maximal concurrency limit.
+</dd>
+
+</dl>
+
+<p> A pseudo-cohort is a number of deliveries equal to the destination's
+delivery concurrency. </p>
+
+<p> This feature is temporarily available in Postfix 2.5. The default
+setting is compatible with earlier Postfix versions. </p>
#define VAR_LMTP_BODY_CHKS "lmtp_body_checks"
#define DEF_LMTP_BODY_CHKS ""
+ /*
+ * Scheduler concurrency feedback algorithms.
+ */
+#define VAR_QMGR_POS_FDBACK "qmgr_positive_concurrency_feedback_style"
+#define DEF_QMGR_POS_FDBACK QMGR_FDBACK_NAME_FIXED_1
+extern char *var_qmgr_pos_feedback;
+
+#define VAR_QMGR_NEG_FDBACK "qmgr_negative_concurrency_feedback_style"
+#define DEF_QMGR_NEG_FDBACK QMGR_FDBACK_NAME_FIXED_1
+extern char *var_qmgr_neg_feedback;
+
+#define QMGR_FDBACK_NAME_FIXED_1 "fixed_1"
+#define QMGR_FDBACK_NAME_INVERSE_1 "inverse_1" /* deprecated */
+#define QMGR_FDBACK_NAME_INVERSE_WIN "inverse_concurrency"
+#define QMGR_FDBACK_NAME_INV_SQRT "inverse_sqrt" /* deprecated */
+#define QMGR_FDBACK_NAME_INV_SQRT_WIN "inverse_sqrt_concurrency"
+
+#define VAR_QMGR_POS_HYST "qmgr_positive_concurrency_feedback_hysteresis"
+#define DEF_QMGR_POS_HYST 1
+extern int var_qmgr_pos_hysteresis;
+
+#define VAR_QMGR_NEG_HYST "qmgr_negative_concurrency_feedback_hysteresis"
+#define DEF_QMGR_NEG_HYST 1
+extern int var_qmgr_neg_hysteresis;
+
+#define VAR_QMGR_SAC_COHORTS "qmgr_sacrificial_cohorts"
+#define DEF_QMGR_SAC_COHORTS 1
+extern int var_qmgr_sac_cohorts;
+
+#define VAR_QMGR_FDBACK_DEBUG "qmgr_concurrency_feedback_debug"
+#define DEF_QMGR_FDBACK_DEBUG 0
+extern bool var_qmgr_feedback_debug;
+
/* LICENSE
/* .ad
/* .fi
* Patches change both the patchlevel and the release date. Snapshots have no
* patchlevel; they change the release date only.
*/
-#define MAIL_RELEASE_DATE "20071111"
+#define MAIL_RELEASE_DATE "20071121"
#define MAIL_VERSION_NUMBER "2.5"
#ifdef SNAPSHOT
TESTPROG=
PROG = qmgr
INC_DIR = ../../include
-LIBS = ../../lib/libmaster.a ../../lib/libglobal.a ../../lib/libutil.a
+LIBS = ../../lib/libmaster.a ../../lib/libglobal.a ../../lib/libutil.a -lm
.c.o:; $(CC) $(CFLAGS) -c $*.c
qmgr_queue.o: ../../include/mail_params.h
qmgr_queue.o: ../../include/msg.h
qmgr_queue.o: ../../include/mymalloc.h
+qmgr_queue.o: ../../include/name_code.h
qmgr_queue.o: ../../include/recipient_list.h
qmgr_queue.o: ../../include/scan_dir.h
qmgr_queue.o: ../../include/sys_defs.h
/* destination.
/* .IP "\fItransport\fB_destination_concurrency_limit ($default_destination_concurrency_limit)\fR"
/* Idem, for delivery via the named message \fItransport\fR.
+/* .IP "\fBqmgr_concurrency_feedback_debug (no)\fR"
+/* Make the queue manager's feedback algorithm verbose for performance
+/* analysis purposes.
+/* .IP "\fBqmgr_negative_concurrency_feedback_hysteresis (1)\fR"
+/* The per-destination integer amount of negative concurrency
+/* feedback that must accumulate between negative adjustments of a
+/* destination's delivery concurrency.
+/* .IP "\fBqmgr_negative_concurrency_feedback_style (fixed_1)\fR"
+/* The per-destination amount of negative delivery concurrency
+/* feedback, after a delivery completes with a connection or handshake
+/* failure.
+/* .IP "\fBqmgr_positive_concurrency_feedback_hysteresis (1)\fR"
+/* The per-destination integer amount of positive concurrency
+/* feedback that must accumulate before positive adjustments of a
+/* destination's delivery concurrency.
+/* .IP "\fBqmgr_positive_concurrency_feedback_style (fixed_1)\fR"
+/* The per-destination amount of positive delivery concurrency
+/* feedback, after a delivery completes without connection or handshake
+/* failure.
+/* .IP "\fBqmgr_sacrificial_cohorts (1)\fR"
+/* How many pseudo-cohorts must suffer connection or handshake
+/* failure before a specific destination is considered unavailable
+/* (and further delivery is suspended).
/* RECIPIENT SCHEDULING CONTROLS
/* .ad
/* .fi
/* OTHER RESOURCE AND RATE CONTROLS
/* .ad
/* .fi
-/* .IP "\fBminimal_backoff_time (version dependent)\fR"
-/* The minimal time between attempts to deliver a deferred message.
+/* .IP "\fBminimal_backoff_time (300s)\fR"
+/* The minimal time between attempts to deliver a deferred message;
+/* prior to Postfix 2.4 the default value was 1000s.
/* .IP "\fBmaximal_backoff_time (4000s)\fR"
/* The maximal time between attempts to deliver a deferred message.
/* .IP "\fBmaximal_queue_lifetime (5d)\fR"
/* The maximal time a message is queued before it is sent back as
/* undeliverable.
-/* .IP "\fBqueue_run_delay (version dependent)\fR"
-/* The time between deferred queue scans by the queue manager.
+/* .IP "\fBqueue_run_delay (300s)\fR"
+/* The time between deferred queue scans by the queue manager;
+/* prior to Postfix 2.4 the default value was 1000s.
/* .IP "\fBtransport_retry_time (60s)\fR"
/* The time between attempts by the Postfix queue manager to contact
/* a malfunctioning message delivery transport.
int var_proc_limit;
bool var_verp_bounce_off;
int var_qmgr_clog_warn_time;
+char *var_qmgr_pos_feedback;
+char *var_qmgr_neg_feedback;
+int var_qmgr_pos_hysteresis;
+int var_qmgr_neg_hysteresis;
+int var_qmgr_sac_cohorts;
+int var_qmgr_feedback_debug;
static QMGR_SCAN *qmgr_scans[2];
qmgr_scans[QMGR_SCAN_IDX_DEFERRED] = qmgr_scan_create(MAIL_QUEUE_DEFERRED);
qmgr_scan_request(qmgr_scans[QMGR_SCAN_IDX_INCOMING], QMGR_SCAN_START);
qmgr_deferred_run_event(0, (char *) 0);
+
+ /*
+ * Scheduler initialization.
+ */
+ qmgr_queue_feedback_init();
}
MAIL_VERSION_STAMP_DECLARE;
{
static CONFIG_STR_TABLE str_table[] = {
VAR_DEFER_XPORTS, DEF_DEFER_XPORTS, &var_defer_xports, 0, 0,
+ VAR_QMGR_POS_FDBACK, DEF_QMGR_POS_FDBACK, &var_qmgr_pos_feedback, 1, 0,
+ VAR_QMGR_NEG_FDBACK, DEF_QMGR_NEG_FDBACK, &var_qmgr_neg_feedback, 1, 0,
0,
};
static CONFIG_TIME_TABLE time_table[] = {
VAR_LOCAL_RCPT_LIMIT, DEF_LOCAL_RCPT_LIMIT, &var_local_rcpt_lim, 0, 0,
VAR_LOCAL_CON_LIMIT, DEF_LOCAL_CON_LIMIT, &var_local_con_lim, 0, 0,
VAR_PROC_LIMIT, DEF_PROC_LIMIT, &var_proc_limit, 1, 0,
+ VAR_QMGR_POS_HYST, DEF_QMGR_POS_HYST, &var_qmgr_pos_hysteresis, 1, 0,
+ VAR_QMGR_NEG_HYST, DEF_QMGR_NEG_HYST, &var_qmgr_neg_hysteresis, 1, 0,
+ VAR_QMGR_SAC_COHORTS, DEF_QMGR_SAC_COHORTS, &var_qmgr_sac_cohorts, 1, 0,
0,
};
static CONFIG_BOOL_TABLE bool_table[] = {
VAR_ALLOW_MIN_USER, DEF_ALLOW_MIN_USER, &var_allow_min_user,
VAR_VERP_BOUNCE_OFF, DEF_VERP_BOUNCE_OFF, &var_verp_bounce_off,
+ VAR_QMGR_FDBACK_DEBUG, DEF_QMGR_FDBACK_DEBUG, &var_qmgr_feedback_debug,
0,
};
int todo_refcount; /* queue entries (todo list) */
int busy_refcount; /* queue entries (busy list) */
int window; /* slow open algorithm */
+ double success; /* cumulative positive feedback */
+ double failure; /* cumulative negative feedback */
+ double fail_cohorts; /* pseudo-cohort failure count */
QMGR_TRANSPORT *transport; /* transport linkage */
QMGR_ENTRY_LIST todo; /* todo queue entries */
QMGR_ENTRY_LIST busy; /* messages on the wire */
extern void qmgr_queue_throttle(QMGR_QUEUE *, DSN *);
extern void qmgr_queue_unthrottle(QMGR_QUEUE *);
extern QMGR_QUEUE *qmgr_queue_find(QMGR_TRANSPORT *, const char *);
+extern void qmgr_queue_feedback_init(void);
#define QMGR_QUEUE_THROTTLED(q) ((q)->window <= 0)
if (VSTRING_LEN(dsb->reason) == 0)
vstring_strcpy(dsb->reason, "unknown error");
vstring_prepend(dsb->reason, SUSPENDED, sizeof(SUSPENDED) - 1);
- qmgr_queue_throttle(queue, DSN_FROM_DSN_BUF(dsb));
- if (queue->window == 0)
- qmgr_defer_todo(queue, &dsb->dsn);
+ if (queue->window > 0) {
+ qmgr_queue_throttle(queue, DSN_FROM_DSN_BUF(dsb));
+ if (queue->window == 0)
+ qmgr_defer_todo(queue, &dsb->dsn);
+ }
}
}
/* transport. A null result means that the queue was not found.
/*
/* qmgr_queue_throttle() handles a delivery error, and decrements the
-/* concurrency limit for the destination. When the concurrency limit
-/* for a destination becomes zero, qmgr_queue_throttle() starts a timer
+/* concurrency limit for the destination, with a lower bound of 1.
+/* When the cohort failure bound is reached, qmgr_queue_throttle()
+/* sets the concurrency limit to zero and starts a timer
/* to re-enable delivery to the destination after a configurable delay.
/*
/* qmgr_queue_unthrottle() undoes qmgr_queue_throttle()'s effects.
/* P.O. Box 704
/* Yorktown Heights, NY 10598, USA
/*
-/* Scheduler enhancements:
+/* Pre-emptive scheduler enhancements:
/* Patrik Rak
/* Modra 6
/* 155 00, Prague, Czech Republic
#include <sys_defs.h>
#include <time.h>
+#include <math.h>
/* Utility library. */
#include <mymalloc.h>
#include <events.h>
#include <htable.h>
+#include <name_code.h>
/* Global library. */
#include <mail_params.h>
#include <recipient_list.h>
+#include <mail_proto.h> /* QMGR_LOG_WINDOW */
/* Application-specific. */
int qmgr_queue_count;
+ /*
+ * Lookup tables for main.cf feedback method names.
+ */
+#define QMGR_FDBACK_CODE_BAD 0
+#define QMGR_FDBACK_CODE_FIXED_1 1
+#define QMGR_FDBACK_CODE_INVERSE_WIN 2
+#define QMGR_FDBACK_CODE_INVERSE_1 QMGR_FDBACK_CODE_INVERSE_WIN
+#define QMGR_FDBACK_CODE_INV_SQRT_WIN 3
+#define QMGR_FDBACK_CODE_INV_SQRT QMGR_FDBACK_CODE_INV_SQRT_WIN
+
+NAME_CODE qmgr_feedback_map[] = {
+ QMGR_FDBACK_NAME_FIXED_1, QMGR_FDBACK_CODE_FIXED_1,
+ QMGR_FDBACK_NAME_INVERSE_WIN, QMGR_FDBACK_CODE_INVERSE_WIN,
+ QMGR_FDBACK_NAME_INVERSE_1, QMGR_FDBACK_CODE_INVERSE_1,
+ QMGR_FDBACK_NAME_INV_SQRT_WIN, QMGR_FDBACK_CODE_INV_SQRT_WIN,
+ QMGR_FDBACK_NAME_INV_SQRT, QMGR_FDBACK_CODE_INV_SQRT,
+ 0, QMGR_FDBACK_CODE_BAD,
+};
+static int qmgr_pos_feedback_idx;
+static int qmgr_neg_feedback_idx;
+
+ /*
+ * Choosing the right feedback method at run-time.
+ */
+#define QMGR_FEEDBACK_VAL(idx, window) ( \
+ (idx) == QMGR_FDBACK_CODE_INVERSE_1 ? (1.0 / (window)) : \
+ (idx) == QMGR_FDBACK_CODE_FIXED_1 ? (1.0) : \
+ (1.0 / sqrt(window)) \
+ )
+
+#define QMGR_ERROR_OR_RETRY_QUEUE(queue) \
+ (strcmp(queue->transport->name, MAIL_SERVICE_RETRY) == 0 \
+ || strcmp(queue->transport->name, MAIL_SERVICE_ERROR) == 0)
+
+#define QMGR_LOG_FEEDBACK(feedback) \
+ if (var_qmgr_feedback_debug && !QMGR_ERROR_OR_RETRY_QUEUE(queue)) \
+ msg_info("%s: feedback %g", myname, feedback);
+
+#define QMGR_LOG_WINDOW(queue) \
+ if (var_qmgr_feedback_debug && !QMGR_ERROR_OR_RETRY_QUEUE(queue)) \
+ msg_info("%s: queue %s: limit %d window %d success %g failure %g fail_cohorts %g", \
+ myname, queue->name, queue->transport->dest_concurrency_limit, \
+ queue->window, queue->success, queue->failure, queue->fail_cohorts);
+
+/* qmgr_queue_feedback_init - initialize feedback selection */
+
+void qmgr_queue_feedback_init(void)
+{
+
+ /*
+ * Positive and negative feedback method indices.
+ */
+ qmgr_pos_feedback_idx = name_code(qmgr_feedback_map, NAME_CODE_FLAG_NONE,
+ var_qmgr_pos_feedback);
+ if (qmgr_pos_feedback_idx == QMGR_FDBACK_CODE_BAD)
+ msg_fatal("%s: bad feedback method: %s",
+ VAR_QMGR_POS_FDBACK, var_qmgr_pos_feedback);
+ if (var_qmgr_feedback_debug)
+ msg_info("positive feedback method %d, value at %d: %g",
+ qmgr_pos_feedback_idx, var_init_dest_concurrency,
+ QMGR_FEEDBACK_VAL(qmgr_pos_feedback_idx,
+ var_init_dest_concurrency));
+
+ qmgr_neg_feedback_idx = name_code(qmgr_feedback_map, NAME_CODE_FLAG_NONE,
+ var_qmgr_neg_feedback);
+ if (qmgr_neg_feedback_idx == QMGR_FDBACK_CODE_BAD)
+ msg_fatal("%s: bad feedback method: %s",
+ VAR_QMGR_NEG_FDBACK, var_qmgr_neg_feedback);
+ if (var_qmgr_feedback_debug)
+ msg_info("negative feedback method %d, value at %d: %g",
+ qmgr_neg_feedback_idx, var_init_dest_concurrency,
+ QMGR_FEEDBACK_VAL(qmgr_neg_feedback_idx,
+ var_init_dest_concurrency));
+}
+
/* qmgr_queue_unthrottle_wrapper - in case (char *) != (struct *) */
static void qmgr_queue_unthrottle_wrapper(int unused_event, char *context)
{
const char *myname = "qmgr_queue_unthrottle";
QMGR_TRANSPORT *transport = queue->transport;
+ double feedback;
+ double multiplier;
if (msg_verbose)
msg_info("%s: queue %s", myname, queue->name);
+ /*
+ * Don't restart the negative feedback hysteresis cycle with every
+ * positive feedback. Restart it only when we make a positive concurrency
+ * adjustment (i.e. at the end of a positive feedback hysteresis cycle).
+ * Otherwise negative feedback would be too aggressive: negative feedback
+ * takes effect immediately at the start of its hysteresis cycle.
+ */
+ queue->fail_cohorts = 0;
+
/*
* Special case when this site was dead.
*/
msg_panic("%s: queue %s: window 0 status 0", myname, queue->name);
dsn_free(queue->dsn);
queue->dsn = 0;
- queue->window = transport->init_dest_concurrency;
+ /* Back from the almost grave, best concurrency is anyone's guess. */
+ if (queue->busy_refcount > 0)
+ queue->window = queue->busy_refcount;
+ else
+ queue->window = transport->init_dest_concurrency;
+ queue->success = queue->failure = 0;
+ QMGR_LOG_WINDOW(queue);
return;
}
* Increase the destination's concurrency limit until we reach the
* transport's concurrency limit. Allow for a margin the size of the
* initial destination concurrency, so that we're not too gentle.
+ *
+ * Why is the concurrency increment based on preferred concurrency and not
+ * on the number of outstanding delivery requests? The latter fluctuates
+ * wildly when deliveries complete in bursts (artificial benchmark
+ * measurements), and does not account for cached connections.
+ *
+ * Keep the window within reasonable distance from actual concurrency
+ * otherwise negative feedback will be ineffective. This expression
+ * assumes that busy_refcount changes gradually. This is invalid when
+ * deliveries complete in bursts (artificial benchmark measurements).
*/
if (transport->dest_concurrency_limit == 0
|| transport->dest_concurrency_limit > queue->window)
- if (queue->window < queue->busy_refcount + transport->init_dest_concurrency)
- queue->window++;
+ if (queue->window < queue->busy_refcount + transport->init_dest_concurrency) {
+ feedback = QMGR_FEEDBACK_VAL(qmgr_pos_feedback_idx, queue->window);
+ QMGR_LOG_FEEDBACK(feedback);
+ queue->success += feedback;
+ /* Prepare for overshoot (feedback > hysteresis, rounding error). */
+ while (queue->success >= var_qmgr_pos_hysteresis) {
+ queue->window += var_qmgr_pos_hysteresis;
+ queue->success -= var_qmgr_pos_hysteresis;
+ queue->failure = 0;
+ }
+ /* Prepare for overshoot. */
+ if (transport->dest_concurrency_limit > 0
+ && queue->window > transport->dest_concurrency_limit)
+ queue->window = transport->dest_concurrency_limit;
+ }
+ QMGR_LOG_WINDOW(queue);
}
/* qmgr_queue_throttle - handle destination delivery failure */
void qmgr_queue_throttle(QMGR_QUEUE *queue, DSN *dsn)
{
const char *myname = "qmgr_queue_throttle";
+ double feedback;
/*
* Sanity checks.
myname, queue->name, dsn->status, dsn->reason);
/*
- * Decrease the destination's concurrency limit until we reach zero, at
- * which point the destination is declared dead. Decrease the concurrency
- * limit by one, instead of using actual concurrency - 1, to avoid
- * declaring a host dead after just one single delivery failure.
+ * Don't restart the positive feedback hysteresis cycle with every
+ * negative feedback. Restart it only when we make a negative concurrency
+ * adjustment (i.e. at the start of a negative feedback hysteresis
+ * cycle). Otherwise positive feedback would be too weak (positive
+ * feedback does not take effect until the end of its hysteresis cycle).
*/
- if (queue->window > 0)
- queue->window--;
+
+ /*
+ * This queue is declared dead after a configurable number of
+ * pseudo-cohort failures.
+ */
+ if (queue->window > 0) {
+ queue->fail_cohorts += 1.0 / queue->window;
+ if (queue->fail_cohorts >= var_qmgr_sac_cohorts)
+ queue->window = 0;
+ }
+
+ /*
+ * Decrease the destination's concurrency limit until we reach 1. Base
+ * adjustments on the concurrency limit itself, instead of using the
+ * actual concurrency. The latter fluctuates wildly when deliveries
+ * complete in bursts (artificial benchmark measurements).
+ */
+ if (queue->window > 1) {
+ feedback = QMGR_FEEDBACK_VAL(qmgr_neg_feedback_idx, queue->window);
+ QMGR_LOG_FEEDBACK(feedback);
+ queue->failure -= feedback;
+ /* Prepare for overshoot (feedback > hysteresis, rounding error). */
+ while (queue->failure < 0) {
+ queue->window -= var_qmgr_neg_hysteresis;
+ queue->success = 0;
+ queue->failure += var_qmgr_neg_hysteresis;
+ }
+ /* Prepare for overshoot. */
+ if (queue->window < 1)
+ queue->window = 1;
+ }
/*
* Special case for a site that just was declared dead.
(char *) queue, var_min_backoff_time);
queue->dflags = 0;
}
+ QMGR_LOG_WINDOW(queue);
}
/* qmgr_queue_done - delete in-core queue for site */
queue->busy_refcount = 0;
queue->transport = transport;
queue->window = transport->init_dest_concurrency;
+ queue->success = queue->failure = queue->fail_cohorts = 0;
QMGR_LIST_INIT(queue->todo);
QMGR_LIST_INIT(queue->busy);
queue->dsn = 0;