mod_firehose: Add a new debugging module able to record traffic

author Graham Leggett <minfrin@apache.org>

Sat, 17 Dec 2011 17:03:59 +0000 (17:03 +0000)

committer Graham Leggett <minfrin@apache.org>

Sat, 17 Dec 2011 17:03:59 +0000 (17:03 +0000)
author Graham Leggett <minfrin@apache.org>
Sat, 17 Dec 2011 17:03:59 +0000 (17:03 +0000)
committer Graham Leggett <minfrin@apache.org>
Sat, 17 Dec 2011 17:03:59 +0000 (17:03 +0000)
diff --git a/CHANGES b/CHANGES

index 77e83895eab37074297a952118ce41f6f4fd890b..12f0e895c7da3abd7b660716bfe836f799377682 100644 (file)
--- a/CHANGES
+++ b/CHANGES
@@ -1,6 +1,10 @@
                                                           -*- coding: utf-8 -*-
  Changes with Apache 2.5.0
  
+  *) mod_firehose: Add a new debugging module able to record traffic
+     passing through the server in such a way that connections and/or
+     requests be reconstructed and replayed. [Graham Leggett]
+
    *) ap_pass_brigade_fchk() added. [Jim Jagielski]
  
    *) error log hook: Pass ap_errorlog_info struct.  [Stefan Fritsch]
diff --git a/docs/manual/mod/allmodules.xml b/docs/manual/mod/allmodules.xml

index 2e0706ca8049deab1da7c46e0eab1439accf7ccc..a26ed7b80e700ab6e7591b5a8eb26ca024585982 100644 (file)
--- a/docs/manual/mod/allmodules.xml
+++ b/docs/manual/mod/allmodules.xml
@@ -49,6 +49,7 @@
    <modulefile>mod_ext_filter.xml</modulefile>
    <modulefile>mod_file_cache.xml</modulefile>
    <modulefile>mod_filter.xml</modulefile>
+  <modulefile>mod_firehose.xml</modulefile>
    <modulefile>mod_headers.xml</modulefile>
    <modulefile>mod_heartbeat.xml</modulefile>
    <modulefile>mod_heartmonitor.xml</modulefile>
diff --git a/docs/manual/mod/mod_firehose.xml b/docs/manual/mod/mod_firehose.xml

new file mode 100644 (file)

index 0000000..df70655
--- /dev/null
+++ b/docs/manual/mod/mod_firehose.xml
@@ -0,0 +1,268 @@
+<?xml version="1.0"?>
+<!DOCTYPE modulesynopsis SYSTEM "../style/modulesynopsis.dtd">
+<?xml-stylesheet type="text/xsl" href="../style/manual.en.xsl"?>
+<!-- $LastChangedRevision: 1174747 $ -->
+
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements.  See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License.  You may obtain a copy of the License at
+
+     http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+<modulesynopsis metafile="mod_firehose.xml.meta">
+
+<name>mod_firehose</name>
+<description>Multiplexes all I/O to a given file or pipe.</description>
+<status>Extension</status>
+<sourcefile>mod_firehose.c</sourcefile>
+<identifier>firehose_module</identifier>
+
+<summary>
+    <p><code>mod_firehose</code> provides a mechanism to record data
+    being passed between the httpd server and the client at the raw
+    connection level to either a file or a pipe in such a way that the
+    data can be analysed or played back to the server at a future date.
+    It can be thought of as "tcpdump for httpd".</p>
+
+    <p>Connections are recorded after the SSL has been stripped, and can
+    be used for forensic debugging.</p>
+
+    <p>The <program>firehose</program> tool can be used to demultiplex
+    the recorded stream back into individual files for analysis, or
+    playback using a tool like <code>netcat</code>.</p>
+
+    <note><title>WARNING</title>This module IGNORES all request level
+    mechanisms to keep data private. It is the responsibility of the
+    administrator to ensure that private data is not inadvertently
+    exposed using this module.
+    </note>
+
+</summary>
+
+<section id="enable">
+    <title>Enabling a Firehose</title>
+
+    <p>To enable the module, it should be compiled and loaded
+    in to your running Apache configuration, and the directives below
+    used to record the data you are interested in.</p>
+    
+    <p>It is possible to record both incoming and outgoing data to
+    the same filename if desired, as the direction of flow is recorded
+    within each fragment.</p>
+
+    <p>It is possible to write to both normal files and fifos (pipes).
+    In the case of fifos, mod_firehose ensures that the packet size is
+    no larger than PIPE_BUF to ensure writes are atomic.</p>
+
+    <p>If a pipe is being used, something must be reading from the pipe
+    before httpd is started for the pipe to be successfully opened for
+    write. If the request to open the pipe fails, mod_firehose will
+    silently stand down and not record anything, and the server will
+    keep running as normal.</p>
+
+    <p>All file writes are non blocking, and buffer overflows will cause
+    debugging data to be lost. The module makes the call that the
+    reliable running of the server takes precedence over the recording
+    of firehose data.</p>
+
+</section>
+
+<section id="format">
+    <title>Stream Format</title>
+
+    <p>The server typically serves multiple connections simultaneously,
+    and as a result requests and responses need to be multiplexed before
+    being written to the firehose.</p>
+    
+    <p>The fragment format is designed as clear text, so that a firehose
+    can be opened with and inspected by a normal text editor.
+    Alternatively, the <program>firehose</program> tool can be used to
+    demultiplex the firehose back into individual requests or
+    connections.</p>
+
+    <p>The size of the multiplexed fragments is governed by PIPE_BUF,
+    the maximum size of write the system is prepared to perform
+    atomically. By keeping the multiplexed fragments below PIPE_BUF in
+    size, the module guarantees that data from different fragments does
+    not interleave. The size of PIPE_BUF varies on different operating
+    systems.</p>
+
+    <p>The BNF for the fragment format is as follows:</p>
+
+    <pre>
+ stream = 0*(fragment)
+
+ fragment = header CRLF body CRLF
+
+ header = length SPC timestamp SPC ( request | response ) SPC uuid SPC count
+
+ length = &lt;up to 16 byte hex fragment length>
+ timestamp = &lt;up to 16 byte hex timestamp microseconds since 1970>
+ request = "&lt;"
+ response = ">"
+ uuid = &lt;formatted uuid of the connection>
+ count = &lt;hex fragment number in the connection>
+
+ body = &lt;the binary content of the fragment>
+
+ SPC = &lt;a single space>
+ CRLF = &lt;a carriage return, followed by a line feed>
+    </pre>
+
+    <p>All fragments for a connection or a request will share the same
+    UUID, depending on whether connections or requests are being recorded.
+    If connections are being recorded, multiple requests may appear within
+    a connection. A fragment with a zero length indicates the end of the
+    connection.</p>
+
+    <p>Fragments may go missing or be dropped if the process reading the
+    fragments is too slow. If this happens, gaps will exist in the
+    connection counter numbering. A warning will be logged in the error
+    log to indicate the UUID and counter of the dropped fragment, so it
+    can be confirmed the fragment was dropped.</p>
+
+    <p>It is possible that the terminating empty fragment may not appear,
+    caused by the httpd process crashing, or being terminated ungracefully.
+    The terminating fragment may be dropped if the process reading the
+    fragments is not fast enough.</p>
+
+</section>
+
+<directivesynopsis>
+
+<name>FirehoseConnectionInput</name>
+<description>Capture traffic coming into the server on each connection</description>
+<syntax>FirehoseConnectionInput <var>filename</var></syntax>
+<default>none</default>
+<contextlist><context>server config</context></contextlist>
+<compatibility>FirehoseConnectionInput is only available in Apache 2.5.0 and
+later.</compatibility>
+
+<usage>
+    <p>Capture traffic coming into the server on each connection. Multiple
+    requests will be captured within the same connection if keepalive is
+    present.</p>
+
+    <example><title>Example</title>
+      FirehoseConnectionInput connection-input.firehose
+    </example>
+</usage>
+
+</directivesynopsis>
+
+<directivesynopsis>
+
+<name>FirehoseConnectionOutput</name>
+<description>Capture traffic going out of the server on each connection</description>
+<syntax>FirehoseConnectionOutput <var>filename</var></syntax>
+<default>none</default>
+<contextlist><context>server config</context></contextlist>
+<compatibility>FirehoseConnectionOutput is only available in Apache 2.5.0 and
+later.</compatibility>
+
+<usage>
+    <p>Capture traffic going out of the server on each connection.
+    Multiple requests will be captured within the same connection if
+    keepalive is present.</p>
+
+    <example><title>Example</title>
+      FirehoseConnectionOutput connection-output.firehose
+    </example>
+</usage>
+
+</directivesynopsis>
+
+<directivesynopsis>
+
+<name>FirehoseRequestInput</name>
+<description>Capture traffic coming into the server on each request</description>
+<syntax>FirehoseRequestInput <var>filename</var></syntax>
+<default>none</default>
+<contextlist><context>server config</context></contextlist>
+<compatibility>FirehoseRequestInput is only available in Apache 2.5.0 and
+later.</compatibility>
+
+<usage>
+    <p>Capture traffic coming into the server on each request. Requests
+    will be captured separately, regardless of the presence of keepalive.</p>
+
+    <example><title>Example</title>
+      FirehoseRequestInput request-input.firehose
+    </example>
+</usage>
+
+</directivesynopsis>
+
+<directivesynopsis>
+
+<name>FirehoseRequestOutput</name>
+<description>Capture traffic going out of the server on each request</description>
+<syntax>FirehoseRequestOutput <var>filename</var></syntax>
+<default>none</default>
+<contextlist><context>server config</context></contextlist>
+<compatibility>FirehoseRequestOutput is only available in Apache 2.5.0 and
+later.</compatibility>
+
+<usage>
+    <p>Capture traffic going out of the server on each request. Requests
+    will be captured separately, regardless of the presence of keepalive.</p>
+
+    <example><title>Example</title>
+      FirehoseRequestOutput request-output.firehose
+    </example>
+</usage>
+
+</directivesynopsis>
+
+<directivesynopsis>
+
+<name>FirehoseProxyConnectionInput</name>
+<description>Capture traffic coming into the back of mod_proxy</description>
+<syntax>FirehoseProxyConnectionInput <var>filename</var></syntax>
+<default>none</default>
+<contextlist><context>server config</context></contextlist>
+<compatibility>FirehoseProxyConnectionInput is only available in Apache 2.5.0 and
+later.</compatibility>
+
+<usage>
+    <p>Capture traffic being received by mod_proxy.</p>
+
+    <example><title>Example</title>
+      FirehoseProxyConnectionInput proxy-input.firehose
+    </example>
+</usage>
+
+</directivesynopsis>
+
+<directivesynopsis>
+
+<name>FirehoseProxyConnectionOutput</name>
+<description>Capture traffic sent out from the back of mod_proxy</description>
+<syntax>FirehoseProxyConnectionOutput <var>filename</var></syntax>
+<default>none</default>
+<contextlist><context>server config</context></contextlist>
+<compatibility>FirehoseProxyConnectionOutput is only available in Apache 2.5.0 and
+later.</compatibility>
+
+<usage>
+    <p>Capture traffic being sent out by mod_proxy.</p>
+
+    <example><title>Example</title>
+      FirehoseProxyConnectionOutput proxy-output.firehose
+    </example>
+</usage>
+
+</directivesynopsis>
+
+</modulesynopsis>
diff --git a/docs/manual/programs/index.xml b/docs/manual/programs/index.xml

index edd42b13e1f46e9d7b426ca91f1fa60fb3d95f1e..1e518e51f61c10fe61b9af9a4b22edc1722b29fd 100644 (file)
--- a/docs/manual/programs/index.xml
+++ b/docs/manual/programs/index.xml
@@ -62,6 +62,10 @@
  
        <dd>Start a FastCGI program</dd>
  
+      <dt><program>firehose</program></dt>
+
+      <dd>Demultiplex a firehose from <module>mod_firehose</module></dd>
+
        <dt><program>htcacheclean</program></dt>
  
        <dd>Clean up the disk cache</dd>
diff --git a/modules/debugging/config.m4 b/modules/debugging/config.m4

index 28441e1cf58cab4c616fc3f41acb78ed4981b9f2..c0d7f53a9f9cc899f4cc902a93be1baaa875dd53 100644 (file)
--- a/modules/debugging/config.m4
+++ b/modules/debugging/config.m4
@@ -3,5 +3,6 @@ APACHE_MODPATH_INIT(debugging)
  
  APACHE_MODULE(bucketeer, buckets manipulation filter.  Useful only for developers and testing purposes., , , no)
  APACHE_MODULE(dumpio, I/O dump filter, , , most)
+APACHE_MODULE(firehose, Firehose dump filter, , , most)
  
  APACHE_MODPATH_FINISH
diff --git a/modules/debugging/mod_firehose.c b/modules/debugging/mod_firehose.c

new file mode 100644 (file)

index 0000000..da7a28e
--- /dev/null
+++ b/modules/debugging/mod_firehose.c
@@ -0,0 +1,709 @@
+/* Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+/*
+ * Originally written @ Covalent by Jim Jagielski
+ * Modified to support writing to non blocking pipes @ BBC by Graham Leggett
+ * Modifications (C) 2011 British Broadcasting Corporation
+ */
+
+/*
+ * mod_firehose.c:
+ *  A request and response sniffer for Apache v2.x. It logs
+ *  all filter data right before and after it goes out on the
+ *  wire (BUT right before SSL encoded or after SSL decoded).
+ *  It can produce a *huge* amount of data.
+ */
+
+#include "httpd.h"
+#include "http_connection.h"
+#include "http_config.h"
+#include "http_core.h"
+#include "http_log.h"
+#include "http_request.h"
+#include "util_ebcdic.h"
+#include "apr_strings.h"
+#include "apr_portable.h"
+#include "apr_uuid.h"
+#include "mod_proxy.h"
+
+#ifdef APR_HAVE_SYS_SYSLIMITS_H
+#include <sys/syslimits.h>
+#endif
+#ifdef APR_HAVE_LINUX_LIMITS_H
+#include <linux/limits.h>
+#endif
+#if APR_HAVE_FCNTL_H
+#include <fcntl.h>
+#endif
+#if APR_HAVE_UNISTD_H
+#include <unistd.h>
+#endif
+
+module AP_MODULE_DECLARE_DATA firehose_module;
+
+typedef enum proxy_enum
+{
+    FIREHOSE_PROXY, FIREHOSE_NORMAL
+} proxy_enum;
+
+typedef enum request_enum
+{
+    FIREHOSE_CONNECTION, FIREHOSE_REQUEST
+} request_enum;
+
+typedef enum direction_enum
+{
+    FIREHOSE_IN = '<', FIREHOSE_OUT = '>'
+} direction_enum;
+
+typedef struct firehose_conn_t
+{
+    const char *filename;
+    apr_file_t *file;
+    proxy_enum proxy;
+    direction_enum direction;
+    request_enum request;
+    int suppress;
+} firehose_conn_t;
+
+typedef struct firehose_conf_t
+{
+    apr_array_header_t *firehoses;
+} firehose_conf_t;
+
+typedef struct firehose_ctx_t
+{
+    firehose_conf_t *conf;
+    firehose_conn_t *conn;
+    apr_bucket_brigade *bb;
+    apr_bucket_brigade *tmp;
+    char uuid[APR_UUID_FORMATTED_LENGTH + 1];
+    apr_uint64_t count;
+    int direction;
+    conn_rec *c;
+    request_rec *r;
+    ap_filter_t *f;
+} firehose_ctx_t;
+
+#define HEADER_LEN (sizeof(apr_uint64_t)*6 + APR_UUID_FORMATTED_LENGTH + 7)
+#define BODY_LEN (PIPE_BUF - HEADER_LEN - 2)
+#define HEADER_FMT "%" APR_UINT64_T_HEX_FMT " %" APR_UINT64_T_HEX_FMT " %c %s %" APR_UINT64_T_HEX_FMT CRLF
+
+apr_status_t logs_cleanup(void *dummy)
+{
+    apr_file_t *file = (apr_file_t *) dummy;
+    apr_file_close(file);
+    return APR_SUCCESS;
+}
+
+apr_status_t filter_output_cleanup(void *dummy)
+{
+    ap_filter_t *f = (ap_filter_t *) dummy;
+    ap_remove_output_filter(f);
+    return APR_SUCCESS;
+}
+
+apr_status_t filter_input_cleanup(void *dummy)
+{
+    ap_filter_t *f = (ap_filter_t *) dummy;
+    ap_remove_input_filter(f);
+    return APR_SUCCESS;
+}
+
+/**
+ * Add the terminating empty fragment to indicate end-of-connection.
+ */
+apr_status_t pumpit_cleanup(void *dummy)
+{
+    firehose_ctx_t *ctx = (firehose_ctx_t *) dummy;
+    apr_status_t rv;
+    apr_size_t hdr_len;
+    char header[HEADER_LEN + 1];
+    apr_size_t bytes;
+
+    if (!ctx->count) {
+        return APR_SUCCESS;
+    }
+
+    hdr_len = apr_snprintf(header, sizeof(header), HEADER_FMT,
+            (apr_uint64_t) 0, (apr_uint64_t) apr_time_now(), ctx->direction,
+            ctx->uuid, ctx->count);
+    ap_xlate_proto_to_ascii(header, hdr_len);
+
+    rv = apr_file_write_full(ctx->conn->file, header, hdr_len, &bytes);
+    if (APR_SUCCESS != rv) {
+        if (ctx->conn->suppress) {
+            /* ignore the error */
+        }
+        else if (ctx->r) {
+            ap_log_rerror(
+                    APLOG_MARK,
+                    APLOG_WARNING,
+                    rv,
+                    ctx->r,
+                    "mod_firehose: could not write %" APR_UINT64_T_FMT " bytes to '%s' for '%c' connection '%s' and count '%0" APR_UINT64_T_HEX_FMT "', bytes dropped (further errors will be suppressed)",
+                    (apr_uint64_t)(hdr_len), ctx->conn->filename, ctx->conn->direction, ctx->uuid, ctx->count);
+        }
+        else {
+            ap_log_cerror(
+                    APLOG_MARK,
+                    APLOG_WARNING,
+                    rv,
+                    ctx->c,
+                    "mod_firehose: could not write %" APR_UINT64_T_FMT " bytes to '%s' for '%c' connection '%s' and count '%0" APR_UINT64_T_HEX_FMT "', bytes dropped (further errors will be suppressed)",
+                    (apr_uint64_t)(hdr_len), ctx->conn->filename, ctx->conn->direction, ctx->uuid, ctx->count);
+        }
+        ctx->conn->suppress = 1;
+    }
+    else {
+        ctx->conn->suppress = 0;
+    }
+
+    ctx->count = 0;
+
+    return APR_SUCCESS;
+}
+
+/*
+ * Pump the bucket contents to the pipe.
+ *
+ * Writes smaller than PIPE_BUF are guaranteed to be atomic when written to
+ * pipes. As a result, we break the buckets into packets smaller than PIPE_BUF and
+ * send each one in turn.
+ *
+ * Each packet is marked with the UUID of the connection so that the process that
+ * reassembles the packets can put the right packets in the right order.
+ *
+ * Each packet is numbered with an incrementing counter. If a packet cannot be
+ * written we drop the packet on the floor, and the counter will enable dropped
+ * packets to be detected.
+ */
+static apr_status_t pumpit(ap_filter_t *f, apr_bucket *b, firehose_ctx_t *ctx)
+{
+    apr_status_t rv = APR_SUCCESS;
+
+    if (!(APR_BUCKET_IS_METADATA(b))) {
+        const char *buf;
+        apr_size_t nbytes, offset = 0;
+
+        rv = apr_bucket_read(b, &buf, &nbytes, APR_BLOCK_READ);
+
+        if (rv == APR_SUCCESS) {
+            while (nbytes > 0) {
+                char header[HEADER_LEN + 1];
+                apr_size_t hdr_len;
+                apr_size_t body_len = nbytes < BODY_LEN ? nbytes : BODY_LEN;
+                apr_size_t bytes;
+                struct iovec vec[3];
+
+                /*
+                 * Insert the chunk header, specifying the number of bytes in
+                 * the chunk.
+                 */
+                hdr_len = apr_snprintf(header, sizeof(header), HEADER_FMT,
+                        (apr_uint64_t) body_len, (apr_uint64_t) apr_time_now(),
+                        ctx->direction, ctx->uuid, ctx->count);
+                ap_xlate_proto_to_ascii(header, hdr_len);
+
+                vec[0].iov_base = header;
+                vec[0].iov_len = hdr_len;
+                vec[1].iov_base = (void *) (buf + offset);
+                vec[1].iov_len = body_len;
+                vec[2].iov_base = CRLF;
+                vec[2].iov_len = 2;
+
+                rv = apr_file_writev_full(ctx->conn->file, vec, 3, &bytes);
+                if (APR_SUCCESS != rv) {
+                    if (ctx->conn->suppress) {
+                        /* ignore the error */
+                    }
+                    else if (ctx->r) {
+                        ap_log_rerror(
+                                APLOG_MARK,
+                                APLOG_WARNING,
+                                rv,
+                                ctx->r,
+                                "mod_firehose: could not write %" APR_UINT64_T_FMT " bytes to '%s' for '%c' connection '%s' and count '%0" APR_UINT64_T_HEX_FMT "', bytes dropped (further errors will be suppressed)",
+                                (apr_uint64_t)(vec[0].iov_len + vec[1].iov_len + vec[2].iov_len), ctx->conn->filename, ctx->conn->direction, ctx->uuid, ctx->count);
+                    }
+                    else {
+                        ap_log_cerror(
+                                APLOG_MARK,
+                                APLOG_WARNING,
+                                rv,
+                                ctx->c,
+                                "mod_firehose: could not write %" APR_UINT64_T_FMT " bytes to '%s' for '%c' connection '%s' and count '%0" APR_UINT64_T_HEX_FMT "', bytes dropped (further errors will be suppressed)",
+                                (apr_uint64_t)(vec[0].iov_len + vec[1].iov_len + vec[2].iov_len), ctx->conn->filename, ctx->conn->direction, ctx->uuid, ctx->count);
+                    }
+                    ctx->conn->suppress = 1;
+                    rv = APR_SUCCESS;
+                }
+                else {
+                    ctx->conn->suppress = 0;
+                }
+
+                ctx->count++;
+                nbytes -= vec[1].iov_len;
+                offset += vec[1].iov_len;
+            }
+        }
+
+    }
+    return rv;
+}
+
+static int firehose_input_filter(ap_filter_t *f, apr_bucket_brigade *bb,
+        ap_input_mode_t mode, apr_read_type_e block, apr_off_t readbytes)
+{
+    apr_bucket *b;
+    apr_status_t rv;
+    firehose_ctx_t *ctx = f->ctx;
+
+    /* just get out of the way of things we don't want. */
+    if (mode != AP_MODE_READBYTES && mode != AP_MODE_GETLINE) {
+        return ap_get_brigade(f->next, bb, mode, block, readbytes);
+    }
+
+    rv = ap_get_brigade(f->next, bb, mode, block, readbytes);
+
+    /* if an error was received, bail out now. If the error is
+     * EAGAIN and we have not yet seen an EOS, we will definitely
+     * be called again, at which point we will send our buffered
+     * data. Instead of sending EAGAIN, some filters return an
+     * empty brigade instead when data is not yet available. In
+     * this case, pass through the APR_SUCCESS and emulate the
+     * underlying filter.
+     */
+    if (rv != APR_SUCCESS || APR_BRIGADE_EMPTY(bb)) {
+        return rv;
+    }
+
+    for (b = APR_BRIGADE_FIRST(bb); b != APR_BRIGADE_SENTINEL(bb); b
+            = APR_BUCKET_NEXT(b)) {
+        rv = pumpit(f, b, ctx);
+        if (APR_SUCCESS != rv) {
+            return rv;
+        }
+    }
+
+    return APR_SUCCESS;
+}
+
+static int firehose_output_filter(ap_filter_t *f, apr_bucket_brigade *bb)
+{
+    apr_bucket *b;
+    apr_status_t rv = APR_SUCCESS;
+    firehose_ctx_t *ctx = f->ctx;
+
+    while (APR_SUCCESS == rv && !APR_BRIGADE_EMPTY(bb)) {
+
+        b = APR_BRIGADE_FIRST(bb);
+
+        rv = pumpit(f, b, ctx);
+        if (APR_SUCCESS != rv) {
+            return rv;
+        }
+
+        /* pass each bucket down */
+        APR_BUCKET_REMOVE(b);
+        APR_BRIGADE_INSERT_TAIL(ctx->bb, b);
+
+        /*
+         * If we ever see an EOS, make sure to FLUSH.
+         */
+        if (APR_BUCKET_IS_EOS(b)) {
+            apr_bucket *flush = apr_bucket_flush_create(f->c->bucket_alloc);
+            APR_BUCKET_INSERT_BEFORE(b, flush);
+        }
+
+        rv = ap_pass_brigade(f->next, ctx->bb);
+
+    }
+
+    return rv;
+}
+
+/**
+ * Create a firehose for each main request.
+ */
+static int firehose_create_request(request_rec *r)
+{
+    firehose_conf_t *conf;
+    firehose_ctx_t *ctx;
+    apr_uuid_t uuid;
+    int set = 0;
+    ap_filter_t *f;
+
+    if (r->main) {
+        return DECLINED;
+    }
+
+    conf = ap_get_module_config(r->connection->base_server->module_config,
+            &firehose_module);
+
+    f = r->connection->input_filters;
+    while (f) {
+        if (f->frec->filter_func.in_func == &firehose_input_filter) {
+            ctx = (firehose_ctx_t *) f->ctx;
+            if (ctx->conn->request == FIREHOSE_REQUEST) {
+                pumpit_cleanup(ctx);
+                if (!set) {
+                    apr_uuid_get(&uuid);
+                    set = 1;
+                }
+                apr_uuid_format(ctx->uuid, &uuid);
+            }
+        }
+        f = f->next;
+    }
+
+    f = r->connection->output_filters;
+    while (f) {
+        if (f->frec->filter_func.out_func == &firehose_output_filter) {
+            ctx = (firehose_ctx_t *) f->ctx;
+            if (ctx->conn->request == FIREHOSE_REQUEST) {
+                pumpit_cleanup(ctx);
+                if (!set) {
+                    apr_uuid_get(&uuid);
+                    set = 1;
+                }
+                apr_uuid_format(ctx->uuid, &uuid);
+            }
+        }
+        f = f->next;
+    }
+
+    return OK;
+}
+
+/* TODO: Make sure the connection directives are enforced global only.
+ *
+ * TODO: An idea for configuration. Let the filename directives be per-directory,
+ * with a global hashtable of filename to filehandle mappings. As each directive
+ * is parsed, a file is opened at server start. By default, all input is buffered
+ * until the header_parser hook, at which point we check if we should be buffering
+ * at all. If not, we dump the buffer and remove the filter. If so, we start
+ * attempting to write the buffer to the file.
+ *
+ * TODO: Implement a buffer to allow firehose fragment writes to back up to some
+ * threshold before packets are dropped. Flush the buffer on cleanup, waiting a
+ * suitable amount of time for the downstream to catch up.
+ *
+ * TODO: For the request firehose, have an option to set aside request buckets
+ * until we decide whether we're going to record this request or not. Allows for
+ * targeted firehose by URL space.
+ *
+ * TODO: Potentially decide on firehose sending based on a variable in the notes
+ * table or subprocess_env. Use standard httpd SetEnvIf and friends to decide on
+ * whether to include the request or not. Using this, we can react to the value
+ * of a flagpole. Run this check in the header_parser hook.
+ */
+
+static int firehose_pre_conn(conn_rec *c, void *csd)
+{
+    firehose_conf_t *conf;
+    firehose_ctx_t *ctx;
+    apr_uuid_t uuid;
+    int i;
+    firehose_conn_t *conn;
+
+    conf = ap_get_module_config(c->base_server->module_config,
+            &firehose_module);
+
+    if (conf->firehoses->nelts) {
+        apr_uuid_get(&uuid);
+    }
+
+    conn = (firehose_conn_t *) conf->firehoses->elts;
+    for (i = 0; i < conf->firehoses->nelts; i++) {
+
+        if (!conn->file || (conn->proxy == FIREHOSE_NORMAL
+                && !c->sbh) || (conn->proxy == FIREHOSE_PROXY && c->sbh)) {
+            conn++;
+            continue;
+        }
+
+        ctx = apr_pcalloc(c->pool, sizeof(firehose_ctx_t));
+        apr_uuid_format(ctx->uuid, &uuid);
+        ctx->conf = conf;
+        ctx->conn = conn;
+        ctx->bb = apr_brigade_create(c->pool, c->bucket_alloc);
+        ctx->c = c;
+        apr_pool_cleanup_register(c->pool, ctx, pumpit_cleanup, pumpit_cleanup);
+        if (conn->direction == FIREHOSE_IN) {
+            ctx->direction = conn->proxy == FIREHOSE_PROXY ? '>' : '<';
+            ctx->f = ap_add_input_filter("FIREHOSE_IN", ctx, NULL, c);
+            apr_pool_cleanup_register(c->pool, ctx->f, filter_input_cleanup,
+                    filter_input_cleanup);
+        }
+        if (conn->direction == FIREHOSE_OUT) {
+            ctx->direction = conn->proxy == FIREHOSE_PROXY ? '<' : '>';
+            ctx->f = ap_add_output_filter("FIREHOSE_OUT", ctx, NULL, c);
+            apr_pool_cleanup_register(c->pool, ctx->f, filter_output_cleanup,
+                    filter_output_cleanup);
+        }
+
+        conn++;
+    }
+
+    return OK;
+}
+
+static int firehose_open_logs(apr_pool_t *p, apr_pool_t *plog,
+        apr_pool_t *ptemp, server_rec *s)
+{
+    firehose_conf_t *conf;
+    apr_status_t rv;
+    void *data;
+    int i;
+    firehose_conn_t *conn;
+
+    /* make sure we only open the files on the second pass for config */
+    apr_pool_userdata_get(&data, "mod_firehose", s->process->pool);
+    if (!data) {
+        apr_pool_userdata_set((const void *) 1, "mod_firehose",
+                apr_pool_cleanup_null, s->process->pool);
+        return OK;
+    }
+
+    while (s) {
+
+        conf = ap_get_module_config(s->module_config,
+                &firehose_module);
+
+        conn = (firehose_conn_t *) conf->firehoses->elts;
+        for (i = 0; i < conf->firehoses->nelts; i++) {
+            /* TODO: make this non blocking behaviour optional, as APR doesn't yet
+             * support non blocking opening of files.
+             * TODO: make this properly portable.
+             */
+            apr_os_file_t file = open(conn->filename, O_WRONLY
+                    | O_CREAT | O_APPEND | O_NONBLOCK, 0777);
+            if (file < 0) {
+                rv = APR_FROM_OS_ERROR(apr_get_os_error());
+                ap_log_error(APLOG_MARK,
+                        APLOG_WARNING,
+                        rv, s, "mod_firehose: could not open '%s' for write, disabling firehose %s%s %s filter",
+                        conn->filename, conn->proxy == FIREHOSE_PROXY ? "proxy " : "",
+                        conn->request == FIREHOSE_REQUEST ? " request" : "connection",
+                        conn->direction == FIREHOSE_IN ? "input" : "output");
+            }
+            else if (APR_SUCCESS != (rv = apr_os_file_put(
+                    &conn->file, &file, APR_FOPEN_WRITE
+                            | APR_FOPEN_CREATE | APR_FOPEN_APPEND, plog))) {
+                close(file);
+                ap_log_error(APLOG_MARK,
+                        APLOG_WARNING,
+                        rv, s, "mod_firehose: could not open '%s' for write, disabling firehose %s%s %s filter",
+                        conn->filename, conn->proxy == FIREHOSE_PROXY ? "proxy " : "",
+                        conn->request == FIREHOSE_REQUEST ? " request" : "connection",
+                        conn->direction == FIREHOSE_IN ? "input" : "output");
+            }
+            else {
+                apr_pool_cleanup_register(plog, conn->file,
+                        logs_cleanup, logs_cleanup);
+            }
+            conn++;
+        }
+
+        s = s->next;
+    }
+
+    return OK;
+}
+
+static void firehose_register_hooks(apr_pool_t *p)
+{
+    /*
+     * We know that SSL is CONNECTION + 5
+     */
+    ap_register_output_filter("FIREHOSE_OUT", firehose_output_filter, NULL,
+            AP_FTYPE_CONNECTION + 3);
+
+    ap_register_input_filter("FIREHOSE_IN", firehose_input_filter, NULL,
+            AP_FTYPE_CONNECTION + 3);
+
+    ap_hook_open_logs(firehose_open_logs, NULL, NULL, APR_HOOK_LAST);
+    ap_hook_pre_connection(firehose_pre_conn, NULL, NULL, APR_HOOK_MIDDLE);
+    ap_hook_create_request(firehose_create_request, NULL, NULL,
+            APR_HOOK_REALLY_LAST + 1);
+}
+
+static void *firehose_create_sconfig(apr_pool_t *p, server_rec *s)
+{
+    firehose_conf_t *ptr = apr_pcalloc(p, sizeof(firehose_conf_t));
+
+    ptr->firehoses = apr_array_make(p, 2, sizeof(firehose_conn_t));
+
+    return ptr;
+}
+
+static void *firehose_merge_sconfig(apr_pool_t *p, void *basev,
+        void *overridesv)
+{
+    firehose_conf_t *cconf = apr_pcalloc(p, sizeof(firehose_conf_t));
+    firehose_conf_t *base = (firehose_conf_t *) basev;
+    firehose_conf_t *overrides = (firehose_conf_t *) overridesv;
+
+    cconf->firehoses = apr_array_append(p, overrides->firehoses,
+            base->firehoses);
+
+    return cconf;
+}
+
+static void firehose_enable_connection(cmd_parms *cmd, const char *name,
+        proxy_enum proxy, direction_enum direction, request_enum request)
+{
+
+    firehose_conn_t *firehose;
+    firehose_conf_t
+            *ptr =
+                    (firehose_conf_t *) ap_get_module_config(cmd->server->module_config,
+                            &firehose_module);
+
+    firehose = apr_array_push(ptr->firehoses);
+
+    firehose->filename = name;
+    firehose->proxy = proxy;
+    firehose->direction = direction;
+    firehose->request = request;
+
+}
+
+static const char *firehose_enable_connection_input(cmd_parms *cmd,
+        void *dummy, const char *name)
+{
+
+    const char *err = ap_check_cmd_context(cmd, NOT_IN_DIR_LOC_FILE
+            | NOT_IN_LIMIT);
+    if (err != NULL) {
+        return err;
+    }
+
+    firehose_enable_connection(cmd, name, FIREHOSE_NORMAL, FIREHOSE_IN,
+            FIREHOSE_CONNECTION);
+
+    return NULL;
+}
+
+static const char *firehose_enable_connection_output(cmd_parms *cmd,
+        void *dummy, const char *name)
+{
+
+    const char *err = ap_check_cmd_context(cmd, NOT_IN_DIR_LOC_FILE
+            | NOT_IN_LIMIT);
+    if (err != NULL) {
+        return err;
+    }
+
+    firehose_enable_connection(cmd, name, FIREHOSE_NORMAL, FIREHOSE_OUT,
+            FIREHOSE_CONNECTION);
+
+    return NULL;
+}
+
+static const char *firehose_enable_request_input(cmd_parms *cmd, void *dummy,
+        const char *name)
+{
+
+    const char *err = ap_check_cmd_context(cmd, NOT_IN_DIR_LOC_FILE
+            | NOT_IN_LIMIT);
+    if (err != NULL) {
+        return err;
+    }
+
+    firehose_enable_connection(cmd, name, FIREHOSE_NORMAL, FIREHOSE_IN,
+            FIREHOSE_REQUEST);
+
+    return NULL;
+}
+
+static const char *firehose_enable_request_output(cmd_parms *cmd, void *dummy,
+        const char *name)
+{
+
+    const char *err = ap_check_cmd_context(cmd, NOT_IN_DIR_LOC_FILE
+            | NOT_IN_LIMIT);
+    if (err != NULL) {
+        return err;
+    }
+
+    firehose_enable_connection(cmd, name, FIREHOSE_NORMAL, FIREHOSE_OUT,
+            FIREHOSE_REQUEST);
+
+    return NULL;
+}
+
+static const char *firehose_enable_proxy_connection_input(cmd_parms *cmd,
+        void *dummy, const char *name)
+{
+
+    const char *err = ap_check_cmd_context(cmd, NOT_IN_DIR_LOC_FILE
+            | NOT_IN_LIMIT);
+    if (err != NULL) {
+        return err;
+    }
+
+    firehose_enable_connection(cmd, name, FIREHOSE_PROXY, FIREHOSE_IN,
+            FIREHOSE_CONNECTION);
+
+    return NULL;
+}
+
+static const char *firehose_enable_proxy_connection_output(cmd_parms *cmd,
+        void *dummy, const char *name)
+{
+
+    const char *err = ap_check_cmd_context(cmd, NOT_IN_DIR_LOC_FILE
+            | NOT_IN_LIMIT);
+    if (err != NULL) {
+        return err;
+    }
+
+    firehose_enable_connection(cmd, name, FIREHOSE_PROXY, FIREHOSE_OUT,
+            FIREHOSE_CONNECTION);
+
+    return NULL;
+}
+
+static const command_rec firehose_cmds[] =
+{
+        AP_INIT_TAKE1("FirehoseConnectionInput", firehose_enable_connection_input, NULL,
+                RSRC_CONF, "Enable firehose on connection input data written to the given file/pipe"),
+        AP_INIT_TAKE1("FirehoseConnectionOutput", firehose_enable_connection_output, NULL,
+                RSRC_CONF, "Enable firehose on connection output data written to the given file/pipe"),
+        AP_INIT_TAKE1("FirehoseRequestInput", firehose_enable_request_input, NULL,
+                RSRC_CONF, "Enable firehose on request input data written to the given file/pipe"),
+        AP_INIT_TAKE1("FirehoseRequestOutput", firehose_enable_request_output, NULL,
+                RSRC_CONF, "Enable firehose on request output data written to the given file/pipe"),
+        AP_INIT_TAKE1("FirehoseProxyConnectionInput", firehose_enable_proxy_connection_input, NULL,
+                RSRC_CONF, "Enable firehose on proxied connection input data written to the given file/pipe"),
+        AP_INIT_TAKE1("FirehoseProxyConnectionOutput", firehose_enable_proxy_connection_output, NULL,
+                RSRC_CONF, "Enable firehose on proxied connection output data written to the given file/pipe"),
+        { NULL }
+};
+
+AP_DECLARE_MODULE(firehose) =
+{
+        STANDARD20_MODULE_STUFF,
+        NULL,
+        NULL,
+        firehose_create_sconfig,
+        firehose_merge_sconfig,
+        firehose_cmds,
+        firehose_register_hooks
+};
diff --git a/support/Makefile.in b/support/Makefile.in

index b840118b58f9f34be0f2cee97e022c798fa9b5c8..02d201b47dc86a99305b0a9b6c6804e163269844 100644 (file)
--- a/support/Makefile.in
+++ b/support/Makefile.in
@@ -3,7 +3,7 @@ DISTCLEAN_TARGETS = apxs apachectl dbmmanage log_server_status \
  
  CLEAN_TARGETS = suexec
  
-PROGRAMS = htpasswd htdigest rotatelogs logresolve ab htdbm htcacheclean httxt2dbm $(NONPORTABLE_SUPPORT)
+PROGRAMS = htpasswd htdigest rotatelogs logresolve ab htdbm htcacheclean httxt2dbm firehose $(NONPORTABLE_SUPPORT)
  TARGETS  = $(PROGRAMS)
  
  PROGRAM_LDADD        = $(UTIL_LDFLAGS) $(PROGRAM_DEPENDENCIES) $(EXTRA_LIBS) $(AP_LIBS)
@@ -73,3 +73,8 @@ httxt2dbm: $(httxt2dbm_OBJECTS)
  fcgistarter_OBJECTS = fcgistarter.lo
  fcgistarter: $(fcgistarter_OBJECTS)
         $(LINK) $(fcgistarter_LTFLAGS) $(fcgistarter_OBJECTS) $(PROGRAM_LDADD)
+
+firehose_OBJECTS = firehose.lo
+firehose: $(firehose_OBJECTS)
+       $(LINK) $(firehose_LTFLAGS) $(firehose_OBJECTS) $(PROGRAM_LDADD)
+
diff --git a/support/config.m4 b/support/config.m4

index 4865e38ec25e1fe92cee71e6f38a4330da25b53f..c3cce10874c5bbbc0fb3182d00bed49cc7c3da29 100644 (file)
--- a/support/config.m4
+++ b/support/config.m4
@@ -8,6 +8,7 @@ checkgid_LTFLAGS=""
  htcacheclean_LTFLAGS=""
  httxt2dbm_LTFLAGS=""
  fcgistarter_LTFLAGS=""
+firehose_LTFLAGS=""
  
  AC_ARG_ENABLE(static-support,APACHE_HELP_STRING(--enable-static-support,Build a statically linked version of the support binaries),[
  if test "$enableval" = "yes" ; then
@@ -21,6 +22,7 @@ if test "$enableval" = "yes" ; then
    APR_ADDTO(htcacheclean_LTFLAGS, [-static])
    APR_ADDTO(httxt2dbm_LTFLAGS, [-static])
    APR_ADDTO(fcgistarter_LTFLAGS, [-static])
+  APR_ADDTO(firehose_LTFLAGS, [-static])
  fi
  ])
  
@@ -114,6 +116,15 @@ fi
  ])
  APACHE_SUBST(fcgistarter_LTFLAGS)
  
+AC_ARG_ENABLE(static-firehose,APACHE_HELP_STRING(--enable-static-firehose,Build a statically linked version of firehose),[
+if test "$enableval" = "yes" ; then
+  APR_ADDTO(firehose_LTFLAGS, [-static])
+else
+  APR_REMOVEFROM(firehose, [-static])
+fi
+])
+APACHE_SUBST(firehose_LTFLAGS)
+
  # Configure or check which of the non-portable support programs can be enabled.
  
  NONPORTABLE_SUPPORT=""
diff --git a/support/firehose.c b/support/firehose.c

new file mode 100644 (file)

index 0000000..b2340ad
--- /dev/null
+++ b/support/firehose.c
@@ -0,0 +1,787 @@
+/**
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ *
+ */
+
+/*
+ * Originally written @ BBC by Graham Leggett
+ * Copyright 2009-2011 British Broadcasting Corporation
+ *
+ */
+
+#include <err.h>
+#include <apr.h>
+#include <apr.h>
+#include <apr_lib.h>
+#include <apr_buckets.h>
+#include <apr_file_io.h>
+#include <apr_file_info.h>
+#include <apr_hash.h>
+#include <apr_poll.h>
+#include <apr_portable.h>
+#include <apr_getopt.h>
+#include <apr_signal.h>
+#include <apr_strings.h>
+#include <apr_uuid.h>
+#if APR_HAVE_STDLIB_H
+#include <stdlib.h>
+#endif
+#if APR_HAVE_STRING_H
+#include <string.h>
+#endif
+
+#include "ap_release.h"
+
+#define DEFAULT_MAXLINES 0
+#define DEFAULT_MAXSIZE 0
+#define DEFAULT_AGE 0 * 1000 * 1000
+#define DEFAULT_PREFIX 0
+#define DEFAULT_NONBLOCK 0
+
+typedef struct file_rec
+{
+    apr_pool_t *pool;
+    apr_file_t *file_err;
+    apr_file_t *file_in;
+    apr_file_t *file_out;
+    const char *directory;
+    apr_bucket_alloc_t *alloc;
+    apr_bucket_brigade *bb;
+    apr_hash_t *request_uuids;
+    apr_hash_t *response_uuids;
+    apr_hash_t *filters;
+    int limit;
+    apr_size_t skipped_bytes;
+    apr_size_t dropped_fragments;
+    apr_time_t start;
+    apr_time_t end;
+} file_rec;
+
+typedef struct uuid_rec
+{
+    apr_pool_t *pool;
+    const char *uuid;
+    file_rec *file;
+    apr_uint64_t count;
+    apr_time_t last;
+    apr_size_t offset;
+    int direction;
+} uuid_rec;
+
+typedef struct filter_rec
+{
+    apr_pool_t *pool;
+    const char *prefix;
+    apr_size_t len;
+} filter_rec;
+
+typedef struct header_rec
+{
+    apr_size_t len;
+    apr_time_t timestamp;
+    int direction;
+    char uuid[APR_UUID_FORMATTED_LENGTH + 1];
+    apr_uint64_t count;
+    uuid_rec *rec;
+} header_rec;
+
+static const apr_getopt_option_t
+        cmdline_opts[] =
+        {
+                /* commands */
+                {
+                        "file",
+                        'f',
+                        1,
+                        "   --file, -f <name>\t\t\tFile to read the firehose from.\n\t\t\t\t\tDefaults to stdin." },
+                {
+                        "output-directory",
+                        'd',
+                        1,
+                        "   --output-directory, -o <name>\tDirectory to write demuxed connections\n\t\t\t\t\tto." },
+                {
+                        "uuid",
+                        'u',
+                        1,
+                        "   --uuid, -u <uuid>\t\t\tThe UUID of the connection to\n\t\t\t\t\tdemultiplex. Can be specified more\n\t\t\t\t\tthan once." },
+                /*                             { "output-host", 'h', 1,
+                 "   --output-host, -h <hostname>\tHostname to write demuxed connections to." },*/
+                /*                             {
+                 "speed",
+                 's',
+                 1,
+                 "   --speed, -s <factor>\tSpeed up or slow down demuxing\n\t\t\t\tby the given factor." },*/
+                { "help", 258, 0, "   --help, -h\t\t\t\tThis help text." },
+                { "version", 257, 0,
+                        "   --version\t\t\t\tDisplay the version of the program." },
+                { NULL } };
+
+#define HELP_HEADER "Usage : %s [options] [prefix1 [prefix2 ...]]\n\n" \
+                    "Firehose demultiplexes the given stream of multiplexed connections, and\n" \
+                    "writes each connection to a file, or to a socket as appropriate.\n" \
+                    "\n" \
+                    "When writing to files, each connection is placed into a dedicated file\n" \
+                    "named after the UUID of the connection within the stream. Separate files\n" \
+                    "will be created if requests and responses are found in the stream.\n" \
+                    "\n" \
+                    "If an optional prefix is specified as a parameter, connections that start\n" \
+                    "with the given prefix will be included. The prefix needs to fit completely\n" \
+                    "within the first fragment for a successful match to occur.\n" \
+                    "\n"
+/*                    "When writing to a socket, new connections\n"
+ *                    "are opened for each connection in the stream, allowing it to be possible to\n"
+ *                    "'replay' traffic recorded by one server to other server.\n"
+ *                    "\n\n"
+ */
+#define HELP_FOOTER ""
+
+/**
+ * Who are we again?
+ */
+static void version(const char * const progname)
+{
+    printf("%s (%s)\n", progname, AP_SERVER_VERSION);
+}
+
+/**
+ * Help the long suffering end user.
+ */
+static void help(const char *argv, const char * header, const char *footer,
+        const apr_getopt_option_t opts[])
+{
+    int i = 0;
+
+    if (header) {
+        printf(header, argv);
+    }
+
+    while (opts[i].name) {
+        printf("%s\n", opts[i].description);
+        i++;
+    }
+
+    if (footer) {
+        printf("%s\n", footer);
+    }
+}
+
+/**
+ * Cleanup a uuid record. Removes the record from the uuid hashtable in files.
+ */
+static apr_status_t cleanup_uuid_rec(void *dummy)
+{
+    uuid_rec *rec = (uuid_rec *) dummy;
+
+    if (rec->direction == '>') {
+        apr_hash_set(rec->file->response_uuids, rec->uuid, APR_HASH_KEY_STRING,
+                NULL);
+    }
+    if (rec->direction == '<') {
+        apr_hash_set(rec->file->request_uuids, rec->uuid, APR_HASH_KEY_STRING,
+                NULL);
+    }
+
+    return APR_SUCCESS;
+}
+
+/**
+ * Create a uuid record, register a cleanup for it's destruction.
+ */
+static apr_status_t make_uuid_rec(file_rec *file, header_rec *header,
+        uuid_rec **ptr)
+{
+    apr_pool_t *pool;
+    uuid_rec *rec;
+    apr_pool_create(&pool, file->pool);
+
+    rec = apr_pcalloc(pool, sizeof(uuid_rec));
+    rec->pool = pool;
+    rec->file = file;
+    rec->uuid = apr_pstrdup(pool, header->uuid);
+    rec->count = 0;
+    rec->last = header->timestamp;
+    rec->direction = header->direction;
+
+    if (header->direction == '>') {
+        apr_hash_set(file->response_uuids, rec->uuid, APR_HASH_KEY_STRING, rec);
+    }
+    if (header->direction == '<') {
+        apr_hash_set(file->request_uuids, rec->uuid, APR_HASH_KEY_STRING, rec);
+    }
+
+    apr_pool_cleanup_register(pool, rec, cleanup_uuid_rec, cleanup_uuid_rec);
+
+    *ptr = rec;
+    return APR_SUCCESS;
+}
+
+/**
+ * Process the end of the fragment body.
+ *
+ * This function renames the completed stream to it's final name.
+ */
+static apr_status_t finalise_body(file_rec *file, header_rec *header)
+{
+    apr_status_t status;
+    char *nfrom, *nto, *from, *to;
+    apr_pool_t *pool;
+    char errbuf[HUGE_STRING_LEN];
+
+    apr_pool_create(&pool, file->pool);
+
+    to = apr_pstrcat(pool, header->uuid, header->direction == '>' ? ".response"
+            : ".request", NULL);
+    from = apr_pstrcat(pool, to, ".part", NULL);
+
+    status = apr_filepath_merge(&nfrom, file->directory, from,
+            APR_FILEPATH_SECUREROOT, pool);
+    if (APR_SUCCESS == status) {
+        status = apr_filepath_merge(&nto, file->directory, to,
+                APR_FILEPATH_SECUREROOT, pool);
+        if (APR_SUCCESS == status) {
+            if (APR_SUCCESS == (status = apr_file_mtime_set(nfrom, file->end, pool))) {
+                if (APR_SUCCESS != (status = apr_file_rename(nfrom, nto, pool))) {
+                    apr_file_printf(
+                            file->file_err,
+                            "Could not rename file '%s' to '%s' for fragment write: %s\n",
+                            nfrom, nto,
+                            apr_strerror(status, errbuf, sizeof(errbuf)));
+                }
+            }
+            else {
+                apr_file_printf(
+                        file->file_err,
+                        "Could not set mtime on file '%s' to '%" APR_TIME_T_FMT "' for fragment write: %s\n",
+                        nfrom, file->end,
+                        apr_strerror(status, errbuf, sizeof(errbuf)));
+            }
+        }
+        else {
+            apr_file_printf(file->file_err,
+                    "Could not merge directory '%s' with file '%s': %s\n",
+                    file->directory, to, apr_strerror(status, errbuf,
+                            sizeof(errbuf)));
+        }
+    }
+    else {
+        apr_file_printf(file->file_err,
+                "Could not merge directory '%s' with file '%s': %s\n",
+                file->directory, from, apr_strerror(status, errbuf,
+                        sizeof(errbuf)));
+    }
+
+    apr_pool_destroy(pool);
+
+    return status;
+}
+
+/**
+ * Check if the fragment matches on of the prefixes.
+ */
+static int check_prefix(file_rec *file, header_rec *header, const char *str,
+        apr_size_t len)
+{
+    apr_hash_index_t *hi;
+    void *val;
+    apr_pool_t *pool;
+    int match = -1;
+
+    apr_pool_create(&pool, file->pool);
+
+    for (hi = apr_hash_first(pool, file->filters); hi; hi = apr_hash_next(hi)) {
+        filter_rec *filter;
+        apr_hash_this(hi, NULL, NULL, &val);
+        filter = (filter_rec *) val;
+
+        if (len > filter->len && !strncmp(filter->prefix, str, filter->len)) {
+            match = 1;
+            break;
+        }
+        match = 0;
+    }
+
+    apr_pool_destroy(pool);
+
+    return match;
+}
+
+/**
+ * Process part of the fragment body, given the header parameters.
+ *
+ * Currently, we append it to a file named after the UUID of the connection.
+ *
+ * The file is opened on demand and closed when done, so that we are
+ * guaranteed never to hit a file handle limit (within reason).
+ */
+static apr_status_t process_body(file_rec *file, header_rec *header,
+        const char *str, apr_size_t len)
+{
+    apr_status_t status;
+    char *native, *name;
+    apr_pool_t *pool;
+    apr_file_t *handle;
+    char errbuf[HUGE_STRING_LEN];
+
+    if (!file->start) {
+        file->start = header->timestamp;
+    }
+    file->end = header->timestamp;
+
+    apr_pool_create(&pool, file->pool);
+
+    name
+            = apr_pstrcat(pool, header->uuid,
+                    header->direction == '>' ? ".response.part"
+                            : ".request.part", NULL);
+
+    status = apr_filepath_merge(&native, file->directory, name,
+            APR_FILEPATH_SECUREROOT, pool);
+    if (APR_SUCCESS == status) {
+        if (APR_SUCCESS == (status = apr_file_open(&handle, native, APR_WRITE
+                | APR_CREATE | APR_APPEND, APR_OS_DEFAULT, pool))) {
+            if (APR_SUCCESS != (status = apr_file_write_full(handle, str, len,
+                    &len))) {
+                apr_file_printf(file->file_err,
+                        "Could not write fragment body to file '%s': %s\n",
+                        native, apr_strerror(status, errbuf, sizeof(errbuf)));
+            }
+        }
+        else {
+            apr_file_printf(file->file_err,
+                    "Could not open file '%s' for fragment write: %s\n",
+                    native, apr_strerror(status, errbuf, sizeof(errbuf)));
+        }
+    }
+    else {
+        apr_file_printf(file->file_err,
+                "Could not merge directory '%s' with file '%s': %s\n",
+                file->directory, name, apr_strerror(status, errbuf,
+                        sizeof(errbuf)));
+    }
+
+    apr_pool_destroy(pool);
+
+    return status;
+}
+
+/**
+ * Parse a chunk extension, detect overflow.
+ * There are two error cases:
+ *  1) If the conversion would require too many bits, a -1 is returned.
+ *  2) If the conversion used the correct number of bits, but an overflow
+ *     caused only the sign bit to flip, then that negative number is
+ *     returned.
+ * In general, any negative number can be considered an overflow error.
+ */
+static apr_status_t read_hex(const char **buf, apr_uint64_t *val)
+{
+    const char *b = *buf;
+    apr_uint64_t chunksize = 0;
+    apr_size_t chunkbits = sizeof(apr_uint64_t) * 8;
+
+    if (!apr_isxdigit(*b)) {
+        return APR_EGENERAL;
+    }
+    /* Skip leading zeros */
+    while (*b == '0') {
+        ++b;
+    }
+
+    while (apr_isxdigit(*b) && (chunkbits > 0)) {
+        int xvalue = 0;
+
+        if (*b >= '0' && *b <= '9') {
+            xvalue = *b - '0';
+        }
+        else if (*b >= 'A' && *b <= 'F') {
+            xvalue = *b - 'A' + 0xa;
+        }
+        else if (*b >= 'a' && *b <= 'f') {
+            xvalue = *b - 'a' + 0xa;
+        }
+
+        chunksize = (chunksize << 4) | xvalue;
+        chunkbits -= 4;
+        ++b;
+    }
+    *buf = b;
+    if (apr_isxdigit(*b) && (chunkbits <= 0)) {
+        /* overflow */
+        return APR_EGENERAL;
+    }
+
+    *val = chunksize;
+
+    return APR_SUCCESS;
+}
+
+/**
+ * Parse what might be a fragment header line.
+ *
+ * If the parse doesn't match for any reason, an error is returned, otherwise
+ * APR_SUCCESS.
+ *
+ * The header structure will be filled with the header values as parsed.
+ */
+static apr_status_t process_header(file_rec *file, header_rec *header,
+        const char *str, apr_size_t len)
+{
+    apr_uint64_t val;
+    apr_status_t status;
+    int i;
+    apr_uuid_t raw;
+    const char *end = str + len;
+
+    if (APR_SUCCESS != (status = read_hex(&str, &val))) {
+        return status;
+    }
+    header->len = val;
+
+    if (!apr_isspace(*(str++))) {
+        return APR_EGENERAL;
+    }
+
+    if (APR_SUCCESS != (status = read_hex(&str, &val))) {
+        return status;
+    }
+    header->timestamp = val;
+
+    if (!apr_isspace(*(str++))) {
+        return APR_EGENERAL;
+    }
+
+    if (*str != '<' && *str != '>') {
+        return APR_EGENERAL;
+    }
+    header->direction = *str;
+    str++;
+
+    if (!apr_isspace(*(str++))) {
+        return APR_EGENERAL;
+    }
+
+    for (i = 0; str[i] && i < APR_UUID_FORMATTED_LENGTH; i++) {
+        header->uuid[i] = str[i];
+    }
+    header->uuid[i] = 0;
+    if (apr_uuid_parse(&raw, header->uuid)) {
+        return APR_EGENERAL;
+    }
+    str += i;
+
+    if (!apr_isspace(*(str++))) {
+        return APR_EGENERAL;
+    }
+
+    if (APR_SUCCESS != (status = read_hex(&str, &val))) {
+        return status;
+    }
+    header->count = val;
+
+    if ((*(str++) != '\r')) {
+        return APR_EGENERAL;
+    }
+    if ((*(str++) != '\n')) {
+        return APR_EGENERAL;
+    }
+    if (str != end) {
+        return APR_EGENERAL;
+    }
+
+    return APR_SUCCESS;
+}
+
+/**
+ * Suck on the file/pipe, and demux any fragments on the incoming stream.
+ *
+ * If EOF is detected, this function returns.
+ */
+static apr_status_t demux(file_rec *file)
+{
+    char errbuf[HUGE_STRING_LEN];
+    apr_size_t len = 0;
+    apr_status_t status = APR_SUCCESS;
+    apr_bucket *b, *e;
+    apr_bucket_brigade *bb, *obb;
+    int footer = 0;
+    const char *buf;
+
+    bb = apr_brigade_create(file->pool, file->alloc);
+    obb = apr_brigade_create(file->pool, file->alloc);
+    b = apr_bucket_pipe_create(file->file_in, file->alloc);
+
+    APR_BRIGADE_INSERT_HEAD(bb, b);
+
+    do {
+
+        /* when the pipe is closed, the pipe disappears from the brigade */
+        if (APR_BRIGADE_EMPTY(bb)) {
+            break;
+        }
+
+        status = apr_brigade_split_line(obb, bb, APR_BLOCK_READ,
+                HUGE_STRING_LEN);
+
+        if (APR_SUCCESS == status || APR_EOF == status) {
+            char str[HUGE_STRING_LEN];
+            len = HUGE_STRING_LEN;
+
+            apr_brigade_flatten(obb, str, &len);
+
+            apr_brigade_cleanup(obb);
+
+            if (len == HUGE_STRING_LEN) {
+                file->skipped_bytes += len;
+                continue;
+            }
+            else if (footer) {
+                if (len == 2 && str[0] == '\r' && str[1] == '\n') {
+                    footer = 0;
+                    continue;
+                }
+                file->skipped_bytes += len;
+            }
+            else if (len > 0) {
+                header_rec header;
+                status = process_header(file, &header, str, len);
+                if (APR_SUCCESS != status) {
+                    file->skipped_bytes += len;
+                    continue;
+                }
+                else {
+                    int ignore = 0;
+
+                    header.rec = NULL;
+                    if (header.direction == '>') {
+                        header.rec = apr_hash_get(file->response_uuids,
+                                header.uuid, APR_HASH_KEY_STRING);
+                    }
+                    if (header.direction == '<') {
+                        header.rec = apr_hash_get(file->request_uuids,
+                                header.uuid, APR_HASH_KEY_STRING);
+                    }
+                    if (header.rec) {
+                        /* does the count match what is expected? */
+                        if (header.count != header.rec->count) {
+                            file->dropped_fragments++;
+                            ignore = 1;
+                        }
+                    }
+                    else {
+                        /* must we ignore unknown uuids? */
+                        if (file->limit) {
+                            ignore = 1;
+                        }
+
+                        /* is the counter not what we expect? */
+                        else if (header.count != 0) {
+                            file->skipped_bytes += len;
+                            ignore = 1;
+                        }
+
+                        /* otherwise, make a new uuid */
+                        else {
+                            make_uuid_rec(file, &header, &header.rec);
+                        }
+                    }
+
+                    if (header.len) {
+                        if (APR_SUCCESS != (status = apr_brigade_partition(bb,
+                                header.len, &e))) {
+                            apr_file_printf(
+                                    file->file_err,
+                                    "Could not read fragment body from input file: %s\n",
+                                    apr_strerror(status, errbuf, sizeof(errbuf)));
+                            break;
+                        }
+                        while ((b = APR_BRIGADE_FIRST(bb)) && e != b) {
+                            apr_bucket_read(b, &buf, &len, APR_READ_BLOCK);
+                            if (!ignore && !header.count && !check_prefix(file,
+                                    &header, buf, len)) {
+                                ignore = 1;
+                            }
+                            if (!ignore) {
+                                status = process_body(file, &header, buf, len);
+                                header.rec->offset += len;
+                            }
+                            if (ignore || APR_SUCCESS != status) {
+                                apr_bucket_delete(b);
+                                file->skipped_bytes += len;
+                                continue;
+                            }
+                            apr_bucket_delete(b);
+                        }
+                        if (!ignore) {
+                            header.rec->count++;
+                        }
+                        footer = 1;
+                        continue;
+                    }
+                    else {
+                        /* an empty header means end-of-connection */
+                        if (header.rec) {
+                            if (!ignore) {
+                                if (!header.count) {
+                                    status = process_body(file, &header, "", 0);
+                                }
+                                status = finalise_body(file, &header);
+                            }
+                            apr_pool_destroy(header.rec->pool);
+                        }
+                    }
+
+                }
+            }
+
+        }
+        else {
+            apr_file_printf(file->file_err,
+                    "Could not read fragment header from input file: %s\n",
+                    apr_strerror(status, errbuf, sizeof(errbuf)));
+            break;
+        }
+
+    } while (1);
+
+    return status;
+}
+
+/**
+ * Start the application.
+ */
+int main(int argc, const char * const argv[])
+{
+    apr_status_t status;
+    apr_pool_t *pool;
+    char errmsg[1024];
+    apr_getopt_t *opt;
+    int optch;
+    const char *optarg;
+
+    file_rec *file;
+
+    /* lets get APR off the ground, and make sure it terminates cleanly */
+    if (APR_SUCCESS != (status = apr_app_initialize(&argc, &argv, NULL))) {
+        return 1;
+    }
+    atexit(apr_terminate);
+
+    if (APR_SUCCESS != (status = apr_pool_create(&pool, NULL))) {
+        return 1;
+    }
+
+    apr_signal_block(SIGPIPE);
+
+    file = apr_pcalloc(pool, sizeof(file_rec));
+    apr_file_open_stderr(&file->file_err, pool);
+    apr_file_open_stdin(&file->file_in, pool);
+    apr_file_open_stdout(&file->file_out, pool);
+
+    file->pool = pool;
+    file->alloc = apr_bucket_alloc_create(pool);
+    file->bb = apr_brigade_create(pool, file->alloc);
+    file->request_uuids = apr_hash_make(pool);
+    file->response_uuids = apr_hash_make(pool);
+    file->filters = apr_hash_make(pool);
+
+    apr_getopt_init(&opt, pool, argc, argv);
+    while ((status = apr_getopt_long(opt, cmdline_opts, &optch, &optarg))
+            == APR_SUCCESS) {
+
+        switch (optch) {
+        case 'f': {
+            status = apr_file_open(&file->file_in, optarg, APR_FOPEN_READ,
+                    APR_OS_DEFAULT, pool);
+            if (status != APR_SUCCESS) {
+                apr_file_printf(file->file_err,
+                        "Could not open file '%s' for read: %s\n", optarg,
+                        apr_strerror(status, errmsg, sizeof errmsg));
+                return 1;
+            }
+            break;
+        }
+        case 'd': {
+            apr_finfo_t finfo;
+            status = apr_stat(&finfo, optarg, APR_FINFO_TYPE, pool);
+            if (status != APR_SUCCESS) {
+                apr_file_printf(file->file_err,
+                        "Directory '%s' could not be found: %s\n", optarg,
+                        apr_strerror(status, errmsg, sizeof errmsg));
+                return 1;
+            }
+            if (finfo.filetype != APR_DIR) {
+                apr_file_printf(file->file_err,
+                        "Path '%s' isn't a directory\n", optarg);
+                return 1;
+            }
+            file->directory = optarg;
+            break;
+        }
+        case 'u': {
+            apr_pool_t *pchild;
+            uuid_rec *rec;
+            apr_pool_create(&pchild, pool);
+            rec = apr_pcalloc(pchild, sizeof(uuid_rec));
+            rec->pool = pchild;
+            rec->uuid = optarg;
+            apr_hash_set(file->request_uuids, optarg, APR_HASH_KEY_STRING, rec);
+            apr_hash_set(file->response_uuids, optarg, APR_HASH_KEY_STRING, rec);
+            file->limit++;
+            break;
+        }
+        case 257: {
+            version(argv[0]);
+            return 0;
+        }
+        case 258: {
+            help(argv[0], HELP_HEADER, HELP_FOOTER, cmdline_opts);
+            return 0;
+
+        }
+        }
+
+    }
+    if (APR_SUCCESS != status && APR_EOF != status) {
+        return 1;
+    }
+
+    /* read filters from the command line */
+    while (opt->ind < argc) {
+        apr_pool_t *pchild;
+        filter_rec *filter;
+        apr_pool_create(&pchild, pool);
+        filter = apr_pcalloc(pchild, sizeof(filter_rec));
+        filter->pool = pchild;
+        filter->prefix = opt->argv[opt->ind];
+        filter->len = strlen(opt->argv[opt->ind]);
+        apr_hash_set(file->filters, opt->argv[opt->ind], APR_HASH_KEY_STRING,
+                filter);
+        opt->ind++;
+    }
+
+    status = demux(file);
+
+    /* warn people if any non blocking writes failed */
+    if (file->skipped_bytes || file->dropped_fragments) {
+        apr_file_printf(
+                file->file_err,
+                "Warning: %" APR_SIZE_T_FMT " bytes skipped, %" APR_SIZE_T_FMT " fragments dropped.\n",
+                file->skipped_bytes, file->dropped_fragments);
+    }
+
+    if (APR_SUCCESS != status) {
+        return 1;
+    }
+
+    return 0;
+}
author	Graham Leggett <minfrin@apache.org>
	Sat, 17 Dec 2011 17:03:59 +0000 (17:03 +0000)
committer	Graham Leggett <minfrin@apache.org>
	Sat, 17 Dec 2011 17:03:59 +0000 (17:03 +0000)
CHANGES		patch \| blob \| blame \| history
docs/manual/mod/allmodules.xml		patch \| blob \| blame \| history
docs/manual/mod/mod_firehose.xml	[new file with mode: 0644]	patch \| blob
docs/manual/programs/index.xml		patch \| blob \| blame \| history
modules/debugging/config.m4		patch \| blob \| blame \| history
modules/debugging/mod_firehose.c	[new file with mode: 0644]	patch \| blob
support/Makefile.in		patch \| blob \| blame \| history
support/config.m4		patch \| blob \| blame \| history
support/firehose.c	[new file with mode: 0644]	patch \| blob