]>
git.ipfire.org Git - thirdparty/squid.git/blob - helpers/log_daemon/DB/log_db_daemon.pl.in
5 use English
qw( -no_match_vars ) ;
15 log_db_daemon - Database logging daemon for Squid
19 log_db_daemon DSN [options]
23 This program writes Squid access.log entries to a database.
24 Presently only accepts the B<squid> native format
30 Database DSN encoded as a path. This is sent as the access_log file path.
33 access_log daemon:/host/database/table/username/password squid
35 to leave a parameter unspecified use a double slash:
36 access_log daemon://database/table/username/password squid
38 Default "DBI:mysql:database=squid"
42 Write debug messages to Squid stderr or cache.log
48 # the first argument to this script is the log file path describing the DSN
51 # others may be options
58 # utility routine to print messages on stderr (so they appear in cache log)
59 # without using warn, which would clutter the log with source line numbers
62 print STDERR
" $msg \n " ;
65 # we use logfile to pass database access information to this script
66 # sample configuration:
67 # access_log daemon:/host/database/table/username/password squid
68 # to let a parmeter unspecified, e.g. the database host, use a double slash:
69 # access_log daemon://database/table/username/password squid
70 my ( $host , $database , $table , $user , $pass ) = $log_file =~ / \/ (.* ?
) \
/(.*?)\/ (.* ?
) \
/(.*?)\/ (.* ?
) \z
/ xms
;
74 log_info
( "Database host not specified. Using $host ." );
78 $database = 'squid_log' ;
79 log_info
( "Database name not specified. Using $database ." );
83 $table = 'access_log' ;
84 log_info
( "Table parameter not specified. Using $table ." );
89 log_info
( "User parameter not specified. Using $user ." );
93 log_info
( 'No password specified. Connecting with NO password.' );
96 # fields that we should have in the table
97 # Order here must match the order of fields in the Log format and parse() output array.
114 # perform db connection
115 my $dsn = "DBI:mysql:database= $database " . ( $host ne "localhost" ?
": $host " : "" );
119 warn "Connecting... dsn=' $dsn ', username=' $user ', password='...'" ;
120 $dbh = DBI
-> connect ( $dsn , $user , $pass , { AutoCommit
=> 1 , RaiseError
=> 1 , PrintError
=> 1 });
123 die "Cannot connect to database: $DBI ::errstr" ;
127 # a simple test to assure the specified table exists
129 my $q = 'SELECT ' . join ( ',' , @db_fields ) . " FROM $table LIMIT 1" ;
130 my $sth = $dbh -> prepare ( $q );
134 # run a query to create the table of required syntax
135 my $create_query = 'CREATE TABLE ' . $table . ' (' .
136 " id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY," .
137 " time_since_epoch DECIMAL(15,3)," .
138 " time_response INTEGER," .
139 " ip_client CHAR(15)," .
140 " ip_server CHAR(15)," .
141 " http_status_code VARCHAR(10)," .
142 " http_reply_size INTEGER," .
143 " http_method VARCHAR(20)," .
145 " http_username VARCHAR(20)," .
146 " http_mime_type VARCHAR(50)," .
147 " squid_request_status VARCHAR(50)," .
148 " squid_hier_status VARCHAR(20)" .
151 $sth = $dbh -> prepare ( $create_query );
153 # test again and fail hard if it is still broken.
155 my $q = 'SELECT ' . join ( ',' , @db_fields ) . " FROM $table LIMIT 1" ;
156 my $sth = $dbh -> prepare ( $q );
160 die "Error initializing database table: $EVAL_ERROR " ;
165 # for better performance, prepare the statement at startup
167 my $q = "INSERT INTO $table (" . join ( ',' , @db_fields ) . ") VALUES(NULL" . ',?' x
( scalar ( @db_fields )- 1 ) . ')' ;
168 #$sth = $dbh->prepare("INSERT INTO $table VALUES(NULL,?,?,?,?,?,?,?,?,?,?,?,?)");
169 $sth = $dbh -> prepare ( $q );
172 die "Error while preparing sql statement: $EVAL_ERROR " ;
177 my ( @t ) = $line =~ /^L(\d+\.\d+) *(\d+?) (.*?) (.*?)\/ ( \d
+ ?
) ( \d
+ ?
) (.* ?
) (.* ?
) (.* ?
) (.* ?
) \
/(.*?) (.*)$/ ;
181 while ( my $line = <>) {
184 my $cmd = substr ( $line , 0 , 1 ); # extract command byte
187 my @log_entry = parse
( $line );
188 eval { # we catch db errors to avoid crashing squid in case something goes wrong...
189 $sth -> execute ( @log_entry ) or die $sth -> errstr
191 if ( $EVAL_ERROR ) { # leave a trace of the error in the logs
192 warn $EVAL_ERROR . " values=(" . join ( ', ' , @log_entry ) . ')' ;
205 This module exploits the new logfile daemon support available in squid 2.7 and 3.2 to store access log entries in a MySQL database.
209 =head2 Squid configuration
211 =head3 access_log directive
213 The path to the access log file is used to provide the database connection parameters.
215 access_log daemon:/mysql_host:port/database/table/username/password squid
217 The 'daemon' prefix is mandatory and tells squid that the B<logfile_daemon> helper is to be used instead of the normal file logging.
219 The last parameter tells squid which log format to use when writing lines to the log daemon.
220 Presently B<squid> format is supported.
224 =item mysql_host:port
226 Host where the mysql server is running. If left empty, 'localhost' is assumed.
230 Name of the database to connect to. If left empty, 'squid_log' is assumed.
234 Name of the database table where log lines are stored. If left empty, 'access_log' is assumed.
238 Username to use when connecting to the database. If left empty, 'squid' is assumed.
242 Password to use when connecting to the database. If left empty, no password is used.
246 To leave all fields to their default values, you can use a single slash:
248 access_log daemon:/ squid
250 To specify only the database password, which by default is empty, you must leave unspecified all the other parameters by using null strings:
252 access_log daemon://///password squid
254 =head3 logfile_daemon directive
256 This is the current way of telling squid where the logfile daemon resides.
258 logfile_daemon /path/to/squid/libexec/logfile-daemon_mysql.pl
260 The script must be copied to the location specified in the directive.
262 =head2 Database configuration
264 Let's call the database 'squid_log' and the log table 'access_log'. The username and password for the db connection will be both 'squid'.
270 CREATE DATABASE squid_log;
276 GRANT INSERT,SELECT,CREATE ON squid_log.* TO 'squid'@'localhost' IDENTIFIED BY 'squid';
279 Note that only CREATE, INSERT and SELECT privileges are granted to the 'squid' user. This ensures that the logfile daemon script cannot change or modify the log entries.
283 The Daemon will attempt to initialize this table if none exists when it starts.
285 The table created should look like:
287 CREATE TABLE access_log (
288 id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
289 time_since_epoch DECIMAL(15,3),
290 time_response INTEGER,
293 http_status_code VARCHAR(10),
294 http_reply_size INTEGER,
295 http_method VARCHAR(20),
297 http_username VARCHAR(20),
298 http_mime_type VARCHAR(50),
299 squid_hier_status VARCHAR(20),
300 squid_request_status VARCHAR(20)
303 =head1 VERSION INFORMATION
305 This document refers to C<log_db_daemon> script version 0.5.
307 The script has been developed and tested in the following environment:
311 =item squid-2.7 Squid-3.2
313 =item mysql 5.0.26 and 5.1
321 =head1 DATA EXTRACTION
323 =head2 Sample queries.
327 =item Clients accessing the cache
329 SELECT DISTINCT ip_client FROM access_log;
331 =item Number of request per day
334 DATE(FROM_UNIXTIME(time_since_epoch)) AS date_day,
335 COUNT(*) AS num_of_requests
340 =item Request status count
342 To obtain the raw count of each request status:
344 SELECT squid_request_status, COUNT(*) AS n
346 GROUP BY squid_request_status
349 To calculate the percentage of each request status:
352 squid_request_status,
353 (COUNT(*)/(SELECT COUNT(*) FROM access_log)*100) AS percentage
355 GROUP BY squid_request_status
358 To distinguish only between HITs and MISSes:
364 WHERE squid_request_status LIKE '%HIT%')
366 (SELECT COUNT(*) FROM access_log)*100
373 WHERE squid_request_status LIKE '%MISS%')
375 (SELECT COUNT(*) FROM access_log)*100
378 =item Response time ranges
382 COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage
384 WHERE time_response >= 0 AND time_response < 500
388 COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage
390 WHERE time_response >= 500 AND time_response < 1000
394 COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage
396 WHERE time_response >= 1000 AND time_response < 2000
400 COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage
402 WHERE time_response >= 2000;
404 =item Traffic by mime type
408 SUM(http_reply_size) as total_bytes
410 GROUP BY http_mime_type
413 =item Traffic by client
417 SUM(http_reply_size) AS total_bytes
426 The MyISAM storage engine is known to be faster than the InnoDB one, so although it doesn't support transactions and referential integrity, it might be more appropriate in this scenario. You might want to append "ENGINE=MYISAM" at the end of the table creation code in the above SQL script.
428 Indexes should be created according to the queries that are more frequently run. The DDL script only creates an implicit index for the primary key column.
434 This script currently implements only the C<L> (i.e. "append a line to the log") command, therefore the log lines are never purged from the table. This approach has an obvious scalability problem.
436 One solution would be to implement e.g. the "rotate log" command in a way that would calculate some summary values, put them in a "summary table" and then delete the lines used to caluclate those values.
438 Similar cleanup code could be implemented in an external script and run periodically independently from squid log commands.
442 This script has only been tested in low-volume scenarios (single client, less than 10 req/s). Tests in high volume environments could reveal performance bottlenecks and bugs.
446 Marcello Romani, marcello.romani@libero.it
447 Amos Jeffries, amosjeffries@squid-cache.org
449 =head1 COPYRIGHT AND LICENSE
451 * Copyright (C) 1996-2014 The Squid Software Foundation and contributors
453 * Squid software is distributed under GPLv2+ license and includes
454 * contributions from numerous individuals and organizations.
455 * Please see the COPYING and CONTRIBUTORS files for details.
457 Copyright (C) 2008 by Marcello Romani
459 This library is free software; you can redistribute it and/or modify
460 it under the same terms as Perl itself, either Perl version 5.8.8 or,
461 at your option, any later version of Perl 5 you may have available.