]>
git.ipfire.org Git - thirdparty/squid.git/blob - helpers/log_daemon/DB/log_db_daemon.pl.in
5 use English
qw( -no_match_vars ) ;
15 log_db_daemon - Database logging daemon for Squid
19 log_db_daemon DSN [options]
23 This program writes Squid access.log entries to a database.
24 Presently only accepts the B<squid> native format
30 Database DSN encoded as a path. This is sent as the access_log file path.
33 access_log daemon:/host/database/table/username/password squid
35 to leave a parameter unspecified use a double slash:
36 access_log daemon://database/table/username/password squid
38 Default "DBI:mysql:database=squid"
42 Write debug messages to Squid stderr or cache.log
46 # the first argument to this script is the log file path describing the DSN
49 # others may be options
56 # utility routine to print messages on stderr (so they appear in cache log)
57 # without using warn, which would clutter the log with source line numbers
60 print STDERR
" $msg \n " ;
63 # we use logfile to pass database access information to this script
64 # sample configuration:
65 # access_log daemon:/host/database/table/username/password squid
66 # to let a parmeter unspecified, e.g. the database host, use a double slash:
67 # access_log daemon://database/table/username/password squid
68 my ( $host , $database , $table , $user , $pass ) = $log_file =~ / \/ (.* ?
) \
/(.*?)\/ (.* ?
) \
/(.*?)\/ (.* ?
) \z
/ xms
;
72 log_info
( "Database host not specified. Using $host ." );
76 $database = 'squid_log' ;
77 log_info
( "Database name not specified. Using $database ." );
81 $table = 'access_log' ;
82 log_info
( "Table parameter not specified. Using $table ." );
87 log_info
( "User parameter not specified. Using $user ." );
91 log_info
( 'No password specified. Connecting with NO password.' );
94 # fields that we should have in the table
95 # Order here must match the order of fields in the Log format and parse() output array.
112 # perform db connection
113 my $dsn = "DBI:mysql:database= $database " . ( $host ne "localhost" ?
": $host " : "" );
117 warn "Connecting... dsn=' $dsn ', username=' $user ', password='...'" ;
118 $dbh = DBI
-> connect ( $dsn , $user , $pass , { AutoCommit
=> 1 , RaiseError
=> 1 , PrintError
=> 1 });
121 die "Cannot connect to database: $DBI ::errstr" ;
125 # a simple test to assure the specified table exists
127 my $q = 'SELECT ' . join ( ',' , @db_fields ) . " FROM $table LIMIT 1" ;
128 my $sth = $dbh -> prepare ( $q );
132 # run a query to create the table of required syntax
133 my $create_query = 'CREATE TABLE ' . $table . ' (' .
134 " id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY," .
135 " time_since_epoch DECIMAL(15,3)," .
136 " time_response INTEGER," .
137 " ip_client CHAR(15)," .
138 " ip_server CHAR(15)," .
139 " http_status_code VARCHAR(10)," .
140 " http_reply_size INTEGER," .
141 " http_method VARCHAR(20)," .
143 " http_username VARCHAR(20)," .
144 " http_mime_type VARCHAR(50)," .
145 " squid_request_status VARCHAR(50)," .
146 " squid_hier_status VARCHAR(20)" .
149 $sth = $dbh -> prepare ( $create_query );
151 # test again and fail hard if it is still broken.
153 my $q = 'SELECT ' . join ( ',' , @db_fields ) . " FROM $table LIMIT 1" ;
154 my $sth = $dbh -> prepare ( $q );
158 die "Error initializing database table: $EVAL_ERROR " ;
163 # for better performance, prepare the statement at startup
165 my $q = "INSERT INTO $table (" . join ( ',' , @db_fields ) . ") VALUES(NULL" . ',?' x
( scalar ( @db_fields )- 1 ) . ')' ;
166 #$sth = $dbh->prepare("INSERT INTO $table VALUES(NULL,?,?,?,?,?,?,?,?,?,?,?,?)");
167 $sth = $dbh -> prepare ( $q );
170 die "Error while preparing sql statement: $EVAL_ERROR " ;
175 my ( @t ) = $line =~ /^L(\d+\.\d+) *(\d+?) (.*?) (.*?)\/ ( \d
+ ?
) ( \d
+ ?
) (.* ?
) (.* ?
) (.* ?
) (.* ?
) \
/(.*?) (.*)$/ ;
179 while ( my $line = <>) {
182 my $cmd = substr ( $line , 0 , 1 ); # extract command byte
185 my @log_entry = parse
( $line );
186 eval { # we catch db errors to avoid crashing squid in case something goes wrong...
187 $sth -> execute ( @log_entry ) or die $sth -> errstr
189 if ( $EVAL_ERROR ) { # leave a trace of the error in the logs
190 warn $EVAL_ERROR . " values=(" . join ( ', ' , @log_entry ) . ')' ;
201 This module exploits the new logfile daemon support available in squid 2.7 and 3.2 to store access log entries in a MySQL database.
205 =head2 Squid configuration
207 =head3 access_log directive
209 The path to the access log file is used to provide the database connection parameters.
211 access_log daemon:/mysql_host:port/database/table/username/password squid
213 The 'daemon' prefix is mandatory and tells squid that the B<logfile_daemon> helper is to be used instead of the normal file logging.
215 The last parameter tells squid which log format to use when writing lines to the log daemon.
216 Presently B<squid> format is supported.
220 =item mysql_host:port
222 Host where the mysql server is running. If left empty, 'localhost' is assumed.
226 Name of the database to connect to. If left empty, 'squid_log' is assumed.
230 Name of the database table where log lines are stored. If left empty, 'access_log' is assumed.
234 Username to use when connecting to the database. If left empty, 'squid' is assumed.
238 Password to use when connecting to the database. If left empty, no password is used.
242 To leave all fields to their default values, you can use a single slash:
244 access_log daemon:/ squid
246 To specify only the database password, which by default is empty, you must leave unspecified all the other parameters by using null strings:
248 access_log daemon://///password squid
250 =head3 logfile_daemon directive
252 This is the current way of telling squid where the logfile daemon resides.
254 logfile_daemon /path/to/squid/libexec/logfile-daemon_mysql.pl
256 The script must be copied to the location specified in the directive.
258 =head2 Database configuration
260 Let's call the database 'squid_log' and the log table 'access_log'. The username and password for the db connection will be both 'squid'.
266 CREATE DATABASE squid_log;
272 GRANT INSERT,SELECT,CREATE ON squid_log.* TO 'squid'@'localhost' IDENTIFIED BY 'squid';
275 Note that only CREATE, INSERT and SELECT privileges are granted to the 'squid' user. This ensures that the logfile daemon script cannot change or modify the log entries.
279 The Daemon will attempt to initialize this table if none exists when it starts.
281 The table created should look like:
283 CREATE TABLE access_log (
284 id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
285 time_since_epoch DECIMAL(15,3),
286 time_response INTEGER,
289 http_status_code VARCHAR(10),
290 http_reply_size INTEGER,
291 http_method VARCHAR(20),
293 http_username VARCHAR(20),
294 http_mime_type VARCHAR(50),
295 squid_hier_status VARCHAR(20),
296 squid_request_status VARCHAR(20)
299 =head1 VERSION INFORMATION
301 This document refers to C<log_db_daemon> script version 0.5.
303 The script has been developed and tested in the following environment:
307 =item squid-2.7 Squid-3.2
309 =item mysql 5.0.26 and 5.1
317 =head1 DATA EXTRACTION
319 =head2 Sample queries.
323 =item Clients accessing the cache
325 SELECT DISTINCT ip_client FROM access_log;
327 =item Number of request per day
330 DATE(FROM_UNIXTIME(time_since_epoch)) AS date_day,
331 COUNT(*) AS num_of_requests
336 =item Request status count
338 To obtain the raw count of each request status:
340 SELECT squid_request_status, COUNT(*) AS n
342 GROUP BY squid_request_status
345 To calculate the percentage of each request status:
348 squid_request_status,
349 (COUNT(*)/(SELECT COUNT(*) FROM access_log)*100) AS percentage
351 GROUP BY squid_request_status
354 To distinguish only between HITs and MISSes:
360 WHERE squid_request_status LIKE '%HIT%')
362 (SELECT COUNT(*) FROM access_log)*100
369 WHERE squid_request_status LIKE '%MISS%')
371 (SELECT COUNT(*) FROM access_log)*100
374 =item Response time ranges
378 COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage
380 WHERE time_response >= 0 AND time_response < 500
384 COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage
386 WHERE time_response >= 500 AND time_response < 1000
390 COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage
392 WHERE time_response >= 1000 AND time_response < 2000
396 COUNT(*)/(SELECT COUNT(*) FROM access_log)*100 AS percentage
398 WHERE time_response >= 2000;
400 =item Traffic by mime type
404 SUM(http_reply_size) as total_bytes
406 GROUP BY http_mime_type
409 =item Traffic by client
413 SUM(http_reply_size) AS total_bytes
422 The MyISAM storage engine is known to be faster than the InnoDB one, so although it doesn't support transactions and referential integrity, it might be more appropriate in this scenario. You might want to append "ENGINE=MYISAM" at the end of the table creation code in the above SQL script.
424 Indexes should be created according to the queries that are more frequently run. The DDL script only creates an implicit index for the primary key column.
430 This script currently implements only the C<L> (i.e. "append a line to the log") command, therefore the log lines are never purged from the table. This approach has an obvious scalability problem.
432 One solution would be to implement e.g. the "rotate log" command in a way that would calculate some summary values, put them in a "summary table" and then delete the lines used to caluclate those values.
434 Similar cleanup code could be implemented in an external script and run periodically independently from squid log commands.
438 This script has only been tested in low-volume scenarios (single client, less than 10 req/s). Tests in high volume environments could reveal performance bottlenecks and bugs.
442 Marcello Romani, marcello.romani@libero.it
443 Amos Jeffries, amosjeffries@squid-cache.org
445 =head1 COPYRIGHT AND LICENSE
447 Copyright (C) 2008 by Marcello Romani
449 This library is free software; you can redistribute it and/or modify
450 it under the same terms as Perl itself, either Perl version 5.8.8 or,
451 at your option, any later version of Perl 5 you may have available.