Frédéric Marchal [Sun, 19 Jun 2011 19:33:48 +0000 (19:33 +0000)]
Alias the host names in the redirector report
The scheme is removed from the URL even for a custom report format and the
URL is always truncated to keep only the host name. The full URL was
always reported for a custom log format.
In addition, the reported host name is replaced by the alias if one is
defined.
There is no grouping of the identical host names as the report list one
access per line along with the access time so there is no grouping any
way.
Frédéric Marchal [Sun, 19 Jun 2011 18:32:18 +0000 (18:32 +0000)]
Don't report clickable link for aliased url
The HTML reports contain A tags to link to the page visited by the users
but if the host name is aliased, the link is meaningless and must not be
reported.
Frédéric Marchal [Sat, 18 Jun 2011 10:32:16 +0000 (10:32 +0000)]
External sort command delimits the columns only on a tabulation
The default sort command splits the columns on a blank to non blank
transition but our files only use a tabulation as the column separator.
Therefore, the calls to the external sort command explicitly require
that the columns be identified by a tabulation. It prevents problems
when the fields contain spaces.
Frédéric Marchal [Sat, 18 Jun 2011 10:31:49 +0000 (10:31 +0000)]
Alias host names in URL and group identical names
The user can write a file providing rules to replace the host names
extracted from the URL and displayed in the reports. The rules allow
for one wildcard in the host names to be matched.
Identical aliased host named are grouped together in the reports.
Fix the error messages when parsing a redirector log with custom format
If redirector_log_format is set in sarg.conf, the error messages displayed for
any error encountered while parsing the format string are unclear or wrong.
This patch fix the message and explain really why the format string could not
be used.
The files and directories are named after the user whose report is
about. Therefore, even if the administrator tries to hide the user's
identity with a useratb file, the real identity is still visible in the
URL.
To solve this problem, option anonymous_output_files was added to
sarg.conf. When it is on, each user's file is named using a unique
number that can't be traced back to the real user.
This patch also allows to shorten the URL of the report.
The man page tells that the program should try again when getnameinfo
returns EAI_AGAIN but it doesn't say if the program should wait and how
many attempts it should perform. Therefore, we assume the implementation
just want us to call it again but we won't waste more time than that.
The number of IP addresses to resolve is potentially very big and it
doesn't matter much if a few addresses are not resolved.
The time was ignored when parsing a squid log file written with the common
logformat. The consequence was that all the accesses were reported as occuring
at 00:00.
Frédéric Marchal [Fri, 25 Feb 2011 20:32:09 +0000 (20:32 +0000)]
Don't abort for an empty report directory
If sarg fails and leaves an empty report directory, one without a
sarg-date file in it, any subsequent execution of sarg will fail due to
that empty report directory.
This change ignores such an empty directory and issue a simple warning.
Frédéric Marchal [Fri, 25 Feb 2011 08:05:14 +0000 (08:05 +0000)]
Take the date_format into account when converting a file
The date_format parameter read from sarg.conf was taken into account too late
in the program flow and was ignored during the convertion or the splitting of a
file. Only command line option -g was effective.
Don't delete a file twice if -i is given on the command line
If sarg is ran with command line option -i, in some circunstances I have
yet to clarify, the ip file is not produced. In that case, the name of
the previously created file (whose name is still in the string buffer)
is deleted a second time. The result is a failure as the file doesn't
exists any more.
Enable a warning in gcc to stop the compilation if an empty body is found after
some control structures. It should detect stray semi-colons at the end of the
control structures such as if, for, while,...
Fix a problem with the attributes passed to ldap_search
The attributes list passed to ldap_search must be terminated by a NULL
pointer. That wasn't the case in sarg and was likely responsible for a
segfault. It should be fixed now.
According to the gettext manual, AM_GNU_GETTEXT_VERSION sets the
minimum gettext version required to build the package but it doesn't
look quite right as my system insist on using that exact version of
gettext to install the po files.
Frédéric Marchal [Mon, 31 Jan 2011 20:17:25 +0000 (20:17 +0000)]
Accept any number of user id in the LDAP filter string
The previous code would only accept up to five %s in the LDAP search
string. It is sufficient in most cases but we can do better than that
and accept any number of occurences as long as the resulting filter
string can fit in the fixed size buffer hard coded in sarg.
Frédéric Marchal [Fri, 28 Jan 2011 08:19:27 +0000 (08:19 +0000)]
Change gettext version to suit autopoint 0.17
Autopoint version 0.17 requires the gettext version to be rounded at
the revision number in AM_GETTEXT_VERSION. Therefore, the previous
minimum required version of 0.15.1 is rounded up to 0.16.
Frédéric Marchal [Thu, 27 Jan 2011 19:51:25 +0000 (19:51 +0000)]
Have a more compatible Makefile ?
The patsubst function introduced in sarg 2.3 to avoid the duplicate
file name lists in Makefile.in is a GNU make extension that is not
supported by BSD make. Therefore sarg 2.3 fails to compile on that
system with the default make command.
According to the GNU make documentation, the variable substitution
should be more compatible with other implementations but it isn't
clear whether it is accepted by BSD make or not. Let's try it.
Frédéric Marchal [Thu, 27 Jan 2011 19:51:10 +0000 (19:51 +0000)]
Lower the minimum gettext version requirement
We don't have to ask for the most recent version of gettext. A quick
search through gettext's sources showed that 0.15.1 is likely to
contain all the features we need.
In fact, all the features we need may be available since version 0.12
but I'm not sure about that.
Frédéric Marchal [Thu, 27 Jan 2011 15:27:26 +0000 (15:27 +0000)]
Fix the creation of a report when index is set to only
Some unnecessary files are not created as they won't be used in the
report but too many of them are still created, especially the
temporary files. There is room for future improvements.
The index doesn't contain any link to the details of the user's
connections, visited sites, downloads and so on.
Frédéric Marchal [Thu, 27 Jan 2011 15:27:01 +0000 (15:27 +0000)]
Don't try to produce the users's report if indexonly is set
The creation of the users's report was failing when indexonly was set
because the list of the users to process is taken from memory instead
of collecting it from the files in the output directory.
As no temporary file is created when indexonly is set, the output
directory is empty but as the users's name are still stored in memory,
sarg tried to read the non existant file and aborted.
Frédéric Marchal [Thu, 27 Jan 2011 15:26:45 +0000 (15:26 +0000)]
Factorisation of the usage of indexonly
Several functions were called and then decided to return immediately
if indexonly was set. Most of those functions were called once or
twice but never from more than one place.
This change let the calling function decide if the call must be made
based on the value of indexonly which is always known.
The gain is mainly a reduction of the number of parameters passed to a
few functions. It also makes the code more readable as it is not
necessary to dig into the function to discover that it does nothing in
that case.
Frédéric Marchal [Tue, 25 Jan 2011 21:08:34 +0000 (21:08 +0000)]
Split the input log file into several files
Each file contains one day worth of data. The name of the output file
is made of a user supplied prefix and the date corresponding to the
data in the file. The file may be written in a directory selected with
command line option -o.
Frédéric Marchal [Tue, 25 Jan 2011 21:08:09 +0000 (21:08 +0000)]
Don't try to produce a parsed log if parsed_output_log is none
The correct value to set in parsed_output_log to disable the parsed
log is "no" but if the user enters "none" as is usual with the other
parameters, parsed_output_log is set to an empty string which is not
equivalent to "no". The creation of the parsed log would then proceed
and fail because the path is invalid.
Frédéric Marchal [Tue, 25 Jan 2011 21:07:55 +0000 (21:07 +0000)]
Fix a warning about the type of sizeof as expected by printf
The size returned by sizeof should fit easily in a int. No need to use
the more standard %zu that may not be portable to other less
compatible systems.
Frédéric Marchal [Fri, 21 Jan 2011 15:56:05 +0000 (15:56 +0000)]
Resolve IPv6 addresses when creating the datafile
There is an option to resolve the addresses of the visited web sites
into an IP address but the existing code was only capable of
converting host names to IPv4 addresses.
If getaddrinfo is available on the system, it is used to resolve the
host names.
Frédéric Marchal [Wed, 12 Jan 2011 15:05:10 +0000 (15:05 +0000)]
Fix the month numbers read from a sarg log
The month numbers parsed from a sarg log file name were in the range 1
to 12 but they must be in the range 0 to 11. This problem has been
fixed thanks to Denis Konchekov.
In addition, the numerical values of the days and months parsed from
the sarg log file name are more strictly validated.
The parsed log file created when parsed_output_log is set is using a
completly numerical date but, when the log file is read again, sarg
expects the month to be a name and therefore reject the file as having
an invalid name.
Frédéric Marchal [Mon, 27 Dec 2010 12:30:34 +0000 (12:30 +0000)]
Fix creation of the report's file name from the mangling of the user name
The directory and file name to store a user's report is produced by
replacing any character but the letters, digits and a few safe
characters from the user name. But that name mangling was not dealing
properly with consecutive invalid characters. The result was, at best,
a truncated file name (without any real consequence) or, at worst, an
invalid character left in the file name.
Frédéric Marchal [Sun, 26 Dec 2010 18:28:45 +0000 (18:28 +0000)]
Use internal user list instead of scanning the directory for users
Sarg used to scan the temporary directory for files to process and
extract the user name from the files. But, now, it is a waste of time
to read the directory as a list of the known users is alread in
memory.
This change also keep away stray files left by the crash of a previous
run.
Frédéric Marchal [Sun, 26 Dec 2010 13:13:36 +0000 (13:13 +0000)]
Don't rebuild sarg.1 when making all
The building of the manpage requires the xsl stylesheets that are not
located at the same place on every system. Moreover, it is not
necessary to rebuild it as it can be shipped with the sources.
Therefore, "make all" only rebuild sarg without making sarg.1.
Frédéric Marchal [Fri, 24 Dec 2010 08:47:12 +0000 (08:47 +0000)]
Convert some variables in Makefile to lowercase
The Makefile variables BINDIR, SYSCONFDIR, MANDIR and MAN1DIR are now
written in lowercase as it is the standard way to write such directory
variables with automake.
This change allows the not too uncommon path rewritting during
installation such as "make install bindir=/tmp".
Frédéric Marchal [Mon, 29 Nov 2010 20:22:24 +0000 (20:22 +0000)]
Delete unused files from the directory containing the user report
When the user_report_limit is set and the site_user_time_date report
is requested, the supernumerary time files were not deleted. They were
wasting a lot of disk space.
Now, only the time files that are linked to the user report are kept.
Frédéric Marchal [Sun, 28 Nov 2010 15:41:03 +0000 (15:41 +0000)]
Add an option to sort the topsites by time
This option is not identical to the dynamic sort as it selects the top
sites from the entire list and only displays the requested number of
entries. Then the dynamic sort only offers to sort the truncated list.
Frédéric Marchal [Sun, 28 Nov 2010 15:40:43 +0000 (15:40 +0000)]
Fix a problem in the parsing of the sort option in sarg.conf
The sort criterion of any sort parameter in sarg.conf was not stored
in the internal variable but or'ed with the default value. Therefore,
the sort was not performed if the selected value was tested after the
default value in the code.
Frédéric Marchal [Sat, 27 Nov 2010 21:05:35 +0000 (21:05 +0000)]
Add --lastlog and --keeplogs to the command line options
Those two options supersede the lastlog option from sarg.conf. It is
useful to have them on the command line to change the number of
reports in the daily, weekly or monthly directories.
Fix uninitialized variable when double checking the top users
The configuration option --enable-doublecheck was compiling in an
invalid piece of code that could potentially use an uninitialized
variable when building the list of the top users.
Escape the LDAP search string instead of truncating it.
A few characters must be escaped in a LDAP search string. Sarg used to
truncate the user login name at the first "dubious" character found in
the string and the list of "dubious" character was much longer than
necessary. Instead of truncating the user login, this patch escape the
characters.
Report an error if the tail command cannot read the last lines of the log file when creating a realtime report.
Surrond the log file name with quotes in the command that read the trailing lines of the log file when creating a realtime report.
Frédéric Marchal [Fri, 27 Aug 2010 13:16:43 +0000 (13:16 +0000)]
Initialize the variables that are used to build the date range when command line option -d is used.
It should prevent a segfault if the complex nesting of "if" fails to set a variable.
Thanks to ItalianPenguin for reporting this bug.
Frédéric Marchal [Tue, 17 Aug 2010 09:00:43 +0000 (09:00 +0000)]
Don't delete the temporary directory after sending the e-mail as it is deleted later by the program and it prevent the purging routine from working properly.
Frédéric Marchal [Tue, 17 Aug 2010 09:00:22 +0000 (09:00 +0000)]
Remove the quotes around the MailUtility command to allow the user to call a script or to add more options to the command.
The description of the mail_utility configuration option has been amended accordingly.