Frédéric Marchal [Thu, 10 Nov 2011 19:59:17 +0000 (19:59 +0000)]
Add missing \n on several debug messages
Some messages were missing the carriage return. The consequence was a
concatenation of the following message with the previous one potentially
letting it go unnoticed.
Some directories are created to store the user's data but if they end up
not being used, they are deleted along with their content. It saves
space on the disk.
A nicer fix would be not to create the directories and their content in
the first place but I'll keep that for the next release.
Frédéric Marchal [Sun, 30 Oct 2011 19:04:33 +0000 (19:04 +0000)]
Add links on the Sites & Users page to the user's page
The page listing the links visited and who visited them contains links to
jump directly to the page of each user provided they are on the top
users page.
Frédéric Marchal [Sun, 30 Oct 2011 19:04:12 +0000 (19:04 +0000)]
Use a function to safely copy the strings
Some calls already used strncpy to copy strings but, now, we are using a
function to encapsulate the code and it is used to copy the arguments
passed to sarg by the user.
Frédéric Marchal [Sun, 30 Oct 2011 14:42:42 +0000 (14:42 +0000)]
Display some messages to understand why sarg isn't doing something
One common class of questions from users is to ask why sarg isn't
producing some kind of report. Considering the number of configuration
parameters, it is not surprising that some users get lost.
To help the users help themselves, the -z command line option has been
enhanced to print messages indicating why sarg don't produce a report.
Times with a one digit hour must be sorted before the times with two
digits. To fix this issue, a padding zero is prefixed to the hour if it
contains only one digit.
Write a note about the ignored items in the reports
Several parameters of sarg.conf can limit the number of lines written in the
reports but, so far, only the denied report was reporting how many entries were
left out.
Now, the authentication failure, the dansguardian and the redirector reports do
write the number of ignored lines.
That change should spare some headaches to the users trying to understand the
reports.
Correctly append a suffix to the mangled temporary file name
When two users end up with the same mangled temporary file name, a suffix is
supposed to be added at the end of the new file name to make it distinct from the
previous one but the suffix was added one byte too far making it useless.
The result was that the log entries of two or more users were written into the
same file overwritting each other's data and corrupting the report.
Protect the columns sorting against missing or invalid dates
Sorttable.js fails on columns containing a date on the first row but not on
subsequent rows. This patch assumes a zero date for any missing or invalid date
sorting those rows at the top of the table.
Add the javascript to dynamically sort the tables in the reports
This is the original script written by Stuart Langridge and downloaded on
http://www.kryogenix.org/code/browser/sorttable/. It is included here as the
script contains some bugs and the author doesn't seem to be supporting his
script any more.
Frédéric Marchal [Sat, 25 Jun 2011 12:49:04 +0000 (12:49 +0000)]
Add support for IPv6 in the aliasing of host names
IPv6 addresses can be defined in the hostalias file and accept the CIDR
notation.
Squares brackets are not required around the IP address in the hostalias
file but the log file should enclose the IPv6 address between square
brackets to avoid confusion with the port number.
Frédéric Marchal [Thu, 23 Jun 2011 13:57:33 +0000 (13:57 +0000)]
Increase the limit on the number of days that can be processed
If a log file (possibly restricted to a date range) ends up with more than 90
different dates, sarg aborts and complains that there are too many dates.
That restriction is just a safety and isn't critical so it has been increased
to a more reasonable value.
Frédéric Marchal [Sun, 19 Jun 2011 19:33:48 +0000 (19:33 +0000)]
Alias the host names in the redirector report
The scheme is removed from the URL even for a custom report format and the
URL is always truncated to keep only the host name. The full URL was
always reported for a custom log format.
In addition, the reported host name is replaced by the alias if one is
defined.
There is no grouping of the identical host names as the report list one
access per line along with the access time so there is no grouping any
way.
Frédéric Marchal [Sun, 19 Jun 2011 18:32:18 +0000 (18:32 +0000)]
Don't report clickable link for aliased url
The HTML reports contain A tags to link to the page visited by the users
but if the host name is aliased, the link is meaningless and must not be
reported.
Frédéric Marchal [Sat, 18 Jun 2011 10:32:16 +0000 (10:32 +0000)]
External sort command delimits the columns only on a tabulation
The default sort command splits the columns on a blank to non blank
transition but our files only use a tabulation as the column separator.
Therefore, the calls to the external sort command explicitly require
that the columns be identified by a tabulation. It prevents problems
when the fields contain spaces.
Frédéric Marchal [Sat, 18 Jun 2011 10:31:49 +0000 (10:31 +0000)]
Alias host names in URL and group identical names
The user can write a file providing rules to replace the host names
extracted from the URL and displayed in the reports. The rules allow
for one wildcard in the host names to be matched.
Identical aliased host named are grouped together in the reports.
Fix the error messages when parsing a redirector log with custom format
If redirector_log_format is set in sarg.conf, the error messages displayed for
any error encountered while parsing the format string are unclear or wrong.
This patch fix the message and explain really why the format string could not
be used.
The files and directories are named after the user whose report is
about. Therefore, even if the administrator tries to hide the user's
identity with a useratb file, the real identity is still visible in the
URL.
To solve this problem, option anonymous_output_files was added to
sarg.conf. When it is on, each user's file is named using a unique
number that can't be traced back to the real user.
This patch also allows to shorten the URL of the report.
The man page tells that the program should try again when getnameinfo
returns EAI_AGAIN but it doesn't say if the program should wait and how
many attempts it should perform. Therefore, we assume the implementation
just want us to call it again but we won't waste more time than that.
The number of IP addresses to resolve is potentially very big and it
doesn't matter much if a few addresses are not resolved.
The time was ignored when parsing a squid log file written with the common
logformat. The consequence was that all the accesses were reported as occuring
at 00:00.
Frédéric Marchal [Fri, 25 Feb 2011 20:32:09 +0000 (20:32 +0000)]
Don't abort for an empty report directory
If sarg fails and leaves an empty report directory, one without a
sarg-date file in it, any subsequent execution of sarg will fail due to
that empty report directory.
This change ignores such an empty directory and issue a simple warning.
Frédéric Marchal [Fri, 25 Feb 2011 08:05:14 +0000 (08:05 +0000)]
Take the date_format into account when converting a file
The date_format parameter read from sarg.conf was taken into account too late
in the program flow and was ignored during the convertion or the splitting of a
file. Only command line option -g was effective.
Don't delete a file twice if -i is given on the command line
If sarg is ran with command line option -i, in some circunstances I have
yet to clarify, the ip file is not produced. In that case, the name of
the previously created file (whose name is still in the string buffer)
is deleted a second time. The result is a failure as the file doesn't
exists any more.
Enable a warning in gcc to stop the compilation if an empty body is found after
some control structures. It should detect stray semi-colons at the end of the
control structures such as if, for, while,...
Fix a problem with the attributes passed to ldap_search
The attributes list passed to ldap_search must be terminated by a NULL
pointer. That wasn't the case in sarg and was likely responsible for a
segfault. It should be fixed now.
According to the gettext manual, AM_GNU_GETTEXT_VERSION sets the
minimum gettext version required to build the package but it doesn't
look quite right as my system insist on using that exact version of
gettext to install the po files.
Frédéric Marchal [Mon, 31 Jan 2011 20:17:25 +0000 (20:17 +0000)]
Accept any number of user id in the LDAP filter string
The previous code would only accept up to five %s in the LDAP search
string. It is sufficient in most cases but we can do better than that
and accept any number of occurences as long as the resulting filter
string can fit in the fixed size buffer hard coded in sarg.
Frédéric Marchal [Fri, 28 Jan 2011 08:19:27 +0000 (08:19 +0000)]
Change gettext version to suit autopoint 0.17
Autopoint version 0.17 requires the gettext version to be rounded at
the revision number in AM_GETTEXT_VERSION. Therefore, the previous
minimum required version of 0.15.1 is rounded up to 0.16.
Frédéric Marchal [Thu, 27 Jan 2011 19:51:25 +0000 (19:51 +0000)]
Have a more compatible Makefile ?
The patsubst function introduced in sarg 2.3 to avoid the duplicate
file name lists in Makefile.in is a GNU make extension that is not
supported by BSD make. Therefore sarg 2.3 fails to compile on that
system with the default make command.
According to the GNU make documentation, the variable substitution
should be more compatible with other implementations but it isn't
clear whether it is accepted by BSD make or not. Let's try it.
Frédéric Marchal [Thu, 27 Jan 2011 19:51:10 +0000 (19:51 +0000)]
Lower the minimum gettext version requirement
We don't have to ask for the most recent version of gettext. A quick
search through gettext's sources showed that 0.15.1 is likely to
contain all the features we need.
In fact, all the features we need may be available since version 0.12
but I'm not sure about that.
Frédéric Marchal [Thu, 27 Jan 2011 15:27:26 +0000 (15:27 +0000)]
Fix the creation of a report when index is set to only
Some unnecessary files are not created as they won't be used in the
report but too many of them are still created, especially the
temporary files. There is room for future improvements.
The index doesn't contain any link to the details of the user's
connections, visited sites, downloads and so on.
Frédéric Marchal [Thu, 27 Jan 2011 15:27:01 +0000 (15:27 +0000)]
Don't try to produce the users's report if indexonly is set
The creation of the users's report was failing when indexonly was set
because the list of the users to process is taken from memory instead
of collecting it from the files in the output directory.
As no temporary file is created when indexonly is set, the output
directory is empty but as the users's name are still stored in memory,
sarg tried to read the non existant file and aborted.
Frédéric Marchal [Thu, 27 Jan 2011 15:26:45 +0000 (15:26 +0000)]
Factorisation of the usage of indexonly
Several functions were called and then decided to return immediately
if indexonly was set. Most of those functions were called once or
twice but never from more than one place.
This change let the calling function decide if the call must be made
based on the value of indexonly which is always known.
The gain is mainly a reduction of the number of parameters passed to a
few functions. It also makes the code more readable as it is not
necessary to dig into the function to discover that it does nothing in
that case.
Frédéric Marchal [Tue, 25 Jan 2011 21:08:34 +0000 (21:08 +0000)]
Split the input log file into several files
Each file contains one day worth of data. The name of the output file
is made of a user supplied prefix and the date corresponding to the
data in the file. The file may be written in a directory selected with
command line option -o.
Frédéric Marchal [Tue, 25 Jan 2011 21:08:09 +0000 (21:08 +0000)]
Don't try to produce a parsed log if parsed_output_log is none
The correct value to set in parsed_output_log to disable the parsed
log is "no" but if the user enters "none" as is usual with the other
parameters, parsed_output_log is set to an empty string which is not
equivalent to "no". The creation of the parsed log would then proceed
and fail because the path is invalid.
Frédéric Marchal [Tue, 25 Jan 2011 21:07:55 +0000 (21:07 +0000)]
Fix a warning about the type of sizeof as expected by printf
The size returned by sizeof should fit easily in a int. No need to use
the more standard %zu that may not be portable to other less
compatible systems.
Frédéric Marchal [Fri, 21 Jan 2011 15:56:05 +0000 (15:56 +0000)]
Resolve IPv6 addresses when creating the datafile
There is an option to resolve the addresses of the visited web sites
into an IP address but the existing code was only capable of
converting host names to IPv4 addresses.
If getaddrinfo is available on the system, it is used to resolve the
host names.
Frédéric Marchal [Wed, 12 Jan 2011 15:05:10 +0000 (15:05 +0000)]
Fix the month numbers read from a sarg log
The month numbers parsed from a sarg log file name were in the range 1
to 12 but they must be in the range 0 to 11. This problem has been
fixed thanks to Denis Konchekov.
In addition, the numerical values of the days and months parsed from
the sarg log file name are more strictly validated.