Frederic Marchal [Fri, 25 Jan 2013 20:58:31 +0000 (21:58 +0100)]
Ugly patch to link on Windows
Library ws2_32 is required when compiling on Windows but the autoconf macro
to test for a library doesn't work due to the special declaration of the
functions in that library.
Frederic Marchal [Thu, 24 Jan 2013 21:17:23 +0000 (22:17 +0100)]
Write the access times in one html page
Web sites access times were written in a file named after the user's name
and the visited site URL. There were one file per site, they were small and
numerous. In fact, they could be so numerous that they could fill the disk
inode table.
Moreover, with a file name made of the user's name and the visited site
URL, the manufactured file name could be more than 256 characters resulting
in an OS error on Windows and Linux.
To fix these two problems, the access times are all grouped inside one web
page with anchors to directly scroll to the relevant site. The html file
name is fixed (tt.html) and doesn't depend on the user's name or web
site URL.
Frederic Marchal [Sat, 12 Jan 2013 09:20:46 +0000 (10:20 +0100)]
Packaging script accepts a tag with a translation suffix
When a translation is submitted after the version has been tagged, a new
tag is added with the language as a suffix. It must be accepted by the
packaging script.
Due to an error, the file name created by mangling the user ID was not
using the suffix that was supposed to make it unique. Therefore, user's
names made of several consecutive non alphanumeric characters were not
unique.
The error manifested itself as an error message saying that the
_.user_unsort file could not be sorted. It occurred after the first
user assigned to the _ file name had been sorted and its file deleted.
Therefore any subsequent user sharing the _ file name would fail.
This patch also makes sure no original user name might possibly ends up
being the same as a mangled name.
Use the string buffer object to store the strings corresponding to the
users. It takes much less memory as the strings only take the exact amount
of memory instead of allocating a fixed size buffer as big as the biggest
expected string.
Frederic Marchal [Wed, 26 Dec 2012 16:23:54 +0000 (17:23 +0100)]
Explain why some input log lines are ignored
When verbose mode is enabled, a listing explaining how many lines have
been excluded is displayed. Every reason to ignore a line is listed. It
should make it easier to figure out why the report is not generated.
Frédéric Marchal [Sun, 16 Dec 2012 17:03:51 +0000 (18:03 +0100)]
Make a translator friendly message out of pieced together words
The "generated by" message written at the page bottom was made out of
words assembled together during the page generation. It was not possible
to translate that message.
Be more thorough when ensuring a file is correctly written
The return status of every written file is checked when the file is closed.
Any incorrectly written file should be detected early and a proper message
should be reported.
Overwrite any existing dansguardian temporary file
When sarg parses a dansguardian log, the temporary file to store the
parsed value is overwritten if it exists instead of appending the entries
at the end of any left over from a previous run.
Frédéric Marchal [Fri, 31 Aug 2012 19:46:41 +0000 (21:46 +0200)]
Remove a message about the redirector log that can't be deleted
If no redirector log was provided, a message was displayed in debug mode to
inform the user about a temporary file that can't be deleted. That message
was unnecessary and misleading. It is now displayed only when appropriate.
Frédéric Marchal [Fri, 31 Aug 2012 16:58:54 +0000 (18:58 +0200)]
Add a safety to prevent the deletion of files that haven't been created
There was a path in the source code where sarg could try to delete the
temporary unsorted files of the denied and authfail reports without
checking that the file names were not empty.
The functions where the guard was added are not supposed to be called if
no reports are to be generated but that check relies on the caller. If the
caller fails and call the function to generate the reports, it will try
to delete a file whose name is empty.
Frédéric Marchal [Sun, 26 Aug 2012 17:40:10 +0000 (19:40 +0200)]
Don't keep the % in the URL when converted into a file name
When a file name is manufactured from a URL, the percent sign is removed to
prevent the web server or the browser from requesting a file with a % in
it.
The server or the browser would decode the percent sign if the two
subsequent bytes happened to be a valid hexadecimal byte and would request
the wrong file.
Frédéric Marchal [Sun, 26 Aug 2012 16:47:25 +0000 (18:47 +0200)]
Keep reading the log files even with a small number of errors
Sarg tolerates a few errors in the input log files. The number of errors
can be configured. Sarg can stop on some consecutive errors and on the
total number of errors.
Frédéric Marchal [Sun, 26 Aug 2012 15:37:23 +0000 (17:37 +0200)]
Allow an empty data size in a common log
Any column of the common log format may be - to denote missing or
no applicable data. In particular, apache writes a - when the URL is a
redirection. That case is taken into account with this patch.
Frédéric Marchal [Sun, 26 Aug 2012 15:25:49 +0000 (17:25 +0200)]
Accept common log files without extension column
The sample common log file used to test the program contained additional
columns. The standard common log format doesn't have those column. Sarg
failed to parse the standard format due to the lack of any supernumerary
column.
Frédéric Marchal [Sun, 26 Aug 2012 13:46:51 +0000 (15:46 +0200)]
Decode extended log formats
Microsoft ISA produces such a log. This change is supposed to handle more
general cases than the previous routine.
The current code successfully decode the one line long log I have to test
the code. The decoding procedure may not be compatible with *any*
compliant extended log implementation. Sample logs are necessary to improve
the code.
Frédéric Marchal [Sun, 26 Aug 2012 09:18:52 +0000 (11:18 +0200)]
Store the entry time in a structure instead of a pointer
Instead of requiring that the module keeps track of the entry time on
behalf of the main loop, the entry time is stored in the entry structure.
Therefore, there is no need to keep a static variable inside the module
and pass its pointer to the caller.
Frédéric Marchal [Tue, 21 Aug 2012 19:02:33 +0000 (21:02 +0200)]
Don't use strcmp to check strings one or zero characters long
As a side effect, the date format is stored in a single character instead
of a string and df is now the only variable used globally to set the
date format.
Frédéric Marchal [Fri, 10 Aug 2012 18:10:31 +0000 (20:10 +0200)]
Don't show the input log reading percentage
The show_read_percent option shows the percent of the input log file
reading independently of show_read_statistics. It allows for a progress
indicator without having to read the input log file twice.
Sort the top sites report by number of users connecting the sites
The top sites report can be sorted according to the number of users
connecting to the visited sites. It shows how popular sites are within your
network.