At the top of the user agent report, the period covered by the report was
wrong when the entries were coming from an extended log or when several
useragent.log files were provided with the useragent option in sarg.conf.
Thanks to Evgeniy Yakushev for reporting this bug.
Create a user agent report if the input log provides the information
An extended log can contain the user agent identification string. Sarg can
now use it to generate the user agent report.
The new user_agent type is added to report_type to generate a user agent
report. For compatibility with previous versions, that report type is
automatically selected if at least one file name is provided with the
useragent option.
Initialize LDAP connection after generating the useragent log
The useragent log doesn't need the connection to the LDAP server. Let's
exclude this task from the connection timeout and have more time to
resolve the user names.
Thanks to Evgeniy Yakushev for spotting that error.
The usertab option may use LDAP but the connection to the server was open
before parsing the access and useragent logs. It was possible that the
connection would time out before the first request was sent.
An attempt to fix this problem is to open the connection to the LDAP server
after parsing the log files.
The list of users to include in the report is processed after every aliasing mechanism
It is more logical to filter the users to include in the report after they
have been aliased to the final name. It makes it possible to identify the
final user irrespective of the name she happens to have when connecting to
the proxy.
Frederic Marchal [Fri, 21 Aug 2015 17:59:26 +0000 (19:59 +0200)]
Limit the useragent log period to the same period as the access log
If no explicit date range was provided on the command line, the useragent
report used to contain the content of the whole useragent.log file even if
it covered a wider period than the access log.
Now, the useragent report only covers the same period as the access log
provided they both overlap. If both logs cover distinct periods, the
useragent log is not produced.
When a host name is replaced by an alias, the host name is prefixed with a
* to write it with no link to a special function such as a link to block
that url.
There are quite a lot of options to drive the user's ID or IP address
processing, replace the ID, include or exclude the entry from the report
and so on.
That processing is only applied in full to the access.log. The redirector
and dansguardian logs are much more primitive in comparison.
By moving the processing to one function, I intent to use it everywhere.
Fix an error introduced when stripping the user domain
Several reports such as the download and authentication reports would
fail with an error about unknown user "-".
It has been fixed by choosing a new method to strip the domain.
The change partially reverts some changes made in commit 36a0b94cbcaa8a9899fcc878639945e8787d0fec that were responsible for the
proper user name not being propagated to the intermediary report files.
Thanks to Yakushev Evgeniy for reporting this bug.
Read an extended log even if cs-uri is split over several columns
Sarg used to require that the visited URL be stored in column cs-uri of the
extended log format. But the URL can be split over the cs-uri-scheme,
cs-host, cs-uri-port, cs-uri-path and cs-uri-query columns.
Sarg detects the columns and re-create the full URL if cs-uri is not
found.
Sarg.conf can include other configuration files with the "include"
directive. It can be used to store common option in one file and create
shorter sarg.conf dedicated to reporting tasks.
Make it possible to delete an old temporary directory
Sarg prepares the report in a the temporary directory. The directory must
be empty before starting the report generation. Deleting an old stray
temporary directory must be made with care to make sure we don't delete
a wrong directory.
The old temporary directory check would not take the email report generated
files into account.
Don't delete anything from the temporary directory if unsure
Sarg must delete any previous stray temporary directory before filling it
with the current log data. But sarg must not just delete any directory and
its content if it isn't ours. Therefore, sarg check the directory content
to make sure that it only contains files that may belong to us.
There was an error in that routine. It would recursively delete directories
found in the candidate temporary directory before making sure the whole
directory content can be safely deleted.
Frederic Marchal [Mon, 22 Jun 2015 20:47:10 +0000 (22:47 +0200)]
Update the configure script and its dependencies
The configure script was rebuild while trying to find out how to get rid
of gettext 0.19 and use gettext 0.18 instead.
The po/Makefile.in file now on my computer requires gettext 0.18 as
expected but I don't know what I did to achieve that result. I still don't
know what tool is responsible for generating Makefile.in from
Makefile.in.in.
Frederic Marchal [Thu, 11 Jun 2015 16:25:41 +0000 (18:25 +0200)]
Display an error when trying to read a Z file
Z compressed files are not supported any longer because I want to use
libraries instead of relying on external processes to parse compressed
files. I haven't found a library or piece of code to read those files.
It is my understanding that the Z format is not used much these days. If
you think otherwise, please open a bug ticket.
Fails if no file names can be found when file globbing is on
If file globbing is enabled and no files can be found to match the pattern,
sarg must fail as it does when file globbing is disabled (that was how
sarg worked before file globbing was programmed).
It is meant as a safety in case there is a problem with the generated log
files.
Don't display the period covered by the logs if it is empty
Sarg outputs a line with the earliest and latest dates found if the logs
but it displays 00/01/00 if no log was found. It is best not to display
anything in that case.
Frederic Marchal [Mon, 25 May 2015 18:14:05 +0000 (20:14 +0200)]
Read the useragent log even if no report date is specified
The useragent log was ignored if the report date was not defined on the
command line. The only way to generate the useragent log was to use
command line option -d.
xgettext needs to know what argument of debuga is a c-format string to add
the proper flag in the po file. The printf-like argument changed place when
the source file and line number were added to debuga.
Function debugaz was not declared as a function taking a printf like
argument.
Display the source of the message displayed on stderr
Messages that have been merged in previous commits cannot be distinguished
from each other. It makes debugging more difficult.
To solve this problem, when sarg is run with -zzz, the source file name
and the line number is displayed in the message prefix. It makes it easy to
look at the corresponding source code and understand why the message was
printed.
There were far too many redundant messages to be translated just to report
errors in the data read from a file. The messages have been merged to
reduce the diversity in vocabulary.
To reduce the number of messages to translate, the index reported in the
message is always an int. There was no reasons to use something else in the
first place.
Translated messages that differ only in the file name printed at the
end of the message are now split in two parts. The file name is taken out
of the translated message to leave only the common part to be translated.