From: Kern Sibbald Date: Tue, 31 Dec 2019 22:17:51 +0000 (+0100) Subject: Tweak delete unused file X-Git-Tag: Release-9.6.0~27 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=e868a312b6a0805314b8d1432832f85a42661ada;p=thirdparty%2Fbacula.git Tweak delete unused file --- diff --git a/bacula/README.Collector.txt b/bacula/README.Collector.txt deleted file mode 100644 index 047fd718c..000000000 --- a/bacula/README.Collector.txt +++ /dev/null @@ -1,497 +0,0 @@ - -This is a brief description of a new Statistics collector functionality. - -0. Requirements and Assumptions - ->>>Eric Bollengier at 12/04/2018 ->>> -The main idea would be to generate statistics from our daemon (mainly the -storage daemon, but FD and Dir can be interested too) and send this data to a -Graphite daemon. - -Graphite is a bit like RRDTool, it is in charge to collect the data and render -the information. - -https://graphiteapp.org/ - -I think that the solution at the Bacula level should be able to use different -"drivers" (csv file on disk with a configurable format for example, native -graphite tcp connection, etc...). - -At the Bacula level, we probably need a new Resource in the configuration file -to configure that (or only new directives I don't know). Once you know what you -need, Kern will review the name of the Resources and/or Directives. - -Each job should be able to send data to the Bacula collector (ie from multiple -threads). Ideally, a job should not be blocked because the Graphite socket is -hanging for example. - -We need to define interesting metrics that can be reported to the statistic -collector, few examples: -- nb of jobs at a given time -- device statistics (nb read, nb write, nb job...) -- total network throughput -- per job network throughput -- disk throughput -- current memory usage -- system information (cpu, load average, swap usage) -- number of files read for a job -- (basically what the status command reports) - -It might be interesting to let the user choose the metrics they want to see, and -have a directive such as the destination message. - - Metrics = NbJob, NetworkThroughput, DiskThroughput, MemoryUsage - -(this is just an idea). - -We can start with few basic statistics and enrich the program later. - -1. Statistics Collector Architecture - -The most important requirement is to not block a job thread because external -communication stalled or introduced unexpected latency which can negatively -affect a running job. This requirement lead to the strict separation in -Collector functionality and behavior. - -The collector was designed as a two separate entities: -- an internal collector class: bstatcollect -- an interface collector thread: COLLECTOR -The first one functions as a metrics cache and the second one is responsible for -sending collected metrics into an external data target (i.e. Graphite). - -1.1. Statistics Collection flow - -Any Bacula's metrics are collected in a push architecture where an object code -is responsible for preparing/generating a metrics and "push" it to internal -collector for later usage. This "push" operation should be extremely fast which -allows to avoid unexpected latency in the job. - -To save a metrics in internal collector requires two step process: metric -registration and metrics update. Metrics registration could take some time -(relatively speaking) with O(n) and return a metrics index used in metrics update -process which is very fast with O(1). - -You should register metrics at the very beginning of the job or daemon start when -additional latency is not a problem at all. The update process using the metric -index from registration will be very fast, so it should be no problem setting it -even at the time critical part of the code, i.e. - -metric_index = statcollector->registration_int64("bacula.jobs.all", - METRIC_UNIT_JOB, value, "The number of all jobs"); - -(...) - -while (true) - -(...) - - statcollector->set_value_int64(metric_index, newvalue); - -(...) - -statcollector->unregistration(metric_index); - -The only latency introduced by a metrics update process is lock/unlock resolution -used for internal collector synchronization access. - -The metrics should be unregistered when no needed any more. After unregistration -process the metrics index becomes invalid and should not be used to address this -metric. As the metrics index is a regular integer number it will be reused upon -next registration of any new metric. - -You can get any or all metrics values when required. The return will be always -the full copy of the metrics so you can process it as you wish. - -1.2. Statistics Collector backend thread - -The collector background thread (COLLECTOR resource) is responsible for getting -the copy of the current metrics list and save them to configured destination. -The save/send process could be a time consuming, i.e. could involve a network -communication like for a Graphite collector. As this collector thread operates -on a copy of metrics list it doesn't affect standard job operations. Collector -thread saves metrics at regular intervals. - Note: the current implementation introduced a two backends: CSV file and - Graphite which are build in backends. - -1.3. Statistics Collector update thread - -We have a two metrics types to collect: an easy to count/update and a hard one -to count/update. The easy to count/update metrics are all statistics which -corresponds to already available (in memory) counter/variable, so we can update -metrics every time the counter/variable is updated. We can achieve a perfect -accuracy here without a problem. On the other hand the hard to count/update -metrics are all metrics which depends on external data (i.e. almost all -permanent metrics based on catalog data), metrics which are not directly -controllable by Bacula (i.e. the size of the heap) or metrics where frequent -update will bring a huge performance impact (i.e. sm_pool_memory size). For this -kind of metrics we've develop a dedicated mechanism for updating these metrics. - -The main assumption here is that these metrics won't be updated until necessary, -so as long as nobody would check what is the value we won't update it. We agreed -that the real value of the metrics (i.e. a number of error jobs) could change a -dozens of times in the mean time, but we want a perfect value at the time of -sampling, so i.e. saving/sending to external backend. - -For this purpose we run a dedicated collector update thread which will start -only when any of collector backend threads are started. So, if no collectors -defined for the daemon, no update thread will be necessary. The collector update -thread executes a dedicated function to every daemon as every daemon would has a -different set of hard to count/update metrics. The collector update thread -updates required metrics as frequent as the minimal interval parameter set at -defined collector resources. So, for two collector resources which would have -Interval=5min and Interval=30sec, the update thread should get a 30secs -interval. Additionally the "collect" command, which display all available -metrics at any time, is executing the same update function as for update thread -to get up to date metrics. - -2. Architecture implementation and code - -2.1. The metrics class - -This is a basic building block for metrics collection: - -class bstatmetrics : public SMARTALLOC { -public: - char *name; /* this is a metrics name */ - metric_type_t type; /* this is a metrics type */ - metric_unit_t unit; /* this is a metrics unit */ - metric_value_t value; /* this is a metrics value */ - char *description; /* this is a metrics description */ - bstatmetric(); - bstatmetric(char *mname, metric_type_t mtype, metric_unit_t munit, - char *descr); - bstatmetric(char *mname, metric_unit_t munit, bool mvalue, char *descr); - bstatmetric(char *mname, metric_unit_t munit, int64_t mvalue, char *descr); - bstatmetric(char *mname, metric_unit_t munit, float mvalue, char *descr); - ~bstatmetric(); - bstatmetric& operator=(const bstatmetric& orig); - void render_metric_value(POOLMEM **buf, bool bstr=false); - void render_metric_value(POOL_MEM &buf, bool bstr=false); - const char *metric_type_str(); - const char *metric_unit_str(); - void dump(); -}; - -You can have a three (technically four) types of metrics in bstatmetric: -- METRIC_UNDEF - when bstatmetrics is uninitialized -- METRIC_INT - the metrics stores an integer values (int64_t) -- METRIC_BOOL - the metrics stores a boolean values (so, True/False) -- METRIC_FLOAT - the metrics stores a float values - -You can define a metrics unit, which shows what entity a value represent, i.e. -METRIC_UNIT_BYTE, METRIC_UNIT_BYTESEC, METRIC_UNIT_JOB, METRIC_UNIT_CLIENT, etc. -When a value have no unit and it is a just a value/number you should use: -METRIC_UNIT_EMPTY. - -2.2. The internal collector class - -The metrics objects collection is an internal collector class: - -class bstatcollect : public SMARTALLOC { -public: - bstatcollect(); - /* registration return a metrics index */ - int registration(char *metric, metric_type_t type, metric_unit_t unit, - char *descr); - int registration_bool(char *metric, metric_unit_t unit, bool value, - char *descr); - int registration_int64(char *metric, metric_unit_t unit, int64_t value, - char *descr); - int registration_float(char *metric, metric_unit_t unit, float value, - char *descr); - /* unregistration */ - void unregistration(int metric); - /* update/set the metrics value */ - int set_value_bool(int metric, bool value); - int set_value_int64(int metric, int64_t value); - int add_value_int64(int metric, int64_t value); - int add2_value_int64(int metric1, int64_t value1, int metric2, - int64_t value2); - int sub_value_int64(int metric, int64_t value); - int set_value_float(int metric, float value); - int inc_value_int64(int metric); - int dec_value_int64(int metric); - int dec_inc_values_int64(int metricd, int metrici); - /* get data */ - bool get_bool(int metric); - int64_t get_int(int metric); - float get_float(int metric); - alist *get_all(); - bstatmetrics *get_metric(char *metric); - /* utility */ - void dump(); -}; - -You can register a metrics of particular type and unit with -bstatcollect::registration() method. In this case the value will be set -initially to zero. Using other registration_*() methods you can set other -initial value. - -When you make a next metrics registration in the time when a particular metrics -already exist in bstatcollect object you will always get the same metrics index. -When you unregister a metrics and register it later again you could get a -different metrics index. - -Any metrics could have a description string which could be useful to users. You -can set metrics description at first registration only. Any subsequent metrics -registration does not update it. - -2.2.1. Updating the metrics value - -Any metrics value update should be performed as an atomic operation, so the -internal collector class has a number of a such methods: -- set_value_*() - will set the metrics value into method argument "value", old - metrics value will be overwritten -- inc_value_int64()/dec_value_int64() - will increment or decrement the metrics - value accordingly as a single atomic operation -- dec_inc_values_int64() - will decrement a first metrics value and increment - a second metrics value as a single atomic operation, used to update a related - metrics in a single step -- add_value_int64() - will add the numeric argument to the metrics value as a - single atomic operation -- add2_value_int64() - will add the two numeric arguments to the two metrics - values as a single atomic operation, used to update more metrics in a single - step - -The inc_value_int64()/dec_value_int64()/add_value_int64()/etc. should be used -when managing a "shared" metrics updated from a different or separate threads. - -2.3. Supported utilities - -There are a few supported utilities you can use when processing metrics or -metrics list: -- bstatmetric::render_metric_value() - render a metrics value as a string into - a buffer. -- bstatmetric::metric_type_str() - return a metrics type as string. -- bstatmetric::metric_unit_str() - return a metrics unit as string. -- free_metric_alist() - releases memory for a list of metrics returned from - bstatcollect::get_all(). - -3. Statistics resource configuration - -The Statistics resource defines the attributes of the Statistics collector threads -running on any daemon. You can define any number of Collector resources and every -single Statistics resource will spawn a single collector thread. This resource can be -defined for any Daemon (Dir, SD and FD). Resource directives: - -Name = The collector name used by the system administrator. This -directive is required. - -Description = The text field contains a description of the Collector that -will be displayed in the graphical user interface. This directive is optional. - -Interval = This directive instruct Collector thread how long it -should sleep between every collection iteration. This directive is optional and -when not specified a value 300 seconds will be used instead. - -Type = The Type directive specifies the Collector backend, which may -be one of the following: CSV or Graphite. This directive is required. - --> CSV is a simple file level backend which saves all required metrics with the -following format to the file: