#include "ssp_items.h"
#include <apr_pools.h>
Go to the source code of this file.
Data Structures | |
| struct | ssp_bcount_s |
| Rarely used counter state. More... | |
Typedefs | |
| typedef ssp_bcount_s | ssp_bcount_t |
| The type ssp_bcount_t is the cornerstone of this API. | |
Functions | |
| ssp_bcount_t * | ssp_bcount_create (apr_pool_t *pool, ssp_hash_function_t *h1, ssp_hash_function_t *h2, apr_uint16_t filter_scale, apr_uint16_t filter_scatter) |
| Create a filter for detecting hot symbols. | |
| void | ssp_bcount_tick (ssp_bcount_t *h) |
| Advance the current. | |
| apr_uint32_t | ssp_bcount_inc (ssp_bcount_t *h, void *symbol) |
| Count the occurance of a symbol. | |
| apr_uint32_t | ssp_bcount_lookup (ssp_bcount_t *h, void *symbol) |
| Fetch the estimated count for this symbol. | |
This API defines an opaque type ssp_bcount_t which is then used to observe a stream of symbols. It count, aproximately, the number of times each symbol occurs in the stream during the current window.
The counts are a lower bound on the number of times the symbol appears. A typical use for this abstraction is to filter a huge universe of events to find those events that are happening a lot. For example, you could use it to discover the list of high volume visitors to your web site. You could use it to find the popular phrases in a corpus of text. You could use it to find files that should be moved to a cache.
The user must provide only two operations on symbols, two hashs. Symbols are represented as void * on the interface.
That algorithum used here is a direct decendent of the one described in <>. See also section 8.1 in <> for a overview.
|
|
The type ssp_bcount_t is the cornerstone of this API. It is the structure that accumlates the counts and manages the current window. It is allocated in an apr_pool provided by the user and survives until that pool is cleared. The functions in this API are all operations upon it. |
|
||||||||||||||||||||||||
|
Create a filter for detecting hot symbols. Instantiate a fresh ssp_bcount_t in the pool given. The user provides two hash functions that will be used on the symbols counted. Two parameters filter_scale and filter_scatter configure how accurate our estimated counts can be. The counters are a varient of bloom filters. Unlike a traditional bloom filter, which uses a bit map to manage a set, we allocate a set of counters instead of a set of bits. The scale sets the size of that set. Scale is a small integer defining the number of bits used to index the set. So if you pass in 10, you get a counter set of 2^10th. The scatter is how many counters are incremented when ever a symbol is noted. It is directly analgous to the number of bits set or each member of the set represented in a bloom filter. Scatter must less than 32. |
|
||||||||||||
|
Count the occurance of a symbol. Calls on ssp_bcount_insert notify the counters the passed symbol has just occured. I.e. it occured during the current window. |
|
||||||||||||
|
Fetch the estimated count for this symbol. Does not increment the estimated count. |
|
|
Advance the current. Counts are maintained for the current window only. Given a sequence of calls on this routine c7, c8, c9, c10 the current interval runs from moment of c9 thru the current moment. For example if this is called each hour at the top of the hour and it is currently 10:35 the current interval is 9:00..10:35. You can clear all counters by making two calls on this. |
1.4.4