Main Page | Data Structures | Directories | File List | Data Fields | Globals

ssp_bcount.h File Reference

Silver Spoon's Bloom Counters, a C include file. More...

#include "ssp_items.h"
#include <apr_pools.h>

Go to the source code of this file.

Data Structures

struct  ssp_bcount_s
 Rarely used counter state. More...

Typedefs

typedef ssp_bcount_s ssp_bcount_t
 The type ssp_bcount_t is the cornerstone of this API.

Functions

ssp_bcount_tssp_bcount_create (apr_pool_t *pool, ssp_hash_function_t *h1, ssp_hash_function_t *h2, apr_uint16_t filter_scale, apr_uint16_t filter_scatter)
 Create a filter for detecting hot symbols.
void ssp_bcount_tick (ssp_bcount_t *h)
 Advance the current.
apr_uint32_t ssp_bcount_inc (ssp_bcount_t *h, void *symbol)
 Count the occurance of a symbol.
apr_uint32_t ssp_bcount_lookup (ssp_bcount_t *h, void *symbol)
 Fetch the estimated count for this symbol.


Detailed Description

Silver Spoon's Bloom Counters, a C include file.

This API defines an opaque type ssp_bcount_t which is then used to observe a stream of symbols. It count, aproximately, the number of times each symbol occurs in the stream during the current window.

The counts are a lower bound on the number of times the symbol appears. A typical use for this abstraction is to filter a huge universe of events to find those events that are happening a lot. For example, you could use it to discover the list of high volume visitors to your web site. You could use it to find the popular phrases in a corpus of text. You could use it to find files that should be moved to a cache.

The user must provide only two operations on symbols, two hashs. Symbols are represented as void * on the interface.

That algorithum used here is a direct decendent of the one described in <>. See also section 8.1 in <> for a overview.


Typedef Documentation

typedef struct ssp_bcount_s ssp_bcount_t
 

The type ssp_bcount_t is the cornerstone of this API.

It is the structure that accumlates the counts and manages the current window. It is allocated in an apr_pool provided by the user and survives until that pool is cleared. The functions in this API are all operations upon it.


Function Documentation

ssp_bcount_t* ssp_bcount_create apr_pool_t *  pool,
ssp_hash_function_t h1,
ssp_hash_function_t h2,
apr_uint16_t  filter_scale,
apr_uint16_t  filter_scatter
 

Create a filter for detecting hot symbols.

Instantiate a fresh ssp_bcount_t in the pool given. The user provides two hash functions that will be used on the symbols counted. Two parameters filter_scale and filter_scatter configure how accurate our estimated counts can be.

The counters are a varient of bloom filters. Unlike a traditional bloom filter, which uses a bit map to manage a set, we allocate a set of counters instead of a set of bits. The scale sets the size of that set. Scale is a small integer defining the number of bits used to index the set. So if you pass in 10, you get a counter set of 2^10th.

The scatter is how many counters are incremented when ever a symbol is noted. It is directly analgous to the number of bits set or each member of the set represented in a bloom filter. Scatter must less than 32.

apr_uint32_t ssp_bcount_inc ssp_bcount_t h,
void *  symbol
 

Count the occurance of a symbol.

Calls on ssp_bcount_insert notify the counters the passed symbol has just occured. I.e. it occured during the current window.

apr_uint32_t ssp_bcount_lookup ssp_bcount_t h,
void *  symbol
 

Fetch the estimated count for this symbol.

Does not increment the estimated count.

void ssp_bcount_tick ssp_bcount_t h  ) 
 

Advance the current.

Counts are maintained for the current window only. Given a sequence of calls on this routine c7, c8, c9, c10 the current interval runs from moment of c9 thru the current moment. For example if this is called each hour at the top of the hour and it is currently 10:35 the current interval is 9:00..10:35. You can clear all counters by making two calls on this.


Generated on Thu Jul 28 22:20:52 2005 for HTP - a library for realtime detection of hotspots by  doxygen 1.4.4