Interface CountingBloomFilter
- All Superinterfaces:
BitMapExtractor,BloomFilter<CountingBloomFilter>,CellExtractor,IndexExtractor
- All Known Implementing Classes:
ArrayCountingBloomFilter
A counting Bloom filter is expected to function identically to a standard Bloom filter that is the merge of all the Bloom filters that have been added to and not later subtracted from the counting Bloom filter. The functional state of a CountingBloomFilter at the start and end of a series of merge and subsequent remove operations of the same Bloom filters, irrespective of remove order, is expected to be the same.
Removal of a filter that has not previously been merged results in an invalid state where the cells no longer represent a sum of merged Bloom filters. It is impossible to validate merge and remove exactly without explicitly storing all filters. Consequently such an operation may go undetected. The CountingBloomFilter maintains a state flag that is used as a warning that an operation was performed that resulted in invalid cells and thus an invalid state. For example this may occur if a cell for an index was set to negative following a remove operation.
Implementations should document the expected state of the filter after an operation that generates invalid cells, and any potential recovery options. An implementation may support a reversal of the operation to restore the state to that prior to the operation. In the event that invalid cells are adjusted to a valid range then it should be documented if there has been irreversible information loss.
Implementations may choose to throw an exception during an operation that generates invalid cells. Implementations should document the expected state of the filter after such an operation. For example are the cells not updated, partially updated or updated entirely before the exception is raised.
- Since:
- 4.5.0-M1
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.commons.collections4.bloomfilter.CellExtractor
CellExtractor.CellPredicate -
Field Summary
Fields inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
SPARSE -
Method Summary
Modifier and TypeMethodDescriptionbooleanadd(CellExtractor other) Adds the specified CellExtractor to this Bloom filter.intGets the maximum allowable value for a cell count in this Counting filter.default intgetMaxInsert(BitMapExtractor bitMapExtractor) Determines the maximum number of times the BitMapExtractor could have been merged into this counting filter.default intgetMaxInsert(BloomFilter<?> bloomFilter) Determines the maximum number of times the Bloom filter could have been merged into this counting filter.intgetMaxInsert(CellExtractor cellExtractor) Determines the maximum number of times the Cell Extractor could have been added.default intgetMaxInsert(Hasher hasher) Determines the maximum number of times the Hasher could have been merged into this counting filter.default intgetMaxInsert(IndexExtractor indexExtractor) Determines the maximum number of times the IndexExtractor could have been merged into this counting filter.booleanisValid()Returnstrueif the internal state is valid.default booleanmerge(BitMapExtractor bitMapExtractor) Merges the specified BitMap extractor into this Bloom filter.default booleanmerge(BloomFilter<?> other) Merges the specified Bloom filter into this Bloom filter.default booleanMerges the specified Hasher into this Bloom filter.default booleanmerge(IndexExtractor indexExtractor) Merges the specified index extractor into this Bloom filter.default booleanremove(BitMapExtractor bitMapExtractor) Removes the specified BitMapExtractor from this Bloom filter.default booleanremove(BloomFilter<?> other) Removes the specified Bloom filter from this Bloom filter.default booleanRemoves the unique values from the specified hasher from this Bloom filter.default booleanremove(IndexExtractor indexExtractor) Removes the values from the specified IndexExtractor from the Bloom filter from this Bloom filter.booleansubtract(CellExtractor other) Adds the specified CellExtractor to this Bloom filter.default IndexExtractorThe default implementation is a no-op since the counting bloom filter returns an unique IndexExtractor by default.Methods inherited from interface org.apache.commons.collections4.bloomfilter.BitMapExtractor
asBitMapArray, processBitMapPairs, processBitMapsMethods inherited from interface org.apache.commons.collections4.bloomfilter.BloomFilter
cardinality, characteristics, clear, contains, contains, contains, contains, copy, estimateIntersection, estimateN, estimateUnion, getShape, isEmpty, isFullMethods inherited from interface org.apache.commons.collections4.bloomfilter.CellExtractor
processCells, processIndicesMethods inherited from interface org.apache.commons.collections4.bloomfilter.IndexExtractor
asIndexArray
-
Method Details
-
add
Adds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
otherwill be incremented by their corresponding values in theother.This method will return
trueif the filter is valid after the operation.- Parameters:
other- the CellExtractor to add.- Returns:
trueif the addition was successful and the state is valid- See Also:
-
getMaxCell
int getMaxCell()Gets the maximum allowable value for a cell count in this Counting filter.- Returns:
- the maximum allowable value for a cell count in this Counting filter.
-
getMaxInsert
Determines the maximum number of times the BitMapExtractor could have been merged into this counting filter.- Parameters:
bitMapExtractor- the BitMapExtractor to provide the indices.- Returns:
- the maximum number of times the BitMapExtractor could have been inserted.
-
getMaxInsert
Determines the maximum number of times the Bloom filter could have been merged into this counting filter.- Parameters:
bloomFilter- the Bloom filter the check for.- Returns:
- the maximum number of times the Bloom filter could have been inserted.
-
getMaxInsert
Determines the maximum number of times the Cell Extractor could have been added.- Parameters:
cellExtractor- the extractor of cells.- Returns:
- the maximum number of times the CellExtractor could have been inserted.
-
getMaxInsert
Determines the maximum number of times the Hasher could have been merged into this counting filter.- Parameters:
hasher- the Hasher to provide the indices.- Returns:
- the maximum number of times the hasher could have been inserted.
-
getMaxInsert
Determines the maximum number of times the IndexExtractor could have been merged into this counting filter.To determine how many times an indexExtractor could have been added create a CellExtractor from the indexExtractor and check that
- Parameters:
indexExtractor- the extractor to drive the count check.- Returns:
- the maximum number of times the IndexExtractor could have been inserted.
- See Also:
-
isValid
boolean isValid()Returnstrueif the internal state is valid.This flag is a warning that an addition or subtraction of cells from this filter resulted in an invalid cell for one or more indexes. For example this may occur if a cell for an index was set to negative following a subtraction operation, or overflows the value specified by
getMaxCell()following an addition operation.A counting Bloom filter that has an invalid state is no longer ensured to function identically to a standard Bloom filter instance that is the merge of all the Bloom filters that have been added to and not later subtracted from this counting Bloom filter.
Note: The change to an invalid state may or may not be reversible. Implementations are expected to document their policy on recovery from an addition or removal operation that generated an invalid state.
- Returns:
trueif the state is valid
-
merge
Merges the specified BitMap extractor into this Bloom filter.Specifically: all cells for the indexes identified by the
bitMapExtractorwill be incremented by 1.This method will return
trueif the filter is valid after the operation.- Specified by:
mergein interfaceBloomFilter<CountingBloomFilter>- Parameters:
bitMapExtractor- the BitMapExtractor- Returns:
trueif the removal was successful and the state is valid- See Also:
-
merge
Merges the specified Bloom filter into this Bloom filter.Specifically: all cells for the indexes identified by the
otherfilter will be incremented by 1.Note: If the other filter is a counting Bloom filter the other filter's cells are ignored and it is treated as an IndexExtractor.
This method will return
trueif the filter is valid after the operation.- Specified by:
mergein interfaceBloomFilter<CountingBloomFilter>- Parameters:
other- the other Bloom filter- Returns:
trueif the removal was successful and the state is valid- See Also:
-
merge
Merges the specified Hasher into this Bloom filter.Specifically: all cells for the unique indexes identified by the
hasherwill be incremented by 1.This method will return
trueif the filter is valid after the operation.- Specified by:
mergein interfaceBloomFilter<CountingBloomFilter>- Parameters:
hasher- the hasher- Returns:
trueif the removal was successful and the state is valid- See Also:
-
merge
Merges the specified index extractor into this Bloom filter.Specifically: all unique cells for the indices identified by the
indexExtractorwill be incremented by 1.This method will return
trueif the filter is valid after the operation.Notes:
- If indices that are returned multiple times should be incremented multiple times convert the IndexExtractor to a CellExtractor and add that.
- Implementations should throw
IllegalArgumentExceptionand no other exception on bad input.
- Specified by:
mergein interfaceBloomFilter<CountingBloomFilter>- Parameters:
indexExtractor- the IndexExtractor- Returns:
trueif the removal was successful and the state is valid- See Also:
-
remove
Removes the specified BitMapExtractor from this Bloom filter.Specifically all cells for the indices produced by the
bitMapExtractorwill be decremented by 1.This method will return
trueif the filter is valid after the operation.- Parameters:
bitMapExtractor- the BitMapExtractor to provide the indexes- Returns:
trueif the removal was successful and the state is valid- See Also:
-
remove
Removes the specified Bloom filter from this Bloom filter.Specifically: all cells for the indexes identified by the
otherfilter will be decremented by 1.Note: If the other filter is a counting Bloom filter the other filter's cells are ignored and it is treated as an IndexExtractor.
This method will return
trueif the filter is valid after the operation.- Parameters:
other- the other Bloom filter- Returns:
trueif the removal was successful and the state is valid- See Also:
-
remove
Removes the unique values from the specified hasher from this Bloom filter.Specifically all cells for the unique indices produced by the
hasherwill be decremented by 1.This method will return
trueif the filter is valid after the operation.- Parameters:
hasher- the hasher to provide the indexes- Returns:
trueif the removal was successful and the state is valid- See Also:
-
remove
Removes the values from the specified IndexExtractor from the Bloom filter from this Bloom filter.Specifically all cells for the unique indices produced by the
hasherwill be decremented by 1.This method will return
trueif the filter is valid after the operation.Note: If indices that are returned multiple times should be decremented multiple times convert the IndexExtractor to a CellExtractor and subtract that.
- Parameters:
indexExtractor- the IndexExtractor to provide the indexes- Returns:
trueif the removal was successful and the state is valid- See Also:
-
subtract
Adds the specified CellExtractor to this Bloom filter.Specifically all cells for the indexes identified by the
otherwill be decremented by their corresponding values in theother.This method will return true if the filter is valid after the operation.
- Parameters:
other- the CellExtractor to subtract.- Returns:
trueif the subtraction was successful and the state is valid- See Also:
-
uniqueIndices
The default implementation is a no-op since the counting bloom filter returns an unique IndexExtractor by default.- Specified by:
uniqueIndicesin interfaceBloomFilter<CountingBloomFilter>- Specified by:
uniqueIndicesin interfaceCellExtractor- Specified by:
uniqueIndicesin interfaceIndexExtractor- Returns:
- this counting Bloom filter.
-