Class | Description |
---|---|
BdbUriUniqFilter |
A BDB implementation of an AlreadySeen list.
|
BenchmarkUriUniqFilters |
BenchmarkUriUniqFilters
|
BloomUriUniqFilter |
An implementation of an AlreadySeen list based on the MG4J BloomFilter.
|
CheckpointUtils |
Utilities useful checkpointing.
|
CrawledBytesHistotable | |
DiskFPMergeUriUniqFilter |
Crude FPMergeUriUniqFilter using a disk data file of raw longs as the
overall FP record.
|
FPMergeUriUniqFilter |
UriUniqFilter based on merging FP arrays (in memory or from disk).
|
FPUriUniqFilter |
UriUniqFilter storing 64-bit UURI fingerprints, using an internal LongFPSet
instance.
|
LogReader |
This class contains a variety of methods for reading log files (or other text
files containing repeated lines with similar information).
|
LogUtils |
Logging utils.
|
MemFPMergeUriUniqFilter |
Crude all-in-memory FP-merging UriUniqFilter.
|
MemUriUniqFilter |
A purely in-memory UriUniqFilter based on a HashSet, which remembers
every full URI string it sees.
|
NoopUriUniqFilter |
A UriUniqFilter that doesn't actually provide any uniqueness
filter on presented items: all are passed through.
|
RecoveryLogMapper | |
SetBasedUriUniqFilter |
UriUniqFilter based on an underlying UriSet (essentially a Set).
|
TopNSet |
Counting Set which only remembers the 'top N' of all String values
reported (with counts) to it.
|
Enum | Description |
---|---|
Logs |
Enumerates existing Heritrix logs
|
Exception | Description |
---|---|
SeedUrlNotFoundException |
Copyright © 2003-2014 Internet Archive. All Rights Reserved.