public interface UriUniqFilter
For efficiency in comparison against a large history of seen URIs, URI objects may not be passed immediately, unless the addNow() is used or a flush() is forced.
Modifier and Type | Interface and Description |
---|---|
static interface |
UriUniqFilter.CrawlUriReceiver
URIs that pass the filter (are new / unique / not already-seen)
are passed to this object, typically a frontier.
|
Modifier and Type | Method and Description |
---|---|
void |
add(String key,
CrawlURI value)
Add given uri, if not already present.
|
void |
addForce(String key,
CrawlURI value)
Add given uri, all the way through to underlying destination, even
if already present.
|
void |
addNow(String key,
CrawlURI value)
Immediately add uri.
|
void |
close()
Close down any allocated resources.
|
long |
count() |
void |
forget(String key,
CrawlURI value)
Forget item was seen
|
void |
note(String key)
Note item as seen, without passing through to receiver.
|
long |
pending()
Count of items added, but not yet filtered in or out.
|
long |
requestFlush()
Request that any pending items be added/dropped.
|
void |
setDestination(UriUniqFilter.CrawlUriReceiver receiver)
Receiver of uniq URIs.
|
void |
setProfileLog(File logfile)
Set a File to receive a log for replay profiling.
|
long count()
long pending()
void setDestination(UriUniqFilter.CrawlUriReceiver receiver)
receiver
- Object that will be passed items. Must implement
HasUriReceiver interface.void add(String key, CrawlURI value)
key
- Usually a canonicalized version of value
.
This is the key used doing lookups, forgets and insertions on the
already included list.value
- item to add.void addNow(String key, CrawlURI value)
key
- Usually a canonicalized version of uri
.
This is the key used doing lookups, forgets and insertions on the
already included list.value
- item to add.void addForce(String key, CrawlURI value)
key
- Usually a canonicalized version of uri
.
This is the key used doing lookups, forgets and insertions on the
already included list.value
- item to add.void note(String key)
key
- Usually a canonicalized version of an URI
.
This is the key used doing lookups, forgets and insertions on the
already included list.void forget(String key, CrawlURI value)
key
- Usually a canonicalized version of an URI
.
This is the key used doing lookups, forgets and insertions on the
already included list.value
- item to add.long requestFlush()
void close()
void setProfileLog(File logfile)
Copyright © 2003-2014 Internet Archive. All Rights Reserved.