Package | Description |
---|---|
org.archive.crawler.framework | |
org.archive.crawler.frontier | |
org.archive.crawler.frontier.precedence | |
org.archive.crawler.prefetch |
Class and Description |
---|
FrontierJournal
Helper class for managing a simple Frontier change-events journal which is
useful for recovering from crawl problems.
|
Class and Description |
---|
AbstractFrontier
Shared facilities for Frontier implementations.
|
BdbFrontier
A Frontier using several BerkeleyDB JE Databases to hold its record of
known hosts (queues), and pending URIs.
|
BdbMultipleWorkQueues
A BerkeleyDB-database-backed structure for holding ordered
groupings of CrawlURIs.
|
CostAssignmentPolicy
Calculate a integer 'cost' value for the given CrawlURI.
|
FrontierJournal
Helper class for managing a simple Frontier change-events journal which is
useful for recovering from crawl problems.
|
HostnameQueueAssignmentPolicy
QueueAssignmentPolicy based on the hostname:port evident in the given
CrawlURI.
|
QueueAssignmentPolicy
Establishes a mapping from CrawlURIs to String keys (queue names).
|
SurtAuthorityQueueAssignmentPolicy
SurtAuthorityQueueAssignmentPolicy based on the surt form of hostname.
|
UnitCostAssignmentPolicy
A CostAssignment policy that uses a constant value of 1 for all CrawlURIs.
|
URIAuthorityBasedQueueAssignmentPolicy
SurtAuthorityQueueAssignmentPolicy based on the surt form of hostname.
|
WorkQueue
A single queue of related URIs to visit, grouped by a classKey
(typically "hostname:port" or similar)
|
WorkQueueFrontier
A common Frontier base using several queues to hold pending URIs.
|
Class and Description |
---|
WorkQueue
A single queue of related URIs to visit, grouped by a classKey
(typically "hostname:port" or similar)
|
Class and Description |
---|
CostAssignmentPolicy
Calculate a integer 'cost' value for the given CrawlURI.
|
QueueAssignmentPolicy
Establishes a mapping from CrawlURIs to String keys (queue names).
|
Copyright © 2003-2014 Internet Archive. All Rights Reserved.