Package | Description |
---|---|
org.archive.crawler.postprocessor | |
org.archive.modules |
The beginnings of a refactored settings framework.
|
org.archive.modules.extractor |
Modifier and Type | Method and Description |
---|---|
protected int |
LinksScoper.getSchedulingFor(CrawlURI curi,
Link wref,
int preferenceDepthHops)
Deprecated.
Determine scheduling for the
curi . |
Modifier and Type | Field and Description |
---|---|
protected Collection<Link> |
CrawlURI.outLinks
All discovered outbound Links (navlinks, embeds, etc.)
Can either contain Link instances or CrawlURI instances, or both.
|
Modifier and Type | Method and Description |
---|---|
Collection<Link> |
CrawlURI.getOutLinks()
Returns discovered links.
|
Modifier and Type | Method and Description |
---|---|
CrawlURI |
CrawlURI.createCrawlURI(UURI baseUURI,
Link link)
Utility method for creation of CandidateURIs found extracting
links from this CrawlURI.
|
CrawlURI |
CrawlURI.createCrawlURI(UURI baseUURI,
Link link,
int scheduling,
boolean seed)
Utility method for creation of CandidateURIs found extracting
links from this CrawlURI.
|
Modifier and Type | Field and Description |
---|---|
Link |
StringExtractorTestBase.TestData.expectedResult |
Modifier and Type | Method and Description |
---|---|
int |
Link.compareTo(Link o) |
protected void |
ExtractorURI.extractLink(CrawlURI curi,
Link wref)
Consider a single Link for internal URIs
|
Constructor and Description |
---|
StringExtractorTestBase.TestData(CrawlURI uri,
Link expectedResult) |
Copyright © 2003-2014 Internet Archive. All Rights Reserved.