public class HashCrawlMapper extends CrawlMapper
Modifier and Type | Field and Description |
---|---|
protected long |
crawlerCount
Number of crawlers among which to split up the URIs.
|
protected Frontier |
frontier |
cache, checkOutlinks, checkUri, diversionDir, diversionLogs, localName, logGeneration, outlinkRule, rotationDigits
Constructor and Description |
---|
HashCrawlMapper()
Constructor.
|
Modifier and Type | Method and Description |
---|---|
long |
getCrawlerCount() |
Frontier |
getFrontier() |
String |
getReducePrefixRegex() |
protected String |
getReduceRegex(CrawlURI cauri) |
boolean |
getUsePublicSuffixesRegex() |
protected String |
map(CrawlURI cauri)
Look up the crawler node name to which the given CrawlURI
should be mapped.
|
static String |
mapString(String key,
String reducePattern,
long bucketCount) |
void |
setCrawlerCount(long count) |
void |
setFrontier(Frontier frontier) |
void |
setReducePrefixRegex(String regex) |
void |
setUsePublicSuffixesRegex(boolean usePublicSuffixes) |
decideToMapOutlink, divertLog, getCheckOutlinks, getCheckUri, getDiversionDir, getDiversionLog, getLocalName, getOutlinkRule, getRotationDigits, innerProcess, innerProcessResult, isRunning, setCheckOutlinks, setCheckUri, setDiversionDir, setLocalName, setOutlinkRule, setRotationDigits, shouldProcess, start, stop, updateGeneration
doCheckpoint, finishCheckpoint, flattenVia, fromCheckpointJson, getBeanName, getEnabled, getKeyedProperties, getRecordedSize, getShouldProcessRule, getURICount, hasHttpAuthenticationCredential, innerRejectProcess, isSuccess, process, report, setBeanName, setEnabled, setRecoveryCheckpoint, setShouldProcessRule, startCheckpoint, toCheckpointJson
protected Frontier frontier
protected long crawlerCount
public Frontier getFrontier()
public void setFrontier(Frontier frontier)
public long getCrawlerCount()
public void setCrawlerCount(long count)
public boolean getUsePublicSuffixesRegex()
public void setUsePublicSuffixesRegex(boolean usePublicSuffixes)
public String getReducePrefixRegex()
public void setReducePrefixRegex(String regex)
protected String map(CrawlURI cauri)
map
in class CrawlMapper
cauri
- CrawlURI to considerCopyright © 2003-2014 Internet Archive. All Rights Reserved.