public class DispositionProcessor extends Processor
Modifier and Type | Field and Description |
---|---|
protected CrawlMetadata |
metadata
Auto-discovered module providing configured (or overridden)
User-Agent value and RobotsHonoringPolicy
|
protected ServerCache |
serverCache |
Constructor and Description |
---|
DispositionProcessor() |
Modifier and Type | Method and Description |
---|---|
float |
getDelayFactor() |
boolean |
getForceRetire() |
int |
getMaxDelayMs() |
int |
getMaxPerHostBandwidthUsageKbSec() |
CrawlMetadata |
getMetadata() |
int |
getMinDelayMs() |
int |
getRespectCrawlDelayUpToSeconds() |
ServerCache |
getServerCache() |
protected void |
innerProcess(CrawlURI puri)
Actually performs the process.
|
protected long |
politenessDelayFor(CrawlURI curi)
Update any scheduling structures with the new information in this
CrawlURI.
|
void |
setDelayFactor(float factor) |
void |
setForceRetire(boolean force) |
void |
setMaxDelayMs(int maxDelay) |
void |
setMaxPerHostBandwidthUsageKbSec(int max) |
void |
setMetadata(CrawlMetadata provider) |
void |
setMinDelayMs(int minDelay) |
void |
setRespectCrawlDelayUpToSeconds(int respect) |
void |
setServerCache(ServerCache serverCache) |
protected boolean |
shouldProcess(CrawlURI puri)
Determines whether the given uri should be processed by this
processor.
|
doCheckpoint, finishCheckpoint, flattenVia, fromCheckpointJson, getBeanName, getEnabled, getKeyedProperties, getRecordedSize, getShouldProcessRule, getURICount, hasHttpAuthenticationCredential, innerProcessResult, innerRejectProcess, isRunning, isSuccess, process, report, setBeanName, setEnabled, setRecoveryCheckpoint, setShouldProcessRule, start, startCheckpoint, stop, toCheckpointJson
protected ServerCache serverCache
protected CrawlMetadata metadata
public ServerCache getServerCache()
public void setServerCache(ServerCache serverCache)
public float getDelayFactor()
public void setDelayFactor(float factor)
public int getMinDelayMs()
public void setMinDelayMs(int minDelay)
public int getRespectCrawlDelayUpToSeconds()
public void setRespectCrawlDelayUpToSeconds(int respect)
public int getMaxDelayMs()
public void setMaxDelayMs(int maxDelay)
public int getMaxPerHostBandwidthUsageKbSec()
public void setMaxPerHostBandwidthUsageKbSec(int max)
public boolean getForceRetire()
public void setForceRetire(boolean force)
public CrawlMetadata getMetadata()
public void setMetadata(CrawlMetadata provider)
protected boolean shouldProcess(CrawlURI puri)
Processor
shouldProcess
in class Processor
puri
- the URI to testprotected void innerProcess(CrawlURI puri)
Processor
#ENABLED
, the
#DECIDE_RULES
and the #shouldProcess(ProcessorURI)
tests.innerProcess
in class Processor
puri
- the URI to processprotected long politenessDelayFor(CrawlURI curi)
curi
- The CrawlURICopyright © 2003-2014 Internet Archive. All Rights Reserved.