Package | Description |
---|---|
org.archive.modules.canonicalize |
Modifier and Type | Class and Description |
---|---|
class |
FixupQueryString
Strip any trailing question mark.
|
class |
LowercaseRule
Lowercases the URL.
|
class |
RegexRule
General conversion rule.
|
class |
StripExtraSlashes
Strip any extra slashes, '/', found in the path.
|
class |
StripSessionCFIDs
Strip cold fusion session ids.
|
class |
StripSessionIDs
Strip known session ids.
|
class |
StripUserinfoRule
Strip any 'userinfo' found on http/https URLs.
|
class |
StripWWWNRule
Strip any 'www[0-9]*' found on http/https URLs IF they have some
path/query component (content after third slash).
|
class |
StripWWWRule
Strip any 'www' found on http/https URLs, IF they have some
path/query component (content after third slash).
|
Copyright © 2003-2014 Internet Archive. All Rights Reserved.