public final class FPGenerator extends Object
This class provides methods that construct fingerprints of strings of bytes via operations in GF[2^d] for 0 < d <= 64. GF[2^d] is represented as the set of polynomials of degree d with coefficients in Z(2), modulo an irreducible polynomial P of degree d. The representation of polynomials is as an unsigned binary number in which the least significant exponent is kept in the most significant bit.
Let S be a string of bytes and g(S) the string obtained by
taking the byte 0x01
followed by eight 0x00
bytes followed by S
. Let f(S) be the polynomial
associated to the string S viewed as a polynomial with
coefficients in the field Z(2). The fingerprint of S is simply
the value f(g(S)) modulo P. Because polynomials are
represented with the least significant coefficient in the most
significant bit, fingerprints of degree d are stored in the
d
most significant bits of a long word.
Fingerprints can be used as a probably unique id for the input string. More precisely, if P is chosen at random among irreducible polynomials of degree d, then the probability that any two strings A and B have the same fingerprint is less than max(|A|,|B|)/2^(d+1) where |A| is the length of A in bits.
The routines named extend[8]
and fp[8]
return reduced results, while extend_[byte/char/int/long]
do not. An unreduced result is a number that is equal (mod
polynomial to the desired fingerprint but may have
degree degree
or higher. The method reduce
reduces such a result to a polynomial of degree less than
degree
. Obtaining reduced results takes longer than
obtaining unreduced results; thus, when fingerprinting long strings,
it's better to obtain irreduced results inside the fingerprinting loop
and use reduce
to reduce to a fingerprint after the loop.
Modifier and Type | Field and Description |
---|---|
int |
degree
The number of bits in fingerprints generated by
this . |
long |
empty
Fingerprint of the empty string of bytes.
|
long |
polynomial
The polynomial used by
this to generate
fingerprints. |
static long[][] |
polynomials
Array of irreducible polynomials.
|
static FPGenerator |
std24
A standard 24-bit fingerprint generator using
polynomials[0][24] . |
static FPGenerator |
std32
A standard 32-bit fingerprint generator using
polynomials[0][32] . |
static FPGenerator |
std40
A standard 40-bit fingerprint generator using
polynomials[0][40] . |
static FPGenerator |
std64
The standard 64-bit fingerprint generator using
polynomials[0][64] . |
Modifier and Type | Method and Description |
---|---|
long |
extend_byte(long f,
int v)
Extends
f with lower eight bits of v
without full reduction. |
long |
extend_char(long f,
int v)
Extends
f with lower sixteen bits of v . |
long |
extend_int(long f,
int v)
Extends
f with (all bits of) v . |
long |
extend_long(long f,
long v)
Extends
f with v . |
long |
extend(long f,
byte v)
Extends fingerprint
f by adding the low eight
bits of "b". |
long |
extend(long f,
byte[] buf,
int start,
int n)
Extends fingerprint
f by adding "n" bytes of
"buf" starting from "buf[start]". |
long |
extend(long f,
char v)
Extends fingerprint
f by adding (all bits of)
"v". |
long |
extend(long f,
char[] buf,
int start,
int n)
Extends fingerprint
f by adding (all bits of) "n"
characters of "buf" starting from "buf[i]". |
long |
extend(long f,
CharSequence s)
Extends fingerprint
f by adding (all bits of)
the characters of "s". |
long |
extend(long f,
int v)
Extends fingerprint
f by adding (all bits of)
"v". |
long |
extend(long f,
int[] buf,
int start,
int n)
Extends fingerprint
f by adding (all bits of) "n"
characters of "buf" starting from "buf[i]". |
long |
extend(long f,
long v)
Extends fingerprint
f by adding (all bits of)
"v". |
long |
extend(long f,
long[] buf,
int start,
int n)
Extends fingerprint
f by adding (all bits of) "n"
characters of "buf" starting from "buf[i]". |
long |
extend8(long f,
char[] buf,
int start,
int n)
Extends fingerprint
f by adding the lower eight
bits of "n" characters of "buf" starting from "buf[i]". |
long |
extend8(long f,
String s)
Extends fingerprint
f by adding the lower eight
bits of the characters of "s". |
long |
fp(byte[] buf,
int start,
int n)
Compute fingerprint of "n" bytes of "buf" starting from
"buf[start]".
|
long |
fp(char[] buf,
int start,
int n)
Compute fingerprint of (all bits of) "n" characters of "buf"
starting from "buf[i]".
|
long |
fp(CharSequence s)
Compute fingerprint of (all bits of) the characters of "s".
|
long |
fp(int[] buf,
int start,
int n)
Compute fingerprint of (all bits of) "n" characters of "buf"
starting from "buf[i]".
|
long |
fp(long[] buf,
int start,
int n)
Compute fingerprint of (all bits of) "n" characters of "buf"
starting from "buf[i]".
|
long |
fp8(char[] buf,
int start,
int n)
Compute fingerprint of the lower eight bits of "n" characters
of "buf" starting from "buf[i]".
|
long |
fp8(String s)
Compute fingerprint of the lower eight bits of the characters
of "s".
|
static FPGenerator |
make(long polynomial,
int degree)
Return a fingerprint generator.
|
long |
reduce(long fp)
Return a value equal (mod
polynomial ) to
fp and of degree less than degree . |
public final long empty
public final int degree
this
.public long polynomial
this
to generate
fingerprints.public static final long[][] polynomials
d
between 1 and 64 (inclusive),
polynomials[d][i]
is an irreducible polynomial of
degree d
. There are at least two irreducible
polynomials for each degree.public static final FPGenerator std64
polynomials[0][64]
.public static final FPGenerator std32
polynomials[0][32]
.public static final FPGenerator std40
polynomials[0][40]
.public static final FPGenerator std24
polynomials[0][24]
.public static FPGenerator make(long polynomial, int degree)
degree
and will be generated by
polynomial
. If a generator based on
polynomial
has already been created, it will be
returned. Requires that polynomial
is an
irreducible polynomial of degree degree
(the
array polynomials
contains some irreducible
polynomials).public long reduce(long fp)
polynomial
) to
fp
and of degree less than degree
.public long extend_byte(long f, int v)
f
with lower eight bits of v
without full reduction. In other words, returns a
polynomial that is equal (mod polynomial
) to the
desired fingerprint but may be of higher degree than the
desired fingerprint.public long extend_char(long f, int v)
f
with lower sixteen bits of v
.
Does not reduce.public long extend_int(long f, int v)
f
with (all bits of) v
.
Does not reduce.public long extend_long(long f, long v)
f
with v
.
Does not reduce.public long fp(byte[] buf, int start, int n)
public long fp(char[] buf, int start, int n)
public long fp(CharSequence s)
public long fp(int[] buf, int start, int n)
public long fp(long[] buf, int start, int n)
public long fp8(String s)
public long fp8(char[] buf, int start, int n)
public long extend(long f, byte v)
f
by adding the low eight
bits of "b".public long extend(long f, char v)
f
by adding (all bits of)
"v".public long extend(long f, int v)
f
by adding (all bits of)
"v".public long extend(long f, long v)
f
by adding (all bits of)
"v".public long extend(long f, byte[] buf, int start, int n)
f
by adding "n" bytes of
"buf" starting from "buf[start]".
Result is reduced.
Requires "[i, i+n)" is in bounds.public long extend(long f, char[] buf, int start, int n)
f
by adding (all bits of) "n"
characters of "buf" starting from "buf[i]".
Result is reduced.
Requires "[i, i+n)" is in bounds.public long extend(long f, CharSequence s)
f
by adding (all bits of)
the characters of "s".
Result is reduced.public long extend(long f, int[] buf, int start, int n)
f
by adding (all bits of) "n"
characters of "buf" starting from "buf[i]".
Result is reduced.
Requires "[i, i+n)" is in bounds.public long extend(long f, long[] buf, int start, int n)
f
by adding (all bits of) "n"
characters of "buf" starting from "buf[i]".
Result is reduced.
Requires "[i, i+n)" is in bounds.public long extend8(long f, String s)
f
by adding the lower eight
bits of the characters of "s".
Result is reduced.public long extend8(long f, char[] buf, int start, int n)
f
by adding the lower eight
bits of "n" characters of "buf" starting from "buf[i]".
Result is reduced.
Requires "[i, i+n)" is in bounds.Copyright © 2003-2014 Internet Archive. All Rights Reserved.