INTRODUCTION
The files ijs.h, ijsio.cpp, ijs.cpp, and WelchBerlekamp.cpp contain an
implementation of the "Improved Juels-Sudan Secure Sketch" (Construction 5)
from "Fuzzy Extractors: How to Generate Strong Keys from Biometrics and
Other Noisy Data" by Dodis, Ostrovsky, Reyzin and Smith (SIAM Journal on
Computing, volume 38, number 1, pages 97-139, 2008; on-line at
http://eprint.iacr.org/2003/235 and http://arxiv.org/abs/cs/0602007). The
implementation is by Soren Johnson and Leonid Reyzin (reyzin@cs.bu.edu).
It uses Victor Shoup's NTL (A Library for doing Number Theory), which must
be installed before it will compile; see http://www.shoup.net.
Given a parameter t, a prime n, and input set A of s integers between 0 and
n-1, "ijs -s" will produce as output a sketch of A containing t integers
between 0 and n-1. Then, if B is also a set of s integers between 0 and
n-1 and the size of the symmetric difference between A and B is at most t,
running "ijs -r" will find A given only B and the sketch of A (thus, in
particular, recovering elements of A that are not in B without seeing A).
These programs provide an efficient way to find differences between two
sets without having to communicate the sets. They also allow for
information-reconciliation with minimum information leakage. For more
details and applications, see the aforementioned "Fuzzy Extractors" paper.
Note that t must be even (because symmetric difference between two sets of
equal size is always even) and no more than s (otherwise it is easier to
simply store A).
Even though the secure sketch as described in the paper works for any prime
power n, the current implementation is limited to prime n.
See also companion implementation of PinSketch, another secure sketch from
the same paper, at http://www.cs.bu.edu/~reyzin/code/fuzzy.html. Note that
the user interface for the two implementations is slightly different (in
particular, because PinSketch can work with either two sketches or a sketch
and a set). The PinSketch implementation is significantly more efficient
than this implementation, and does not require two sets to be of the same
size. The efficiency of this implementation may be improved considerably,
as discussed in the comments of ijs.cpp.
The mathematical meat of the implementation is in ijs.cpp and
WelchBerlekamp.cpp. In particular, WelchBerlekamp.cpp implements the
Welch-Berlekamp algorithm for decoding Reed-Solomon codes, and may be of
independent interest. ijsio.cpp contains a simple command-line-parser and
file input-output routines that provide a user interface.
WARNINGS AND LIMITATIONS
The programs will not behave correctly if any of the following occurs.
1) There are duplicate elements in A or duplicate elements in B (i.e., if A
or B are multisets).
2) B has a different parameter n or size s from A
3) n is not a prime
4) The input or sketch files do not follow the prescribed format.
5) The size of the symmetric difference between the two sets is greater
than t. (In that case, "ijs -r" will output an error message unless there is a set
C that has the same sketch as A and whose symmetric difference with B has
no more than t elements, in which case C will be output.)
Note that cases 1, 2, 3, and 4 can be taken care of at the input stage (but
currently are not). Case 5 is impossible to fix completely, although there
are techniques that will reduce the likelihood of an incorrect output at
the expense of a larger sketch.
USAGE DETAILS
To compile, make sure you have NTL (A Library for doing Number Theory)
installed (see http://www.shoup.net) and issue command
make
If the compilation fails, see the comments in the file called Makefile.
Once everything compiles, invoking
./ijs -s A.set
will produce a file sketch.out (the output filename can be changed by supplying
it as a separate argument, e.g., ./ijs -s A.set A.sketch). The format of A.set is
t=
n=
[ ... ]
where is an integer between 0 and n.
For example,
t=4
n=17
[2 4 7 16 8 9 12 15]
(note that no spaces around the equal signs are allowed; otherwise, spaces
and linebreaks can be added anywhere).
The sketch file will contain t integers between 0 and n-1.
Invoking
./ijs -r B.set sketch.out
will output a file rec.out (again, the output filename can be changed by
supplying it as the last argument) that contains the set that was used to produce
the sketch file, if the difference between B and that set is at most t.
More precisely, rec.out will contain the unique set
whose sketch is equal to the sketch in sketch.out and whose symmetric
difference with B is of size at most t, if such a set exists; if it doesn't
exist, an error message will be printed. Note that A.set, B.set and
rec.out are treated as sets and not as sequences: the order of elements is
arbitrary, and the order in rec.out may not match the order in A.set. The
format of B.set is the same as of A.set.
Invoking
./ijs -p bitlength
produces a file mod.out (again, the filename can be changed by supplying it
as a last argument) that contains a prime of the specified length. This
may be useful for generating .set files.
Invoking
./ijs -h
prints out a help message.
Sample files A.set, B.set A.sketch, M.set, N.set, M.sketch are included
with the implementation.