Software and Datasets
National Science Foundation
Award Number: CT-ISG-0831281
Award Number: CT-ISG-0831278
PIs: George Kollios and Leonid Reyzin (BU), Feifei Li (FSU)
This is a collaborative project with the Database Lab at Florida State University lead by Prof. Feifei Li.
Answers to database queries often form the basis for critical
decision-making. One way to ensure that the answers are correct is to ask
a trusted data owner directly, over an authenticated channel. This approach,
however, does not scale well: the data owner may not have the
to respond to a high volume of requests because of bandwidth or processing
The clients may not have the
connectivity for low-latency communication to the single owner, particularly if they are
geographically dispersed. Moreover, the approach only assures the client
who formulated the query
that the answer is correct; it does not provide a way to transfer that
assurance to a third party. For example, it may allow a driver to look up
her record at a state registry of motor vehicles, but does not allow her
to then pass the record on to an insurance broker, and the broker to then
pass it on the insurer. These entities would have to issue their
own queries to the data owner in order to be assured of the
authenticity of the answer, which would lead to additional
connectivity and processing requirements.
A desirable alternative is
to have database answers come with their own proofs of correctness. Thus, one would
trust an answer to the database query not based on where the answer comes
from, but based on the information the answer carries. This would enable a
rich variety of communication paths for obtaining verifiable database
In this project we plan to investigate authentication
and verification methods for answers to database queries. Although most existing solutions
can authenticate simple relational queries, such as range selection and projection queries, efficient methods for
authenticating more complex relational queries (e.g., queries that involve joins and/or nested queries) do not exist yet.
Besides relational queries, spatial and spatio-temporal queries for supporting location-based services and
OLAP (Online Analytical Processing) queries over large data warehouses are expected to be important
forms of outsourced queries, and therefore in need of authentication.
Authentication in these settings can be even more challenging since these queries are usually defined
over multi-dimensional and complex data or are computed over very large volume of data.
Furthermore, an important scenario is when multiple clients use a third-party server to exchange and integrate data.
We plan to design efficient verification methods under this model. Finally, data confidentiality in terms of both data privacy
and access control is an important security measure in these systems.
We plan to integrate query authentication techniques to work with methods that guarantee data confidentiality as well.
The long-term goal is to pave the way towards trustworthy database systems that will allow users
to verify the soundness and completeness of their query results without compromising the efficiency of these systems.
Any opinions, findings, and conclusions or recommendations expressed
here are those of the author(s) and do not necessarily reflect the
views of the National Science Foundation.
- A new paper has been accepted in the ACM Transactions on Information and System Security (ACM TISSEC) journal:
Feifei Li, Marios Hadjieleftheriou, George Kollios, and Leonid Reyzin. Authenticated Index Structures for Aggregation Queries. To Appear in ACM TISSEC, 2010.
- A new paper has been accepted in VLDB 2010:
Tomasz Nykiel, Michalis Potamias, Chaitanya Mishra, George Kollios, Nick Koudas, MRShare: Sharing Across Multiple Queries in MapReduce.
In VLDB, September 2010, Singapore. [pdf]