CT-ISG: Collaborative Research: Towards Trustworthy Database Systems




Software and Datasets

Supported by National Science Foundation

Award Number: CT-ISG-0831281

Award Number: CT-ISG-0831278

PIs: George Kollios and Leonid Reyzin (BU), Feifei Li (FSU)

This is a collaborative project with the Database Lab at Florida State University lead by Prof. Feifei Li.

Answers to database queries often form the basis for critical decision-making. One way to ensure that the answers are correct is to ask a trusted data owner directly, over an authenticated channel. This approach, however, does not scale well: the data owner may not have the infrastructure to respond to a high volume of requests because of bandwidth or processing requirements. The clients may not have the connectivity for low-latency communication to the single owner, particularly if they are geographically dispersed. Moreover, the approach only assures the client who formulated the query that the answer is correct; it does not provide a way to transfer that assurance to a third party. For example, it may allow a driver to look up her record at a state registry of motor vehicles, but does not allow her to then pass the record on to an insurance broker, and the broker to then pass it on the insurer. These entities would have to issue their own queries to the data owner in order to be assured of the authenticity of the answer, which would lead to additional connectivity and processing requirements.

A desirable alternative is to have database answers come with their own proofs of correctness. Thus, one would trust an answer to the database query not based on where the answer comes from, but based on the information the answer carries. This would enable a rich variety of communication paths for obtaining verifiable database answers.

In this project we plan to investigate authentication and verification methods for answers to database queries. Although most existing solutions can authenticate simple relational queries, such as range selection and projection queries, efficient methods for authenticating more complex relational queries (e.g., queries that involve joins and/or nested queries) do not exist yet. Besides relational queries, spatial and spatio-temporal queries for supporting location-based services and OLAP (Online Analytical Processing) queries over large data warehouses are expected to be important forms of outsourced queries, and therefore in need of authentication. Authentication in these settings can be even more challenging since these queries are usually defined over multi-dimensional and complex data or are computed over very large volume of data. Furthermore, an important scenario is when multiple clients use a third-party server to exchange and integrate data. We plan to design efficient verification methods under this model. Finally, data confidentiality in terms of both data privacy and access control is an important security measure in these systems. We plan to integrate query authentication techniques to work with methods that guarantee data confidentiality as well.

The long-term goal is to pave the way towards trustworthy database systems that will allow users to verify the soundness and completeness of their query results without compromising the efficiency of these systems.


  • A new paper has been accepted in the ACM Transactions on Information and System Security (ACM TISSEC) journal:
    Feifei Li, Marios Hadjieleftheriou, George Kollios, and Leonid Reyzin. Authenticated Index Structures for Aggregation Queries. To Appear in ACM TISSEC, 2010.
  • A new paper has been accepted in VLDB 2010:
    Tomasz Nykiel, Michalis Potamias, Chaitanya Mishra, George Kollios, Nick Koudas, MRShare: Sharing Across Multiple Queries in MapReduce. In VLDB, September 2010, Singapore. [pdf]

Any opinions, findings, and conclusions or recommendations expressed here are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.