Searching in encrypted data

Richard Brinkman

Research output: ThesisPhD Thesis - Research UT, graduation UT

245 Downloads (Pure)


This PhD thesis addresses the problem of securing data stored on an untrusted server. There are situations in which personal data or other sensitive information has to be stored on an untrusted system. For instance, if someone else has a cheaper means to store large amounts of data or offers a better network connectivity, it is beneficial to outsource your data to that system. In the literature we find different approaches to secure data. Some approaches use access control while others use encryption. In this thesis we focus on the latter approach. We do not assume that the storage system itself is secure. In this PhD thesis we envisage the following scenario. There exists a server with a large storage capacity and a large bandwidth. This server is considered honest but curious. This means that on the one hand we trust that it stores the data correctly and follows the protocols. On the other hand it cannot be trusted to refuse access to unauthorised people. Since the security of the system itself cannot be trusted, the data should be stored in encrypted form at the server. Authorised people should still be able to query the encrypted database efficiently. The goal of the search process is to perform the majority of workload at the server, allowing low power devices to connect to the database. Three solutions are presented. The first solution uses a trapdoor mechanism. The data is encrypted in such a way that it is possible to search for a certain word. The server is given a key that is specific for that particular word. With this key the server is able to scan the encrypted text and find occurrences of the word. Although the server does not know which word it is being asked for, it will learn the location where the word can be found, if it can be found at all. The server does not learn anything else about the text. The second solution uses secret sharing. The text to be stored is split in two (or more) shares. Both shares are needed to reconstruct the original text. The text is split in such a way that it is possible for the data owner to regenerate his own share, so that he does not actually have to store it. The other share is stored at the server. The search process consists of an interactive protocol between the data owner and the server. The server does not learn the location where the answer can be found, as in the first solution, but the client has more work to do. A third category of solutions uses homomorphic encryption functions. Homomorphic encryption makes it possible to perform simple operations like addition and multiplication directly on the encrypted data, without the need to decrypt it first. We explore possibilities to use this type of encryption functions to search in encrypted data. The thesis ends with a storage technique, based on the principles of a lucky dip, in which the security not solely relies on the computational complexity, like in standard cryptography, but also on information theoretic security. The information will be torn into shreds by using secret sharing before they are are mixed with similar shreds from other documents. The security of the lucky dip containing all those shreds is based on the fact that many combinations of shreds result in correctly readable texts. The number of combinations increases dramatically with the number of shreds. Enumerating all possible combinations is not feasible in practice. Even if we assume that an attacker has unlimited time or has unlimited computational power, the attacker still does not have certainty which messages are stored in the lucky dip. Although the attacker finds all the stored messages, he also `finds' almost every other text imaginable. The attacker does not know which combination results in a readable text by accident and which one is stored deliberately. Summarising, we offer three approaches for a client to query a database such that the server neither learns the query nor the stored data: -- An encrypted XML text can be searched efficiently for the occurrences of a word. The search takes place entirely at the server. The server learns only the locations of the word, if it occurs at all, but nothing more about the text or the search word. -- Using a secure protocol between the server and the client and data represented as shared polynomials, we can secure the data, the query as well as the answer, at the cost of more work for the client. -- The simple operations that homomorphic encryption can perform on encrypted data, are sufficient to search the encrypted data.
Original languageUndefined
Awarding Institution
  • University of Twente
  • Jonker, Willem, Supervisor
  • Hartel, Pieter Hendrik, Supervisor
Thesis sponsors
Award date1 Jun 2007
Place of PublicationEnschede
Print ISBNs978-90-365-2488-9
Publication statusPublished - 1 Jun 2007


  • METIS-241644
  • IR-57852
  • EWI-9951

Cite this