Secure outsourcing of XML data Barbara Carminati University of Insubria at Varese [email protected] http://www.dicom.uninsubria.it/~barbara.carminati Software as a Service Get Pay for What you need When you need it What you use Don’t worry about Deployment, installation, maintenance, upgrades Hire/train/retain people Emerging trend: data outsourcing Database as a Service (DBaaS), why? Most organizations need efficient data management DBMSs are extremely complex to deploy, setup, and maintain Require skilled DBAs (at very high cost!) Driven by faster, cheaper, and more accessible networks Traditional architecture DBMS Server Client Third-party architecture Data Outsourced db Internet Data Provider Data owner Results Queries Client Research issues Distributed query management Consistency Security & Privacy: Main requirements: confidentiality, integrity, authenticity, completeness, etc… Security & Privacy NaÏve solution: Data providers are trusted -- they always operate according to owners security and privacy policies Security & Privacy To be satisfied even in the presence of an untrusted provider that: Can modify/delete the data Can access sensitive/private information Can send data to non authorized users Can send a user not all the information he/she is authorized to access Can be attacked from outside To be satisfied by incurring minimal computation and bandwidth overhead Main requirements Confidentiality Authenticity/integrity Completeness Confidentiality Confidentiality: Data are disclosed only to authorized users Usually, confidentiality requirements are expressed through a set of access control policies Access control Access control policies SAs Authorizations Access granted (partially or totally) Access request Reference Monitor Access denied Users Confidentiality When data are outsourced, confidentiality has a twofold meaning: Confidentiality wrt users: protect data against unauthorized user’s read accesses Confidentiality wrt providers: protect the Owner’s data from read accesses by untrusted providers Integrity It refers to information protection from modifications; it involves several goals: Assuring the integrity of information with respect to the original information– often referred to as authenticity Protecting information from unauthorized modifications Integrity/authenticity Usually enforced through signature techniques When data are outsourced: Traditional signature techniques are not enough A user can be returned only selected portions of the data signed by the owner Completeness It refers to ensure that users receive all information they are entitled to access, according to the owner policies Secure outsourcing of XML data our proposal Scenario XML Source Credential base We focus on XML The Owner is the producer of information. It specifies access control policies The Provider is responsible for managing (a portion of) the Owner information and answering user queries according to the access control policies specified by the Owner Policy Base XML docs Owner Provider Scenario We focus on XML data The Owner specifies access control policies according to an access control model supporting: Fine-grained and credential-based access control XML-based language to express access control policies and credentials (X-Sec) Example X-Sec Alice Credential <x_profile> <secretary level='7’> <name>Alice Rossi</name> <department>marketing</type> <type> administrative</type> <email>[email protected]</email> </secretary> </x_profile> Access Control Policy (encoded by X-Sec language) Cred expression target Path M P secretary[@level>='4'] organization.xml department[@dept='Marketing']/employee[@level<10] R F secretary[@level>='9'] organization.xml department[@dept='Internet']/employee R F Example Alice submits this Xpath: //organization/department/employee[@level>4] <?xml version="1.0" encoding="UTF-8"?> <Organization> <department dept=‘Marketing’> <employee> <name> Alice Rossi</name> <salary> 80K </salary> <level> 7</level> </employee> <employee> <name> Bob Red</name> <salary> 50K </salary> <level> 5 </level> </employee> <employee> <name> Tom Black</name> <salary> 170K </salary> <level> 12</level> </employee> </department> <department dept=‘HR’> <employee> <name> Kim </name> <salary> 150K </salary> <level> 11 </level> </employee> <employee> <name> Ann</name> <salary> 80K </salary> <level> 7</level> </employee> </department> </Organization> denied denied denied Access control policy authorizes Alice to see department[@dept=‘Marketing’]/employee[@level<10] Problem Provider 2 XML docs XML Source Credential base Policy Base Provider 1 XML docs Strategies for ensuring authenticity and completeness XML confidentiality,Owner docs if the provider is not trusted even XML Provider 3 docs Untrusted Provider 4 Proposed solution: overall idea The owner outsources to providers a Security Enhanced Encryption of the original XML docs, where: Authenticity and integrity are enforced by an alternative digital signature devised for XML docs, i.e., Merkle Signature; Confidentiality is ensured by the properties of Well formed encryption; It contains security information, that makes the providers able to evaluate queries. Moreover, the owner provides users with auxiliary data structures (i.e., Query templates), that make them able to submit queries directly to providers and verify the obtained query results Owner-side processing Merkle Signature XML document Partioning information Authenticity information Security Information Well-formed encryption K1 Kj Km Kp SE-ENC document Removal of encrypted content Query Template System architecture Decryption keys OWNER SE-ENC document credentials Query User Answer CLIENT PROVIDER System architecture OWNER Query Template SE-ENC document Query User Answer CLIENT XML query Reply Document PROVIDER Confidentiality enforcement Confidentiality issues Secure data outsourcing implies two different confidentiality issues: Confidentiality with respect to users Confidentiality with respect to providers Confidentiality Problem: Providers must be able to evaluate queries and enforce access control policies on XML documents, by respecting at the same time confidentiality requirements Solution based on encryption techniques Well Formed Encryption The idea is that before sending a document to a provider, the owner encrypts it: Well formed encryption The approach is based on encrypting all document portions to which the same set of access control policies apply with the same key Well-Formed Encryption &1 &2 P1,P3 &3 &4 P1,P3 P1,P3 &5 &6 P1,P3 P2 P1,P3 &7 P1,P3 P3 &11 &8 P3 &9 &13 &14 &10 &12 &15 &16 Well-Formed Encryption &2 P1,P3 &3 &4 P1,P3 P1,P3 &6 P1,P3 &1 P2 &5 P1,P3 Node encrypted with key K1 &7 &8 P3 &9 &13 P1,P3 P3 &11 &14 &10 &12 &15 &16 Well-Formed Encryption &2 P1,P3 &3 &4 P1,P3 P1,P3 &6 P1,P3 &1 P2 &5 P1,P3 &7 P3 &9 &13 P1,P3 P3 Nodes encrypted with key K2 &8 &11 &14 &10 &12 &15 &16 Well-Formed Encryption &2 P1,P3 &3 &4 P1,P3 P1,P3 Nodes encrypted with key K3 &6 P1,P3 &1 P2 &5 P1,P3 &7 &8 P3 &9 &13 P1,P3 P3 &11 &14 &10 &12 &15 &16 Well-Formed Encryption &2 P1,P3 &3 &4 P1,P3 P1,P3 &6 P1,P3 &1 P2 &5 P1,P3 &7 P3 &9 &13 P1,P3 P3 Nodes encrypted with key Kd &8 &11 &14 &10 &12 &15 &16 Well-Formed Encryption &2 P1,P3 &3 &4 P1,P3 P1,P3 P1 K2 P2 K1 &6 P1,P3 &1 P2 &5 P1,P3 &7 K2, K3 P3 &9 &13 P1,P3 P3 &11 P3 &8 &14 &10 &12 &15 &16 Well Formed Encryption: Key management The owner does not supply any key to providers Keys are properly stored by the owner into the user entries in the directory server. Each user entry contains the key(s) corresponding to access control policies satisfied by the user: Hierarchical key management scheme that minimizes the number of keys to be permanently stored Well Formed Encryption pro Each node of the resulting encrypted document is accessible only by authorized users It prevents provider accesses to the managed data Well-formed encryption ensures confidentiality both wrt users and Providers Well Formed Encryption cons Issue: How can the Provider evaluate queries on XML encrypted data? Quering XML encrypted data - Querying encrypted documents is a difficult issue and greatly depends on the kinds of queries that are submitted to providers. - In our scenario, we assume users submit XPath expressions Quering XML encrypted data - Xpath expressions: Queries that impose conditions only on the structure of the XML document (structure queries) Queries that impose conditions also on data content (content-dependent queries) Quering XML encrypted data - Xpath expressions: Queries that impose conditions only on the structure of the XML document (structure queries) Queries that impose conditions also on data content (content-dependent queries) Well Formed Encryption Well formend encryption is encoded by an XML document preserving the structure of the original XML document Enc(tg1,K1) tg1 tg2 tg3 tg2 tg3 Att Att Enc(tg2,K2) Enc(tg3,K1) Enc(Att,K1) Enc(tg2,K2) Enc(tg3,K3) Enc(Att,K3) Well Formed Encryption Preserving the original doc structure greatly facilitates the evaluation of structure queries over the encrypted document But it implies some security threats: Data dictionary attacks by providers and users: At schema level (tag/attribute names) On element data contents/attribute values Well Formed Encryption To prevent data dictionary attacks we adopt the encryption scheme proposed by Song, Wagner and Perrig for textual data (IEEE Symposium on Security and Privacy,2000): Different occurrences of the same word, encrypted with the same key, result in different encryptions It is possible to perform keyword-based searches on the encrypted textual data without knowing decryption keys Quering XML encrypted data structure queries XPath expressions specify only the location path: Since we preserve the structure, client simply generates the corresponding encrypted query Ex: //tag1/tag2/tag3// Ex: //Enc(tag1,K1)/Enc(tag2,K2)/Enc(tag3,K1)// Providers are able to evaluate the encrypted query directly on the encrypted document Quering XML encrypted data - Xpath expressions: Queries that impose conditions only on the structure of the XML document (structure queries) Queries that impose conditions also on data content (content-dependent queries) Quering XML encrypted data content-dep. queries In order to make a provider able to evaluate conditions on encrypted data, we provide it with additional information In particular, on the basis of the data domain, we use two different strategies: non-textual data: Hacigums et al. (SIGMOD 2002) textual data: Song et al. (IEEE Symposium on Security and Privacy,2000) Quering XML encrypted data content-dep. queries Proposed solution for non-textual data: Previous research on querying encrypted relational db (H.Hacigumus et al.) Given a relation R, the data owner divides the domain of each attribute into distinguished partitions, to which it assigns a different id For each encrypted tuple, the provider receives also the partition ids of each of its attributes The provider is able to perform queries directly on the encrypted tuples, by exploiting the partitioning ids Quering XML encrypted data content-dep. queries Employee relation Provider Eid Name Dip Salary 0945 Alice 98 275 7903 Bob 93 436 8239 John 93 380 Partition salary ID 251-300 46 301-350 29 351-400 30 401-450 41 etuple ID Eid ID Name ID Dip ID Salary #%& … ... ... 46 @#% ... … … 41 @#% … … … 30 SELECT * FROM Employee WHERE Salary=275 SELECT * FROM Employee WHERE ID_salary=46 Quering XML encrypted data content-dep. queries Provider Owner tg1 tg2 tg3 Salary tg2 tg3 Salary Partition salary ID 251-300 46 301-350 29 351-400 30 401-450 41 Well formed encryption & Node Partion IDs Enc(tg1,K1) Enc(tg2,K2) Enc(tg3,K1) Enc(380,K1); 30 Enc(tg2,K2) Enc(tg3,K3) Enc(275,K3); 46 //Enc(tg1,K1)/Enc(tg2,K2)/Enc(tg3,K1)[@Enc(Salary,K1)=46]// //tg1/tg2/tg3[@Salary=275]// Quering XML encrypted data content-dep. queries Proposed solution for textual data: A first phase during which the Owner preprocesses the textual data contained in an attribute/element and extracts from them a set of meaningful keywords. Second phase where each keyword is encrypted according to the Song et al. schema Quering XML encrypted data content-dep. queries Provider Owner tg1 tg2 tg3 tg2 tg4 Well formed encryption & Encrypted keywords Enc(tg1,K1) Enc(tg2,K2) Enc(tg2,K2) Enc(tg3,K1) Enc(tg3.content,K1); Enc(XML,k1), Enc(DB,k1) Enc(tg4,K3) Enc(tg3.content,K3); ….. //Enc(tg1,K1)/Enc(tg2,K2)/Enc(tg3,K1)[contains(., Enc(‘DB’,k1))]// Keywords: XML, DB //tg1/tg2/tg3[contains(.,’DB’)]// Authenticity and Integrity enforcement Authenticity/integrity To ensure authenticity in two-party architectures traditional digital signature works well query User Signed view Owner But… ….traditional digital signatures have some problems in third-party architectures!! Xpath User Provider Owner Merkle Signature An alternative way to sign an XML doc By applying a unique digital signature on an XML doc it is possible to ensure the authenticity of: the whole document any portions of it It uses a different way to compute the digest of XML docs, based on the Merkle tree authentication mechanisms Merkle Signature An alternative way to sign an XML doc By applying a unique digital signature on an XML doc it is possible to ensure the authenticity of: the whole document any portions of it N1 N2 N4 N3 N5 N6 N7 MhX(N7)=h(h(N7.content) || h(N7))) It uses a different way to compute the digest of XML docs, based on the Merkle tree authentication mechanisms Merkle Signature An alternative way to sign an XML doc By applying a unique digital signature on an XML doc it is possible to ensure the authenticity of: the whole document any portions of it N1 N2 N4 N3 N5 N6 N7 MhX(N3)=h(h(N3.content) || h(N3) || MhX(N6) || MhX(N7)) It uses a different way to compute the digest of XML docs, based on the Merkle tree authentication mechanisms Merkle Signature An alternative way to sign an XML doc By applying a unique digital signature on an XML doc it is possible to ensure the authenticity of: the whole document any portions of it N1 N2 N4 J8ygVS8nqtl F5HP3FBj9e ZU/KYY= N3 N5 N6 N7 Merkle Signature It uses a different way to compute the digest of XML docs, based on the Merkle tree authentication mechanisms Merkle hash paths How can a user validate the Merkle signature computed on the whole XML document by having only a portion of it? Merkle Hash Paths Merkle Hash Paths for a leaf node The Merkle hash Path between v’ and v consists of: • the Merkle hash values of all the siblings of the nodes belonging to the path connecting v’ to v v 1 2 v’ 3 5 4 7 6 8 9 10 11 12 13 14 15 16 17 MhPath(4,1) Completeness enforcement Completeness Completeness is verified through the use of Query Templates The query template consists of the SE-ENC document (i.e., the well formed encryption, plus the additional information) without data content. By executing queries submitted to the provider on the query template, a user is able to verify the completeness of the query answer without accessing information he/she is not allowed to see. Completeness The encrypted data structure makes user able to verify the completeness of structure queries By exploiting partition information and ciphered keywords, a user is able to verify the completeness of content-dependent queries Conclusion Owner-side processing Merkle Signature XML document Partioning information Authenticity information Security Information Well-formed encryption K1 Kj Km Kp SE-ENC document Removal of encrypted content Query Template Provider-side processing Query evaluation SE-ENC document Create Reply document Insert Merkle Signature Insert information needed for authenticity verification Reply document User-side processing Confidentiality verification Reply document Authenticity verification Completeness verification Query Template System architecture Alice Bob Frank Users 7 Providers 4 8 3 9 SE-ENC XML documents Directory server User Policy Configuration + encryption Keys Query Template documents OWNER 5 Subscription 6 Request User_ID Users entry Key Query 10 User Answer CLIENT 1 2 Subscription Request Provider entry key User Policy Configuration + Query Reply Document 11 PROVIDER References Papers in XML B. Carminati, E. Ferrari. Confidentiality Enforcement for XML Outsourced Data. In Proc. of the Second International EDBT Workshop on Database Technologies for Handling XML Information on the Web, Munich, Germany, March 2006. B. Carminati, E. Ferrari, E. Bertino. Assuring Security Properties in Third Party Architecture. Proc. of the International Conference on Data Engineering (ICDE’05), poster paper. B. Carminati, E. Ferrari. Trusted Privacy Manager: A System for Privacy Enforcement on Outsourced Data. Proc. of the International Workshop on Privacy Data Management, Tokyo, Japan, April 2005. E. Bertino, B. Carminati, E. Ferrari, B. Thuraisingham, A. Gupta. Selective and Authentic Third-party Distribution of XML Document. IEEE Transactions on Knowledge and Data Engineering, 16(10): 1263-1278, 2004. E. Bertino, B. Carminati, E. Ferrari. A Flexible Authentication Method for UDDI Registres. Proc. of the 2003 International Conference on Web Services (ICWS'03), Las Vegas, June 2003. E. Bertino, B. Carminati, E. Ferrari. A temporal key management scheme for secure broadcasting of XML documents. Proc. of the 9th ACM conference on Computer and Communications Security, Washington, November 2002. E. Bertino, E. Ferrari. Secure and Selective Dissemination of XML Documents. ACM Transactions on Information and System Security (TISSEC), 5(3): 290- 331, 2002. Papers in relational data H.Hacigumus, B.Iyer, C.Li, and S.Mehrotra. Executing SQL over Encrypted Data in the Database Service Provider Model. In Proceedings of the SIGMOD Conference, 2002. D. X. Song, D. Wagner and A. Perrig, Practical Techniques for Searches on Encrypted Data, In Proceedings of the IEEE Symposium on Security and Privacy, Oakland, California, 2000. B. Chor, O. Goldreich, E. Kushilevitz, M. Sudan. Private Information Retrieval In Proc. of Symposium on Foundations of Computer Science,1995 Devanbu P., Gertz M., Martel C., Stubblebine S.G. Authentic Third-party Data Publication. In Proc. of the 14th Annual IFIP WG 11.3 Working Conference on Database Security, Schoorl, the Netherlands, 2000. Goh E., Secure Indexes, Cryptology ePrint Archive, Report 2003/216, 2003 Golle P., Staddon J. and Waters B., Secure Conjunctive Keyword Search Over Encrypted Data, In Proc. of the Applied Cryptography and Network Security Conference, 2004. Mykletun E., Narasimha M.,Tsudik G. Authentication and Integrity in Outsourced Databases. In Proc. of the 11th Annual Symposium on Network and Distributed System Security, San Diego, California, 2004. Pang H., Jain A., Ramamritham K. and Tan K., Verifying completeness of relational query results in data publishing, In Proc. of the ACM SIGMOD international conference on Management of data, Baltimore, Maryland, 2005
© Copyright 2026 Paperzz