A Deep-Dive with Azure DocumentDB: Partitioning, Data Modelling, and Geo Replication Andrew Liu [email protected] Session objectives and takeaways Yesterday's Session: Objectives for Today: A brief recap for those who missed yesterday… Gartner’s 3Vs of Big Data Volume Velocity Variety How can my app deal with massive volume of data & throughput? How do I write responsive apps? How do I deal with schema changes? How do I elastically scale my database? How do I make data available where my users are? How do I write highly available apps? How do I iterate rapidly? What data models work at scale? Common scenarios + use cases Retail • Product Catalog • Ordering and Payment Pipelines • Personalization • Customer 360 View Gaming • Multiplayer Games • Social Gameplay • Leaderboards • Game Analytics IoT / Sensor Data • Telemetry + Event Store • Telematics • Device Registry Ad Technology + Social Analytics • User behavior telemetry • Recommendations DocumentDB Capabilities Elastic and limitless global scale • • Guaranteed low latency • • • • Independently scale throughput and storage - locally and globally Transparent partition management and routing SQL and JavaScript – schema free • • <10ms reads/<15ms writes @ P99. Requests are served from local region Write optimized, latch-free database engine designed for SSDs and low latency access. Synchronous and automatic document indexing at sustained ingestion rates • • • Automatic tree path based indexing No schemas or secondary indices required upfront SQL and JavaScript language integrated queries Hash, range, and spatial Multi-document, JavaScript language integrated transactions Multiple consistency levels • • • Multiple well defined consistency levels Intuitive programming model for relaxed consistency models Clear PACELC tradeoffs and 99.99% availability SLAs DocumentDB 101 (ish) Architecture (Behind the Scenes) region datacenter datacenter federation federation FD • DocumentDB service is manifested as an overlay network with ring topology (aka federation) resource partitionset • Resources are partitioned; they span federations, datacenters and regions Partitionset partition • partition replica Partitions are made highly available by replicasets • A replica in-turn hosts the DocumentDB database engine and implements the replication protocol and local persistence physical logical Resource Model 1 • • • Partition set Resources identified by their logical and stable URI Represented as JSON documents Partitioned and across span machines, clusters and regions Replica-set = DocumentDB Collection 3 • • 2 • • Resource model Stateless interaction (HTTP and TCP) Hierarchical overlay atop partitioning model Global distribution US-East Partitions US-West Partitioning Model Grid Partitioning – horizontal based on hash/range and vertical across regions Each partition made highly available via a replica set N Europe Local distribution Let’s talk about… Everything you need to know to build Blazing fast, planet-scale applications! Collections != Tables Collections do NOT enforce schema Co-locate multiple types in a collection Annotate documents with a "type" property Co-locating types in the same collection Ability to query across multiple entity types with a single network request. Ability to query across multiple entity types with a single network request. For example, we have two types of documents: cat and person. { "id": "Andrew", "type": "Person", "familyId": "Liu", "worksOn": "DocumentDB" { "id": "Ralph", "type": "Cat", "familyId": "Liu", "fur": { "length": "short", "color": "brown" } } } Ability to query across multiple entity types with a single network request. For example, we have two types of documents: cat and person. { "id": "Andrew", "type": "Person", "familyId": "Liu", "worksOn": "DocumentDB" { "id": "Ralph", "type": "Cat", "familyId": "Liu", "fur": { "length": "short", "color": "brown" } } } We can query both types of documents without needing a JOIN simply by running a query without a filter on type: SELECT * FROM c WHERE c.familyId = "Liu" Ability to query across multiple entity types with a single network request. For example, we have two types of documents: cat and person. { "id": "Andrew", "type": "Person", "familyId": "Liu", "worksOn": "DocumentDB" { "id": "Ralph", "type": "Cat", "familyId": "Liu", "fur": { "length": "short", "color": "brown" } } } If we wanted to filter on type = “Person”, we can simply add a filter on type to our query: SELECT * FROM c WHERE c.familyId = "Liu" AND c.type = "Person" Co-locating types in the same collection Ability to query across multiple entity types with a single network request. Ability to perform transactions across multiple types Cost: every collection has one or more physical partitions underneath Let's talk about partitioning. Two Dimensions: Throughput and Storage Measuring Throughput (Request Units) % CPU % Memory % IOPS Document Documents Document Incoming Requests Request Unit/sec (RU) is the normalized currency Rate limit Max RU/sec No throttling Min RU/sec Replica Quiescent Documents Requests get rate limited if they exceed the SLA Operations consume request units (RUs) Replica gets a fixed budget of request units Customers pay for reserved request units by the hour Partitioning Model Collection …. Partition 1 Partition 2 … Partition i …. Partition n Partitioning Model Partition Key = city Houston London Chicago New Delhi Mumbai Paris New York …. … …. Boston Berlin … Partition 1 … Partition 2 Partition i Partition n Overall request volume should scale across Partition Keys …. … Partition 1 … … Partition 2 Partition i …. … Partition n Overall request volume should scale across Partition Keys …. … Partition 1 … … Partition 2 Partition i …. … Partition n Individual queries should minimize cross-partition lookups …. … Partition 1 … … Partition 2 Partition i …. … Partition n Partition Key Design Goals Choosing a Partition Key Let’s talk about object model "With great power comes great responsibility“ - Uncle Ben How do approaches differ? How do approaches differ? Data normalization How do approaches differ? Data normalization Come as you are Modeling Data: The Relational Way Person Id PersonContactDetailLnk PersonId ContactDetail Id ContactDetailId Address Id ContactDetailType Id Modeling Data: The Document Way Person { "id": "0ec1ab0c-de08-4e42-a429-...", "addresses": [ { "street": "1 Redmond Way", "city": "Redmond", "state": "WA", "zip": 98052} ], "contactDetails": [ {"type": "home", "detail": “555-1212"}, {"type": "email", "detail": “[email protected]"} ], ... Id Addresses Address … Address … ContactDetails ContactDetail … } To embed, or to reference, that is the question Data modeling with denormalization { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "addresses": [ { "line1": "100 Some Street", "line2": "Unit 1", "city": "Seattle", "state": "WA", "zip": 98012 } ], "contactDetails": [ {"email: "[email protected]"}, {"phone": "+1 555 555-5555", "extension": 5555} ] Try model your entity as a selfcontained document Generally, use embedded data models when: contains one-to-few changes infrequently bounds won’t grow without } integral better read performance Data modeling with referencing In general, use normalized data models when: { "id": "address_xyz", "userid": "xyz", "address" : { … } { "id": "xyz", "username: "user xyz" Write performance one-to-many many-to-many } changes frequently } { "id: "contact_xyz", "userid": "xyz", "email" : "[email protected]" "phone" : "555 5555" } Normalizing typically provides better write performance Hybrid models No magic bullet { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "countOfBooks": 3, "books": [1, 2, 3], "images": [ {"thumbnail": "http://....png"} {"profile": "http://....png"} ] } Model on a property-level (as opposed to record-level) Optimize your data model for your workload… { "id": 1, "name": "DocumentDB 101", "authors": [ {"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"}, {"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"} ] } Hybrid Approach: (as opposed to blindly following types) Segment data based on mutability Query and Indexing 38 Documents as Trees JSON serializable values (aka JSON Infoset) JavaScript Object Literals { "locations": [ { "country": "Germany", "city": "Berlin" }, { "country": "France", "city": "Paris" } ], "headquarter": "Belgium", "exports":[{ "city": "Moscow" },{ "city": "Athens"}] } locations headquarter 0 country Germany city Berlin 1 country France Belgium city Paris exports 0 1 city city Moscow Athens JSON document as tree Query SELECT C.locations FROM company C WHERE C.headquarter = "Belgium" function businessLogic() { var country = "Belgium"; __.filter(function(x){return x.headquarter===country;});} JavaScript SQL { "locations": [ { "country": { "country": ], "headquarter": "exports": [{ } "Germany", "city": "Berlin" }, "France", "city": "Paris" } "Belgium", "city": "Moscow" }, { "city": "Athens" }] locations headquarter 0 country Germany city Berlin 1 country France Belgium city Paris { "locations": [{ "country": "Germany", "city": "Bonn", "revenue": 200 } ], "headquarter": "Italy", "exports": [ { "city": "Berlin","dealers": [{"name": "Hans"}] }, { "city": "Athens" } ] } locations headquarter exports 0 0 city Moscow 1 country city Athens Germany revenue Bonn city Berlin 200 1 dealers city 0 Athens name Input documents Hans { "results": [ { "locations": [ {"country":"Germany","city":"Berlin"}, {"country":"France","city":"Paris"} ] } ] Query result } 0 Italy city exports results 0 locations 0 country Germany city Berlin 1 country France city Paris Query {"id":"GermanTax", "body": "function GermanTax(income) { if(income < 1000) return income * 0.1; else if(income < 10000) return income * 0.2; return income * 0.4; }" SELECT location.city, GermanTax(location.revenue) AS Tax FROM location IN company.locations WHERE location.revenue > 100 UDF } { { "locations": [ { "country": { "country": ], "headquarter": "exports": [{ "locations": [{ "country": "Germany", "city": "Bonn", "revenue": 200 }], "headquarter": "Italy", "exports": [{"city": "Berlin","dealers": [{"name":"Hans"}]}, {"city":"Athens"}] "Germany", "city": "Berlin" }, "France", "city": "Paris" } "Belgium", "city": "Moscow" }, { "city": "Athens" }] } } locations headquarter 0 locations headquarter 0 country Germany city Berlin 1 country France Belgium city Paris exports 0 country 1 city city Moscow Athens Germany city Bonn 0 Italy revenue city Berlin 200 exports 1 dealers city 0 Athens name Input documents Hans { results "results": [ {"city":"Bonn","Tax":20} ] 0 city } Query result Bonn Tax 20 Schema Agnostic Indexing • Logically the index is a union of all the document trees • Structure contributed by the interior nodes, instance values are the leaves • Columnar index for fast scans • Support for rich hierarchical, relational and analytical queries • Different path encodings depending on index type • Support for multi-tenancy requires fixed upper bound on index size • Structural information and instance values are normalized into a unifying concept of JSON-Path Common structure 0 Germany location location country 0 0 country coordinates country 0 0 0 0 country location location Germany Range (>, <, !=) & ORDERBY queries Wildcard queries Spatial queries Terms Postings List $/location/0/ 1, 2 location/0/country/ 1, 2 location/0/city/ 1, 2 0/country/Germany 1, 2 1/country/France 2 … … 0/city/Moscow 2 0/dealers/0 2 Dynamic Encoding of Postings List (E-WAH/differential) Queries that use the index Indexing Policies Configuration Level Options Automatic Per collection True (default) or False Override with each document write Indexing Mode Per collection Consistent or Lazy Lazy for eventual updates/bulk ingestion Included and excluded paths Per path Individual path or recursive includes (? And *) Indexing Type Per path Support Hash (Default) and Range Hash for equality, range for range queries Indexing Precision Per path Supports 3 – 7 per path Tradeoff storage, query RUs and write RUs Indexing Paths Path / Description/use case Default path for collection. Recursive and applies to whole document tree. /"prop"/? Serve queries like the following (with Hash or Range types respectively): SELECT * FROM collection c WHERE c.prop = "value" SELCT * FROM collection c WHERE c.prop > 5 /"prop"/* All paths under the specified label. /"prop"/"subprop"/ Used during query execution to prune documents that do not have the specified path. Serve queries (with Hash or Range types respectively): /"prop"/"subprop"/? SELECT * FROM collection c WHERE c.prop.subprop = "value" SELECT * FROM collection c WHERE c.prop.subprop > 5 Global Distribution Multi-region DocumentDB databases Total RUs = Provisioned RUs x Number of regions Partition set Replica-set 2M RUs In this example: 2M RUs x 3 regions = 6M RUs A DocumentDB collection DocumentDB Collection Primary Replica-sets 2M RUs Global distribution US-East Partitions US-West Secondary Replica-sets 2M RUs India Secondary Replica-sets 2M RUs Local distribution Programmable data consistency Strong consistency, High latency “Its hard to write distributed apps.” Eventual consistency, Low latency Consistency Levels • PACELC Theorem and the associated tradeoffs Consistency Levels • Strong, Eventual, Bounded Staleness, and Session LEFT TO RIGHT Weaker Consistency, Better Read scalability, Lower write latency Strong S Client Client Client P Session Bounded Staleness S P • • S P S Consistent Prefix reads. Reads lag behind writes by K prefixes or T interval • S Eventual Client S Monotonic reads, writes and Read your writes guarantee Client P S S General Tips General Tips: Low latency void ServerStart() { ... await _client.OpenAsync(); } return new DocumentClient(endpoint, key, policy); DocumentClient _client DocumentClient _client DocumentClient _client Server Instance Server Instance Server Instance Create a singleton instance of DocumentClient for an app server instance ConnectionPolicy policy = new ConnectionPolicy { Protocol = Protocol.Tcp, Mode = ConnectionMode.Direct }; Warm up DocumentClient cache by calling DocumentClient.OpenAsync() upon start of your app server Use Direct Connectivity and TCP for .NET SDK Use Direct Connectivity and HTTPS for Java SDK General Tips: Throughput Throughput 100 80 60 40 20 0 Use relaxed consistency levels for efficient utilization of provisioned throughput POST .../colls { GET https://.../docs x-ms-max-item-count: 1 If-None-Match: "28535" A-IM: Incremental feed x-ms-documentdb-partitionkeyrangeid: 16 ... Subscribe for changes via change feed APIs instead of polling and reading the entire feed ... indexingPolicy : { IndexingMode : "None" … } If you intend to use DocumentDB as a KV store, you can tell them system to drop the secondary indexes. This will also save storage. Roadmap 2017 Change Feed Distributed replication log Keep your cache or data warehouse up to date Perform notifications on changes Perform streaming aggregation Lambda pattern with significantly lower TCO Single scalable database solution for both ingestion and query Aggregates at global scale Low latency aggregates at any scale Supported via Updatable, column store index at global scale Deeply integrated with latch free, log structured database engine Preview now available Spark connector for DocumentDB RDD and Dataset-based connectors available Native integration with Spark SQL Direct mapping to DocumentDB partitions Natively leverage DocumentDB index Predicate pushdown Public release in H1 CY2017 Pricing and scaling improvements Enable bursting up to 10x for spiky workloads Reduced starting price for partitioned collections (4x) Create up to 10 TB collections without support ticket Deprecating S1 – S3 offers Bursting available H1 2017 Graph APIs SQL and Gremlin query Independently scalable graph engine using TinkerPop Optimized query engine for relationship traversals Schema freedom for ad-hoc expansion of attributes on nodes & edges Limitless scale to support massive graphs Same NoSQL stack Session objectives and takeaways Continue your Ignite learning path Visit Channel 9 to access a wide range of Microsoft training and event recordings https://channel9.msdn.com/ Head to the TechNet Eval Centre to download trials of the latest Microsoft products http://Microsoft.com/en-us/evalcenter/ Visit Microsoft Virtual Academy for free online training visit https://www.microsoftvirtualacademy.com Microsoft Ignite
© Copyright 2026 Paperzz