Markets for (Big) Data Aija Leiponen, Cornell University, [email protected] Joint work with Pantelis Koutroumpis and Llewellyn Thomas, Imperial College London 2013: everybody is talking about big data but… what is it? Gartner: Emerging Technologies Hype Cycle 2013 Web 3.0? • Capacity to capture and store information growing exponentially • Sensor networks, social networks, admin data, health records • Boon for social science… and business innovation? Communication revolutions Printing press Steam engine Telegraph Telephone Radio Television Networked data? Agenda for today 1. 2. 3. How are data different from other intangible/digital assets? How are data currently being traded and how do the economic features of data influence the trading mechanisms? How will the Internet of Things emerge considering the economic features of data and the available and emerging trading mechanisms? 1. Creation of data value ”22.7” ”ºC” Which instrument made the observation. Inalienability & provenance. ”24032017” ”60.1699”°N ”24.9384”°E ”18.6 22.1 25.3 24.0 22.7 19.9” Units – what is being measured. Metadata is crucial ”sensor 2292334” Observational data point from some instrument.Value? When and where observed; time series. Connected data ”Is it a lot or a little?” What is the environment in which observation is made? How does it matter? Who cares about this? Judgment (models, analyses) & context (who, how) Data value capture How to appropriate value (profit) from data? I.e. what aspects of data can generate market power? Control the data resource Control the metadata Control the connected data Control the analytical tools, models, intelligence Control the enabling platform How to control the data resource AND maximize its value? NO Intellectual Property Rights for data Secrecy – embed data in a service can’t license data itself Database right (EU) – prevent others from selling the whole database doesn’t apply to subsets Contracts – license the data via contractual agreement can sue for contractual breach; not prevent third parties from using data Verification technologies – attach a Distributed Ledger to the data and track its trading Works 100% with parties who care about provenance. Maybe not others Closed network of partners – share data within a consortium through a combination of contracts, trust, reputation effects, monitoring, consortium rules Small network/market in order to effectively monitor & govern No broader legal recourse in case of breach Differences between data and other intangible assets Record Data Content Software Currency Invention Information Type Raw records or structured databases Knowledge (insights) Knowledge (instructions) Pure value Knowledge (instructions) Good Type Intermediate/ Final Final Final Final Intermediate Alienability Low/medium Medium High High High Inferability High Low Low Zero Zero Excludability Limited Variable Variable High Variable Protection Method Copyright Secrecy or timing or timing Copyright or patents in some cases Distributed Patents or ledgers or other secrecy verification tech Protection Aspect Reuse Expression (patterns) Expression Transaction (patterns) or value insight (invention) Insight Fungibility Variable Low Low Low ? High Characteristics of different data sources Source of data Confidentiality Duration/ Alienability useful life Fungibility Inferrability Health care High >50 years Low (health, retail, social network, locational) High? Medium? Public sector administration Medium >50 years Medium Low Low 10-20 years Medium Low Medium? Manufacturing/ Medium Operations (sensor networks) Individual behavior High 1-5 years Low (health, retail, social network) High High Personal Location Data Medium 1-5 years Medium High High Summary (1) The economics of data goods depend on an analysis of data characteristics Data are heterogeneous across contexts Description, classification of data and its institutional framework is necessary for understanding its commercialization potential Data goods substantially differ from other intangible goods in terms of how their value is affected by: Excludability (protection) Provenance (metadata) Alienability (ongoing implications for subjects) Inferrability (implications of data integration for subjects) (2) Data Market Design Market efficiency requires (A. Roth) Thickness/liquidity Low transaction costs Limited strategic behavior by participants Provenance Excludability Stable matching: there are no more preferred potential matches Lack of “repugnance” (appropriateness/fairness) Types of market matching mechanisms Matching Marketplace design Terms of Exchange Examples One-to-one 1. Bilateral Negotiated Data brokers One-to-many 2. Dispersal Standardized Twitter API Many-to-one 3. Harvest Implicit barter Google Services Many-to-many 4. Multilateral Standardized or negotiated InfoChimps, Microsoft Azure “The (unfullfilled) promise of Data Marketplaces”, P. Koutroumpis, A. Leiponen, L. Thomas 1. Bilateral: Proprietary data vs. other IP licenses Data Patents Trademarks Copyrights License duration 1-2 years 10-20 years Up to 20 years 1-5 years Exclusivity Rare Frequent Often regional Rare Confidentiality Frequent Rare Rare Rare Use restrictions Abundant Concise Specific Concise Warranty ‘As is’ Frequent -- -- Obligation & remedy Correct/refund/replace/ update -- -- -- Audit Frequent -- -- -- % of sales or flat fee NA Per device Modal fee schedule Annual subscription “Data Contracts”, P. Koutroumpis, A. Leiponen, L .Thomas & J. Wu (2016) 2. Dispersal: 366 Open Data Contracts (T&C) Personal 4% International 2% Contract type Non-Profit 17 % Academic 37 % Government 21 % 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Proprietary License Open Database Comons GNU FOI / Open Government Commercial 19 % Data sharing 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Commercial use Sharing Permitted Share Alike Not Noted No Sharing 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Not Noted No Commercial Use Permitted Commercial Use Permitted 3. Harvest Facebook Terms; Data Policy Sharing Your Content and Information You own all of the content and information you post on Facebook, and you can control how it is shared through your privacy and application settings. For content that is covered by intellectual property rights, like photos and videos (IP content), you specifically give us the following permission: you grant us a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use any IP content that you post on or in connection with Facebook (IP License). When you publish content or information using the Public setting, it means that you are allowing everyone, including people off of Facebook, to access and use that information, and to associate it with you (i.e., your name and profile picture). About Advertisements and Other Commercial Content Served or Enhanced by Facebook You give us permission to use your name, profile picture, content, and information in connection with commercial, sponsored, or related content (such as a brand you like) served or enhanced by us. You permit a business or other entity to pay us to display your name and/or profile picture with your content or information, without any compensation to you. If you have selected a specific audience for your content or information, we will respect your choice when we use it. We do not give your content or information to advertisers without your consent. You understand that we may not always identify paid services and communications as such. https://www.facebook.com/legal/terms/update 4. Multilateral: Data Platform Supply Data Providers Complement Demand Customer Algorithm Providers Data Marketplace Customer Expert Advice Complement Customer • Selling data through a platform • Platform provider takes the risk, provides services, takes a cut • Technical challenges in standardization, rights management, • Strategic challenges in revenue sharing, chicken & egg (switching costs); loss of control etc Summary (2): Data marketplaces meet Roth Marketplace design Bilateral Dispersal Harvest Multilateral Liquidity Low High High High Transaction costs High Low Low Low Provenance Clear Unclear Unclear Medium Excludability Stability of Matching Medium Low? Low ? Low ? Low High? Market liquidity and stability inversely related to transaction costs and excludability (strategic behavior) With current data market mechanisms, you can achieve large markets with little control or small markets with greater control (3) How do we build an Industrial Internet? (a) isolated industrial clusters/ data pools (cf. patent pools) (b) adopt verification technologies such as Distributed Ledgers (a) Common Pool Resources (Ostrom 1990) Costly but not impossible to exclude potential beneficiaries from obtaining benefits from use CPR Tragedy of the Commons Collective action resolves TOTC and maintains resource if Clearly defined boundaries identify legitimate users Rules define how CPR should be used; metarules to change rules Effective monitoring to enforce rules, boundaries Smart steel data consortium SSAB recently finalised an R&D project exploring SmartSteel, a digital platform enabling steel to be ‘loaded with knowledge’. Unique identity code in the steel plate connecting the plate and information provides customers and their machinery with appropriate data and instructions to help them select and use SSAB steels. The idea is to share expert knowledge in steel. A platform built on cloud-based data that contains instructions for different stakeholders in the value chain on how to use the steel. “By accessing and adding data on the platform, our customers would be able to make optimal use of the steel and avoid costly and time-consuming failures and misuse.” Pilot R&D project: SSAB, Meyer Turku, Cajo Technologies, Aalto U,VTT & DIMECC If steel could provide all the data accumulated during the manufacturing and transportation chain, it would help us significantly and would be the first step towards transparent value chains. SSAB invites customers, process equipment manufacturers and other actors to join the development work. (b) Decentralized Data Platform – distributed ledger for data? User content & sensor data Public Ledger … transactionXX1 transactionXX2 transactionXX3 transactionXX4 transactionXX5 … Tagging & Cleaning A B E D G I Trading Aggregators Processing C F H • “Bottom-up” approach in information exchange • Users and sensors collect data • Aggregators can buy/sell data for profit; data owners get paid and have control over future uses • Processing, analysis and insights are separate Decentralized marketplace Multilateral marketplaces meet Roth Marketplace design Bilateral Dispersal Harvest Multilateral Centralized Multilateral Decentralized with DLT Collective action/ consortium Liquidity Provenance Low High High Transaction costs High Low Low Clear Unclear Unclear Excludability Stability of Matching Medium Low? Low ? Low ? High Low Medium Low High? High Low/ Medium Clear High? High? Medium/ low High Clear Medium Low? Distributed Ledger Technologies could conceivably enable large-scale, anonymous multilateral data markets by enforcing excludability Data consortia can enable small-scale markets based on identity and reputation but will they be sufficiently valuable and stable? http://hackingdistributed.com/2016/08/04/byzcoin/ https://www.technologyreview.com/s/600781/technical-roadblock-might-shatter-bitcoin-dreams/ Multilateral marketplaces meet Ostrom Marketplace design Boundaries Rules Monitoring Types of data Bilateral Clear Strong Effective High value/high confidentiality Dispersal Unclear Weak Minimal Low value/low confidentiality Harvest Unclear Weak Minimal Low value/low confidentiality Centralized multilateral Medium Medium Weak Medium value? Med.confidentiality? Decentralized multilateral Unnecessary with DLT Strong Effective High value/high confidentiality Collective action/consortium Strong Effective High value/high confidentiality, few sources Clear Collective governance is feasible in small settings; verification tech required to achieve large scale for high-value data Summary (3) • Data really is a different kind of an intellectual asset – Careful attention to technical, institutional detail • Trading regimes: secrecy & trust or verification technology (DLT) – or ‘FREE’ – Bilateral trading sets up a complex relationship with remedies, audits, subscriptions as contractual features – Decentralized Multilateral based on verification tech anonymous and one-off – probably for high-value data due to computing cost – Collective data pooling can resolve control problems but will not create market liquidity
© Copyright 2025 Paperzz