Peer-to-peer Networks : promise and trouble. Bart Dhoedt Ghent University - Faculty of Applied Sciences Department of Information Technology (INTEC) e-mail : [email protected] phone : ++32 9 264 99 66 Presentation at NORDUnet Network Conference August 24-27, Reykjavik, 2003 UNIVERSITEIT GENT Tuesday, August 27, 2003. 1 OUTLINE 1. Introduction 2. Taxonomy of P2P-systems 3. Issues in P2P-systems 4. P2P-trends 5. Concluding remarks 2 Defining P2P • about sharing content disk space bandwidth 1001010 Software resources computer cycles liability • symmetric (architectural view) • creating an application-level overlay network • decentralized • application critical infrastructure owned by many 4 Sharing resources ? - estimate of edge resources total number of Internet hosts : 150 M average disk capacity : 10 GB average available memory : 128 MB average processing power : 1 GFLOPS average BW : 100Kb/s - available for P2P-network 1% hosts 50% processing power 50% memory 10% disk space 25% network bandwidth 1.5 Mprocessors disk storage : 1.5 PB processing power : 1.5 PFLOPS BW/link : 25 Kb/s 5 Sharing resources ? • What about supercomputers ? IBM ASCI White 12.3 TFLOPS 8192 processors 512 RS/6000 processing nodes 6.2 TB memory storage 160 TB disk storage 110 M$ 106 tons P2P-supercomputer > x 10 ! 1.5 PFLOPS 1.5 M processors 92 TB memory storage 1.5 PB disk storage ? M$ ? tons 6 P2P @ edge ? • How to unleash the power of the “Internet’s dark matter ?” 7 P2P popularity 2003 summer download hit parade [Last week] [Total] 1. Kazaa Media Desktop 2 644 777 261 405 295 2. ICQ Lite 588 141 25 423 064 3. AOL Instant Messenger (AIM) 532 897 17 521 190 4. iMesh 392 703 55 145 269 5. WinZip 351 865 100 741 790 332 624 233 204 712 P2P 6. ICQ Pro 2003a beta 7. Spybot – Search & Destroy 232 993 2 764 380 8. Ad-aware 224 720 19 078 555 179 347 114 140 262 P2P 9. Morpheus 10. DownloadAccelerator Plus 119 601 36 355 895 P2P P2P P2P P2P [www.download.com] 8 P2P popularity Napster : the early days … Internet Applica tions Adoption Ra te 70 Millions 60 50 40 30 20 10 23 21 19 17 15 13 11 9 7 5 3 1 0 Month Hotmail ICQ Napster Gnutella network : up to 400 000 nodes operating world wide 9 Architectural view Mediated P2P Napster Audiogalaxy Pure P2P Early Gnutella FreeNet Hybrid P2P Gnutella FastTrack Kazaa 10 P2P-architectures mediated data traffic control traffic efficiency P2P client-server pure P2P P2P + efficient search + efficient control - inefficient search - BW consuming scalability - control hot spot (mirrors needed ?) - BW needed grows rapidly robustness - single point of failure - easy to attack + graceful degradation + difficult to attack accountability easy difficult hybrid P2P local : client-server long distance : P2P +/good compromise ? difficult 11 P2P taxonomy content sharing distributed computing instant messaging collaborative working mediated pure hybrid 13 File Sharing performance 1.6 M downloads/day 150 M searches/day 10 TB data transfer/day 1-2 TB data transfer/day 100 servers 15000 servers 14 Distributed computing performance SETI =“Search for extraterrestrial Intelligence” • started in 1998 as a 2 year project (but still running) • 4 M users signed up so far • Radio telescope data sent to clients for digital signal analysis • Nodes process data when cycles are available (works as screen saver) • Using resources to allow better signal analysis 35 GB/tape 16 hours recorded data 10 tapes/week, 350 GB 10 000 0.3 MB work units 15 Distributed computing performance 3.1x1012 FP-operations 700 000/day computations per work unit work unit throughput 22x1017 FLOP/day >25 TFLOPS Processing Cost SETI@home ASCI White@DoE 25 TFLOPS 12.3 TFLOPS 1 M USD 110 M USD 16 Scaling problems Mechanisms in GNUTELLA to limit traffic • Network horizon set by TTL • Descriptor ID’s avoid cyclic routing • PONG/QueryHIT/Push NOT flooded “1 Gnutella request would cause 90MB data traffic on Napster scale network” BUT ... Bandwidth 10000 KB/PING 8000 6000 4000 2000 0 0 2 4 6 8 Horizon 17 Scaling answers 1. Reduce network horizon to reduce f 2. Use of reflectors = node with high BW available - mimics peer sharing all files of its “clients” high BW access low access BW handles all PING/PONG QUERY/QUERYHIT Traffic 3. Use of UltraPeers = same principle as reflector,handle but ONLY chosen dynamically download traffic 18 Robustness • self-organization leads to power-law networks (1% of servents shows server-like behaviour …) • very robust to random node failure • more vulnerable to targeted attacks Simulation result for FreeNet peers [T. Hong, “Performance”, Chapter 14 in “Peer-to-peer : Harnessing the Benefits of a Disruptive Technology”, ISBN 0-596-00110-X, O’Reilly, March 2001.] 19 Free-riding on Gnutella Network size since Jan 2002 - only 30 % of nodes offering content [www.limewire.com] - 50% of queries satisfied by 1% of servents 20 Overlay mismatch Mismatch between application layer network and physical network based on network traffic analysis • 40% Gnutella clients belong to top 10% AS • only 2-5% links within AS based on domain names Gnutella’s clustering logic shows no/little correlation with domain name based clustering [M. Ripeanu, A. Iamnichi, I. Foster, “Mapping the Gnutella Network”, IEEE Internet Computing, January-February 2002.] 21 Business Models How to monetise P2P ? • authors agree on “P2P business models are unclear” • reality : few companies make money on P2P • current situation : File sharing application sponsored by advertisement (banners) • some other possibilities • micropayment mechanisms • indirect mechanisms (P2P will increase BW-need and hence …) • tip based strategy (cf. US-model …) • make “low”-quality content available to get people interested in specific content • make use of end users devices to reduce cost ! 22 Problems/issues/barriers/challenges Problems Solutions node/link transient nature robustness File-sharing : content redundancy Cycle-sharing : checkpointing scalability bandwidth consumption Hybrid approach Avoid floodings (e.g. FreeNet : intelligent routing) Content/Query caching TTL Avoid routing cycles Network discontinuities (firewalls, (dynamic) NAT) (Ab)use of port 80 Rendez-vous servers 23 Problems/issues/barriers/challenges Problems Privacy/trust Anonymity Solutions Encryption techniques (e.g. FreeNet : plausible deniability for node operators) application redesign P2P-frameworks free-riding accountability micro-payment asymmetric bandwidth in access (ADSL, HFC) combine uplink capacity (e-donkey) inefficient overlay Network/infrastructure aware routing ??? business models ? 24 P2P-trends • emergence of platforms • convergence between Grid-computing and P2P-technology • enhance P2P-performance • semantic searches (Tapestry, Content Addressable Networks …) • Query/result caching 25 Platform emergence Application areas File sharing Distributed computing Instant Messaging Dedicated Application Programs and Protocols Freenet eDonkey Collaboration ? Gnutella • for 1 application area • non-generic • 1 application class • 1 specific problem • network interoperability ? SETI@home Platforms Frameworks Groove ? • offer generic services • support the P2P paradigm • used to build P2P applications ? ? ? ? 26 JXTA • developed by Sun Microsystems • set of 6 XML based open protocols • Java API offered e-mail auctioning data storage Applications JXTA Community Applications indexing Sun JXTAsearching JXTA Applications file sharing Shell Services JXTA Community Services Peer Sun JXTA peer establishment management Commands Services communication routing Core Peer Groups Peer Pipes Peer Monitoring Security [http://www.jxta.org] 27 BOINC • Berkeley Open Infrastructure for Network Computing • allows participants to participate to solve selected problems • = “generic SETI@Home” [http://boinc.berkeley.edu] 28 Conclusions For network operators P2P applications can be very BW-consuming • extremely popular (and addictive) • use of inefficient strategies (broadcast, flooding, …) • “tragedy of the commons” Danger for Bottlenecks • overlay network has little relation to physical infrastructure • symmetric relations between peers Change in user behaviour • “always” online • information provider AND information consumer 29 Conclusions For application developers People are (extremely) interested in digital content People are willing to share resources for free (and even want to spend money …) • make people feel they participate in a large project • give some credit to users (competition) (top 10 list, eternal fame if solution is found, …) To avoid digging ones own grave • avoid BW-consuming strategies • include micropayment/trust mechanisms as - encouragement to participate - avoid free-riding - avoid DoS attacks 30 Conclusions For application developers Hacker danger • need for encryption mechanisms High performance P2P-platforms are emergent • reuse of efforts • reuse of user community Make sure your application has some scaling effect • the more users, the more interesting to join ! 31
© Copyright 2026 Paperzz