NEXPReS WP8 Provisioning High-Bandwidth, High-Capacity Networked Storage on Demand Ari Mujunen Board Meeting 20-Sep-2010 in Manchester Research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007- 2013) under grant agreement n° RI261525. This presentation reflects only the author's views. The European Union is not liable for any use that may be made of the information contained therein. WP8 – High-bw+cap Storage on Demand Participants JIVE, ASTRON, INAF, UMAN, OSO, PSNC, AALTO Total person-months 163.2 Deliverables 12 (11 reports and one demonstration test) 2 (17) WP8 – GANTT Chart with % of 1FTE over Task Durations 3 (17) Partner Focus Areas AALTO Coordination, basic technologies ASTRON Long-term archival & reprocessing INAF Global/local allocation/deallocation schemes JIVE Augmenting correlation capabilities /w buffering OSO Trial-site performance & applicability testing PSNC Role & trials of computing center buffering UMAN Trial-site performance & applicability testing 4 (17) Start at Partners, First Deliverables AALTO Jul-2010 .. Dec-2010 (D8.1), .. Feb-2011 (D8.2), .. ASTRON Jul-2010/Oct-2011 .. Mar-2013 (D8.9) INAF Jul-2010 .. Apr-2011 (D8.3), .. May-2012 (D8.6) JIVE Jul-2010 .. Aug-2012 (D8.8), .. Feb-2013 (D8.10) OSO Feb-2011 .. Sep-2012 (in D8.4), .. Mar-2013 (in D8.5+7) PSNC Feb-2011 .. Sep-2012 (in D8.4), .. Mar-2013 (in D8.5+7) UMAN Feb-2011 .. Sep-2012 (in D8.4), .. Mar-2013 (in D8.5+7) ACTION: Send your POC's email to '[email protected]' for WP8 deliverables and execution of work! 5 (17) Objective • Determine the best practical mix of solutions – What kind of storage • HDDs, SSDs, memory buffers, others – Where located & packaged • Geographically (stations, correlators, computing centers, clouds,...) • Locally (enclosures, racks, packaging, net topologies,...) – Connected in which ways • Locally (interface types, net equipment,...), globally (ship, net, LP,...) – How storage is allocated/deallocated and accessed • Algorithms, APIs, sw structure; strategies to bookkeeping,... • Which will serve the needs of evolving (>1Gbps) VLBI data acquisition and processing 09/19/10 6 (17) Model / Mindset Framework • VLBI is globally geographically distributed data acquisition, data storage, and data processing – Where data from a given global observation in time must be brought to one place to be compared / correlated • => Implies data transfers geographically, globally • Modelling the global VLBI network as a (potentially hierarchically) connected network of “nodules” – Which have capabilities like connectivity (BW, IF types,..) storage (size, BW, BW dir limits,..), computing, etc. – Which can be remodeled and replaced with new (hierarchical) “nodule” designs without affecting (too much) the “big picture” 7 (17) Nodules • Pretty much any piece of equipment – (Or a larger collection of such equipment, a “system”) • Which can be described with a small set of capabilities – Connectivity options and capabilities • Interface types, bandwidths, bw / direction limitations – Storage options and capabilities • Device types, r/w bandwidths, bw /direction limitations, sw access methods – Internal CPU, RAM buffering, and data “pumping” power – Packaging options – Price, power consumption, longevity,... 8 (17) Connectivity • All sorts of methods used to transfer data from one place to another – – – – – Physical shipping Networking (both local and global) Device interfaces (e.g. SATA II) Internal buses within equipment VLBI interfaces (e.g. legacy Mark IV formatter if) • Connectivity has a given bandwith and its restrictions – Direction, simultaneous use, less than theoretical performance in a given interconnect,... 9 (17) Existing “Nodules” • Variants of Mark 5 – 5A, 5B: 1Gbps in or out /w shipping; 1.6Gbps in or out /w sw – 5B+: 1/2Gbps in or out /w shipping; 3.2(?)Gbps in or out /w sw – 5C: 4Gbps only in, /w shipping; 3.2(?)Gbps in or out /w sw • Metsähovi 20-disk pack /w 10GE PC – 6Gbps in or out /w shipping(?); 6Gbps in or out /w sw; in&out /w sw not yet tested • BackBlaze 45-disk 4U rackmount /w 1(!)GE PC • Emerging high-end 2-4-6U rackmounts – Claim “up to” 16—24Gbps r/w at a premium price 10 (17) Nodule Jigsaw Puzzle • For instance, try to find a balanced match of storage, connectivity, and packaging options to accompany ~CPU Storage Options • Connectivity Options – 2 1GE ports 2—4 SATA II disks 6 SATA II disks • 1Gbps (or a little more) – 3 1GE ports 4Gbps(?) 4—6 SATA II disks + 5 /w PM = ~ 10 SATA II disks 20 SATA II disks /w PM • 2Gbps(?) – 1 10GE port • 6Gbps, maybe 8Gbps – 2 10GE ports 6Gbps 20—45 SATA II disks /w many controllers 8-10-??Gbps 11 (17) • ? Packaging Puzzles • Single unit – Tend to become bulky; problems of (semi)custom construction • Small-scale rack installation • Full-size rack – Rack connectivity: switches as 24/48 1GE x 2 10GE (cheap), 8 10GE, 24 10GE (rare, expensive, 10GE CX->T transition) • Google-style “racks” – Very economical for “20 or more small PCs” configuration • But becomes trash in a couple of years and must be thrown away and replaced with a new set... • Rack farms 12 (17) Simultaneous Read and Write • Want to observe (and store a copy of data) and at the same time, already start processing • Frequently dictated by the need to use the same (maybe special) connectivity for both directions • Two problems: – HDD seek time, slows down using more than one “spot” of disk – Even without, double data streaming bandwidth required throughout the internal data paths • Seek alleviated by multiplexing HDDs – Means more HDDs needed than the bare minimum – Multiplexing typically in time, in time chunks >>HDD seek time 13 (17) Imagining the NEXPReS WP8 Nodule... • We want more than a trivial single-PC system – But not any large-scale rack systems (no money for that!) • Something that would retain its topology in 2015 – But go from 4—8Gbps (NEXPReS) to 16—32Gbps (2015) • The most obvious Nodule would be a configuration of six 1GE PCs and one 24 1GE-to-(1 or 2)10GE switch – Could do 4Gbps in or out, 1—2Gbps in and out simultaneously – Can exercise multiplexing in time and IP, and multiple nets/PC – The obvious upgrade in 2015 would be to 100% 10GE • Which means everything---except software! • Might get up to 32Gbps in or out... 14 (17) Imagining the NEXPReS WP8 Nodule... • OTOH, a station Nodule could be a configuration of two 10GE PCs and one small 10GE switch – Could do 8Gbps in or out, 4Gbps in and out simultaneously – Can exercise multiplexing in time and IP – The obvious upgrade in 2015 would be to buy more similar PCs • But then: PCs of 2015 will be completely different—a mixed configuration might look weird and make use (=software) more complicated; the 10GE switch might prove too small and outdated – So might end up buying all new stuff anyway... • Will quite likely cost now more than the “six small PCs” scenario • Well, this should be in the Dec-2010 D8.1 deliverable... 15 (17) “The Inconvenient Truths” :-) • About e-VLBI: – “A given station cannot really sustain recording bandwidth larger than their e-VLBI connectivity—unless given an unlimited disk buffer.” – “A single slow (or high-latency like shipping) connection in a given e-VLBI network will force others (or some buffering party) to buffer most of the VLBI data, if not all.” • About buffers and archives: – “Huge disk buffers with thousands of disks (whether distributed or centralized) will cost a fortune, age rapidly, and be fragile (even with the highest-end equipment) and in constant need of (hw) maintenance.” 16 (17) “The Inconvenient Truths” :-) • About Mark 5s: – “The existing 8-packs of PATA disks will never be accessed simultaneously read and write—unless Conduant dramatically changes StreamStor firmware.” – “No variant of Mark 5 will ever feed the Mark IV correlator faster than 1Gbps. While a given Mark 5 unit is feeding the correlator, no new data can be fed into that Mark 5 at the same time.” – “At 1.6Gbps and maybe 3.2Gbps in pairs, the existing 8-PATApacks make little sense in >=4Gbps buffering. 8-packs will continue to be useful only for storing data certainly destined to be shipped physically.” 17 (17)
© Copyright 2026 Paperzz