Meeting the Challenge of Large Data The IT industry is abuzz with advice on what to do about Big Data, but some organizations are facing another important topic that receives much less attention: Large Data Large Data as a concept is the need to store individual pieces of data that are massive in size such as audio, high definition videos and graphics, (see Figure 1). Some examples of Large Data include an archive of videos, such as security videos for a department store or a movie production company, seismic maps generated by oil and gas companies for exploration or storage of high resolution digital X-rays and 3D ultrasound scans by a hospital. Large Data Characteristics Some important characteristics of the Large Data are: SUSE Enterprise Storage is a comprehensive storage solution tailored for the Large Data environment. You • Large total size for the archive (over 200TB) get what you need, and you don't pay for what you • Large individual pieces of data stored on the system don't need. Behind the scenes, SUSE Enterprise • Reliable but relatively infrequent access to stored Storage adds additional savings with fault-tolerant, resources • Potential need for future expansion and scale self-healing Ceph technology that minimizes administration and maintenance. The result is a powerful solution optimized for Large Data customers. The problems in large data are compounded when it Conventional block storage solutions cost around $1-2 comes to the total size of the storage system. A 10GB per GB for installations in the 100TB to 1PB range. video file is almost ten thousand times bigger than a With SUSE Enterprise Storage, the cost is often only standard Microsoft Word document, which means you 30 cents per GB, dropping to as low as 10 cents per would need ten thousand times more capacity to GB for installations above 10PB. store an equal number of files. If you attempt to address your Large Data storage needs with a conventional storage solution, the total cost of deploying and operating the archive increases with the storage capacity, which is infeasible for many IT budgets. Scalable storage to deal with Large Data emerged because data vendors and customers were not satisfied with the prospect of letting storage costs scale linearly to very large implementations. As you will learn in this paper, a storage solution specifically designed for the needs of Large Data can deliver savings that aren't possible if you depend on conventional storage technologies. Figure 1: Large Data scenarios encompass a wide range of use cases, but they have one common feature: files are large. One file might be bigger than hundreds of the spreadsheets and still-shot images you find on a conventional disk. Low-Latency/High Overhead When hard disk technology began to emerge many data could simply reside in conventional, low-latency years ago, disks were small, and system administrators block storage. This solution was not especially spent a lot of time tending to the storage system to efficient from a cost viewpoint, but rising disk make sure it didn't fill up. Users had to make choices capacities obscured the cost penalty. about which files to save and which to delete. Files that might need occasional access even if they weren't The emergence of Large Data as a distinctive market in active use were relegated to offline storage. puts the block storage cost penalty in the foreground. Large Data buyers need a solution that offers a Offline storage usually consisted of some kind of significantly lower cost than pouring all the storage archival storage medium, which could be anything resources into conventional block storage. At the from a floppy disk, to a magnetic tape, to a DVD. same time, a Large Data solution must provide: Offline storage was quite cumbersome and inefficient, typically requiring a human to store and classify disks • Convenient, always-on live access to all data in some kind of library-like storage collection, but • Fault tolerance offline storage served an important function: • A self-healing architecture that minimizes preserving the expensive, low-latency block storage maintenance tier for files that required frequent access. An alternative solution that can meet these needs and As storage capacities increased over time, storage deliver big savings for your Large Data customers is vendors were all too happy to offer solutions that object storage. provided more and bigger hard disks so that all the Understanding Object Storage Conventional block storage breaks a data file into fixed-sized blocks, placing the blocks at different (typically discontiguous) places on the disk (Figure 2). A file saved through block storage in a Large Data storage environment might consist of hundreds of blocks. Managing and manipulating the storage blocks requires system resources. Perhaps more importantly, some component of the system must serve as the central source for information on where all the blocks are stored, which can cause a processing bottleneck. Object storage, on the other hand, saves a file without breaking it into separate blocks. The location is derived using a hash of the filename and metadata, which means the system is not dependent on a central lookup table. Object storage simplifies the handoff from the client to the storage system, creating the potential for innovation and automation within the storage system. Ceph is a powerful and versatile solution based on object storage. File Figure 2: Block storage systems divide the file into blocks, storing the blocks separately on the disk. Disk Discover Ceph Ceph builds the promise of object storage into a storage admin per 500TB of data. Because a Ceph real-world, self-healing, fault-tolerant framework. cluster manages itself, a Ceph storage admin can Unlike other enterprise-ready storage solutions, which administer as much as 3-4PB of data, and some often require the purchase of expensive, proprietary networks report up to 10PB per admin. hardware, Ceph runs in a cluster formation on commodity hardware. You decide what hardware you wish to use based on a price you negotiate with your hardware vendor. The Ceph object store is accessible from a variety of storage client technologies (see the SUSE Enterprise Storage: The Best Way to Leverage Ceph box entitled “Interfaces”), which means you can deploy Ceph in a number of ways depending on your SUSE Enterprise Storage takes the promise of Ceph needs and your network configuration. and packages it into a form that is easy to use and implement. The storage experts at SUSE bring value The Ceph object store is a self-managing, to Ceph by adding: self-contained system. Once you get your Ceph cluster up and running, it stores and retrieves files with • Easy deployment—SUSE assembles and integrates a little or no intervention. Ceph has been called “valet helpful collection of deployment and management parking” for file storage. A file shows up at the front tools to simplify administration and extend door, and Ceph does the rest, managing the process productivity automatically and invisibly. And Ceph doesn't just stuff the data on a disk; multiple copies of the file are • Hardware certification—SUSE tests and certifies stored within the cluster using a fault-tolerance common hardware to minimize hardware headaches technology known as erasure coding to ensure the and IT overhead seamless recovery of your data if a disk is lost. • Support—the SUSE team provides several options If one of the nodes in the cluster goes down, Ceph for comprehensive, inexpensive customer support senses the loss and redistributes to other systems, that will keep your systems running and minimize restoring full fault tolerance and providing continuous the need for on-site expertise. operation. If your Large Data archive grows beyond its capacity, simply add another node to the cluster. Ceph The nature of the Large Data environment means that integrates the new node automatically. storage archives tend to grow with time. Unlike other Ceph-based options, which charge based on the The automation and native fault tolerance built into amount of data in your archive, SUSE offers per-node Ceph result in huge savings. A good rule of thumb for licensing, which means the price of your network conventional block storage is that it requires around 1 won't go up just because you store more data. Interfaces Ceph is based on object storage, but a robust and versatile system of interfaces make the Ceph object store accessible from many different kinds of storage systems (see Figure 3). In addition to an object storage interface, your Ceph cluster offers interfaces for block storage, a RESTful gateway for web services, and a network filesystem for access from file service clients. Block Storage Clients RESTful Gateway File Server Clients SUSE Enterprise Storage Storage Cluster Local Cloud Object Storage iSCSI (Network Block Storage) Figure 3: Ceph supports a number of different file storage interfaces and therefore can serve in a number of different roles. Questions for your Storage Vendor Conclusion If you are comparing storage solutions to address your Pouring 100% of your storage resources into Large Data problem, keep the following questions in conventional block storage is inefficient and mind: counter-productive if you work in a Large Data environment. Ceph is a powerful alternative built on • What is the total cost of the solution in $ per GB? commodity hardware and object storage technology. • How many storage admins will you need to SUSE Enterprise Storage puts the power of Ceph into administer the system? a form that is easy to implement and deploy. The SUSE team adds hardware certification and support, • How easy is it to expand the system after you deploy it? Does the cost rise when you add more data? and a per-node licensing scheme means you won't pay more when you add more data to your archive. Ceph and SUSE Enterprise Storage bring the total • What is the cost of migrating the data to newer cost of your Large Data archive down to as low as 40 hardware when the warranty of the storage being cents per GB—one third the cost of using conventional procured expires? block storage for the same scenario. • Is the solution based on Open Source technology, or If your environment has a need for storing is it subject to vendor lock-in and proprietary jumbo-sized files in a Large Data setting, and your control? storage solution requires (or might someday require) more that 200TB in storage capacity, consider the • Does the solution work with commodity hardware? If you consider all these questions carefully, Ceph and SUSE Enterprise Storage emerge as a sensible solution for your Large Data storage needs. benefits of Ceph and SUSE Enterprise Storage. For more information 800-796-3700 U.S./Canada 801-861-4500 Worldwide Learn more at suse.com SUSE Maxfeldstrasse 5 90409 Nuremburg Germany Part number 163-000006-001 © 2016 SUSE. All rights reserved. SUSE and the SUSE logo are registered trademarks of SUSE, LLC in the United States and other countries. All third-party trademarks are the property of their respective owners.
© Copyright 2026 Paperzz