Micron® SolidScale™ Platform Architecture for Cassandra™ Big Data Needs Big Performance Today it is no longer enough to simply store data. You need to be able to glean actionable information from it—whether it is sensor data from thousands of IoT devices distributed around the world, sales transaction data from some of the world’s largest ecommerce websites, or data that may lead to finding the newest energy reserve. As the size of data grows at an ever increasing rate, it is harder and harder to process that data and convert it into useful information. Apache Cassandra™ software is one of the industry’s most popular NoSQL database solutions, powering some of the world’s most precise data analytics applications. It is designed to provide fast, scalable, and optimized access to dynamic, unstructured data. Deployed as part of larger analytic solutions such as Apache Spark, Elasticsearch® and Apache Hadoop® MapReduce, Cassandra can put an incredible demand on a storage infrastructure. Cassandra is designed to take advantage of a scale-out, “share nothing” resource infrastructure where a portion of the data is stored in each local server. When your administrators need additional performance, they add additional servers (nodes) to the cluster. This expansion can lead to unused storage (that you are paying for) and additional administration resources. Analytics solutions need exceedingly fast access to all types of data — structured, unstructured, semi-structured — and Cassandra is a proven, high-performance solution, especially when paired with fast, low-latency NVMe™ SSDs. SSDs are traditionally installed inside the servers in the cluster to gain the performance advantages of PCIe®. Ideally, it would be more efficient for administrators to share these devices and manage them as a single pool of capacity and IOPS, but this is difficult to do with server-local storage. Administrators can turn to storage area networks (SANs) to share storage, but traditional SANs carry their own limitations and costs when used with scale-out applications solutions like Cassandra, and there are currently no SAN options that provide the full performance benefits of NVMe at scale. This is where leveraging a high-speed, low-latency infrastructure such as the new Micron® SolidScale™ architecture can provide the flexibility you need to take full advantage of those NVMe SSD resources. Get Aboard the Data Express for Your Cassandra Solution SolidScale architecture is the next-generation, intelligent infrastructure providing scale-out applications, like Cassandra, with all of the raw performance administrators, users and automated systems demand from server-local storage. It has all of the flexibility, manageability and scalability of SANs, leveraging the latest NVMe and PCIe standards with a lowlatency, high-bandwidth RDMA over Converged Ethernet (RoCE) fabric to connect compute and storage. By decoupling your application servers and storage, the SolidScale architecture allows you the flexibility to deploy compute and storage resources to meet your needs over time. The SolidScale infrastructure can be deployed in storage-centric (centralized, Micron® SolidScale™ Platform Architecture for Cassandra™ dedicated storage nodes) or compute-centric (nodes that can run both applications and provide storage services to the data center) configurations or by using a combination of both deployment strategies at the same time. By combining all of your NVMe SSD storage into a single pool, the SolidScale architecture unlocks all of the capacity and IOPS of the aggregated NVMe SSDs. Allowing more SSDs to contribute to each Cassandra server’s storage needs potentially enables you to run the same workload with fewer storage devices. Simply configure logical volumes of any desired size using the capacity of one or more physical NVMe devices in one or more SolidScale nodes to optimize your Cassandra deployment’s needs. A basic SolidScale solution consists of three storage nodes configured as a clustered storage resource in a high-availability configuration to prevent data loss. For Cassandra, SolidScale Infrastructure Manager allows you to create logical volumes in a striped (RAID 0) configuration to scale the performance of your data volumes to more SSD devices than you could in a typical in-server storage Cassandra deployment. The SolidScale infrastructure gives you access to thousands of NVMe SSDs across hundreds of SolidScale nodes in a single storage cluster. The SolidScale architecture is built on a low-latency, 100Gb RoCE fabric to connect application servers to their NVMe SSDs. RoCE networks are the latest high-performance interconnect technology specifically designed for memory and storage data transfer between independent server resources. While it’s expected that any networked solution will introduce some additional latency to the end-to-end I/O, our initial testing shows that the RoCE network introduces an average of only 5µs additional latency. Early Testing We tested two Cassandra cluster configurations: the first used server-local NVMe SSDs and the second used a SolidScale storage cluster for the database repository. With each configuration we used the Yahoo Cloud Serving Benchmark (YCSB) workloads A–D and F to reflect a broad set of cloud application workload types and their respective I/O read and write profiles. Table 1 shows the YCSB workloads definitions. Workload1 Operation Description Mix (R/W) A 50% / 50% B 95% / 5% Photo tagging C 100% / 0% User profile cache (typical for Session store recording actions in a user session Hadoop solutions) D 95% / 5% E 95% / 5% Threaded conversation perusal F 50% / 50% Read, modify, write database Insert records and read latest inserted records multiple times activity Table 1: YCSB Workload Definitions 1. We did not test workload E (short ranges) as it is not universally supported on NoSQL-type databases. The baseline Cassandra solution used six application servers configured as described in Table 1 and deployed as shown in Figure 1. We compared these results to those obtained by running the same YCSB test matrix against a six-node Cassandra solution using a three-node SolidScale storage configuration, with each server using a single SolidScalearchitecture-provided NVMe SSD device. Using six application servers distributed across the SolidScale cluster meant each SolidScale node hosted the data for two of the Cassandra servers, as shown in Figure 2). The early testing provides proof that a SolidScale infrastructure can provide performance that is nearly that of server-local deployments. Figure 1: Cassandra test configuration using Micron SolidScale infrastructure Figure 2: Cassandra test configuration using server local SSDs Micron® SolidScale™ Platform Architecture for Cassandra™ Baseline SolidScale (Figure 1) (Figure 2) # Application servers Application server configuration 6 • 2X Intel® Xeon® 26xxv4 @ 2.1MHz • 2X Intel Xeon 26xxv4 @ 2.1MHz • 8X 32GB RDIMMs • 8X 32GB RDIMMs • 2X 500GB M500DC SSDs RAID1 (boot) • 2X 500GB M500DC SSDs RAID1 (boot) • 2X 10 GbE local area network ports • 2X10GbE local area network ports • 2X 25Gb Mellanox ConnectX®-4 RoCE storage network ports # Data drives per server 1X 2.4TB Micron 9100 NVMe SSD 0 1 1 # Physical drives per logical volume Logical volume capacity # Logical data volumes per application server Total Cassandra database size Cassandra replication factor 2.4TB 1 1TB 3 Table 2: Cassandra test configurations The Results The Bottom Line The results of our testing with YCSB workloads show that the SolidScale infrastructure performance, in terms of operations per second and latency, is less than 10 percent slower than local NVMe storage performance. While our testing was done on an early technology preview configuration of hardware and software, we are already seeing significant performance increases from our SolidScale infrastructure. Using only a single 25Gb RoCE port and a single NVMe SSD in each Cassandra server, and without any network optimization, our performance was expected to be lower than what we should be able to achieve when we include additional, redundant data paths, advanced multipath software and network quality of service — none of which were used in these early tests. When compared to a server-local SSD solution, some additional latency will be incurred because of the network, but our early numbers are very encouraging. As data continues to grow and the need to analyze this data at an ever faster pace becomes more essential to your business success, advanced, high-performance data center infrastructures that can power next-generation applications must be used. As we have shown, SolidScale can provide the capacity, performance, and scalability that you will need to be successful. Cassandra performance on SolidScale is extremely fast, demonstrating a scalable, centrally managed solution that provides more potential performance than when deployed using traditional Cassandra server deployments that use serverlocal storage. Want to Learn More? Interested in participating in the SolidScale platform early access program? Or, are you an OEM company interested in partnering with Micron to extend the SolidScale architecture across your hardware platforms? Visit micron.com/solidscale or send an email to [email protected]. www.micron.com No hardware, software or system can provide absolute security and protection of data under all conditions. Micron assumes no liability for lost, stolen or corrupted data arising from the use of any Micron product. Products are warranted only to meet Micron’s production data sheet specifications. Products, programs and specifications are subject to change without notice. Dates are estimates only. ©2017 Micron Technology, Inc. All rights reserved. All information is provided on as “AS IS” basis without warranties of any kind. Micron, the Micron logo and SolidScale, are trademarks of Micron Technology, Inc. Apache, Apache Cassandra, and Apache Hadoop are registered trademarks of The Apache Software Foundation. NVMe is a trademark of NVM Express, Inc. Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S. and in other countries. PCIE is a registered trademark of PCI-SIG. Intel and Xeon are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. Mellanox and ConnectX are registered trademarks of Mellanox Technologies, Ltd. Rev. C 6/17 CCMMD-676576390-10708
© Copyright 2026 Paperzz