Liu JL, Zhang YL, Yang L et al. SAC: Exploiting stable set model to enhance CacheFiles. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 29(2): 293–302 Mar. 2014. DOI 10.1007/s11390-014-1431-z SAC: Exploiting Stable Set Model to Enhance CacheFiles Jian-Liang Liu1,2 (刘建亮), Yong-Le Zhang3 (张永乐), Lin Yang1,2 (杨 Zhen-Jun Liu1 (刘振军), and Lu Xu1 (许 鲁) 琳), Ming-Yang Guo1 (郭明阳) 1 Data Storage and Management Technology Research Center, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China 2 University of Chinese Academy of Sciences, Beijing 100049, China 3 Department of Electrical and Computer Engineering, University of Toronto, Toronto M5S 3G4, Canada E-mail: [email protected]; [email protected]; {yanglin, guomingyang, liuzhenjun, xulu}@ nrchpc.ac.cn Received November 15, 2013; revised January 8, 2014. Abstract Client cache is an important technology for the optimization of distributed and centralized storage systems. As a representative client cache system, the performance of CacheFiles is limited by transition faults. Furthermore, CacheFiles just supports a simple LRU policy with a tightly-coupled design. To overcome these limitations, we propose to employ Stable Set Model (SSM) to improve CacheFiles and design an enhanced CacheFiles, SAC. SSM assumes that data access can be decomposed to access on some stable sets, in which elements are always repeatedly accessed or not accessed together. Using SSM methods can improve the cache management and reduce the effect of transition faults. We also adopt looselycoupled methods to design prefetch and replacement policies. We implement our scheme on Linux 2.6.32 and measure the execution time of the scheme with various file I/O benchmarks. Experiments show that SAC can significantly improve I/O performance and reduce execution time up to 84%, compared with the existing CacheFiles. Keywords 1 Stable Set Model, cache management, CacheFiles Introduction With the era of big data and increasing computing power, data centers, global enterprises, and cloud storage providers all require massive amounts of data to share. The conventional distributed storage and centralized storage are confronted with severe challenges of performance and scalability. Client cache is an important technology for the optimization of distributed and centralized storage systems[1-4] . It can reduce access latency and server load, and smooth data access traffic, with local cache storage. The rapid development of SSD (solid-state drive) further increases the importance of client cache. With a directory on disks, CacheFiles[2],① can be used as a caching file system layer for Linux to enhance the performance of a distributed file system (e.g., NFS, AFS). Blue Whale Cluster File System[5] (BWFS) is based on SAN architecture and adopts Out-of-Band transfer mode. BWFS’s clients can use CacheFiles as local caching. However, the performance of CacheFiles is limited by two problems: CacheFiles lacks the ability to efficiently exchange data during phase transitions; furthermore, CacheFiles just supports a simple file-level LRU policy with a tightly-coupled design. Larger cache capacity leads to fewer and fewer phase faults[6] which represent cache misses in stable phases. But, transition faults which happen in transition periods account for majority of cache faults, which leads to the inefficiency of “data exchange” between phases. How to improve the performance of CacheFiles during transition periods is an important problem. Conventional researches on cache management algorithms[7-9] mainly focus on phase faults and ignore “data exchange” during transitions. In order to solve the transition fault problem, our previous study presented Stable Set Model (SSM)[10] . SSM considers that data access streams can be decomposed to a num- Regular Paper This work was supported by the National Basic Research 973 Program of China under Grant No. 2011CB302304, the National High Technology Research and Development 863 Program of China under Grant Nos. 2011AA01A102, 2013AA013201 and 2013AA013205, the Strategic Priority Research Program of the Chinese Academy of Sciences under Grant No. XDA06010401, and the Chinese Academy of Sciences Key Deployment project under Grant No. KGZD-EW-103-5(7). The work was done while the second author was a M.S. student of Institute of Computing Technology, Chinese Academy of Sciences. ① https://www.kernel.org/doc/Documentation/filesystems/caching/cachefiles.txt, Jan. 2014. ©2014 Springer Science + Business Media, LLC & Science Press, China 294 J. Comput. Sci. & Technol., Mar. 2014, Vol.29, No.2 ber of access streams on different stable sets, and there are durable relationships between data in each stable set. The most important contribution of this paper is that we use SSM-based methods to manage CacheFiles and apply SSM to a real cache system for the first time. After devising a general framework where different polices can be easily plugged in, we present the overall design and implementation of SSM-based CacheFiles — SAC, and take an incremental approach to evaluate it with representative testing tools and file system benchmarks. Experiments show SSM-based prefetch can reduce response time by up to 54.3% and SSM-based replacement can reduce the execution time by 21.76%. The overall effectiveness can be improved up to 84%. The remainder of the paper is organized as follows. Section 2 provides a brief review of Stable Set Model. We describe system design details in Section 3 and experiments in Section 4. Section 5 discusses the limitation of our work and presents some future work. Section 6 describes related work and Section 7 concludes the paper. 2 2.1 Stable Set Model Definition SSM is the first macro model used in managing cache of storage systems. It comes from graphical observation of a large amount of data access. The majority of data access at the macroscopic level has two things in common: 1) Just like program behavior, the data access presents phase-transition feature, which is the temporal-locality behavior coexisting with mutations; 2) The dataset accessed in a phase is not randomly composed, and there are elementary sets to form it. SSM is defined based on these two commonalities. Definition 1 (Stable Set Model). Stable Set Model considers that any data access stream RE can be decomposed to a number of stable set access streams, just as following: n X RE = (Si , Ti ), (1) i=1 where (Si , Ti ) represents access stream on stable set Si . Stream (Si , Ti ) is in a phase-based manner, and independent with streams on other stable sets. Si represents the dataset, and is disjoint with other stable sets. Ti represents the sequence of time phases during which stream (Si , Ti ) happens: Ti = (ti1 , ti2 ), (ti3 , ti4 ), (ti5 , ti6 ), . . . . (2) We use working set model[6] as a base to give the precise definition. Because working set model ignores the access order information inside a working set, the complexity of our definition is approximately O(L), where L is the length of data accesses. Because a stable set is a dataset in which all data are always repeatedly accessed together, we need two steps to define the stable set via working set. Step 1: forming repeatedly accessed sets (defined as the stable access sets in early stage). We get all repeatedly accessed sets by intersecting two contiguous working sets. R is a repeatedly accessed set, i.e., R = W (t, T ) ∩ W (t + T, T ), (3) where W (t, T ), W (t + T, T ) are working sets. Step 2: forming always repeatedly accessed together sets (so called stable sets). We get stable sets by intersecting all repeatedly accessed sets we got. So the stable set is the elementary set of repeatedly accessed sets. S is a stable set, i.e., ∀R, S ∩ R = S or S ∩ R = Ø. (4) Another important concept is the stable set’s active time T, and we define it as follows. Definition 2 (Active Time). If repeatedly accessed set R = W (t, T ) ∩ W (t + T2 , T2 ), and stable set S ∩ R = S; then time section (t − T, t + T2 ) is called the active time of S, and S in (t − T, t + T2 ) is considered to be in the active state. 2.2 Cache Block Size Cache block size (B) is not only the basic unit of cache resource allocation, but also the basic unit of backend load. It is very important for cache resource, backend load, and the locality of data access. For cache resources, too small B will lead to insufficient use of cache capacity and high overhead for cache management; too large B will lead to too much cache pollution, thus greatly increasing the number of cache miss. For backend load, too small B could undermine the continuity of data requests, thus decreasing the maximum throughput of storage systems; too large B may cause too much waste I/O, and also reduce the maximum service ability of network storage systems. Using a larger granularity for perceiving data access streams will enhance the locality of data access, and reduce the miss ratio of SSM. Based on the three reasons above, we use SSM to choose a proper B for cache management. Definition 3 (Stable Set Granularity). B is the stable set granularity (SSG) of a data access stream if and only if using B as the cache granularity will not lead Jian-Liang Liu et al.: SAC: Exploiting Stable Set Model to Enhance CacheFiles to phase faults, and the backend load of its footprint is minimal. 2.3 Cache Prefetch Using SSM for prefetch has three advantages: 1) It can reduce I/O response time more. 2) It can bring more opportunities for merging requests, so as to reduce backend load more. 3) Prefetch makes more data requests asynchronous and provides more schedule space for storage systems, so as to smooth the bursty I/O and improve the scalability of storage systems. The core idea of stable set prefetch, is to perceive the state change of stable sets, and prefetch all data blocks of a stable set when it turns active, so as to reduce compulsory misses. But the active time of SSM is a posterior definition, which is hard to be used in cache management. In order to detect the state change of stable sets, we still need the following concept — MAI. Definition 4 (Maximum Active Interval). The largest access interval on a stable set S when it is active is called the maximum active interval (MAI) of the stable set S. When any stable set is active, all of its elements will be accessed simultaneously and sustainably, but when it is inactive, there will be no access to it. So there is a wide difference of data access intervals between these two states. This is why we can use data access intervals to detect the state change of stable sets. MAI is the boundary of state change detection. If a stable set is accessed twice successively with an interval smaller than MAI, we can think that it is in an active state. If a stable set has not been accessed for a time longer than MAI, we can think that it turns inactive. The formal definition of stable set spatial-locality is as follows. Definition 5 (Stable Set Spatial-Locality). If the last access interval of a stable set is smaller than MAI, then all the data of the stable set are likely to be accessed in recent future. This is called stable set spatial-locality. 2.4 295 MAI, we will consider that it turns to the inactive state, and degrade all its data to a low priority. The formal definition of stable set temporal-locality is given as follows. Definition 6 (Stable Set Temporal-Locality). If a stable set is not accessed for a time longer than its MAI, all the data of the stable set are unlikely to be accessed in recent future. This is called stable set temporal-locality. 2.5 Limitation of SSM Cache Management The premise of using SSM in cache management is that cache must be big enough so there is a chance to eliminate all phase faults. Therefore we define largecapacity cache as follows. Definition 7 (Large-Capacity Cache). A cache system is a large-capacity cache if and only if it is bigger than the largest phase set when using the smallest granularity. 3 3.1 System Design SAC System Architecture In this design, SAC is used in the client of BWFS and utilizes a large capacity SSD with ext4 file system as client cache’s physical media. Fig.1 shows the architecture of our client cache, and cachefilesd is a daemon in user space and manages cache, while CacheFiles runs in kernel space. In order to support SSM, we add a stable set mining module. We design each component, as follows. Cache Replacement Stable set prefetch needs a lot of cache resources to load the whole set. According to [11], prefetched data is usually treated as one-time access data or even lower priority. Some stable sets may be no longer active, but because their data has been accessed repeatedly, we cannot replace them from the cache. Consequently, it will waste the cache space. The core idea of stable set replacement algorithm, is to perceive the state change of a stable set, and degrade all its data when it becomes inactive, so as to make them replaceable. The way to perceive the state change is also to monitor access intervals on stable sets. If a stable set is not accessed for a time longer than its Fig.1. SAC system architecture. SSM Mining. This component in user space is for mining stable sets. It first collects online traces captured by kernel to a temporal trace file, and then mines the trace file with the mining program to produce the most critical stable set file. SSM Cache Manager. This component in user space, contains replacement and prefetch policies. These policies are configured on demand. 296 J. Comput. Sci. & Technol., Mar. 2014, Vol.29, No.2 Cache-Kernel-Module. This kernel part is implemented in CacheFiles, and consists of three parts. • Prefetch-Module. This module analyzes command parameters passed by cachefilesd and does asynchronous prefetch. • Release-Module. This module receives and analyzes command parameters by cachefilesd, and executes ext4’s fallocate operation with FALLOC PUNCH HOLE flag to release cache blocks. • Tracking-Module. This module is for tracking access-record trace. Its main goal is to provide access records to SSM mining and cache manager modules. Overall, cachefilesd is responsible for collecting the kernel states (via read call), and then sends all commands (via write call) to CacheFiles; CacheFiles does the real jobs of prefetch and replacement. That is to say, cachefilesd to CacheFiles is analogous to the strategy to the mechanism. Fig.2 shows the pseudo-code of cache management in cachefilesd and policy initializations are ignored. Here, we isolate prefetch algorithm from replacement algorithm, unlike the conventional prefetch triggered by miss. It benefits the module design and simplifies the inner procedure. cachefilesd() { n = read cache state to buffer; do { S1: collect system information; } while (buffer is not empty); For (prefetched records) S2: add block to cache; For (access records) { S3: access cache; S4: do the prefetch; } S5: cache replacing; } Fig.2. Simplified cache management in cachefilesd. S1. This step processes the kernel information returned by read call, including prefetch trace, data access trace, and the state of cache resource. S2. Because prefetch is asynchronous, this step adds pages successfully prefetched to cache, and when all pages belonging to the same block (SSG may contains multiple pages) are called back, SAC removes the corresponding node from the prefetch tree. S3. This step playbacks the data access trace, updating the cache state (if needed, update states of all stable sets). S4. This step checks whether prefetch or not, and if allowed, SAC issues prefetch and inserts a node generated with prefetch information into a prefetch tree for quickly looking up, so as to avoid repeatedly prefetching a block. S5. The final step, according to the cache state and the physical state of the cache directory (via statvfs call), triggers replacement operation to remove the right data blocks. In essence, SAC is a combination of SSM and CacheFiles, but two challenging issues need to be addressed for the design. The first issue is to find a data organization to describe cache management. The second issue is how cache policies interact with SSM. The next two subsections address the two design issues respectively. 3.2 File-Block Management CacheFiles is a file-level cache, only supporting cache replacement at a file granularity, while the cache management based on SSM needs block-level caching. In order to make the best use of cache management, we need to implement mechanism of cache management at a block granularity. The combination of file level and block level of cache management performs better. File-Aware-Cache[12] has explored file-level information to optimize block-level cache, which can avoid caching one-time sequential access of large file using file information. The research has shown that cache management of combination of file level and block level can perform better than that of simple file level or block level. To achieve a file-block-level cache, we combine the file identification and the offset together. Generally, we can find a file via a file full path name or a file inode, or a file handle by NFS. As the length of a full path name is uncertain and the inode may be expired, we use the file handle as the file identifier. File handles and file block offsets are combined into logical block numbers. As the size of a file handle can be up to 128 B, directly as a part of the logical block number, it is bound to introduce a lot of time and space overhead. In order to reduce the overhead, we map the file handle to a 32-bit number, and then with a 32-bit offset of the file together to form the logical block number, as shown in Fig.3. 3.3 SSM-Based Cache Management To make use of SSM for guiding the cache prefetch and cache replacement, we need to define the way of interaction between SSM and cache (including cache replacement and cache prefetch). Unlike the conventional prefetch and cache replacement mixed together, we isolate prefetch interface from replacement interface. Jian-Liang Liu et al.: SAC: Exploiting Stable Set Model to Enhance CacheFiles 297 Fig.3. A file handle is encoded to be visible characters by base64 algorithm, and then we use a hash function for visible characters to complete the mapping from a file handle to a 32-bit number. On the one hand, hash function ensures SAC runs with a small overhead; on the other hand, the 32-bit offset within file allows the file size of 2T, guaranteeing the large file support. Both of SSM-based prefetch and replacement are able to access stable sets to guide their optimizations through interfaces of SSM. Cache management is described as three distinct operations: cache access, cache replacement, and cache prefetch. 3.3.1 Cache Access When a data block is accessed, we first determine whether it is in cache. If in the cache, then data is returned. It should be noted that, the search operation is done automatically by NFS, and we just record access to data (FH, Offset, and Length). When the user daemon playbacks the access trace, we update cache state, and check state transitions of stable sets, and do data prefetch. Finally, update the last access time of the stable set if the data block belongs to any stable set. What is more, our data management uses the hash list to reduce space overhead. 3.3.2 Cache Replacement Stable set temporal-locality can be used to enhance any replacement algorithm, because stable set temporal-locality is orthogonal to the theoretical basis of all replacement algorithms, including the temporallocality, access frequency and so on. The replacement procedure is divided to three interfaces: Cache hit interface (by S3 in Fig.2) only cares those hits. First update metadata cache management according to the hit information, and then update the states of all stable sets. If the interval from the last update time of a table set is more than the maximum active interval, the active set turns inactive. When a miss occurs, we leave it to prefetch simply. Cache allocate interface (by S2 in Fig.2) processes data blocks successfully prefetched, which have been cached in the underlying physical cache. If the block belongs to a stable set, we insert it to the corresponding stable set. According to the current design of prefetch, NFS read does not write CacheFiles cache disk, so CacheFiles just caches the prefetched data. Cache release interface (by S5 in Fig.2) does cache resource recycling, based on the cull flag from CacheFiles. First use statvfs call to detect free space in cache, then according to the threshold of cache system settings, select data blocks which are inactive to release, and finally use CacheFiles HCULL interface to release the physical resource blocks. Based on SSM, the replacement policy will only remove data blocks of inactive stable sets, provide more space for active stable sets and improve the cache utilization. 3.3.3 Cache Prefetch Stable set prefetch will get the entire stable set of data blocks; and any access not belonging to any stable set will get a stable set granularity of data (by S4 in Fig.2). SSM-based prefetch method uses a way of monitoring the access distance in a stable set to predict state transitions of stable sets. If the access interval of an inactive stable set is less than its maximum active interval, the prediction is that the stable set is turning into the active state, and we need to prefetch all data blocks of the stable set. SSM-based prefetch is based on the state of a stable set to decide how to send prefetch, rather than a single access. This principle is the biggest difference from conventional prefetch. Prefetching the whole stable set in the active state to cache, is really the nature of stable sets — the always-accessing-together feature. The whole stable set of data will be repeatedly accessed together during a relatively long period. 4 Experimental Evaluation To evaluate the performance of SAC, we implemented this prototype in Linux 2.6.32 and transplanted ext4’s punch-hole function in Linux 3.0 and compared our scheme with original CacheFiles. As follows, we show the results. 298 J. Comput. Sci. & Technol., Mar. 2014, Vol.29, No.2 Table 1. Experimental Setup Client MDS DSS 4.1 CPU Intelr Xeon E5620 2.40 GHz Intelr Xeon E5620 2.40 GHz Intelr Pentium E6500 2.93 GHz Experimental Setup This experiment platform is based on BWFS, including three nodes: the data storage server (DSS), metadata server (MDS), and client. Though our design assumes SSD, as the client machine has 12 GB memory, we use a RAM disk of 4 GB size as the cache space simply. Experimental setup is detailed in Table 1. We choose several conventional testing tools[13] to measure their execution time. These testing tools represent different access modes: 1) Cat is a Linux tool to output the content of a file; 2) Grep② is a Linux tool to search a collection of files for lines containing a match to a given regular expression; 3) Diff is a Linux tool that compares two files for differences. It sequentially reads two files and then compares them; 4) Cp does sequentially reading from one file while writing to another; 5) Cscope③ is an interactive utility that allows users to view and edit parts of the source code relevant to specified program items under the auxiliary of an index database; 6) Glimpse④ is a text information retrieval tool, searching for keywords through large collections of text documents. It builds approximate indices for words and searches relatively fast with small index files. 7) Fio⑤ is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user; 8) IOzone⑥ is a file system benchmark tool. It generates and measures a variety of file operations. First we mount a directory of BWFS to the client, and then run all experiments. The first 6 kinds of tools are used in Subsections 4.2 and 4.3 as basic experiments to show the advantages of SSM. As Fio displays all sorts of I/O performance information, including complete IO latencies, we use it to test random read in Subsection 4.4. IOzone is useful for performing a broad file system analysis. So, we use it in the overall test of file system in Subsection 4.5. Because SSM needs a stable set file, ② http://www.gnu.org/software/grep/, Jan. 2014. ③ http://cscope.sourceforge.net, Jan. 2014. ④ http://freecode.com/projects/glimpse, Jan. 2014. ⑤ http://freshmeat.com/projects/fio, Jan. 2014. ⑥ http://www.iozone.org, Jan. 2014. Memory (GB) 12 2 2 Network Gigabits Gigabits Gigabits OS Linux 2.6.32 Linux 2.6.32 Linux 2.6.32 we first run each test to produce stable sets. Between any two consecutive runs, the buffer cache is emptied to ensure that the second run does not benefit from cached data. Then we run all experiments again. Every test is run three times to get an average value. 4.2 Effectiveness of File Block The original CacheFiles (orig) uses a file-level management in cachefilesd and cannot prefetch data on its own, just passively accepting a request of NFS at a page granularity. In SAC, SSG represents the basic unit of cache management (orig/ssg). We first compare the management methods between orig and orig/ssg. Table 2 lists all the experiments. Table 2. Basic Experiments Tools Cat Experiment Traverse a glibc source code directory. The size of source code is 142 MB. Grep Traverse a glibc source code directory, and search 32 system keywords. The size of source code is 142 MB. Compare two source code directories of glibc. The Diff size of dataset is 288 MB. Cp Copy a Linux source code directory to the local disk. The size of source code is 545 MB. Cscope Traverse three glibc source code directories, and search 32 keywords. The size of source code and index is 500 MB. Glimpse Traverse three glibc source code directories, and search 32 type keywords. The size of source code is 393 MB. Note: all source code directories are in the mounted directory. Fig.4 shows that the effectiveness of orig/ssg is better. Response time is reduced by 17%∼70%. Because the size of SSG (obtained by mining) is 64 KB, it is bigger than 4 096-byte page size, which provides a better locality. So, the management of SAC is feasible. 4.3 Performance of Prefetch We design stable set prefetch (ssetpref) to enhance CacheFiles (orig/ssetpref) and compare orig/ssetpref Jian-Liang Liu et al.: SAC: Exploiting Stable Set Model to Enhance CacheFiles Fig.4. Execution time of different managements. Orig/ssg can fetch a block of SSG size. SSG means a bigger size that can effectively capture the long-term block referencing behavior. with orig/ssg. All the experiments are listed in Table 2, the same as the experiments of Subsection 4.2. Fig.5 shows that the effectiveness of orig/ssetpref is better. The response time is reduced by 13.7%∼54.3%. Fig.5. Execution time of the six benchmarks. For a request, ssg method just fetches a block of SSG size; ssetpref method can 299 rithm, we get SSLIRS replacement algorithm, while the main principle remains the same with LIRS. Fio randomly reads 5 GB data of files, 2 MB for each file. First Fio reads a directory of 3 GB data (phase 1), and there is no replacement; then it sleeps 10 seconds (phase 2), and reads the other directory of 2 GB data (phase 3). And around at 600 s SAC triggers replacement. Fig.6 shows the result: LIRS plus prefetch consumes 1 043 s, while SSLIRS plus prefetch consumes 816 s, reducing time by 21.76%. SSLIRS can timely adjust cache state, and pro-actively move inactive sets to HIR, so that when the available cache is insufficient, the candidate set to be replaced can be found. While the original LIRS is not aware of such a situation, LIRS replacement is only reactive. As soon as determining the candidate set, SSMbased replacement can free up space in time, which is a radical change in the process, different from the gradual process of those conventional replacement algorithms. Fig.6. Comparison of execution time. These two executions are prefetch a set of blocks. the same until replacement occurs. In each test, we find only one stable set, which covers the whole amount of data, and the whole stable set is repeatedly accessed. Essentially, Fig.5 shows the effectiveness of SSM-based prefetch. On the one hand, the overall optimization effectiveness is reflected in the polymerization compared with a single request for a read request, which helps reduce the interaction of RPC requests. On the other hand, the ssetpref method can prefetch data in time and reduce miss ratio. Therefore, SSM-based prefetch is reasonable. Conventional replacements are not aware of the SSM-based data access rule, so how to choose the victim is based on the current state of cache data and the history information. SAC can quickly swap out those inactive sets, without affecting the active sets, so SSMbased replacement is reasonable. 4.4 Performance of Replacement When cache lacks free data blocks, we need to recycle cache blocks. To demonstrate the design of replacement algorithm based on SSM, we choose Fio random read test which can produce enough data to fill cache. Although the original cachefilesd uses file-level LRU management, we directly use SSM to guide LIRS algorithm, according to the advantage of LIRS[9] . Adding the SSM temporal-locality to LIRS replacement algo- 4.5 Performance of File System Benchmark To test the overall effectiveness of cache management based on SSM, we run the file system benchmark IOzone testing. IOzone measures file I/O performance by generating particular types of operations in batches. It creates a single large file and performs a series of operations on that file. We measure the performance with running two IOzone random read tests one by one, varying request block size ranging from 4 KB to 5 MB. The size of each file is 3 GB. Fig.7(a) displays that during phase 1 SAC improves throughput by 8.67%∼936%, and the 4 096-byte is the 300 J. Comput. Sci. & Technol., Mar. 2014, Vol.29, No.2 Fig.7. Throughput of IOzone with different request block sizes and the whole execution time. (a) Throughput of phase 1. (b) Throughput of phase 2. (c) Execution time of all phases. most remarkable; Fig.7(b) shows when phase 2 is going on, replacement will occur and SAC will swap out inactive stable sets accessed only in phase 1, resulting in improvement by 1.93%∼644%. We also measure the execution time. Fig.7(c) shows the overall execution time is reduced by 0.3%∼84%. The overall effectiveness is that: when no replacement occurs, SSM-based prefetch reduces network overhead by aggregating the relevant reading; replacement can swap out inactive data in time, without affecting data being accessed. So the SSM-based cache management can bring effective optimization to the client cache. Because throughput mainly depends on the sequentiality of disk requests, with the increasing request block size, there are many sequential accesses to disks and the effectiveness of SSM gets limited. 4.6 Evaluation for Overhead To illustrate the integration overhead caused by SSM, we do not cache any data and compare SAC with the original CacheFiles system. Although SAC has no data cached, it still maintains the cache state and replays the access trace. Here, as an example, we present the results of experiments in Subsection 4.3. Fig.8 shows that compared with original CacheFiles, the overheads of SAC are 8.8%, 0.695%, 0.659%, 0.811% and 1.77% respectively. So we think that with faster computing speed, the integration of SSM does not bring too much negative impact. 5 Discussion In all experiments, the effectiveness of SSM-based prefetch is the most obvious. SSM-based replacement plays a significant role when amounts of stable sets get inactive. Fig.8. Five applications’ execution time of SAC and CacheFiles. In our implementation, the mining algorithm is offline and we must run applications to get stable sets. Because of repeatedly accessed feature of stable set temporal-locality, there is no need to mine the access trace continuously. It is a feasible solution to re-mine stable sets at some intervals. As a result, we leave research on time regularity of stable sets as future work. There are several limitations in our work. First, in this paper, prefetch sometimes is aggressive and throttling of prefetch is simple. The prefetched data may be useless if applications read slowly. It is worthwhile to evaluate prefetch’s utility. So we will focus on actual control of prefetch in future work. Second, CacheFiles is only a read-only cache, because FS-cache for NFS in Linux 2.6.32 does not support write operation. Third, IOzone supports multi-threads, but because obtaining data early for some thread can result in early termination of IOzone and SSM-based prefetch can get data early, we just choose one thread. 6 Related Work Cache Systems. Caching has been used widely in distributed file systems to improve performance. AFS[3] implements a cache manager called Venus at the client to improve the system performance. Venus duplicates the entire file from remote file server to local disk of the client as replicas and carries out all read and write Jian-Liang Liu et al.: SAC: Exploiting Stable Set Model to Enhance CacheFiles operations of the client on local replicas. Nache[1] is an NFSv4-oriented caching proxy system and based on the NFSv4 implementation of proxy support. Nache includes Nache Server and Nache Client. Nache enables the cache and the delegation to be shared among a set of local clients, thereby reducing conflicts and improving performance. In Nache, CacheFS[2] is used primarily to act as an on-disk extension of the buffer cache. Bwcc[14] is a cooperative caching system, using disks as its cache media through FS-Cache[2] for network storage system. Dm-cache[15] , a general block-level disk cache, can be transparently plugged into a client for any storage systems, and supports dynamic customization for policy-guided optimizations. In order to reduce the idle power caused by main memory in web server platforms, Flash-Cache[16] uses a two-level file buffer cache composed of a relatively small DRAM and a flash memory. And it needs DMA transaction to transfer flash memory content from/to DRAM. Cache Algorithms. There are also a lot of researches on the cache algorithms[7-9] . But they lack the considerations of transition faults, which is important in large-capacity cache management. Traditional prefetch and replacement algorithms cannot predict data access across transitions, leading to low efficiency of cache systems when there is a lot of data to be exchanged. DiskSeen[17] performs more accurate prefetch and achieves more continuous streaming of data from disk by efficiently tracking disk accesses. However, compared with the always-repeatedly-accessed feature of stable sets, DiskSeen just employs a tuned linear sequence of address accesses. 7 Conclusions The cache management of existing CacheFiles cannot be aware of the radical change of data references and will be limited by phase transitions. Performance of CacheFiles can be significantly improved by well utilizing SSM, which proves the effect of phase transitions in turn. SSM-based prefetch and replacement both benefit from the prediction of cache state by SSM. SSM is orthogonal to the traditional cache management algorithms and can provide enough information of data correlations to guide cache management. Our implementation of the SAC scheme, shows the effectiveness and feasibility of SSM in real systems and provides a preliminary basis for managing other large-capacity client caches of big data and cloud computing. In the future, we will focus on the time regularity of stable sets which can provide a more accurate prediction about the states of stable sets. Prefetch control and system optimization will also be included in our future work. 301 References [1] Gulati A, Naik M, Tewari R. Nache: Design and implementation of a caching proxy for NFSv4. In Proc. the 5th USENIX Conf. File and Storage Technologies, Feb. 2007, pp.199-214. [2] Howells D. FS-Cache: A network filesystem caching facility. In Proc. the Linux Symposium, July 2006, pp.424-440. [3] Howard J H, Kazar M L, Menees S G et al. Scale and performance in a distributed file system. ACM Transactions on Computer Systems, 1988, 6(1): 51-81. [4] Satyanarayanan M, Kistler J J, Kumar P et al. Coda: A highly available file system for a distributed workstation environment. IEEE Trans. Computers, 1990, 39(4): 447-459. [5] Yang D, Huang H, Zhang J et al. BWFS: A distributed file system with large capacity, high throughput and high scalability. J. Computer Research and Development, 2005, 42(6): 1028-1033. (In Chinese) [6] Denning P J. Working sets past and present. IEEE Transactions on Software Engineering, 1980, 6(1): 64-84. [7] Megiddo N, Modha D S. ARC: A self-tuning, low overhead replacement cache. In Proc. the 2nd USENIX Conference on File And Storage Technologies, March 2003, pp.115-130. [8] Johnson T, Shasha D. 2Q: A low overhead high performance buffer management replacement algorithm. In Proc. the 20th Int. Conf. Very Large Data Bases, Sept. 1994, pp.439-450. [9] Jiang S, Zhang X D. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. In Proc. the 2002 ACM SIGMETRICS, June 2002, pp.31-42. [10] Guo M Y, Liu L, Zhang Y L et al. Stable set model based methods for large-capacity client cache management. In Proc. the 14th HPCC, June 2012, pp.681-690. [11] Butt A R, Gniady C, Hu Y C.The performance impact of kernel prefetching on buffer cache replacement algorithms. In Proc. ACM SIGMETRICS Int. Conf. Measuring and Modeling of Computer Systems, June 2005, pp.157-168. [12] Sivathanu M, Prabhakaran V, Popovici F I et al. Semantically-smart disk systems. In Proc. the 2nd USENIX Conference on File and Storage Technologies, March 2003, pp.73-88. [13] Traeger A, Zadok E, Joukov N et al. A nine year study of file system and storage benchmarking. ACM Transactions on Storage, 2008, 4(2): Article No.5. [14] Shi L, Liu Z J, Xu L. BWCC: A FS-cache based cooperative caching system for network storage system. In Proc. the 2012 IEEE CLUSTER, September 2012, pp.546-550. [15] Van Hensbergen E, Zhao M. Dynamic policy disk caching for storage networking. Technical Report, RC24123, IBM Research Division Austin Research Laboratory, http://citeseerx.ist.psu.edu/showciting?cid=19808002, Jan. 2014. [16] Kgil T, Mudge T. FlashCache: A NAND flash memory file cache for low power Web servers. In Proc. the 2006 CASES, October 2006, pp.103-112. [17] Ding X N, Jiang S, Chen F et al. DiskSeen: Exploiting disk layout and access history to enhance I/O prefetch. In Proc. USENIX Annual Technical Conference, June 2007, Article No.20. Jian-Liang Liu received his M.S degree in computer science from China University of Geoscience, Beijing, in 2010. Now, he is currently a Ph.D. candidate of Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS), Beijing. His research interests include blocklevel storage and cache management. 302 J. Comput. Sci. & Technol., Mar. 2014, Vol.29, No.2 Yong-Le Zhang received his M.S. degree in computer science from ICT, CAS, in 2013. Now, he is a Ph.D. candidate of University of Toronto, Canada. His research interests include network storage and cache management. Lin Yang received her B.S. degree in computer science from Beijing Language and Culture University, China, in 2011. Now, she is a M.S candidate of ICT, CAS. Her research interests include network storage and cache management. Ming-Yang Guo received his Ph.D. degree in computer architecture from ICT, CAS, Beijing, in 2012. Now, he is an assistant professor at the Data Storage and Management Technology Research Center, ICT, CAS. His research interests include distributed raid and cache management. Zhen-Jun Liu received his Ph.D. degree in computer architecture from ICT, CAS, Beijing, in 2006. , He is currently an associate professor at the Data Storage and Management Technology Research Center, ICT, CAS. His research interests include WAN storage, distributed file system and RAID management. Lu Xu received his M.S. degree in computer architecture from University of Tokyo, Japan, in 1989 and his Ph.D. degree in computer systems software from Purdue University, USA, in 1995. He is currently a professor at the Data Storage and Management Technology Research Center, ICT, CAS. His research interests include computer architecture, high performance network storage, and computer system software.
© Copyright 2026 Paperzz