Journal of Visual Communication and Image Representation 10, 197–218 (1999) Article ID jvci.1999.0420, available online at http://www.idealibrary.com on Data Allocation and Dynamic Load Balancing for Distributed Video Storage Server∗ Shiao-Li Tsao,†, ‡ Meng Chang Chen,†,1 Ming-Tat Ko,† Jan-Ming Ho,† and Yueh-Min Huang‡ †Institute of Information Science, Academia Sinica, Taiwan; ‡Department of Engineering Science, National Cheng Kung University, Taiwan E-mail: [email protected] Received January 20, 1998; accepted March 8, 1999 In this paper, a novel initial videos allocation scheme and a dynamic load balancing strategy are proposed for a distributed video storage server in order to increase the availability and reduce operation cost. The initial allocation scheme determines the allocation of video replicas on the servers to achieve static load balance and to obtain a configuration for efficient dynamic load adjustment. From the simulation results, the proposed load shifting algorithm can reduce up to 50% request fail rate if compared with the same initialization algorithm without load shifting. The proposed initial allocation with load shifting also reduces 25% to 60% request fail rate from the least load first initial allocation scheme with load shifting, 5% to 10% request fail rate and 5% to 25% the number of shifting steps from the DASD dancing method. Moreover, a prototype is implemented on Windows NT to examine the correctness and practicability of the proposed schemes. °C 1999 Academic Press 1. INTRODUCTION Recent advances in computing and communication technologies enable the distributed multimedia applications that video storage servers play an important role. One of the important issues of designing video servers is the scalability problem [1–3]. Of late, the distributed video (storage) servers composed of low-end computers were proposed to solve the issue [4–8]. Even though the video servers as a whole have sufficient computing power to serve all the current user requests, some user requests may fail since the computers with the desired videos do not have the available computing power to handle the requests. As the low-end computers have limited storage space and computing power, it is extremely important to properly allocate the video files and balance the user requests among them. A statistical result of long-term observation of user behaviors blended with domain expertise, called ∗ This research is partially supported by NSC under Grant NSC86-2213-E-001-022. This article was originally part of the Special Section on Multimedia Storage and Archiving Systems, which appeared in the Journal of Visual Communication and Image Representation, Vol. 9, No. 4, December 1998. 1 Corresponding author. 197 1047-3203/99$30.00 C 1999 by Academic Press Copyright ° All rights of reproduction in any form reserved. 198 TSAO ET AL. expected request pattern, is used to guide the video file allocation process. However, as the user requests of videos are dynamic in nature, the actual request pattern is skewed from the expected request pattern [9–11] that results in the increase of request failures. One possible data allocation solution is to stripe every video file over all servers [12, 13]. The advantage of this approach is that it can achieve maximum load balancing, since user requests can be served by any available server. However, the system is not reliable, since the failure of one server will result in the failure of whole system. System synchronization overhead and a complex system control mechanism are also its drawbacks. Alternatively, we can replicate videos to several servers so that the requests can be migrated to other servers during server failure or for load balance purposees [14, 3]. The cost of this approach is extra disk space [7]. The fundamental policies of data replication had been explored in the previous study [11]. Serpanos et al. [15] proposed a data replication scheme for distributed multimedia servers to achieve both load and storage balance. With the assumption of the actual requests being identical to the expected load, they focused on the initial allocation of video replicas to achieve static load balance. However, the actual requests may not be identical to the expected demands due to two reasons. First, the expected load is calculated and forecasted according to long-term statistical data that may not represent the short-term request pattern. Second, the access probabilities of some videos may not be accurately predicted (e.g., the actor in the movie won an award yesterday). The discrepancy of actual and expected request patterns induces request failures. Wolf et al. [16] proposed the DASD dancing algorithm to balance the user requests of a multidisk server. They modeled disks as nodes of a graph, and a pair of the same video replicas on two disks as an edge between the nodes. The initial allocation algorithm tried to reduce the diameter of the graph. Then, they proposed a on-line load balance scheme on hard disks, called DASD dancing, to migrate progressing requests among disks. However, the DASD dancing scheme assumes each replica of a video has the same access probability to simplify the problem, so that the graph they constructed is undirected and unweighted. Moreover, it does not consider reducing the number of shifting steps (i.e. request migrations) when performing dancing, as the cost of migration is negligible in the single server environment. In a distributed environment, each migration of a user request from one server to another server pays the cost of control messages passing, admission control exercising, and job rescheduling. The cost varies from environment to environment, and sometimes it is too large to be ignored. In this paper, we elaborate a novel initial data allocation algorithm to obtain a high connectivity between servers, together with a dynamic load balancing strategy called load shifting to efficiently migrate the progressing requests in a distributed environment. The major differences between the DASD dancing [16] and our approach are that our approach considers the access probability of video replicas to optimize the connectivity between servers and our approach also reduces the shifting steps. We discuss the implementation issues, and prototype our scheme on Windows NT to prove its practicability. The rest of the paper is organized as follows. The architecture of a distributed video server, the basic concept of load shifting procedure, and notations in the paper are described in Section 2. Then, the initial data allocation and reallocation algorithms are proposed and discussed in Section 3. The load shifting algorithm is in Section 4. In Section 5, the simulation results and implementation issues are discussed. Finally, we conclude the paper in Section 6. LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 199 FIG. 1. The environment of a video service system. 2. BASIC CONCEPTS OF LOAD SHIFTING 2.1. System Architecture The environment of a distributed video service system is depicted in Fig. 1. The elements of the system include tertiary storage servers, disk-based video storage server clusters, a backbone network (e.g. ATM), an access network (e.g. HFC), and end-user equipment. All of the video data are stored in the tertiary storage servers, and hot videos are cached in the disk-based video server clusters. In this paper, we focus on the disk-based video server cluster. We apply file replication among servers within a cluster. A video file is a basic unit of replication and may be striped to all the disks within a video server. In other words, a video file is never striped across a network. A cluster of servers shares a single name, called cluster name. The cluster name can be an IP address or a domain name. Clients request video services by referring to the unique cluster name. The request is received by all servers, but only one server within the cluster will respond to the request. A progressing request may be migrated to other servers within the same cluster dynamically. In this research, we extend the approach of OneIP proposed by Damani et al. [17], which is a technique to share a single IP address with servers within a cluster. We will describe OneIP in detail and present its extension to support our scheme in Section 5. Before giving a formal description of the problem and our solution, we first illustrate a complete processing flow of a user request in the distributed video service system. A user first requests the video service by using the cluster name. As we described above, 200 TSAO ET AL. the request to a cluster will be received by all servers. Only one server within the cluster will handle this request. In order to achieve load balancing, progressing requests may be migrated to other servers within the same cluster. As the migration is transparent to the user, the user uses the same cluster name during the entire service. Occasionally, the hand-off procedure of a migrated request may not be executed smoothly due to the timing difference between the two servers; a certain amount of buffer in the client side is required. Lee studied the synchronization problems between multiple servers delivering data to a client and summarized the architectures into three approaches [18], i.e. proxy-at-server, independent proxy [19], and proxy-at-client. Our design for the synchronization problem is basically one type of proxy-at-client solution. 2.2. Concepts of Load Shifting In a video storage server cluster, when a request on a particular video arrives and no server with the desired video within a cluster has sufficient computing power to serve this request, the request is blocked. A blocked request can only be served when one of the servers with the desired video within the cluster reclaims sufficient resources previously allocated to other requests. The basic concept of load shifting scheme is that a blocked request on a server can be admitted by shifting a progressing request on the server to other servers within the same cluster, and the shifted request can be admitted in the other server immediately or after applying one or more shifts. Note that shift operations can act as a chain reaction until all the shifted requests are admitted. In Fig. 2, the cluster consists of four video servers; each server can store two videos and serve up to four requests simultaneously. Assume three of the four video servers, are fully utilized, except Server 4. Meanwhile, a request for Video 1 arrives. Video 1 is only stored on Server 1, but Server 1 has no resource to serve the request. Then Server 1 migrates a request for Video 2 to Server 2, Server 2 migrates a request for Video 3 to Server 3, and Server 3 migrates a request for Video 4 to Server 4 subsequently. Finally, the new request for Video 1 can be admitted without dropping any request. FIG. 2. Video i/j means j requests on video i. LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 201 Before describing our proposed initial allocation strategy and shifting scheme, we fix some notations. Let S and M denote the number of servers and the number of videos in the system. For a video server cluster, all the video files are stored in the tertiary storage servers, and a limited number of videos are cached in the video servers within a cluster. We assume that the servers are identical with the disk storage capacity of C videos, and the computing capability to serve a maximum number of Amax users. For a large-scale system, M is usually much larger than C · S, i.e. M À C · S, and only the first K hottest videos are cached in the cluster. In order to improve the availability of a video server cluster, some videos may be replicated several times and stored on different servers. We define Vi, j as ½ Vi, j = 1, if a copy of video i stored on server j; 0, otherwise. Pi, j denotes the number of progressing streams of video i on server j with the followPM Pi, j ≤ Amax , and if ing properties: for any pair of video i and server j, we have i=1 PM Pi, j ≥ 1, Vi, j = 1 must be true. For the first property, i=1 Pi, j ≤ Amax , it means that the number of serving requests on any server must be less than its total service capacity. For the second property, the progressing streams of video i on server j is larger than one only if server j has video i. When a new request arrives at the cluster, if all the servers with the desired video are not available, the load-shifting procedure will try to adjust the load of servers to accommodate the new request. For example, a new request of video Vreq arrives at a system while no server with video Vreq is available; the request is blocked. Meanwhile, the shifting algorithm proceeds to find a feasible shifting path in order to admit the request. A request is called failed if it is blocked and no feasible shifting path can be found. We define a shifting path for a new request on video Vreq as S P(Vreq ) = ©¡ ¢ ¡ ¢ ¡ ¢ ¡ ¢ª NULL, Vreq , Si1 , Si1 , V j1 , Si2 , Si2 , V j2 , Si3 , . . . , Sid−1 , V jd−1 , Sid , where the triple (Si , V j , Si 0 ) is a shifting step that shows a progressing stream of video V j on server Si migrated to server Si 0 . The first vector (NULL, Vreq , Si1 ) in the shifting path shows the new request is assigned to server Si1 . For a feasible shifting path, it satisfies the following properties: • For every tuple (Si , V j , Si 0 ) in a shifting path, V j,i = 1, P j,i ≥ 1, and V j,i 0 = 1. This property means that for any shifting step, both the server migrating out a request and the server taking over the request have the desired video of the migrated request. PM • i=1 Pi,Sid < Amax . That is, the number of progressing streams on the last server in the shifting path must be less than the maximum number of streams supported by a server. • The number of serving streams on the servers, excluding the last one in the shifting path, are the same after the shifting procedure is performed. 3. INITIAL ALLOCATION STRATEGY To increase the service availability of a distributed video storage server, the first issue is to explore an initial data allocation algorithm to achieve the static load balance among these servers and to improve the possibility of finding feasible shifting paths for blocked requests. The next issue is to develop an efficient load shifting algorithm to reduce the 202 TSAO ET AL. shifting steps. Suppose each video, say video Vi , has an expected access probability Pri . We want to determine the number of replicas for each video under the given expected access probabilities of the K cached videos and the storage capacity of the video server cluster, i.e. C · S. It is intuitive that the number of replicas of a video should be proportional to their access probabilities since the video with more replicas on different servers means higher availability. Since storing more than one copy of a video cannot improve its availability, an obvious constraint is that the maximal number of replicas of a video must be less than the number of video servers, i.e. S. Another constraint is the storage limitation of servers. The number of replicas of videos are integers so that the apportionment of the amount of C · S storage space to K cached videos by their access probabilities is an integer resource allocation problem [20]. Several fair integer resource allocation algorithms were investigated in [20, 21], which determine the number of replicas of each video under the above constraints. Roughly, the number of replicas of video Vi , Ri , can be calculated as Pri · C · S. Ri = P K j=1 Pr j For the details of the resource allocation algorithms, readers can refer to [20]. After deciding the number of replicas of every video, we further allocate the access j probability of the video to its replicas. Let Pri denote the access probability of the jth copy of the ith video. The summation of the access probabilities of all the replicas equals P i j Pri . The initial data allocation can be the access probability of the video; i.e. Pri = Rj=1 represented as a directed graph, called the connection graph. The server is a node in the graph, and an edge (or a connection) between two servers indicates that there is a video whose replicas are stored on both of the servers. Each edge has a weight which is defined as the access probability of the replica stored in the out-going node of the edge. In other words, the weight of the edge means the probability that a request will be shifted from the out-going node to the in-coming node. The edges are always paired that each may have a different weight if the replicas have different access probabilities. For example, the connection graph in Fig. 3 shows the access probability allocation of videos in Table 1. A blocked request can be admitted only if there are feasible shifting paths from the designated server. To increase the possibility of finding feasible shifting paths for a blocked request, it is favorable that the designated server has many out-going edges with high access probabilities, as well as their neighbors. We define the connectivity of a server as the total access probabilities of its video replicas. I.e. the connectivity Cl of the lth server is defined P M PS n(i,l) , where n(i, l) is a mapping function which denotes as Cl = i=1 k=1 Vi,l · Vi,k · Pri the n(i, l)th replica of the ith video stored in the lth server. As the total connectivity of all j the servers is fixed for a given {Ri , 1 ≤ i ≤ M} and {Pri , 1 ≤ i ≤ M and 1 ≤ j ≤ Ri }, it is TABLE 1 The Access Probabilities of Video Replicas Video 1 Video 2 Video 3 Video 4 Server #1 Server #2 0.03 0.01 0.03 0.01 Server #3 0.01 0.03 Server #4 Total access probability 0.01 0.03 0.04 0.04 0.04 0.04 LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 203 FIG. 3. The connection graph for the initial allocation shown in Table 1. desirable to balance the connectivity of the servers. Then, we can define that the objective function is to maximize the production, C1 · C2 · C3 · · · · · C S , since C1 · C2 · C3 · · · · · C S is maximal when C1 ≈ C2 ≈ C3 ≈ · · · ≈ C S . Once we have the number of replicas of each video determined by an apportionment algorithm [20] and the access probability of each replica obtained by an access probability allocation algorithm, we can allocate these replicas to servers. The access probability allocation algorithm is to apportion the access probability of one video to its replicas. Several access probability allocation algorithms were proposed, such as the uniform apporj tionment of access probability on all replicas, i.e. Pri = Pri /Ri by [16], or the nonuniform apportionment algorithm proposed by [15]. Our initial allocation algorithm can obtain a balanced connectivity of a video storage server cluster under any given access probabilities obtained by any apportionment algorithm. Our proposed initial allocation strategy allocates the video from the hottest to the collest. For a video Vi with Ri copies, we allocate the Ri copies to Ri distinct servers, which maximizes C1 · C2 · C3 · · · · · C S at the mon(i, j) access probability is allocated to server j, the ment. After the video replica with Pri n(i, j) server gains Pri access probability, and the connectivity of the server is increased by n(i, j) n(i, j) · Ri − 1, since the server will create Ri − 1 edges, each with Pri weight to the Pri other Ri − 1 servers. If more than one set of Ri servers maximizing C1 · C2 · C3 · · · · · C S are found, the set of Ri servers with the lowest total access probability is selected. For the videos with only one copy, they do not contribute to the connectivity and are assigned to the server with minimal access probability. That is, the videos with only one copy are allocated to the server in a least-load-first manner. Figure 4 shows the proposed initial allocation algorithm. To support the dynamic changes of the system, including the insertion of new servers and files and changes of the access probabilities of videos, we propose two video reallocation policies. The first one is called partial reallocation policy, which requires less updates on 204 TSAO ET AL. FIG. 4. The proposed initial allocation algorithm. the servers but it cannot guarantee the optimal connectivity between servers. The other one is called total reallocation policy, which reallocates all videos on the servers to obtain the optimal connectivity. The total reallocation policy performs the initial allocation algorithm presented in the previous section. While it can obtain the optimal file allocation, total reallocation policy introduces recomputation overhead of the initial allocation algorithm, as well as expensive file copies within the cluster. On the other hand, the partial reallocation policy takes the original video locations into account and only considers the newly inserted or changed videos. The partial reallocation policy first determines the number of replicas of each file according to the storage space and new access probability. If the number of replicas of a video changes, these replicas are called undetermined replicas that their locations are yet to be determined. The set of undetermined replicas is called as undetermined set. Then, the undetermined replicas will be first sorted by their access probabilities and the policy runs our proposed initial allocation algorithm for determining their new locations. The allocation algorithm may not always find a feasible solution, e.g. a file with five replicas when there are only four different servers with available storage space. In this situation, the policy will remove the video with the least access probability from the servers, insert all its replicas in the undetermined set and resume the initial allocation algorithm. The flow chart of the partial reallocation algorithm is depicted in Fig. 5. It is clear that the result is not optimal. 205 LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER FIG. 5. Flowchart of the partial reallocation policy. Table 2 shows an example. Originally, we have three servers and three files. Suppose we insert a new file, save Video 4, and a new server, say server D, into the cluster. The access probabilities are also changed. We have the new file locations by applying partial reallocation and total replication policies in Table 2. The location obtained by the partial reallocation policy introduces no video copy on the original servers, but it obtains a connectivity value, TABLE 2 An Example of File Locations by Applying Different Reallocation Policies Server A (storage space = 2) Video 1 Video 2 Video 3 Video 4 (new video) × × Video 1 Video 2 Video 3 Video 4 (new video) × × Video 1 Video 2 Video 3 Video 4 (new video) × × Server B (storage space = 2) Server C (storage space = 2) Server D (storage space = 2) (new server) Total access probability Number of replicas Original file allocation × × × × 0.5 0.3 0.2 3 2 1 New file allocation by partial reallocation policy × × × × × × 0.4 0.3 0.2 0.1 3 2 2 1 New file allocation by total reallocation policy × × × × × × 0.4 0.3 0.2 0.1 3 2 2 1 206 TSAO ET AL. 6.65 × 10−3 . On the other hand, the total reallocation policy copies two files on the original servers and obtains a higher connectivity value, 9.59 × 10−3 . In order not to interrupt the video services, we need to allocate the unused system resources to perform the reallocation. Techniques to perform the reallocation job on-line can be found in [22] that both total and partial reallocation policies can be applied to a system in operation. 4. LOAD SHIFTING SCHEME The goal of load shifting is to make room to accomodate the difference between actual access probabilities and expected access probabilities of videos. When a new request arrives, it is assigned to one of the servers with the desired video, following the access probabilities of the replicas. For example, the request of video 1 in Table 1 has probability of 0.75 to be assigned to Server 1 and 0.25 to Server 2. If the access probabilities of all the replicas are equal, the least loaded server with the desired video is selected. In the proposed load shifting algorithm, the shifting procedure can be activated in the following two conditions; for the new request of a video, S j is the selected server: PM PM • when i=1 Pi, j − Min{ i=1 Pi, j 0 } ≥ T1 , where S j 0 is a neighbor of S j , or PM • i=1 Pi, j ≥ Amax − T2 . T1 and T2 are predefined thresholds. The first condition is to compare the load of the selected server with its neighbors to decide if load shifting is needed, while the second condition is only checking the load of the selected server against the threshold value. Both checks can be performed as a background or a real-time job. The system administrator can set values of the two thresholds to obtain the desired operating policy. For instance, if T1 = ∞ and T2 = 0, the load shifting process is performed in real time when the new request is blocked. If 0 ≤ T1 , T2 ≤ Amax , the load shifting process can be implemented as a background job to reduce request delays. The shifting procedure finds feasible shifting paths from the assigned server servers with system load less than Amax . In order to reduce the shifting steps to accomodate the requests, the shortest feasible shifting path is selected. Since there may be more than one shortest feasible shifting path, some criteria need to be added to distinguish them. As mentioned in the previous section, the initial allocation is to maximize the connectivity C1 ·C2 ·C3 · · · · ·C S , based on the expected access probabilities. For the load shifting, the run-time connectivity denoted as Cicur for the server Si , is applied to select the shifting path from the shortest feasible shifting paths which maximize the product C1cur · C2cur · C3cur · · · · · C Scur . Since we model the allocation of video replicas on servers as a directed graph, it is obvious that breadth-first search can be used to find the optimal shifting path. Figure 6 shows the proposed load-shifting algorithm. 5. SIMULATION RESULTS AND IMPLEMENTATION ISSUES 5.1. Simulations In this section, we compare the performance of the proposed approach and others via simulation. The user request is assumed to be Poisson process with 0.83 request/minute arrival rate. The simulation environment of the video server cluster consists of 10 identical video servers that each server can support up to ten 6-Mbps MPEG-2 streams simultaneously. LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 207 FIG. 6. The proposed load-shifting algorithm. The video file length is assumed 5.4 GB for 2 h playback. The top 100 popular videos are cached in the disks that at least one copy of the videos can be found on one of the servers. The number of replicas of each video is determined by applying the Hamilton method which is one of the frequently used integer resource allocation algorithms [20], and we also adopt the uniform apportionment to allocate the access probabilities of videos to the replicas. The distribution of requests on videos are modeled as Zipf’s distribution [23], which is defined as Pi = c i , (1−θ ) , where c = 1 N X 1 i 1−θ , i=1 i ∈ {1, 2, . . . , N }. Figure 7 shows the access probabilities of videos, based on Zipf’s distributions with various θ values. In the first simulation, we allocate 64.8 GB disks for each server. In total, 120 copies of video files can be stored. I.e., 20 copies are for replicas. We forecast the expected access distribution to be Zipf’s distribution with θ = 0.1, while we let the actual access probabilities range from θ = 0.1 to θ = 0.9 of Zipf’s distribution. Note that the initial video allocation is based on the expected access distribution. We examine four different initial data allocation and load balancing approaches, including the least-load-first initial allocation without load shifting (LLF without LS), the least-load-first with load shifting (LLF with LS), the DASD dancing method with dancing threshold ∞ (DSAD), and the proposed initial allocation algorithm with load shifting (CO with LS). The least-load-first (LLF) initial data allocation strategy sorts the videos in descending order by the access probabilities and allocates the video replica in the order to the server with the minimal load. In order to have a fair comparison of the number of shifting steps, we set the thresholds of the proposed approach as T1 = ∞ and T2 = 0, and we also set the dancing threshold of the DASD dancing method 208 TSAO ET AL. FIG. 7. The access probabilities of videos sorted by ranks. to infinity, which implies that request dancing is only performed while the new request is blocked. Figure 8 depicts the request fail rate over the distribution of actual demands. From the figure, we learn that the request fail rate increases with the increment of actual requests from θ = 0.1 to θ = 0.9. The request fail rate is minimal when the actual demand distribution fully matches to the forecasted, i.e. θ = 0.1 in this case. The figure also shows that LLF FIG. 8. The request fail rate over the actual demands under the expected parameter (θ = 0.1). LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 209 FIG. 9. The number of shifting steps per 100 requests over the actual demand under the expected parameter (θ = 0.1). initial allocation with load shifting can reduce the request fail rate by 30% to 50% from the LLF method without load shifting. The service capacity improvement of a video server cluster by employing the load shifting procedure is quite significant. Comparing with other approaches, our initial allocation scheme can further reduce 25% to 60% request fail rate from the LLF method with load shifting and around 5% fail requests from the DASD dancing. Our approach obtains the lowest request fail rate, since the definition of connectivity in our approach considers both the number of video replicas and their access probabilities that better balance the loads of the servers. Figure 9 shows the number of shifting steps per 100 requests over the distribution of actual demands with the expected parameter (θ = 0.1). The figure should be viewed with caution, as the failed requests are counted in the total requests. It means lower shifting steps may be due to better algorithms or higher fail rates. While the proposed approach reduces 25% to 60% request fail rate from LLF approach, it introduces 8% to 30% more shifting steps. Comparing with the DASD approach, we can observe 10% to 25% the number of shifting steps are reduced by our approach and still a 5% reduction of fail rate can be seen from Fig. 8. To have a clear view of the combined effects of shifting ability and system availability derived from the initial data allocation schemes, we define a reward function to evaluate their performance. We assume that the system earns R E a units of reward when accepting a new request, while paying a penalty of R E p units for a shifting step. Therefore, the reward function can be used as a basis of comparison of different initial data allocation approaches. Figure 10 shows the outcomes of the reward function with R E a = 5 and R E p = 1 of different initial allocation schemes under various actual request probabilities. From the figure, it can be seen that our proposed method obtains 100% and 10% more reward units than the LLF method and the DASD method with load shifting, respectively. 210 TSAO ET AL. FIG. 10. The reward obtained by different initial allocation schemes over different distribution of actual demands under the expected parameter (θ = 0.1). In the next simulation, we modify the forecasted demand on videos to Zipf’s distribution with θ = 0.9 and repeat the previous simulation with all other parameters unchanged. The simulation results are shown in Figs. 11, 12, and 13, which are in parallel to Figs. 8, 9, and 10. From Figs. 11, 12, and 13, it shows that our proposed approach has a better performance than the others, just like in Figs. 8, 9, and 10. It can also be observed that the initial allocation FIG. 11. The request fail rate over the actual demand under the expected parameter (θ = 0.9). LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 211 FIG. 12. The number of shifting steps per 100 requests over the actual demand under the expected parameter (θ = 0.9). of all approaches based on a near uniform distribution, such as Zipf’s distribution with θ = 0.9 in this simulation, has lower connectivity which results in lower shifting capability and higher fail rate when the actual demands in videos differ from the forecast. In the next simulation, we explore the relation between the factor of the number of replicas and system performance. We adopt the same simulation parameters and increase FIG. 13. The reward obtained by different initial allocation schemes over different distribution of actual demands under the expected parameter (θ = 0.9). 212 TSAO ET AL. FIG. 14. The request fail rate over different percentage of replication. the percentage of replication from 10% to 100%, i.e. 108 GB per server. We examine the four situations: (1) the expected demand θ = 0.1 and the actual request θ = 0.1, denoted as (EP = 0.1/P = 0.1); (2) the expected demand θ = 0.1 and the actual request θ = 0.9, denoted as (EP = 0.1/P = 0.9); (3) the expected demand θ = 0.9 and the actual request θ = 0.1, denoted as (EP = 0.9/P = 0.1); (4) the expected demand θ = 0.9 and the actual request θ = 0.9, denoted as (EP = 0.9/P = 0.9). Figure 14 illustrates the request fail rate over the percentage of replications of the DASD dancing approach and our approach. We find that the request fail rate decreases by the increment of the percentage of replications, and our proposed method obtains lower request fail rate than the DASD method from around 5% under 10% replication to 10% under 100% replication. The reason is that, as the percentage of replication increases, our approach can better allocate replicas to the servers. Note that a higher percentage of replication means more edges in the connection graph and a more sophisticated algorithm gains advantage. Figure 14 also shows that EP = 0.1/P = 0.1 and EP = 0.1/P = 0.9 result in the best and poorest performances, respectively, among the four situations. Figure 15 shows the number of shifting steps per 100 requests of the simulation in Fig. 14. The shifting steps increase by the increment of replications due to the decrease of fail rate. In the last simulation, we examine the relation between the request fail rate and the service capacity of a storage server, i.e. the number of streams supported by a server. Figure 16 shows the results. We see that the request fail rate decreases by increasing the number of supported streams of a server. For the two situations, EP = 0.1/P = 0.1 and EP = 0.9/P = 0.1, the request fail rates reduce dramatically, since the skewed access pattern can be alleviated by increasing the number of users in a system. Moreover, DASD and the proposed method become closer for a higher number of served user in a system since the requests on servers are naturally balanced. LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 213 FIG. 15. The number of shifting steps per 100 requests under different percentage of replication. 5.2. Implementation Issues To realize the proposed load-shifting scheme, we adopt and extend the broadcast dispatching scheme of OneIP presented by Damani et al. [17]. The basic concept behind the broadcast dispatching scheme of OneIP is that a cluster of servers publishes a cluster IP FIG. 16. The request fail rate under different service capacity of a storage server. 214 TSAO ET AL. FIG. 17. Apply OneIP technique to a video server cluster. address to users. All requests to the cluster use the cluster IP as their destination, and only one server within the cluster will respond for each request. Figure 17 demonstrates the request/response packet flows when OneIP is applied to a video server cluster. First, we assign a unique IP address and an ID to each server and a shared cluster IP to all servers within a cluster. The request packets use the cluster IP as their destination IP address. The destination medium access control (MAC) address of the request packet will be replaced by the broadcasting MAC address. In this way, all servers in the cluster will receive the request packets. The servers run an election algorithm to determine a server to handle the request. For example, a fair election algorithm can be a MOD operation on the source IP address of the request packet, and the result of MOD operation determines the ID of the server which will handle the request. Later, the selected server responds to the request by using the cluster IP address. If the source IP addresses of the incoming request packets are uniformly distributed, the load of each server will be balanced. The details of OneIP can be found in [17]. OneIP suggests using a simple server election algorithm such as MOD operation in order to reduce computation complexity and the memory space of a dispatching table. We extend OneIP to support video server clusters and the load shifting operation by maintaining a file location table and the load information of each server within the cluster. While the request arrives, the servers determine a server to handle it by looking up the file location table and load information. Suppose the server with ID = k is assigned to respond to the request and the server will reply to the client with the handling server ID = k. From then on, the following request packets for the same video service from the client will always carry the handling server ID. The handling server ID can be inserted in the protocol, realtime stream protocol (RTSP) or DSM-CC packets. Hence, the servers can easily determine if the incoming packets are assigned to themselves by checking the handling server ID in LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 215 the packet. Before migrating progressing requests between servers, the system will notify clients to update their handling server IDs in the request packets. While a new request is blocked, servers will run the load-shifting algorithm to identify a shifting path. If a feasible shifting path can be found, the first server in the shifting path will initiate the shifting operation by sending a hand-off packet to all the servers in the shifting path. The hand-off packet contains the local time, the expected hand-off time point, all videos to be shifted, and the servers in the shifting path. Once a server receives the hand-off packet, it calculates the expected hand-off point of the video file that will be shifted from it, and sends the information to its neighboring server in the path. After all the servers in the path receive the hand-off packets, the new handling server ID is sent to the clients viewing to-be-shifted videos. The next step is to perform hand-off operation when the expected hand-off time is reached. Since the downstream is connectionless and the cluster IP address is always the same, it is obvious that the hand-off procedure is transparent to clients. Time distortion and timer drift on servers may incur difficulty to smooth migration. Little buffer on the client site is necessary to accommodate the timing difference between the stop of video transmission from the previous server and the start of transmission from the new server. According to our experiments, if the hand-off time is set to 1 s and the service is 1.5 Mbps MPEG-1 video, around 100 KB buffer is sufficient. We extend OneIP as an intermediate driver on Windows NT to support the operations of a distributed file server. Figure 18 shows Windows NT network architecture and packet flows of extended OneIP. The intermediate driver works between of network layer and the MAC layer. When it receives request packets, the driver discards the request packets with the wrong handling server ID. For the packets to the server, it forwards the packets to the upper layer. The driver also performs the load-shifting procedure in order to reduce operating system overhead and to improve delay latency. It sends and processes the hand-off packets. According to the experimental results, the driver introduces less than 3% overhead FIG. 18. Packet flows of the extended OneIP driver on Windows NT. 216 TSAO ET AL. in sending and receiving load-shifting related packets, and requires around 1-s hand-off time, which varies for different interconnection network of the servers. 6. CONCLUSIONS In this paper, a novel initial allocation of videos for a distributed video storage server was proposed. With the proposed load-shifting algorithm, the system can reduce the request fail rate and shifting steps. According to the simulation results, we can reduce 50% the request fail rate from that without the load-shifting procedure, around 25% to 60% the request fail rate from that of the LLF initial allocation scheme with load shifting, and 5% to 10% the request fail rate with 5% to 25% shifting steps from the DASD dancing method. Moreover, a kernel level driver on Windows NT was prototyped to examine the practicability of the proposed load-shifting scheme. REFERENCES 1. C. Freedman and D. Dewitt, The spiffi scalable video-on-demand server, in ACM SIGMOD, 1995. 2. D. Pegler, N. Yeadon, D. Hutchison, and D. Shepherd, Incorporating scalability into networked multimedia storage systems, in SPIE Conference on Multimedia Computing and Networking, 1997, pp. 118–134. 3. P. Lougher, R. Lougher, D. Shepherd, and D. Pegler, A scalable hierarchical video storage architecture, in SPIE Conference on Multimedia Computing and Networking, 1996, pp. 18–29. 4. W. Tetzlaff, M. Kienzle, and D. Sitaram, A methodology for evaluation storage system in distributed and hierarchical video servers, in Spring COMPCON ’94, 1994. 5. C. Federighi and L. A. Rowe, A distributed hierarchical storage manager for a video-on-demand system, in Proceeding of Symp. On Elec. Imaging Sci. and Tech., Feb. 1994. 6. D. N. Serpanos and T. Bouloutas, Centralized vs Distributed Multimedia Servers, Technical Report RC 20411, IBM Research Division, T. J. Watson Research Center, March 1996. 7. D. Pegler, D. Hutchison, and D. Shepherd, Scalability issues for a networked multimedia storage architecture, in Proceedings of 1995 IEE Data Transmission, 1995. 8. W. J. Bolosky, J. S. Barrera, R. P. Draves, R. P. Fitzgerald, G. A. Gibson, M. B. Jones, S. P. Levi, N. P. Myhrvold, and R. F. Rashid, The tiger video fileserver, in Proceedings of the Sixth International Workshop on Network and Operating System Support for Digital Audio and Video, 1996. 9. Y. N. Doganata and A. N. Tantawi, Making a cost-effective video server, IEEE Multimedia 1, 1994, 22–30. 10. Y. Doganata and A. Tantawi, A cost/performance study of video servers with hierarchical storage, in 1st International Conference on Multimedia Computing and Systems, 1994. 11. T. D. C. Little and D. Venkatesh, Probability-based assignment of videos to storage devices in a video-ondemand system, ACM/Springer Multimedia System 2, 1994, 280–287. 12. R. Tewari, D. Dias, R. Mukherjee, and H. Vin, High Availability in Clustered Video Server, Technical Report RC 20108, IBM Research Division, T. J. Watson Research Center, June 1995. 13. R. Tewari, R. Mukherjee, D. Dias, and H. Vin, Design and performance tradeoffs in clustered video servers, in 3rd IEEE International Conference on Multimedia Computing and Systems, 1996, 144–150. 14. A. Dan, M. Kienzle, and D. Sitaram, A dynamic policy of segment replication for load balancing in videoon-demand servers, ACM/Springer Multimedia System 3, 1995, 93–103. 15. D. N. Serpanos, L. Georagiadis, and T. Bouloutas, Mmpacking: A Load and Storage Balancing Algorithm for Distributed Multimedia Servers, Technical Report RC 20410, IBM Research Division, T. J. Watson Research Center, March 1996. 16. J. L. Wolf, P. S. Yu, and H. Shachnai, Disk load balancing for video-on-demand systems, ACM/Springer Multimedia System 5, 1997, 358–370. 17. O. P. Damani, P. Y. Chung, Y. Huang, C. M. Kintala, and Y. M. Wang, One-ip: Techniques for hosting a service on a cluster of machines, in Sixth International World Wide Web Conference, 1997. 18. J. Y. Lee, Parallel video servers: A tutorial, IEEE Multimedia April/June, 1998, 20–28. 19. M. Kienzle, A. Dan, D. Sitaram, and W. Tetzlaff, The effect of video server topology on contingency capacity requirements, in Proceedings of Multimedia Computing and Networking, 1996. LOAD BALANCING FOR DISTRIBUTED VIDEO SERVER 217 20. T. Ibarkai and N. Katoh, Resource Allocation Problems—Algorithmic Approaches, MIT Press, Cambridge, MA, 1988. 21. A. Federgruem and H. Groenevelt, The greedy procedure for resource allocation problem: Necessary and sufficient conditions for optimality, Oper. Res. 34, 1986, 909–918. 22. A. Dan and D. Sitaram, An online video placement policy based on bandwidth to space ratio (bsr), in Proceedings of SIGMOD’95, 1995. 23. G. Zipf, Human Behavior and the Principle of Least Effort, Addison-Wesley, Reading, MA, 1949. SHIAO-LI TSAO received the B.S., M.S., and Ph.D. degrees in engineering science from National Cheng Kung University, Tainan, Taiwan, in 1995, 1996, and 1999, respectively. His research interests include multimedia system, computer network, operating system, and mobile computing. He has published about 30 international journal/conference papers in the area of multimedia storage servers. He holds or has filed four U.S. patents in total. MENG CHANG CHEN received the B.S. and M.S. degrees in computer science from National Chiao Tung University, Taiwan in 1979 and 1981, respectively, and the Ph.D. degree in computer science from UCLA in 1989. He joined AT&T Bell Labs in 1989 as a member of the technical staff. He became an associate professor in the Department of Information Management, National Sun Yat-Sen University, Taiwan in 1992. Since 1993, he has been with the Institute of Information Science, Academia Sinica, Taiwan. His current research interests include multimedia systems and networking, QoS networking, operating systems, database and knowledge base systems, and Internet documents management and access. MING-TAT KO received the B.S. and M.S. degrees in mathematics from National Taiwan University in 1979 and 1982, respectively. He received a Ph.D. in computer science from National Tsing Hua University, Taiwan in 1988. Then he joined the Institute of Information Science as an associate research fellow. Dr. Ko’s major research interest include the design and analysis of algorithms, computational geometry, graph algorithms, multimedia systems, and computer graphics. 218 TSAO ET AL. JAN-MING HO received his Ph.D. degree in electrical engineering and computer science from Northwestern University in 1989. He received his B.S. in electrical engineering from National Cheng Kung University in 1978 and his M.S. at the Institute of electronics of National Chiao Tung University in 1980. He joined the Institute of Information Science, Academia Sinica, Taiwan as an associate research fellow in 1989 and was promoted to research fellow in 1994. He visited IBM T.J. Watson Research Center in summer 1987 and 1988, the Leonardo Fibonacci Institute for the Foundations of Computer Science, Italy, in summer 1992, and the Dagstuhl-Seminar on Combinatorial Methods for Integrated Circuit Design, IBFI-Geschäftsstelle, Schloss Dagstuhl, Fachbereich Informatik, Bau 36, Universität des Saarlandes, Germany, in October 1993. His research interests include realtime operating systems with applications to continuous media systems, e.g., video on demand and video conferencing, computational geometry, combinatorial optimization, VLSI design algorithms, and implementation and testing of VLSI algorithms on real designs. YUEH-MIN HUANG received the B.S. degree in engineering science from the National Cheng-Kung University, Taiwan, in 1982, and the M.S. and the Ph.D. degrees in electrical engineering from the University of Arizona in 1988 and 1991, respectively. He has been with the National Cheng-Kung University since 1991 and is currently an associate professor of the Department of Engineering Science. His research interests include video-on-demand system, real time operating systems, multimedia, and data mining.
© Copyright 2025 Paperzz