No misspellings or grammatical errors. Demonstrates full knowledge. Can answer all questions with explanations and elaborations. Clear organization with good and logical flow between parts. Varies the pitch, timbre and energy of the voice according to the needs of the presentation to maintain interest. Presentation falls within required time frame Enhances presentation and keeps interest. All key points articulated/covered. Thoroughly explains all points. Multiple vocalized pauses noticed at appropriate places in presentation or in answering questions. PAD009, DR version 1.01, 2007-08-23, Robert Feldt Several key points glossed over. Majority of points covered in depth, some glossed over. Thoughts articulated clearly, but flow is somewhat hampered. 1-2 misspellings or grammatical errors. At ease with material. Can answer questions but without elaboration. Presentation is less than minimum time. Adds nothing to presentation. Presentation is on the edges of the required time frame. Key points articulated/covered but not engaging/enhancing. 3… Uncomfortable with information. Can answer only basic questions. No or unclear logical flow between parts. Small variations in … Some variations in … A few … only some at appropriate … Multiple slumps. Too static or dynamic movements. Shows some negativity towards work and/or results. Mild tension; trouble recovering from mistakes. Occasionally slumps. Occasionally shows positive feelings about work and/or results. Makes mistakes but recovers quickly from them. Displays little or no tension. Some … Few … Some … Somewhat adapted … 2 – Fair/some/little control Only focuses on one part of the audience. Does not scan audience. 3 – Good control Occasionally looks … with parts of the audience. 4 - Superior command 4 or more … Incomplete grasp of information. Cannot answer questions. Incomplete; several key points omitted. Hard to understand work and/or results. Confusing order and organization. No variation in pitch, timbre or energy of voice. A constant and boring voice which is hard to listen to. Mumbling. Presentation is more than maximum time. Poor, distracts audience and is hard to read/interpret. No vocalized pauses noticed. Does not attempt to look at audience at all. Reads notes or looks at computer throughout. No hand gestures are noticed and/or body language is not adapted to presented content. Sits during presentation or slumps repeatedly. Shows no interest in the presented work and/or results. Nervous. Problems recovering from mistakes. 1 – Minimal or no control Student(s)/Work:______________________________________________________ Reviewed by:_________________________________________________________ Constantly looks at and maintains eye contact with different parts of the audience. Natural hand gestures and body language are demonstrated. Well adapted to the content. Stands up straight with both feet to the ground. Turned to audience. Demonstrates a strong, positive feeling about work and results. Relaxed and self-confident with no mistakes. * Key criteria which is the main basis for evaluation and grading Flow, Coherence * Language * Subject knowledge * Completeness * Visual aids Timing Vocalized pauses (ah, um, well etc) Voice variations Poise Enthusiasm Posture, Poise Gestures Criteria Eye contact (Oral) Defense/Presentation Rubric Master Thesis Electrical Engineering March 2012 Network Performance of a video Application in the Cloud Shravan Kumar Narisetty Sravan Kumar Nampally School of Computing Blekinge Institute of Technology 371 79 Karlskrona Sweden 1 This thesis is submitted to the School of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering. The thesis is equivalent to 20 weeks of full time studies. Contact Information: Author(s): Shravan Kumar Narisetty E-mail: [email protected] Sravan Kumar Nampally E-mail: [email protected] University advisor(s): David Erman School of Computing Blekinge Institute of Technology 371 79, Karlskrona, Sweden Email: [email protected] University Examiner(s): Patrik Arlos School of Computing Blekinge Institute of Technology 371 79, Karlskrona, Sweden Email: [email protected] School of Computing Blekinge Institute of Technology 371 79 Karlskrona Sweden Internet Phone Fax : www.bth.se/com : +46 455 38 50 00 : +46 455 38 50 57 iii ABSTRACT In recent years cloud computing has been growing rapidly. There are different cloud services of which cloud Infrastructure as a service (IAAS) enables a company to grow very fast. All small and large-scale companies are shifting their applications to cloud. With the expansion of the internet all over the world, the number of video applications are increasing more and becoming popular. A large amount of data is transferred over a wireless network in the smart phone. The purpose of mobile browser plays an important role while accessing video application from the cloud is to assist the cloud providers, whether the applications are working effectively in laptop and smart phone. First, a Systematic literature review (SLR) is conducted on the performance issues of cloud infrastructure as a service. Second, the performance metrics Jitter, Round Trip Time (RTT) and Page Loading Time are analyzed while accessing a video streaming application from the cloud. Finally, results are analyzed for various browsers in Smartphone and laptop. It allows the users to achieve better user experience while surfing the internet. Keyword: Infrastructure as a service, Jitter, Page Load Time, Round Trip Time, Cloud. iii ACKNOWLEDGEMENT We would like to express our gratitude to our supervisor Dr. David Erman, whose expertise for understanding, and patience throughout the thesis work. We appreciate his vast knowledge and skill in many areas, which have made us to improve our areas of research. Very special thanks to Telecom city and Logica members for providing their support throughout the work. We would like to thanks to Mats Barvesten, Daniel Gustafsson and Emil Olofsson. We are grateful to our parents for their support and encouragement provided throughout our lifetime. We must acknowledge our roommates and best friends for their support to complete the thesis work. Last but not the least we like to thank Dr. Patrik Arlos for providing us David Erman as a supervisor. Regards, Shravan and Sravan. iv Table of Contents ABSTRACT ............................................................................................................................................iii ACKNOWLEDGEMENT....................................................................................................................... iv LIST OF FIGURES ............................................................................................................................... viii LIST OF TABLES .................................................................................................................................. ix ACRONYMS ...........................................................................................................................................x 1 INTRODUCTION ................................................................................................................................ 1 1.1 Aims and objectives ...................................................................................................................... 2 1.2 Survey of Related works ............................................................................................................... 2 1.3 Research Questions ....................................................................................................................... 3 1.4 Research Methodology .................................................................................................................. 3 1.5 Motivation ..................................................................................................................................... 4 1.6 Contributions ................................................................................................................................. 4 1.7 Thesis Outline.......................................................................................................................... 5 2 Cloud Computing ............................................................................................................................... 6 2.1 Introduction ................................................................................................................................... 6 2.2 Systematic Literature Review........................................................................................................ 7 2.2.1 Features of Systematic Literature Reviews ............................................................................ 7 2.3 Defining the Research Questions .................................................................................................. 8 2.4 Defining Keywords ....................................................................................................................... 8 2.5 Study Quality Assessment ............................................................................................................. 9 2.5.1 Review Protocol ..................................................................................................................... 9 2.5.2 Data Extraction ....................................................................................................................... 9 2.6 Selection Criteria and Procedures ............................................................................................... 10 2.6.1 Inclusion Criteria .................................................................................................................. 10 2.6.2 Exclusion Criteria ................................................................................................................. 10 2.7 Results of SLR............................................................................................................................. 13 2.8 Cloud infrastructure ..................................................................................................................... 15 2.8.1 GoGrid .................................................................................................................................. 15 2.8.2 Cloud.com ............................................................................................................................ 15 2.8.3 IBM Smart Cloud ................................................................................................................. 16 v 2.8.4 Rackspace ............................................................................................................................. 16 2.8.5 Eucalyptus ............................................................................................................................ 17 3 EXPERIMENTAL SETUP ................................................................................................................ 18 3.1 Experimental Setup for Cloud Performance................................................................................ 18 3.2 Experimental Procedure .............................................................................................................. 19 3.2.1 Right Scale ........................................................................................................................... 19 3.2.2 Server Template ................................................................................................................... 20 3.3.3 Amazon Web Services (AWS) ............................................................................................. 20 3.3.4 Amazon Elastic Compute Cloud (Amazon EC2) ................................................................. 21 3.3.5 EC2 Instances ....................................................................................................................... 21 3.3.6 Amazon Simple Storage Service (Amazon S3) .................................................................... 22 3.3.7 Jw Player .............................................................................................................................. 22 3.3.8 TShark tool ........................................................................................................................... 22 3.3.9 Wireshark tool (1.6.5)........................................................................................................... 22 3.3.10 Network Time Protocol (NTP) ........................................................................................... 22 3.3.11 Mobile Browsers ................................................................................................................ 23 3.3.12 Super User and Shark Root ................................................................................................ 23 3.4 Results and Analysis ................................................................................................................... 23 3.4.1 PCAP to text conversion ...................................................................................................... 23 3.4.2 Jitter Calculation................................................................................................................... 23 3.4.3 MATLAB and graph analysis .............................................................................................. 23 3.5 Results ......................................................................................................................................... 24 3.5.1 Jitter performance in Smart Phone Browsers ....................................................................... 24 3.5.2 Jitter Performance in Laptop Browsers ................................................................................ 25 4.1 Round Trip Time (RTT) .............................................................................................................. 27 4.2 Results ......................................................................................................................................... 28 4.2.1 RTT for cloud server ............................................................................................................ 28 4.2.2 RTT for Apache Server ........................................................................................................ 29 4.2.3 RTT for Nginx Server .......................................................................................................... 29 4.3 Page Loading Time ..................................................................................................................... 30 4.4 Results ......................................................................................................................................... 31 4.4.1 Page Loading Time for cloud ............................................................................................... 31 4.4.2 Page loading time for Apache Server ................................................................................... 32 4.3.3 Page loading time for Nginx Server ..................................................................................... 32 vi Conclusion ............................................................................................................................................. 34 Future Work .......................................................................................................................................... 35 References ............................................................................................................................................. 36 APPENDIX ........................................................................................................................................... 40 vii LIST OF FIGURES FIG. 1 CLOUD COMPUTING SERVICES ...................................................................................................................... 6 FIG. 2 STEP BY STEP PROCESS OF SYSTEMATIC LITERATURE REVIEW.................................................................... 11 FIG. 3 NUMBER OF PAPERS PER YEAR WISE. .......................................................................................................... 13 FIG. 4 EXPERIMENTAL SETUP FOR CLOUD PERFORMANCE. .................................................................................... 18 FIG. 5 BLOCK DIAGRAM FOR ANALYSIS OF JITTER FROM CLOUD. .......................................................................... 19 FIG. 6 CDF GRAPH OF PACKET SIZE. ...................................................................................................................... 24 FIG. 7 SEMI-LOG GRAPH FOR THE CALCULATION OF JITTER IN SMART PHONE FOR VARIOUS BROWSERS. ............... 25 FIG. 8 SEMI-LOG GRAPH FOR JITTER CALCULATION FOR VARIOUS BROWSERS. ....................................................... 25 FIG. 9 EXPERIMENTAL SETUP FOR LOCAL SERVER. ................................................................................................ 26 FIG. 10 BLOCK DIAGRAM FOR LOCAL SERVER. ..................................................................................................... 26 FIG. 11 ROUND TRIP TIME FOR VARIOUS BROWSERS IN CLOUD. ............................................................................. 28 FIG. 12 ROUND TRIP TIME FOR APACHE LOCAL SERVER. ....................................................................................... 29 FIG. 13 ROUND TRIP TIME FOR NIGIX LOCAL SEVER. .............................................................................................. 29 FIG. 14 PAGE LOADING TIME FOR VARIOUS BROWSERS IN CLOUD. ......................................................................... 31 FIG. 15 PAGE LOADING TIME FOR VARIOUS BROWSERS IN APACHE LOCAL SERVER. .............................................. 32 FIG. 16 PAGE LOADING TIME FOR VARIOUS BROWSERS IN NGINX LOCAL SERVER................................................... 32 FIG. 17 JITTER CALCULATION FOR VARIOUS BROWSERS IN LAPTOP........................................................................ 42 FIG. 18 JITTER CALCULATION FOR VARIOUS BROWSERS IN SMART PHONE. ............................................................. 43 viii LIST OF TABLES TABLE 1 SLR SEARCH STRING. ................................................................................................................................ 8 TABLE 2 QUALITY ASSESSMENT CHECKLIST. ....................................................................................................................... 9 TABLE 3 DATA EXTRACTION STRATEGY. .......................................................................................................................... 10 TABLE 4 SLR RESULTS. ......................................................................................................................................... 15 TABLE 5 AVERAGE PAGE LOADING TIME IN LOCAL SERVERS. ............................................................................... 33 TABLE 6 SPECIFICATION TABLE FOR EXPERIMENTS................................................................................................ 41 TABLE 7 SLR SEARCH STRING. .............................................................................................................................. 41 ix ACRONYMS AaaS-Application as a service. ACK-Acknowlegement. AWS-Amazon web services. CDF-Cumulative distribution function. CDN-Content delivery network. CIFS-cloud infrastructure Frames work. CPU-central processing unit. EC2-Elastic compute cloud. EIP-Elastic internet protocol. HTML-HyperText Markup Language. HTTP-Hypertext Transfer Protocol. IaaS-Infrastructure as a service. ICMP-Internet control Message Protocol. IP-Internet Protocol. Mbps-Megabits per second. NTP-Network Time Protocol. P2P-Peer to Peer. PaaS-Platform as a service. PHP-Hypertext Preprocessor. RAM-Read only memory. RTT-Round Trip Time. S3-Simple Storage Services. SaaS-Software as a service. SLR-Systematic literature review. SOAP-Simple Object Access Protocol. SRB-Service oriented Resource broker. SSH-Secure Shell. x SYN-Synchronize. TCP-Transmission Control Protocol. TPM-Trusted Platform Modules . UDP-User Datagram protocol. VM-Virtual Machine. XML-Extensible Markup Language. xi 1 INTRODUCTION Cloud is the new business model for computing world. Cloud computing is a metaphor for remotely accessing computing resources through a network [2]. It provides on-demand network access to shared resources that can be physically located anywhere across the world. It is being ubiquitously designed and deployed in major places all over the world. New cloud services will soon be available in the market from the established IT and Telecom providers such as Microsoft, IBM, Accenture, Fujitsu, China Mobile and Sign Tel join cloud pioneers like Google, Amazon and salesforce.com in [6]. Cloud Computing provides greater flexibility, authentication issues and cost savings. Today, cloud computing covers several kinds of services. o o o o Software as a service: cloud-based applications. Infrastructure as a service: processing and storing data. Platform as a service: developing, testing and running applications for clouds. Anything as a service: increasing number of services that are delivered over the internet. Now-a-days, mobile cloud computing is emerging as one of the most important branches of cloud computing and becomes a massive force in the mobile world and is still in its infancy and eventually it will become the dominant way in mobile applications. Most of mobile applications are still using data storage and processing capacities of mobile storage. In Mobile cloud computing, Mobile browser plays an important role as it supplies an open door to the Internet for mobile phones. The mobile browser is optimized to display web contents most effectively for small screens on portable devices. In wireless handheld devices, mobile browser software must be small, low bandwidth and efficiency to accommodate low memory capacity. The mobile browser usually connects to the server via wireless LAN or cellular network using standard Hyper Text Transfer Protocol (HTTP) over TCP/IP and displays web pages written in XML, HTML, and SOAP. At the user end, performance should be faster, easier and reliable. While accessing a video application from the cloud using various browsers are available in smart phone. There are different mobile web browsers that are available of which some of the browsers are chosen they are Firefox, Xscope, Opera, Dolphin and Android internal browser in android mobile. In future, cloud may replace a traditional office setup. We know that various cloud services replacing desktop computing will be accessed via cloud due to increase in traffic exponentially. The traffic generated mostly in mobiles is mostly User Datagram Protocol (UDP), Transmission Control Protocol (TCP) and Internet control Message Protocol datagram’s [63]. Many more applications are shifting to cloud infrastructure such as video streaming, online chats, and file transfer. Video steaming applications are increasing rapidly then the traffic also increases. Growth in traffic, leads to some problems in per flow loss rate in network congestion. Due to this loss rate, the performance of the network will be degrading 100% in utilized link, and delays are occurring on client side when they are accessing the web application. The increase in delay results affects the overall performance of Round Trip Time (RTT) and Page loading time. This delay and retransmission can cause jitter, which is problematic for video streaming application. In this paper, mainly focus on “Network performance analysis of video application in the cloud,” While accessing video using smart phone and laptop with various browsers. Metrics 1 chosen for the experiment are Round Trip Time, Page load time and jitter. A systematic literature review has been conducted on performance issues on cloud infrastructure. RTT is defined as the elapsed from the propagation of a message to a remote place and to its arrival back at the source. The choice of this metric provides the exact amount of time that a client can access a web application, and that would experience a delay in receiving the output of query from the time input. The Page loading time [18] [19] is defined as the time taken to load the web page from the server through mobile browser or laptop browser. Jitter is the variation in packet transit delay caused by queuing, congestion and serialization effects on the path through the network. Jitter is also variation or the degree of unpredictability in delay like these reasons makes wireless as unreliable. This term is associated with the loss of data packets in a real-time data stream. The transmission rate of the channel varies over time. The video display interruption may occur if the data don't deliver on time. Jitter reduces the perceived video quality and is inconvenient video streaming [11]. 1.1 Aims and objectives The aim of this thesis project is: To analyze the network performance issues in various browsers while accessing a video application from the cloud using a Smartphone. Our aim is to calculate the RTT and page loading time using TCP packets while accessing a video based application which is launched in local server and in the cloud. To do this, need to install shark tool in the android mobile. The time stamps are collected for various browsers that are available in the android mobile. This is to identify the impact of browsers on the smart phone while accessing video streaming application. Our aim is to see the impact of jitter in laptops and smart phones while accessing a video streaming application from the Amazon Cloud Service using Right Scale. For this, need to collect network traces with shark tools at both ends. Research Question Method 1 SLR 2 Experiment 3 Experiment 4 Experiment 1.2 Survey of Related works In today’s world, Cloud computing has began to migrate from the public and private market. The infrastructure and performances in cloud computing are attracted to adopt different applications in the research of the business world. In [22] performs analysis on EC2’s management and security facilities, at the same time measuring Amazon S3 and SQS, finding EC2 best when 2 considering cost-time tradeoff. This paper shows the performance of cloud depends on the dynamic load balancing, security, independent running application to better performance [65]. IBM is also developing its own cloud platforms and is gaining huge market service. In [5] the cloud technologies have limited support for market oriented resource management and negotiating Quality of Service between users and providers. An extensive research work has been done to overcome the time synchronization problem. It can be used as delays across nodes. In [8] the study of 3G authentication traces from a provider to measure the correlations between locations, time of day and application usage. In [7] the author monitored the device consists of 43 users and found that browsing contributes most traffic, and lower layer protocols higher overhead due to small transfer sizes. They also found the current server-side transfer buffers and radio power management are not well tuned for smart phone workloads. In [9] the analysis of different providers like Amazon, EC2 with measured performance metrics like waiting time, response time and experiments were conducted on many tasks computing based scientific computing. In the cloud, sharing of computer and storage resources has become a popular solution for a number of key enterprise applications. Distribution of high workloads between the sites and distributing critical data and risk failures are minimized. It is transforming current Internet practices providing multiple search engine facilities, traditional services, application running on the Internet or broadband to deliver services to an end user. In [21] some of the network performances are measured between different zones in Amazon web services. The work includes link to evaluate network Quality of Service in different zones of the Amazon. They conclude streaming service can be efficiently used to improve the quality of service compared to traditional P2P and CDN systems by distributing a hybrid P2P and cloud streaming network. In [66] solutions based on caching of entire dynamic page are explained. The RTT has calculated using SYN, SYN ACK of TCP connection in [11][12][13]. In some papers, the comparison of network performance in smart phones has been explained on the application based measurement software. Work in [16] suggested which operating system for mobile is most suitable for users in mobile gaming and applications. In [15] with the help of software measurement application tools they compared different operators and network protocols in various smart phone operating systems. 1.3 Research Questions Some of the research questions are identified related to mobile cloud. 1. What are the performance issues that are influencing cloud infrastructure? 2. How does jitter varies in Laptop and Smartphone while accessing a video streaming service in the cloud? 3. Does Round Trip Time (RTT) depend on the type of mobile browser? 4. How page loading times varies with various browsers in the smart phone? 1.4 Research Methodology While doing a thesis, some of the research methods are followed to explain research methodology in [4]. 3 1. Literature Study: - This phase includes the thorough analysis of journals and conference papers obtained from reputed scientific databases for specified search criteria. There by, attained the enough depth in the domain of network performance issues related to cloud infrastructure. White papers are included in the search for even more knowledge on the real time and corporate expertise scientific knowledge in the performance analysis in cloud computing. Thus, the first research question is addressed using Systematic Literature Review (SLR). 2. To solve the second research question an experimental test bed was designed. A video application is deployed in the cloud and accessed through Smartphone and laptop. While accessing the service network performance issues like jitter are calculated and analyzed using the collected traces. 3. For the third and fourth research question various browsers are considered to calculate the round trip time and page loading times with the help of packet sniffing tools. Thereby round trip time and page load time dependencies and underlying factors on various mobile browsers are analyzed. In order to validate the experiments each experiment is repeated multiple times say around 25 times on different browsers in a smart phone to ensure that results are not affected due to the time of the day. 1.5 Motivation The main motivation is to evaluate network performance of Smart phones based on the applications that use hardware and operating system clock time stamps [15, 16 and 17]. The cloud providers around the world are developing more infrastructure and facilities due to increase in number of customers who are willing to launch their applications in the cloud. Now, most of the large companies are trying to build their own clouds so that they cannot lose their customer base. On the basis of our study, the collected timestamps are crucial to analyze network metrics such as RTT, page loading time and jitter in smart phone while accessing video from the cloud in real time networking. Our works also focus on the browsers behavior comparison in android Smartphone and laptop. This thesis helps to see the network performance of laptop and smart phone for video streaming application in real time environment when launched in the cloud. 1.6 Contributions In this thesis, it gives an overview of video application deployed in the cloud. It shows how network performance varies in the cloud and in the local server when the same video application is launched and video is accessed through smart phone with various browsers. To provide knowledge about the cloud provides of which services, they are offering to the users with the provided cloud infrastructure. 4 1.7 Thesis Outline The thesis document gives as follows. Chapter 2 explains about cloud computing, systematic literature review (SLR) and results. Chapter 3 experimental setup for jitter in cloud and its results. Chapter 4 experimental setup for round trip time and page loading time in local server and results. Chapter 5 Conclusion and Future Work. 5 2 Cloud Computing 2.1 Introduction “Cloud Computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., Networks, servers, storage, applications and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction” [60]. The three service models are Application as a service Platform as a service Infrastructure as a Service Fig. 1 Cloud Computing Services Software as a Service (SaaS) –It provides industry-standard functions when you need them, where you need them, without any capital investment in [6] [5]. It allows users to run existing online applications. Platform as a Service (PaaS) - It gives ready-made platforms to build new, unique applications faster to help the business to grow. Furthermore, allows users to create their own cloud applications using supplier specific tools and languages. In [6] [2] PaaS provides an environment and tools for creating new online applications at a rapid growth with low cost. Infrastructure as a Service (IaaS) – It gives environments on demand, extra computing power to handle spikes and very low capital investment. In [21] [6] allows running online on the provider’s hardware that means the existing your applications over data centers can be migrated to the cloud so that it reduces the IT cost. Virtualization allows users to share the same physical server without interfering with each other’s application over the internet. It allows users to run any applications, which are placed on cloud hardware of their own choice. 6 All this service makes users to run applications and store data online. Anyhow, each offers a different level of user flexibility and control. IaaS comes in four categories in [60] [6]. They are 1. Private cloud- this is a most secure and costly option. It has very specific numbers of physical servers are dedicated to one customer. 2. Dedicated Hosting- when the physical servers are on demand. Furthermore, matching all their requirements of the customers. 3. Hybrid Hosting- it makes the physical server and virtual server instances on demand in an effort to reduce cost for to increase accessibility. 4. Cloud Hosting- when the virtual server instances on demand and offer on an hourly basis. Amazon web services are operating on IaaS model. For example, Amazon has a number of products of which one is Amazon Elastic Cloud (EC2) and variety of instance types are available in the market which are purchased on an hourly basis. Here they provide choice of selecting servers, memory, RAM, CPU, storage, power, firewall, security, hardware load balancing and other network equipment. IaaS is to enterprise customers with high secured, high resilience and high availability solution for the applications in business modeling. 2.2 Systematic Literature Review A systematic literature review (often referred to as a systematic review) is a means of identifying, evaluating and interpreting all available research relevant to a particular research question, or topic area, or phenomenon of interest in [20]. There are many reasons for undertaking a systematic literature review. The most common reasons are: 1. To summarize the existing evidence concerning to technology. 2. To identify gaps in further research in order to suggest areas for significant investigation. 3. To provide a background in order to ensure new research activities. 2.2.1 Features of Systematic Literature Reviews Some of the features that differentiate a systematic review and conventional expert literature review of [20] are: Systematic reviews start by defining a review protocol that specifies the research question being addressed, and the method will be used to perform the review. Systematic review defines on search strategy that aims to find the most relevant literature material. Systematic reviews require explicit inclusion and exclusion criteria to assess each potential primary study. A systematic review is a prerequisite for quantitative meta-analysis. A systematic review is conducted mainly in three phases. They are Planning the review: -At this stage here the identification of the need for a review and development of the review protocol has been proposed. 7 Conducting the review: -This phase included primary study, data extracting, and data analysis. Reporting the review: -Finally, this part is associated with reporting the results and documenting the process. The Systematic Literature Review is one of the leading research methodologies in the research work. The principal reasons to do systematic literature review is to collect necessary data for solving the RQ1 by perusing relevant research published on articles, journals and conference proceedings from different publication sources by following the predefined review protocol. Systematic reviews are based on distinct search strategies that aim to collect much of the relevant literature as possible. Planning the Review At the planning stage, a review protocol has been defined which includes the search strategy, search string formulation, used data sources, selection criteria, data extracted and quality assessment strategies. Major keywords in a search string were formulated from the research questions. In addition, identified their synonyms and alternative terms. (("cloud infrastructure") AND ("performance" OR "practice" OR "operation" OR "efficiency") AND ("issue*" OR "risk*" OR "challenge*" OR "problem*")) Table 1 SLR Search string. 2.3 Defining the Research Questions What are the performance issues that are influencing cloud infrastructure? The purpose to conduct SLR is to evaluate and interpret all available issues relevant to a cloud infrastructure as a service. The SLR is conducted on databases, articles, journals and conference from distinct publications by predefined review protocol. The detailed and the preliminary results are explained below. 2.4 Defining Keywords In [20] PICO (Population, Intervention, Comparison, Outcome) criteria is used to frame research questions with the help of keywords. Population: An industry group such as Telecommunications companies or Small IT companies. Here, it refers to a very specific area and chosen “Cloud Computing” as for this research. Intervention: The intervention is the software tool to address a specific issue. “Cloud infrastructure” is intervention for research. 8 Comparison: This is the tool or procedure with the intervention is being compared. When the comparison technology is the commonly used technology, but authors are not comparing any technology in this research. Outcomes: Outcomes should relate to factors such as reliability, reduced production costs and relevant outcomes should be specified. 2.5 Study Quality Assessment The researchers should develop quality checklists to assess the individual studies. The purpose of the quality assessment is to develop the checklists [20] [61] has shown in the table. No. Quality Assessment Yes/No 1 Does the Aims and objectives are clearly stated? Yes 2 Does the data collection method describe? Yes 3 Does the citations in the paper explained? Yes Table 2 Quality Assessment Checklist. 2.5.1 Review Protocol A review protocol specifies the method that will be used to undertake a specific systematic review [61]. Choosing SLR into consideration to extract the papers related to the performance issues of infrastructure in cloud computing associated with infrastructure as a service. The published papers of recent years are considered. 2.5.2 Data Extraction The Data extraction process is used to collect the accurate and necessary information obtained from the primary studies with minimum bias, which address the research question [20]. The data is extracted based on cloud infrastructure issues. The necessary information was collected using inclusion and exclusion study criteria from the popular and well known databases. The methodology extraction was mainly focused on performance, issues and challenges in cloud infrastructure. Title of the paper/article Name of the author Publication Database Research Method Journal article Conference paper Book Engineering village IEEE Xplore Science Direct Survey Case study Experiment Experience Report The model proposed 9 Context Industry Academic Table 3 Data Extraction Strategy. 2.6 Selection Criteria and Procedures The selection criteria as mentioned in kitchenham [20]. Search strategy includes papers which are appropriate for the research work. Based on inclusion and exclusion of the selection criteria, the papers are filtered which are not relevant to the research question. Inclusion and exclusion selection of criteria are performed to identify that interpreted data is identified correctly. 2.6.1 Inclusion Criteria 1. The collection of cloud infrastructure papers gives the information about the research question. 2. The collections of different database papers on the performances of the cloud infrastructure are collected using online library provided by Blekinge Tekniska Högskola. 3. Search string: search string needs to be formed first. Extract only the papers which are published in English and with full text of recent years. 4. Title and abstract: studies covering cloud infrastructure, which relates to cloud computing. And paper shows the infrastructure issues which are used in cloud computing. 2.6.2 Exclusion Criteria The exclusion criteria show the removal of unwanted material and papers by the search strategy. 1. 2. 3. 4. 5. Papers which do not relate to cloud infrastructure in cloud computing. Do not relate the online or the papers which are not published. Do not relate the papers which are not in English. Remove the duplicates. Remove the papers in which full texts are not available. The search string is used extract the papers from distinct databases, which are published recently. The selection procedure has step by step to follow, which are shown in figure 2. Step 1: To identify the papers, the search has been performed by both the authors simultaneously in three different databases using the search string. The papers which are relevant and related to our research question are considered in our research work. 10 STEP1: Different Database IEEE 22 STEP 2: Total Papers of 4 Database COMPENDEX INSPEC SCIENCE DIRECT 37 81 140 44 Repeated ,Remove Duplicate and no full text STEP 3: Repeated/ Duplicate and no Full text 96 41 Not relevant to tilte STEP 4: Screening by relevant titles 55 22 Removed screening by abstract STEP 5: screening by Abstract 33 19 Papers are not discussing about cloud infrastructure as service STEP 6: Screening full text content 14 Fig. 2 Step by Step process of Systematic Literature Review. IEEE Xplore: IEEE Xplore is a simple, flexible and convenient database. Here, it has many advanced search options, which consist of many logical operators like AND/OR/NOT to from the keywords easily. With the help of keywords, a search string is framed to extract the papers relevant to the research of which 22 papers extracted from fig 2. The papers are collected from last decade as cloud computing as become more popular from past 3years more papers are published in 2008-2011. Compendex and Inspec: Engineering Village consists of two different database hosts present. They are Compendex and Inspec. There are many search fields like quick search, expert search and thesaurus. Here, there are many search options present with the help of the search string the authors extracted 81 papers in both Compendex and Inspec, which are published from [2008 - 2011] in English. 11 Science Direct: Science Direct is another popular database, which has a different search engines authors have extracted the journals using the search string. Step 2: For the papers obtained from the step 1 with the help of search string, which is used in four databases by extracting related to the research question and year wise. The authors have filtered 140 papers after removing some of the papers, which are not relevant to the subject cloud computing. Step 3: In these step Duplicates and the papers which are not in full text are filtered of which 44 papers has been removed in this process. Here, the inclusion criteria are chosen for selection of papers which are available in full text. And even repetition of papers in four databases is also filtered. Out of which 96 papers are collected in this step. Step 4: Papers are identified on cloud infrastructure performance. When all the data bases brought together there are 96 papers. Now, the papers are extracted with the title of which 41are not relevant to the title. Step 5: 55 papers are available from step 4 the selection of papers are relevant to the research question or work after studying title, abstract and conclusion of the papers. Then authors have identified 33 papers related to the research work. Then new lists are compared between both the authors. If there is any change in the selected list of papers. Then both discuss and come up with a single list. Step 6: The collection of papers is obtained from step 5. And individually both read the entire text of the research papers. Then both authors finally analyzed all the information related to cloud infrastructure performance issues, which are currently used in research to answer the research question. Furthermore, comes up with 14 papers, which are most relevant to the cloud infrastructure as a service performance. 12 2.7 Results of SLR 12 10 8 6 Papers 4 2 0 2008 2009 2010 2011 Fig. 3 Number of papers per Year Wise. Now a day’s cloud computing research is growing very rapidly in the recent years. The authors conducted SLR on the performances in cloud infrastructure. From the recent years 14 relevant papers from different databases are selected. Mostly selected papers are from 2010 and 2011 respectively. The fig. 3 shows the papers published as per year wise. These are SLR results obtained from different databases which are shown in table 4. S.no Performance Main contribution issues 1 ref Hardware and It explains about hardware and [23] Network failure network map reduce with different algorithms for fault tolerance in computing framework. 2 ref Security [24] hardware Conclusion A new Twister algorithm is made for solving fault tolerance. and It explains Private data protecting To generate a service is used for protecting data symmetric key before before storage. upload in the cloud. 13 3 ref Networking [25] Cloud network as the service is The novel cloud implemented for network failure. network is implemented for network failure. 4 ref Networking and A thiaific model is designed It monitors network [26] hardware infrastructure for monitoring the million of people in failures in IAAS. less than a second. Scalability and robustness are increased due to thaific model. 5 ref Security [27] It explains about Data of service The novel estimation attack (DOS) in cloud tool is designed for infrastructure. controlling a DOS attack. 6 ref Large Scale Grid Batch gives complete control [28] applications on how data are partitioned and how computation is distributed so that applications can have higher performance. There are challenges both in the programming model and in the underlying infrastructure to analyze data in the cloud. 7 Data storage The focus is on the cloud ref[29] infrastructure that can be seamlessly integrated into an Enterprise Architecture. 8 ref[30] Network and To ensure guarantee service of data transfer bulk data transfer in cloud computing. 9 ref[31] Network performance 10 ref[32] Parallel distributed simulation techniques 11 ref[33] Security Future grid distributed cloud characteristics of the network, transport and application levels. and Core architecture and simulation as a service are emerging in public cloud. Benefits of eucalyptus cloud based on design and deployment of a trusted eucalyptus cloud architecture on the remote attestation via trusted platform modules (TPM). Grid Batch allows writing parallel programs for data intensive batch applications. An approach towards automated integration of open source EAM tool iteraplan and private or public infrastructure cloud via push and pull protocols. Cloud infrastructure framework (CISF) and a service oriented broker (SRB) to transfer data to cloud. Flexible sensor centric grid framework with cloud infrastructure likes future grid. An approach to implement ARTIS/GAIA+ simulation based on the multi-agent system. Eucalyptus is to ensure the integrity and confidentiality of user data and computation for Security and privacy issues in cloud 14 infrastructure. 12 ref[34] Virtual layer 13 ref[35] Industrial 14 ref[36] Hardware Virtual layer identifies the effect of services and analyzes the middleware self managed services. To deploy large scale enterprises on cloud infrastructure implemented within the framework of the IRMOS EU project. It provides automated services, security and privacy. It shows the evaluation, validation and optimization of the implemented service mechanisms. The data distribution across the It Improves the cloud are managed by security, security while storage, storage and cost efficiency. quality of service and resource management. Table 4 SLR results. 2.8 Cloud infrastructure This is some of the company’s cloud infrastructure information. The information is collected through online websites. 2.8.1 GoGrid GoGrid [48] is a service infrastructure in the cloud, Linux and Windows virtual machine’s control panel server management and more comfortable hosting API. GoGrid is a private company and compete on Rackspace dedicated hosting space, hosted in the cloud. The current version of GoGrid API is 1.8. GoGrid can be easily managed with a powerful tool, easy to use a variety of cloud infrastructure, allowing you to monitor, manage and scale of infrastructure in real time. GoGrid provides the data center to achieve go grid infrastructure, which supplies a powerful tool. This makes it easy for the business on multiple locations using an individual infrastructure as a service (IaaS) provider. GoGrid hosted private cloud is secure and dedicated, infrastructure on demand and cost savings. Its minimum cost to start is 68.5$ per month in [49]. 2.8.2 Cloud.com Cloud.com received a powerful, enabling them to quickly create, manage and deploy enterprise cloud computing users and service providers. Cloud.com is the user requirements, and how is the best choice, in infrastructure, in partnership with a company that seeks to cloud computing data center of [50]. “Unlimited” resources that can be accessed on-demand. Increased business agility because invests only in the areas that I need to go and invest to make a business successful. Reducing costs by using the required as I need through by public service provider. Cloud.com is also a Pay as you go policy. Open source cloud computing platform for building and managing private and public cloud infrastructure [51]. Cloud.com provides three benefits for private clouds they are end user selfadministration, service offering management and virtual data center deployments. 15 Citrix: Citrix systems have acquired cloud.com which is a cloud computing provider [52]. Citrix provides software, infrastructure and platform as a service for the cloud providers. Cloud stack is used to implement, simple and cost effective services. Citrix is secure, scalable and open design of the management line. For the acquisition of cloud providers is growing rapidly as the market leader in infrastructure based on Citrix. 2.8.3 IBM Smart Cloud IBM Smart Cloud [53] [55] [54] is a Cloud Computing solution and IBM brand ecosystem. This is IBM’s cloud computing products, the growing part. IBM smart cloud includes Infrastructure as a service (IaaS), Software as a service (SaaS) and Platform as a service (PaaS) delivery model, through the provision of public, private and hybrid clouds. This allows IBM to provide ways they are the smart cloud foundation, smart cloud services and smart cloud solutions. IBM offers a flexible approach to the cloud. When start working on the cloud, depending on the needs of business. The short-term goal is to get in the middle of the balance for the future and is ready to take advantage of opportunities. The challenge of the pressure on IT infrastructure with business growing, cloud computing companies are looking forward to provide IT services. With the IBM, transformative power of cloud computing will drive the way to do business that can apply. IBM cloud solutions can helps to Create new business value. Improve speed and dexterity. Deliver IT without boundaries. IBM offers High-Performance Computing Cloud (HPC) provides methods and manages the HPC management tools for the use of cloud computing technology. The concessions designed for both private hosting and private HPC cloud to include IBM Intelligent Cluster. IBM HPC Management suite for Cloud. HPC Cloud service from IBM. IBM provides a different type of charge plans named copper, bronze, silver and gold each varies with the price and services. Otherwise pay as per use plan. 2.8.4 Rackspace One of the cloud web hosting provider is Rackspace that startup bills on a utility computing offering. It is also one of the commercial computing services. The rack space [57] clouds are simple, scalable and pay as per use. Rackspace cloud has been designed and constructed goal in mind. It provides cost effective, great service and support, particularly to the provision and use of support scalable solutions. Rackspace servers consist of both windows and Linux virtual servers in the cloud that can deploy in minutes and pay on an hourly basis. Pay as per use that is from the start of server to the end. Pay for each cloud server based on the selection of RAM, data-storage type, operating system, on the server type also these prices vary according to your configurations and requirements per hour in [56] [57] . Rackspace server provides a world-class service to the cloud extends to management services offering Rackspace cloud hosting. This product provides cloud monitoring, operating systems 16 and application layer infrastructure support, including a technical guidance to support more cloud servers. 2.8.5 Eucalyptus Eucalyptus provides a cloud platform with a worldwide development community and professional support in deploying software platform for infrastructure as a service cloud. Eucalyptus is interface-compatible with AWS, so there is flexibility to expand for hybrid, private and public cloud. AWS cloud resources for network and storage [59]. From modern infrastructure, Virtualization software to create flexible benefits of eucalyptus can be dynamically zoomed in or out on the application workload may be set. Eucalyptus is specifically designed for the web service using Amazon Web Service API industry standard hybrid clouds [58]. The advantage is high efficiency and scalability, increased confidence and control for IT as a service. The IT infrastructure and data center management focus on cost benefit, eucalyptus is a variety of interfaces, a single framework for managing resources. Hardware, network and storage can be consolidated, the eucalyptus cloud, hidden heterogeneity in hardware, software, stack policy and configurations. 17 3 EXPERIMENTAL SETUP 3.1 Experimental Setup for Cloud Performance The Fig.4 shown below represents the experimental setup for calculation of jitter from cloud to smart phone and laptop using a Right scale account. Video Application Ubuntu image TLC Apache PHP micro Different Browsers WireShark/.pcap file LAPTOP WI-FI Video Application INTERNET Video Application Tshark/.pcap file Different Browsers CLOUD SharkRoot/.pcap file NTP SERVER SMART PHONE Fig. 4 Experimental Setup for Cloud performance. To study the Network performance of the cloud, a video-based application embedded within jw player is deployed in the Amazon AWS ec2 cloud using Right scale TLC Apache/PHP micro server template. TLC Apache/PHP micro consists of ubuntu_10. 04_i386 with 32bit micro EC2 instance and an Apache server is installed in the Amazon AWS in us-east zone. The shark root is installed in the cloud for collecting traces on the server side. On the receiver side, experiments are conducted at Blekinge Institute of Technology, Sweden. The WI-FI bandwidth usually varies around 12 Mbps. Both the sender and receiver clock is synchronized to Network Time Protocol (NTP). Experiments are conducted on both Smartphone and laptop on the receiver side. A shark root tool is used in the android HTC desire Smartphone to collect traces at the receiver side. Experiments are repeated for 25 times on various browsers like Android inbuilt browser, Firefox, Opera, Xscope and Dolphin in android mobile. In laptop, Firefox, Opera and Safari browser is considered. The Wireshark 1.6.5 tool is used to collect traces in Toshiba laptop with operating system windows 7, 32-bit Intel Core 2 Duo processor and RAM 3GB. The jitter analysis is performed using the collected traces. From the obtained graphs, results are analyzed. 18 3.2 Experimental Procedure This block diagram of Fig. 5 gives in detail experimental procedure. Steps are explained in detail below. Fig. 5 Block diagram for analysis of Jitter from Cloud. The services of the Right scale cloud provider are used. The Logica Company and Telecom city of Karlskrona have a subscription from the right scale. Cloud services for the thesis are provided by the Telecom city and Logica Company. From Right scale provider Amazon web services and EC2 are accessed. 3.2.1 Right Scale Right scale is a web-based management platform for managing cloud infrastructure from multiple providers [37]. The right scale manages all three platforms they are public, private and hybrid clouds. The workloads between private and public cloud are operated by distinct services like Amazon Web Services (AWS), Rack space, logic works, soft layer and Tata. The public cloud has basically changed to quick enterprise IT. Day by day, expectations are changing because they pay per use so that the developers and business lines are showing more interest towards it. Right scale cloud management is a platform that brings together an entire ecosystem for cloud based IT [38]. Their partners develop different server template software to 19 access a large number of users by providing security and large data. Rather than purchasing servers, and networks, the clients buy those resources with the help of the platform as a service like right scale. Right Scale cloud management is the bridge between an application and cloud infrastructure. Right scale is portability, automation and controls the user permission, audit entries, version control. It supports multiple public and private clouds. Right scale is leading the infrastructure provider in the cloud. Now, the right provides software for VM, storage and networking. It provides the tools to create own data center, which is automated, reliable and secure. 3.2.2 Server Template Right scale provides so many server templates that are ready to configure and built by the right scale team and other partners. Server templates allow users to configure servers from the first stage of a base image and addition of scripts that runs the task during the operation, booting phase and in shutdown phases. A server template main idea is to boot any server from set of images and configure the server at boot time. List of scripts that are yet to run at boot time to install and configure all software [39]. TLC Apache/PHP Micro This is the server template available in the right scale multi cloud market place, which is supported by the Amazon Web Service provider. This template is used in the thesis work. An AWS t1. Micro all in one server with Ubuntu 32bit Operating System, Apache, PHP, common PHP modules and site enabling link manageable through right scale. The different contents to configure the server template are Multi Cloud Image: Ubuntu_10.04_i386_micro. Right Script: Apache Ubuntu vhost configure. Right Script: SYS SYSLOG remote logging client -11H1 Right Script: SYS Time Zone Set – 11H1 Right Script : MAIL Postfix local delivery – 11H1 Right Script: WEB Apache (re) starts – 11H1 Right Script: WEB Apache base install – 11H1 Right Script: WEB PHP installs – 11H1 3.3.3 Amazon Web Services (AWS) In 2006, AWS began to provide IT infrastructure services to all types of business in the form of web services now it’s called as cloud computing [41]. The advantage of cloud computing is the chance to replace up the capital infrastructure expenses with low cost, which helps the business. No need to launch the own servers and infrastructure. Instantly, they provide you the thousands of servers in minutes as per the requirements and deliver the results faster. Amazon cloud is a partner of Right Scale so authors have chosen Amazon Cloud. Today, AWS is a highly scalable, reliable, efficient, open, flexible and low cost infrastructure platform in the cloud that covers hundreds of companies or business around the world [41]. AWS covers all over the world by providing data centers at different locations in the U.S., Europe, Singapore and Japan. Now they also launched their servers in Oregon and Paulo. 20 The main concept of AWS cloud computing is pay-as-you –go pricing with no up-front investment or expenses or without long-term plans or commitments. AWS provides a flexible, costeffective, secure, scalable and easy to use cloud computing platform for business of all sizes of [41]. It is comfortable to deploy applications and services with greater flexibility, scalability and reliability in AWS. The application for this research work is deployed in US-East. 3.3.4 Amazon Elastic Compute Cloud (Amazon EC2) Amazon EC2 [42] is a web service that provides the change of size for the developers in the cloud. EC2 is a new way of introducing web hosting by allowing the flexible increase or decrease of the number of servers according to the service required within minutes. EC2 is a simple web service interface allows obtaining and configuring the capacity so that the applications can run easily in the cloud. Amazon EC2 allows you to pay for how much you have used. Various features like Amazon Elastic load balancing, Auto scaling and an Amazon cloud watch are provided for monitoring the developer tools. EC2 has different instances, images, security groups, SSH keys, Elastic IPs and placement. They are configured as per the requirements. The following steps are performed using Amazon EC2 Select a pre-defined template if it is available in the market. Then import it and run immediately or create an own image according to the application requirements, install libraries, data and some configuration settings. Set up security and network approach on an EC2 instance. Decide whether to run the application in different zones or locations. Pay as per use for what you actually take, like per hour charges or the data transfer. 3.3.5 EC2 Instances The instance makes free from the costs, planning, purchasing and maintaining hardware, which costs more and this set up provides a low cost. It offers both 32 bit and 64 bit instance types. Choose according to the application requirement. Some of the applications need high performance network interconnects along with a high-performance CPU then use cluster compute instances. SSH Keys Before launching of deployment, an image is launched and will specify Secure Shell key to link that image. It is better to create own SSH Key from the right scale dashboard. The SSH key is passed into the new instance to allow root login access to your instance via SSH. This is an acceptable and secure way to communicate your instances. Security Groups Amazon has developed security groups and essential firewalls for EC2 servers. The traffic is filtered based on the IP address, packet types and ports. Security groups are essential to provide firewalls for EC2 servers. It assigns incoming ports opened in the Amazon for the interconnection to instance. At the launch of EC2 server at least one security group needs to be assigned. Security groups are usually required if you have multiple deployments that require different levels of accessibility. All security groups must have port 22 open in order to support root level access the machine via SSH. 21 Elastic IPs Once an instance is launched, the Elastic IP to the running instances is associated. So that the application will link to that EIP address. For example you can see the page with the help of that IP address. 3.3.6 Amazon Simple Storage Service (Amazon S3) Amazon S3 is simple storage for internet. It provides a web service interface to store and recollect any amount of data at any time and from anywhere in the world using the web. Each data is stored in a bucket and can collect through developer-assigned key. A bucket can be stored at one of the zones. When the objects are stored in particular region then they store in that region unless you change the region. Authentication is required to ensure that data is kept secure. It gives the developer access to reliable, secure, fast inexpensive infrastructure that Amazon uses to run its own global network of websites [43]. 3.3.7 Jw Player Jw Player version 5.8 is an open source embedded video player [46] that supports both audio and video formats. With the help of a Jw player script a video application is deployed in the cloud. Jw Player works in every browser on every old and new device. Its embedded script supports both Flash application and JavaScript application using html. This work focuses on .mp4 file for streaming video from the cloud. 3.3.8 TShark tool The Tshark (version: 1.6.5) is a command line oriented version of the Wire shark. The Tshark is a network protocol analyzer. It is designed for capturing and displaying packets in the terminal. The Tshark native capture file format is pcap which is supported by the TCP dump and other tools [45]. The Tshark is installed in the cloud for collecting traces at the server side in the form of pcap files. The command used for collecting pcap file is i.e.: tshark –i –w filename.pcap The traces are then downloaded to the local machine using secure copy of SSH. i.e.: scp filename.pcap root @ipaddress:/home/path 3.3.9 Wireshark tool (1.6.5) Wire shark is an open source network packet analyzer [45]. Wire shark is a validated tool to capture network packets and display data. Wire shark is cross platform using a GTK+ tool kit to user interface and using pcap to capture packets. It runs on various operating systems and its free software available in the market. Wire shark is installed in laptops and traces are collected through various browsers like opera, Firefox and chrome. The captured files are in pcap format. 3.3.10 Network Time Protocol (NTP) NTP is a protocol designed and software for the synchronization of the computer clock over packet switched variable via a network. The pool.ntp.org project consists of a huge number of 22 time servers providing easy to use NTP service for a large number of clients. NTP application is available in the android market for android mobile. 3.3.11 Mobile Browsers HTC Desire smart phone with android version 2.2.1 is used. Five browsers are considered to conduct the experiments. They are an android in build browser, opera, x-scope, dolphin and Firefox browsers [47]. 3.3.12 Super User and Shark Root Super user is an application available in the android market [47]. Super user has the functionality to access sudo su. su has permission to modify any data in android devices. The super user in conjunction with shark root tools helps to capture network packets. Sharkroot is a traffic sniffer tool works for both 3G and Wi-Fi it is similar to Wireshark. The shark tool monitors all the network activity in android mobiles. The captured information is placed in a .pcap format. For validating the tool different experiments are performed. Say, traces are collected at both the server and client side. In the other experiment, an android mobile is made as router and connected WI-Fi to laptop. Now Wireshark is used to collect traces in the laptop then a comparison is made of both the traces and validated the tool. 3.4 Results and Analysis . Pcap files are used to perform the analysis of network performance metrics. 3.4.1 PCAP to text conversion From the inbuilt libraries of Tshark conversion from the .Pcap file to text is performed for convenience and easy analysis. The command used in Tshark is: tshark -<file.pcap> file.txt. 3.4.2 Jitter Calculation Jitter ( 𝐽𝑛 ) is calculated as the difference of the inter arrival times of the consecutive packets of the captured packets of [14]. Jitter is calculated from the obtained traces after the three-way handshake SYN, ACK until the last packet FIN, ACK from the server. 𝑇𝑅,𝑛 is the time when the 𝑛𝑡ℎ packet is received. 𝑇𝑅,𝑛−1 is the time when the (𝑛 − 1)𝑡ℎ packet is received. The equation for the calculation of jitter is ∆𝑇𝑅,𝑛 = 𝑇𝑅,𝑛 - 𝑇𝑅,𝑛−1 𝐽𝑛 = ∆𝑇𝑅,𝑛 -∆𝑇𝑅,𝑛−1 3.4.3 MATLAB and graph analysis MATLAB (MATtrix LABoratory) (version 7.12.1) is a tool for visualization and numerical computation [64]. MATLAB is a convenient tool for analyzing the statistics. A script that can perform the functions like reading text files, filter IP address, packet’s size and jitter calculation is written. In order to validate the script, the metrics first calculates the result theoretically and compared the result with MATLAB output and then different graphs like jitter with respect to sequence number, packet size, semi log and CDF is plotted. 23 3.5 Results To analyze the results, experiments are repeated for 25 times with each browser of Smartphone and laptop. All experiments are done during different time slots in BTH University. Before the start of every experiment in Smartphone and laptop caches are removed and then capture of traces is collected through packet capture tools. The Fig 6 represents the CDF graph of packet size. Here, experiments are repeated with various browsers in HTC Desire and Toshiba Laptop. From the obtained traces, it is observed that most of the packet sizes are 1514 bytes (approx 99% of packet size is 1514 bytes) which are obtained from server cloud to the client Smartphone and laptop. Empirical CDF 1 0.9 0.8 0.7 F(x) 0.6 0.5 0.4 0.3 0.2 0.1 0 0 200 400 600 800 Packet Size 1000 1200 1400 1600 Fig. 6 CDF graph of Packet Size. 3.5.1 Jitter performance in Smart Phone Browsers The experiments were performed on best five browsers; they are Android browser, xScope, Firefox, Opera and Dolphin browsers of android mobile. The experiments were repeated for 25 times individually with each browser. The traces are collected using shark root in HTC Desire mobile. From the obtained traces, jitter is calculated as shown in section 3.4.2. Performance of video application depends upon the network and browser. With the help of jitter values of various browsers, the semi logs CDF graph is plotted as shown in Fig. 7 Opera browser performs better jitter performance compared to other browsers. Due to fewer numbers of Retransmission are generated in capture packets of Opera browser but in other browsers, the transmissions of resets are more. Opera browser performs well due to network performance for jitter. Detailed graphs of jitter from various browsers are placed in Appendix. 24 Jitter(ms) calculation in Smart phone for various Browser 1 0.9 CDF 0.8 0.7 0.6 Andriod Firefox Opera Dolphin xScope 0.5 0.4 -20 10 -15 -10 10 -5 10 0 10 5 10 10 Jitter(ms) Fig. 7 Semi-log graph for the calculation of jitter in Smart phone for various Browsers. 3.5.2 Jitter Performance in Laptop Browsers The experiments were held in three browsers, they are chrome, Firefox and Opera browsers in a Laptop. The experiments were repeated for 25 times individually with each browser. The traces are collected using wire shark in Toshiba Laptop. From the obtained traces jitter calculation has shown in 3.4.2. With the help of jitter values of various browsers, the CDF graph is plotted as shown in fig. 8 with the plotted graph; Firefox is the best browser for video application in the cloud. Jitter(ms) calculation for various Browser 1 Firefox Opera Chorme 0.9 CDF 0.8 0.7 0.6 0.5 0.4 -15 10 -10 10 -5 10 Jitter(ms) 0 10 5 10 Fig. 8 Semi-log graph for jitter calculation for various browsers. 25 4. Experimental Setup for Local Server NTP SERVER WI-FI OF SMART PHONE LAPTOP WI-FI SPOT SMART PHONE LAPTOP Fig. 9 Experimental setup for Local Server. The experimental setup for Local server is as shown in Fig 9. For this experiment HP laptop of windows7 64 bit, Intel core i3 processor with 4 GB RAM will act as a server. Wamp Server (version 2.2) or Nginx server (version 1.1.13) is installed in the laptop to act as a local server. First, in the Apache server video application is deployed, and Wi-Fi hot spot has done in laptop. Now, the laptop will act as a router for smart phones. The smart phone gets access through the laptop WI-Fi hotspot. Both laptop and smart phone synchronizes to NTP server. The pcap files are captured using shark root while accessing a video streaming application through smart phone. Experiments repeat for 25 times for various browsers available in the smart phone. With the captured traces page loading time and Round Trip Time are calculated for local server on various browsers. As for cloud server, the experimental setup is explained clearly in 3.1. Video Application(Laptop) Laptop WI-FI Hotspot Smart Phone Apache Server(WAMP) and nginx server Smart phone WI-FI Different Browsers WireShark NTP Server Sharkroot .pcap file .pcap file Fig. 10 Block Diagram for Local Server. 26 By using two different servers experiments are conducted in local server. They are as shown in fig. 10. Wamp Server Wamp server (2.2C) is a freely available open-source software and easy to configure. Wamp server manages Apache and Mysql services [61]. Apache is a freely available source for web server platform. Nginx Server Nginx Server is open-source software with high-performance HTTP server. Nginx provides high performance, reliability, scalability, security, and it is consistently efficient. The version number of Nginx used is 1.1.13 of [62]. Some tips are considered before conducting the experiment. Page weight is important for page loading time for mobile browsers. So, here in the script unnecessary comments, white space, timeline is eliminated. Video file, CSS, JavaScript into other extra files is considered. To reduce the time required for the request to be sent and response to receive and minimizing the page size is considered by this HTTP Look up will be reduced. For every time the caches are cleared. If caches are present, it displays from a last modified content page. The embedded JavaScript player is considered to display fast appearance of webpage for the quick response to the user. Experiments are repeated to know the better performance of the browser for the page loading time and round trip time. 4.1 Round Trip Time (RTT) Round trip time is the time taken for the response from the server when the request is sent by the user. Round trip time is most important parameter in cloud computing. To know how round trip time varies with the local server. RTT is the time between SYN, SYN ACK packets of a three way handshake. The time between SYN, SYN ACK predicts minimum RTT. The estimating time between SYN, SYN ACK shows the average RTT. The time between the SYN, SYN ACK is a poor prediction of the maximum RTT of [11] [12]. The network monitor tool is placed in both server and client side to capture the TCP packets. From the captured packets, RTT can be calculated using SYN, SYN ACK. 27 4.2 Results 4.2.1 RTT for cloud server Round Trip Time fori various browser in cloud. 180 Android Dolphin FireFox Opera xScope 170 Time(ms) 160 150 140 130 120 110 0 5 10 15 No. of Experiments 20 25 Fig. 11 Round trip time for various browsers in Cloud. The above experiment is done to calculate Round Trip Time from the cloud server. Packets are collected for 25 times using various browsers of smart phone. From the captured packets of SYN, SYN ACK is called Round Trip Time. The Fig. 11 shows the RTT for five browsers while accessing a video from the cloud server. Almost every browser RTT value is similar so from the analysis RTT does not depend on the mobile browsers. So by the study, RTT depends on network latency, server response time, CPU processor and hardware. Experiments are conducted in a controlled environment to minimize network latency. To know the browser performance experiments are conducted in two different servers, they are Apache and Nginx server. 28 4.2.2 RTT for Apache Server Round Trip Time for Apache local server. 20 Android Opera Firefox Dolphin xScope 18 16 Time(ms). 14 12 10 8 6 4 2 0 5 10 15 No. of Experiment. 20 25 Fig. 12 Round Trip Time for Apache local server. From Fig. 12, it shows the results of various browsers of smart phone. When video is deployed in Apache server and accessed through smart phone. The packets are collected at both the client and server side. From the captured packets RTT can be calculated from SYN, SYN ACK packets. The experiment is conducted for 25 runs on each browser. From the RTT values of Apache server Firefox shows better performance compared to other browsers of smart phone. 4.2.3 RTT for Nginx Server Round Trip Time for nginx local server. 20 Andriod Firefox Opera Dolphin xScope 18 16 Time(ms) 14 12 10 8 6 4 2 0 5 10 15 No. of Experiment 20 25 Fig. 13 Round trip time for nigix local sever. 29 The experiment is conducted same as Apache server. Here, the Apache server has been replaced with Nginx Server to know whether browsers depend on RTT. The video is deployed in Nginx server and accessed through smart phone using various browsers. From the calculated RTT for Nginx server Firefox shows least browser performance compared with other browsers in fig 13. From 4.2.2 and 4.2.3, the same Firefox browser shows the best performance in Apache server and least performance in Nginx Server for RTT. So, by this analysis RTT does not depend on browsers, and it mostly depends on server, memory and CPU. 4.3 Page Loading Time The Page loading time [18] [19] is defined as the time taken to load the web page from the server through mobile browser or laptop browser. Page loading time is calculated from the first HTTP GET request packet from smart phone or laptop to the last FIN ACK response from the cloud server to the smart phone or laptop while accessing the video from the cloud server. The Page loading time of a mobile browser depends upon hardware, software, server response time, on the network and bandwidth. Mobile browsers play a vital role in smart phones, PDAs, tablets etc. Mobile browsers are also known as mini browser, micro browser or wireless internet browser. Mobile browser helps to display the content of the web page. Mobile browsers are designed based on hardware, operating system, and low power consumption and even on the low Band width to display the content. To increase the performance of the mobile browser the development of hardware was increasing rapidly. A mobile browser specifies a data services platform which is provided by the mobile operators for the end user. For an end user mobile browser should perform fast, effective, reliable and secure. The performance of mobile browser depends on User Interface, browser engine, Java script interpreter, networking, subsystem, XML parser, UI back end and Data persistence subsystem. The User Interface provides the features like display the content, toolbar, page load and downloads option. Browsers Engine is software that takes a URL and displays forward, backward, reload of the browser actions. It loads and displays the web page content on the screen. The rendering engine displays the given URL and also displays the HTML and XML documents with CSS are the result for features of web browser design or architecture. Networking carries out file transfer protocols such as HTTP and FTP. It is used to transfer data cache of recently retrieved resources. The Java script interpreter is Java script which is embedded in web pages. Java script is an object oriented programming language developed by Netscape (Netscape, 2008). XML parser translates an XML document to an XML DOM object. Display backend subsystem mainly depends on the operating system, and on the user interface widgets. Data persistence is to store data of various browser sessions. It stores all types of data such as cookies, bookmarks and toolbar settings, etc. Performance of mobile browser not only depends upon page loading time moreover on the performance issues like network latency, server response time and hardware. To avoid the network latency experiments are conducted in controlled environment as well as experiments are conducted on two different servers they are Wamp server and Nginx server. 30 Experiments are conducted in both local servers and cloud server for the page loading performance. Page loading time is calculated for both local servers and in the cloud server. A video application is deployed on all the servers. That video is accessed through various browsers available in android mobile and in laptop. The time taken from the video to load and play from the server to smart phone or in the laptop through various browsers is the page loading time performance for the research work. The traces are collected in Smartphone and laptop. 4.4 Results 4.4.1 Page Loading Time for cloud Performance for page load time is the total time taken for the web browser to display the whole web page content after the request is sent by the user to the server. Page loading time is calculated from first HTTP GET a request packet from smart phone to the last FIN ACK response from the cloud server or the local server to the smart phone while accessing the video from the cloud server or local server through various browsers. Page loading time for various browser in Cloud. 340 Andriod Dolphin Firefox Opera xScope 320 300 TIme(secs) 280 260 240 220 200 180 160 140 0 5 10 15 No. of Experiment. 20 25 Fig. 14 Page loading time for various browsers in cloud. From fig. 14, page loading time for various browsers while accessing the video from the cloud server. The experiments repeat for 25 times for each browser. A number of experiments in the graph are on the x-axis and time in seconds is on y- axis. The graph represents how much time it takes for the video to load and play with each browser of an android device. Opera browser is the best browser compared to other browsers for the page loading time. From the obtained traces of various browsers, opera browser performance is better due to less retransmission of packets occurred due to TCP window size full. The Page loading time depends upon the network latency and server response time. To reduce network latency experiments are conducted on a local server in a controlled environment. 31 4.4.2 Page loading time for Apache Server Page Loading Time For various browser in Apache local server. 290 Android Opera Dolphin xScope Firefox 285 280 Time(secs) 275 270 265 260 255 250 0 5 10 15 No.of Experiment. 20 25 Fig. 15 Page loading time for various browsers in Apache local server. The experiments are conducted in a controlled environment to calculate page loading time for Apache server. The experiments repeat for 25 times for several browsers while accessing a video application in the smart phone. The time taken for various browsers is shown in the above graph in fig 15. The average page load time for various browsers are shown in the below table 5. Every time caches are cleared to know the exact time to load the video. For 322 seconds video to play each browser is approximately taking 266-270 seconds of time. The time taken for all above browsers of smart phone is similar. 4.3.3 Page loading time for Nginx Server Page Loading Time For various browser in nginx local server. 275 Android Opera Firefox Dolphin xScope Time(secs) 270 265 260 0 5 10 15 No. of Experiment. 20 25 Fig. 16 Page loading time for various browsers in nginx local server. 32 The experiments are repeated same as Apache server as shown in Fig 16. Before every experiment, all caches are removed; browser history and saved cookies are removed. Here, all the browsers performance shows similar and equal in Nginx server in the smart phone. Apache server Nginx server Android browser 267.265 266,017 Opera 266.118 265,730 Firefox 267.033 266,639 Dolphin 270.084 266,954 xScope 268.241 266,778 Browsers Table 5 Average Page Loading time in local servers. The performance of average page load time for two different servers is shown in table [5]. From the obtained results, page loading time for both Apache server and Nginx server are approximately equal. So, page loading performance is highly influential on hardware capacity, processor and on the RAM. 33 Conclusion We have presented a Systematic Literature Review on the performances of Cloud Infrastructure in Cloud Computing. To answer research question 1 the author has done SLR on cloud infrastructure to observe the cloud performance. The related papers are written in the tabular form which gives the detail issues of security, hardware, power consumption and network for cloud infrastructure. To answer the second research question, video application is deployed in the cloud and observer the jitter performance while video load and play in smart phone and laptop for various browsers. Opera browser in smart phones shows better performance for jitter compared to other browsers in the cloud. Firefox shows best browser performance for jitter in laptop compared to other browsers. To answer the third research question, it focuses mainly based on the performance of mobile browser while accessing a video application in a cloud for Round Trip Time. It shows similar results as shown in the graph. So RTT is analyzed for various browsers and can conclude with the help of these results, RTT does not depend on mobile browsers. RTT depends upon CPU, memory, server response time and Network latency. Now, the experiments are conducted in a controlled environment on two different servers. A graph of RTT represents Firefox performs better and worst performance in two servers. Hence from these experiments we can conclude RTT does not depend on browsers. It depends on the server response time. To answer the fourth research question, the time took to load the whole web page content to display on the screen. Page loading time is conducted on three servers one is cloud server, and two other servers are local server with the smart phone. Experiment is conducted using various browsers. But page loading time is similar in all the browsers of the same server. So page loading time depends upon hardware, CPU, memory. 34 Future Work During the course of work this can be extended in many ways. Firstly, it can be linked to QOE (Quality of Experience). The Measuring different parameters like throughput, packet loss, CPU processor and memory utilization. Experiments can be done in different smart phones. See the same performance issues how it works in 3G. It would be interesting to compare the Performances of various cloud providers. By increasing loads, scalability in different cloud provider’s performances can be analyzed. Browser performance can be performed on advanced hardware and smart phones. Cloud Network performance also increases by different Algorithms. 35 References 1. Z. Ganon, I. E. Zilbershtein, “Cloud-based Performance Testing of Network Management Systems,” in PROC. 14th International conf. On Computer Aided modeling and Design of Communication links and Networks, 2009, pp. 1-6. 2. A. Miha, D. Amrhein and P. Anderson, Cloud Computing Use Cases White Paper. Version 4, July 2010. 3. Amazon Elastic Compute Cloud (Amazon EC2) http://aws.amazon.com/ec2/. Retrieved [2011-05-18]. 4. Barber. A. Kitchenham et.al, "Preliminary Guidelines for Empirical Research in software Engineering," in PROC. IEEE transactions on software Engineering, vol. 28, no. 8, Aug. 2002. 5. B. Raj Kumar, Y. S. Chee, S. Venugopal, “Cloud Computing and Emerging IT Platforms: Vision, hype, and reality for delivering computing as the 5th utility,” in PROC. IEEE International conf. On Computer systems, 2009, pp. 599-616. 6. V. T. Anthony, V. J. Toby and E. Robert, Cloud Computing: A Practical Approach, Cambridge, McGraw Hill, 2010. 7. H. Falaki, D. Lymberopouls and R. Mahajan, “A First look at Traffic on Smart Phones,” in PROC. 10TH Annual conf. on Internet Measurement, Newyork, USA, 2010. 8. I. Trestian, S. Ranjan, A. Kuzmanovic, and A. Nucci. Measuring serendipity: Connecting people, locations and interests in a mobile 3G network. In IMC, 2009. 9. A. Iosup, S. Ostermann, M. N. Yigitbasi, T. Fahringer and D. H. J. Epema, “Performance Analysis of Cloud Computing Services for Many Tasks Scientific Computing,” in PROC. IEEE Transactions on Parallel and Distributed systems, vol. 22, no. 6, 2011, pp. 931-945. 10. S.Y. Park, H. S. Ahn and W. Yu “Round-Trip time based Wireless Positioning without Time Synchronization,” in PROC. International conf. on control, automation and systems,2007, pp. 2323-2326. 11. Yolanda Tsang; Yildiz, M.; Barford, P.; Nowak, R, "On the Performance of Round Trip Time Network Tomography," Communications, 2006. ICC'06. IEEE International Conference on, vol.2, pp.483-488, June 2006. 12. S. Phillipa and A. Mahanti, “Observations on Round- Trip Times of TCP Connections”, Canada. 13. Yujie Pei; Hongbo Wang; Shiduan Cheng, "A passive method to estimate TCP round trip time from nonsender-side," Computer Science and Information Technology, 2009. ICCSIT 2009, 2nd IEEE International Conference on, pp.43-47, 8-11 Aug. 2009. 14. S. Ickin, K. D. Vogeleer, M. Fiedler, D. Erman., "On the Choice of Performance Metrics for User-Centric Seamless Communication," in Third Euro-NF IA. 7.5 Workshop on socio economic Issues of Networks of the Future, Ghent, Belgium, 2010. 15. J. Huang, Q. Xu, Z. M. Mao, M. Zhang, P. Bahl, “Anatomizing Application Performance Differences on smartphones,” from Microsoft Research, University of Michigan, 2010. 16. W. Michael, N. Corey, C. Hsin-Ping and C. Jui-Hung, “Comprehensive Analysis of Smartphone OS Capabilities and Performance,” Wireless Internet and Pervasive Computing, April 20, 2009. 17. W. Jonatan, O.Mikael, “Comparison of CPU management in Symbian,” from OS team and Microsoft windows, November 19, 2006. 36 18. S. S. Regmi, S.M.S. Adhikari “ Network Performance of HTML 5 Web Application in Smartphone,” MSc Thesis, Dept. of Telecommunication System at School of Computing (COM), Blekinge Institute of Technology (BTH), Karlskrona, Sweden, Nov 2011. 19. F. Hossein, L. Dimitros, M. Ratul, S. Kandula, E. Deboran, A First Look on Traffic on Smartphone, [Online] Available http://www.cs.ucla.edu/~falaki/pub/imc153s-falaki.pdf 20. Kitchenham, B.; Charters, S.;, "Guidelines for performing Systematic Literature Reviews in Software Engineering," Keele University and Durham University Joint Report EBSE 2007001, 2007. 21. Cervino, J.; Rodriguez, P.; Trajkovska, I.; Mozo, A.; Salvachua, J.; , "Testing a Cloud Provider Network for Hybrid P2P and Cloud Streaming Architectures," Cloud Computing (CLOUD), 2011 IEEE International Conference on , vol., no., pp.356-363, 4-9 July 2011(jitter, cloud) 22. S. L. Garfinkel, “An evaluation of amazon’s grid computing services: Ec2, s3 and sqs,” Center for, Tech. Rep., 2007. (10, jitter) 23. Srirama .S.N, Jakovits .P, Vainikko, E, "Adapting scientific computing problems to clouds using MapReduce," Future Generation Computer Systems, vol. 28, no. 1, pp. 184-192, 2012. 24. X. Yang, Q. Shen and Y. Yang, “A way of key management in Cloud storage based on trusted computing,” in 8th IFIP International Conference on Network and Parallel Computing, 2011. 25. Benson. T, Akella . A and Shaikh. A, "CloudNaaS: A cloud networking platform for enterprise applications," in 2nd ACM Symposium on Cloud Computing, 2011. 26. Adya. A, Cooper. G and Myers. D, "Thialfi: A client notification service for internet-scale applications," in 23rd ACM Symposium on Operating Systems Principles, United States, 2011. 27. Eshete. B, Villafiorita. A and Weldemariam. K, "A new form of dos attack in a cloud and its avoidance mechanism," in ACM Workshop on cloud computing security , 2010. 28. Huan Liu; Orban, D, "GridBatch: Cloud Computing for Large-Scale Data-Intensive Batch Applications," Cluster Computing and the Grid, 2008. CCGRID'08. 8th IEEE International Symposium on, vol., no., pp.295-305, 19-22 May 2008 29. Farwick, M.; Agreiter, B.; Breu, R.; Häring, M.; Voges, K.; Hanschke, I, "Towards Living Landscape Models: Automated Integration of Infrastructure Cloud in Enterprise Architecture Management," Cloud Computing (CLOUD), 2010 IEEE 3rd International Conference on, vol., no., pp.35-42, 5-10 July 2010 30. Yichao Yang; Yanbo Zhou; Lei Liang; Dan He; Zhili Sun, "A Sevice-Oriented Broker for Bulk Data Transfer in Cloud Computing," Grid and Cooperative Computing (GCC), 2010 9th International Conference on , vol., no., pp.264-269, 1-5 Nov. 2010 31. Fox, G.C.; Ho, A.; Chan, E, "Measured characteristics of futuregrid clouds for scalable collaborative sensor-centric grid applications," Collaboration Technologies and Systems (CTS), 2011 International Conference on, vol., no., pp.151-160, 23-27 May 2011 32. D'Angelo, G, "Parallel and distributed simulation from many cores to the public cloud," High Performance Computing and Simulation (HPCS), 2011 International Conference on , pp.1423, 4-8 July 2011 33. Khan, I.; Rehman, H.; Anwar, Z, "Design and Deployment of a Trusted Eucalyptus Cloud," Cloud Computing (CLOUD), 2011 IEEE International Conference on, vol., no., pp.380-387, 4-9 July 2011 34. Abbadi, I.M, "Middleware Services at Cloud Virtual Layer," Computer and Information Technology (CIT), 2011 IEEE 11th International Conference on, vol., no., pp.115-120, Aug. 31 2011-Sept. 2 2011 35. Voulodimos, A.S.; Kyriazis, D.P.; Gogouvitis, S.V.; Doulamis, A.D.; Kosmopoulos, D.I.; Varvarigou, T.A, "QoS-oriented Service Management in clouds for large scale industrial 37 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. activity recognition," Soft Computing and Pattern Recognition (SoCPaR), 2011 International Conference of , vol., no., pp.556-560, 14-16 Oct. 2011 Schnjakin. M, Alnemr. R and Meinel. C, "A security and high availability layer for cloud storage," in 1st International Symposium on Web Intelligent Systems and Services, 2010. Rightscale. [Online]. Available: http://www.rightscale.com. [Accessed september 2011]. Rightscale. [Online]. Available: http://www.rightscale.com/products/rightscale-forenterprise.php. [Accessed December 2011]. Rightscale. [Online]. Available: http://blog.rightscale.com/2010/03/22/rightscaleservertemplates-explained/. [Accessed august 2011]. Rightscale. [Online]. Available: http://support.rightscale.com/09-Clouds/AWS/01AWS_Basics/Amazon_Web_Services_%28AWS%29. [Accessed january 2012]. Amazon. [Online]. Available: http://aws.amazon.com/what-is-aws/. [Accessed 2011 july]. Amazon. [Online]. Available: http://aws.amazon.com/ec2/. [Accessed july 2011]. Amazon. [Online]. Available: http://aws.amazon.com/s3/. [Accessed july 2011]. Wireshark. [Online]. Available: http://dictionary.sensagent.com/wireshark/en-en/. [Accessed August 2011]. Wireshark. [Online]. Available: http://www.wireshark.org/. [Accessed June 2011]. Jw player. [Online]. Available: http://www.longtailvideo.com/. Android market. [Online]. Available: https://market.android.com/ GoGrid. [Online]. Available: http://www.gogrid.com/. [Accessed september 2011]. GoGrid. [Online]. Available: http://www.gogrid.com/cloud-hosting/managing-cloudinfrastructure.php. [Accessed september 2011]. Cloud. [Online]. Available: http://www.cloud.com/. [Accessed september 2011]. Cloud. [Online]. Available: http://www.cloud.com/index.php?option=com_k2&view=item&layout=item&id=87&Itemid= 389. [Accessed september 2011]. Citrix. [Online]. Available: http://www.citrix.com/lang/English/lp/lp_2313912.asp?ntref=hp_promo_cloud_change. [Accessed september 2011]. IBM SmartCloud. [Online]. Available: http://www.ibm.com/cloud-computing/us/en/. [Accessed 2011 september]. IBM Smartcloud. [Online]. Available:http://www-03.ibm.com/systems/cloud/?link=ovr_fndplr. 55. IBM Smartcloud. [Online]. Available: http://www-03.ibm.com/systems/cloud/?link=ovr_fndplr. 56. Rackspace. [Online]. Available: http://www.rackspace.co.uk/cloud-hosting/cloud-products/. [Accessed August 2011]. 57. Rackspace. [Online]. Available: http://www.rackspace.co.uk/cloud-hosting/cloudproducts/managed-cloud/prices/. [Accessed August 2011]. 58. Eucalyptus. [Online]. Available: http://www.eucalyptus.com/products/eee. [Accessed September 2011]. 59. Eucalyptus. [Online]. Available: http://www.eucalyptus.com/resources/whitepapers. [Accessed September 2011]. 60. NIST [online]. Available: http://www.nist.gov/itl/cloud/upload/cloud-def-v15.pdf [Accessed June 2011]. 61. WAMP [online]. Available: http://www.wampserver.com/en/ 62. Nginx [online]. Available: http://nginx.org/. 38 63. B. Lee, K. Kim, T. geun Kwon, and Y. Lee, “Content classification of wap traffic in korean cellular networks," in Network Operations and Management Symposium Workshops (NOMS Wksps), 2010 IEEE/IFIP, April 2010, pp. 22 -27. 64. Matlab 7 homepage. [Online]. Available: http://www.mathworks.se/products/matlab/index.html 65. G. Singh, S. Sood and A. Sharma, “CM- Measurement Facets for Cloud Performance,” in proc. International journal of Computer Applications, vol 23, no 3, June 2011. 66. Challenger, J.R.; Dantzig, P.; Arun Iyengar; Squillante, M.S.; Li Zhang; , "Efficiently serving dynamic data at highly accessed web sites," Networking, IEEE/ACM Transactions on , vol.12, no.2, pp. 233- 246, April 2004. 39 APPENDIX Wireshark Tshark Type Model Processor RAM Operating system Version 1.6.5 Version 1.6.5 Android 2.2 HTC desire. 1Ghz. 576 MB. Android 2.2. Android : Browsers(layout): Dolphin(webkit) Firefox(Gecko) Opera (persto) Xscope (webkit) Version 7.3.0 Version 9.0 Version 11.5.3 Version 6.50 Band width for client Laptop (Application: http://speedtest.net/) Smart phone Jw player with embedded player WI-FI Video type MP4 Video Length Frame width Frame height Data rate Total bit rate Frame rate Audio Bitrates Channels Audio sample rate Apache server for Cloud Apache server for local sever Nginx server for local server HP Laptop OS RAM ~10mbps/sec. ~10mbps/sec. Version 5.8 Wi-fi 802.11g MP4 Video 05.22 480 368 429kbps 537kbps 29frames/secon d Audio 108kpps 2(stereo) 44khz Wampserver 2.2A P1 (32 bits)Apache 2.2.21 Php 5.3.8 Mysql 5.5.16 XDebug 2.1.2 XDC 1.5 PhpMyadmin 3.4.5 SQLBuddy 1.3.3 webGrind 1.0. Version 2.2 Version 1.1.13 Windows 7. 4GB. 40 Processer Toshiba Laptop OS RAM Processor Cloud Zone Server Template Image Bit Configuration MATLAB Intel core i3. Windows 7 3GB. Intel core 2 duo(32bit). Amazon aws EC2 US East Rightscale TLC Apache/PHP Micro Ubuntu_10.04_i38 6_ micro[rev1] 32bit SSH, Security ,Elastic IP’s,S3 Version 7.0.12. Table 6 Specification table for experiments. Database Search string IEEE (("Abstract":"cloud infrastructure") AND ("Abstract":performance OR "Abstract":practice OR "Abstract":operation OR "Abstract":efficiency) AND ( "Abstract":issue* OR "Abstract":risk* OR "Abstract":challenge* OR "Abstract":problem*)) E(( (({cloud infrastructure}) WN AB) AND (English WN LA) AND (2008VIILLAGE 2012 WN YR)) AND ( ((((((($performance) WN AB) OR (($practice) WN AB)) Osearch R (($efficiency) WN AB)) AND (English WN LA) AND (20082012 WN YR)) OR ((($OPERATION) WN AB) AND (English WN LA) AND (2008-2012 WN YR)))))) AND ( ((((((($issue) WN AB) OR (($risk) WN AB)) OR (($challenge) WN AB)) AND (English WN LA) AND (2008-2012 WN YR)) OR ((($problem) WN AB) AND (English WN LA) AND (2008-2012 WN YR))))) SCIENCE DIRECT (("Abstract":"cloud infrastructure") AND ("Abstract":performance OR "Abstract":practice OR "Abstract":operation OR "Abstract":efficiency) AND ( "Abstract":issue* OR "Abstract":risk* OR "Abstract":challenge* OR "Abstract":problem*)) Table 7 SLR Search string. 41 Jitter(ms) calculation for Opera Browser 0.2 0.1 0.1 Jitter(ms) Jitter(ms) Jitter(ms) calculation for Firefox Browser 0.2 0 -0.1 -0.2 0 -0.1 0 0.5 1 1.5 Sequence number 2 4 0.5 1 1.5 Sequence number 4 x 10 Jitter(ms) calculation for Chorme Browser 0.2 -0.2 0 0.5 1 1.5 Sequence number 2 4 x 10 Jitter(ms) 0.1 0 -0.1 -0.2 0 2 x 10 Fig. 17 jitter calculation for various browsers in Laptop. 42 0 -5 Jitter(ms) calculation for Dolphin Browser 5 Jitter(ms) Jitter(ms) Jitter(ms) calculation for Andriod Browser 5 0 0.5 1 1.5 Sequence number -5 2 4 -5 0 0.5 1 1.5 Sequence number 2 4 0.5 1 1.5 Sequence number 4 0 4 2 0.5 1 1.5 Sequence number 4 0 -5 0 x 10 Jitter(ms) calculation for opera Browser 5 Jitter(ms) 0.5 1 1.5 sequence number x 10 Jitter(ms) calculation for xScope Browser 5 Jitter(ms) Jitter(ms) x 10 Jitter(ms) calculation for Firefox Browser 5 0 0 2 x 10 0 -5 0 2 x 10 Fig. 18 jitter calculation for various browsers in smart phone. 43 No misspellings or grammatical errors. Demonstrates full knowledge. Can answer all questions with explanations and elaborations. Clear organization with good and logical flow between parts. Varies the pitch, timbre and energy of the voice according to the needs of the presentation to maintain interest. Presentation falls within required time frame Enhances presentation and keeps interest. All key points articulated/covered. Thoroughly explains all points. Multiple vocalized pauses noticed at appropriate places in presentation or in answering questions. PAD009, DR version 1.01, 2007-08-23, Robert Feldt Several key points glossed over. Majority of points covered in depth, some glossed over. Thoughts articulated clearly, but flow is somewhat hampered. 1-2 misspellings or grammatical errors. At ease with material. Can answer questions but without elaboration. Presentation is less than minimum time. Adds nothing to presentation. Presentation is on the edges of the required time frame. Key points articulated/covered but not engaging/enhancing. 3… Uncomfortable with information. Can answer only basic questions. No or unclear logical flow between parts. Small variations in … Some variations in … A few … only some at appropriate … Multiple slumps. Too static or dynamic movements. Shows some negativity towards work and/or results. Mild tension; trouble recovering from mistakes. Occasionally slumps. Occasionally shows positive feelings about work and/or results. Makes mistakes but recovers quickly from them. Displays little or no tension. Some … Few … Some … Somewhat adapted … 2 – Fair/some/little control Only focuses on one part of the audience. Does not scan audience. 3 – Good control Occasionally looks … with parts of the audience. 4 - Superior command 4 or more … Incomplete grasp of information. Cannot answer questions. Incomplete; several key points omitted. Hard to understand work and/or results. Confusing order and organization. No variation in pitch, timbre or energy of voice. A constant and boring voice which is hard to listen to. Mumbling. Presentation is more than maximum time. Poor, distracts audience and is hard to read/interpret. No vocalized pauses noticed. Does not attempt to look at audience at all. Reads notes or looks at computer throughout. No hand gestures are noticed and/or body language is not adapted to presented content. Sits during presentation or slumps repeatedly. Shows no interest in the presented work and/or results. Nervous. Problems recovering from mistakes. 1 – Minimal or no control Student(s)/Work:______________________________________________________ Reviewed by:_________________________________________________________ Constantly looks at and maintains eye contact with different parts of the audience. Natural hand gestures and body language are demonstrated. Well adapted to the content. Stands up straight with both feet to the ground. Turned to audience. Demonstrates a strong, positive feeling about work and results. Relaxed and self-confident with no mistakes. * Key criteria which is the main basis for evaluation and grading Flow, Coherence * Language * Subject knowledge * Completeness * Visual aids Timing Vocalized pauses (ah, um, well etc) Voice variations Poise Enthusiasm Posture, Poise Gestures Criteria Eye contact (Oral) Defense/Presentation Rubric Master Thesis Electrical Engineering Thesis no: MEE 2011: 36918 Jan 2012 Comparative Study of Virtual Machine Software Packages with Real Operating System Arunkumar Jayaraman Pavankumar Rayapudi School of Computing Blekinge Institute of Technology 371 79 Karlskrona Sweden This thesis is submitted to the School of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering. The thesis is equivalent to 20 weeks of full time studies. Contact Information: Author(s):Arunkumar Jayaraman Address: Stenbocksvägen 8, 37237, Ronneby, Sweden E-mail: [email protected] Author(s):Pavankumar Rayapudi Address: Valhallavägen, 371 41, Karlskrona, Sweden E-mail: [email protected] University Advisor(s): Prof. Lars Lundberg School of Computing Blekinge Institute of Technology University Examiner: Dr. Patrik Arlos School of Computing Blekinge Institute of Technology School of Computing Blekinge Institute of Technology 371 79 Karlskrona Sweden Internet Phone Fax : www.bth.se/com : +46 455 38 50 00 : +46 455 38 50 57 ii ABSTRACT Virtualization is one of the main research areas in the field of computing technology. The Virtualization concept was introduced by IBM a decade ago. Virtualization allows computer users to utilize their resources efficiently and effectively. To utilize the network resources, without adopting new resources is a bottleneck for all organizations. Maintaining the resources is also a major challenge for organizations. Virtualization allows them to manage their resources effectively. Operating system that run on top of the Virtual Machine or Hypervisor is called guest OS. This Virtual machine is abstract of the real physical machine. The main intention of this thesis work was to analyze different kinds of virtual software packages and investigate about their advantages and disadvantages. In addition, we were analyzed the performance of the virtual software packages with a real operating system in terms of web services. Web Server plays an important role in the internet. The performance and throughput for a web server are not common to all virtual machines and real machines. In this thesis, we analyzed the web server performance in real operating system Linux. On the other hand, we examined the performance of the web server with guest OS, which is running on Virtual Machines. The performances measures results clearly indicate that real machine performance is best when compared with virtual machine performance. The performance measure from the web services provide a better option to choose an appropriate platform to run web services in an organization. Keywords: Virtualization, Virtual Machine, Performance, Web Server ACKNOWLEDGEMENTS ii CONTENTS ABSTRACT .......................................................................................................................................... I CONTENTS ....................................................................................................................................... III LIST OF FIGURES .............................................................................................................................. 2 LIST OF TABLES ................................................................................................................................ 3 ACRONYMS......................................................................................................................................... 4 1 INTRODUCTION ....................................................................................................................... 6 1.1 1.2 1.3 2 AIMS AND OBJECTIVES ........................................................................................................... 7 RESEARCH QUESTIONS .......................................................................................................... 8 EXPERIMENTAL MODEL ......................................................................................................... 8 BACKGROUND .......................................................................................................................... 9 2.1 VIRTUALIZATION ................................................................................................................... 9 2.2 VIRTUALIZATION METHODS ................................................................................................. 12 2.2.1 Hosted Virtualization ...................................................................................................... 12 2.2.2 Para Virtualization ......................................................................................................... 12 2.2.3 Partial Virtual Machine .................................................................................................. 13 2.2.4 Desktop Virtualization .................................................................................................... 13 2.2.5 Host Virtualization Desktop............................................................................................ 13 2.2.6 Client Virtualization Desktop ......................................................................................... 14 2.2.7 Memory Virtualization .................................................................................................... 14 2.2.8 Data Virtualization: ........................................................................................................ 15 2.2.9 Storage Virtualization ..................................................................................................... 15 2.2.10 Network Virtualization ............................................................................................... 15 2.3 VIRTUALIZATION SOFTWARE PACKAGES .............................................................................. 16 2.3.1 VMware .......................................................................................................................... 16 2.3.2 Virtual Box ...................................................................................................................... 17 2.3.3 QEMU ............................................................................................................................. 17 2.4 THE BENEFITS OF THE VIRTUALIZATION .............................................................................. 18 2.4.1 Hardware Reducibility and Reusability .......................................................................... 18 2.4.2 Cost Reduction ................................................................................................................ 18 2.4.3 Disaster Recovery ........................................................................................................... 19 2.4.4 Server Migration ............................................................................................................. 20 2.4.5 Power Consumption........................................................................................................ 20 2.5 CHALLENGES IN VIRTUALIZATION ....................................................................................... 21 2.5.1 Security Issues ................................................................................................................ 21 2.5.2 Physical Machine Failure ............................................................................................... 21 2.5.3 Input/output Request ....................................................................................................... 21 3 . RESEARCH METHODOLOGY ........................................................................................... 22 3.1 QUALITATIVE RESEARCH ..................................................................................................... 23 3.1.1 Interview ......................................................................................................................... 23 3.2 QUANTITATIVE RESEARCH .................................................................................................. 24 3.2.1 Literature Review............................................................................................................ 24 3.2.2 Experimental Model........................................................................................................ 25 iii 4 RESULT AND ANALYSIS ....................................................................................................... 27 4.1 EXPERIMENTAL RESULTS..................................................................................................... 27 4.1.1 Web Request.................................................................................................................... 27 4.1.2 Test Total Time ............................................................................................................... 28 4.1.3 Minimum Connection Time ............................................................................................. 30 4.1.4 Maximum Connection Time ............................................................................................ 31 4.1.5 CPU I/O Wait ................................................................................................................. 33 4.1.6 CPU User Utilization ..................................................................................................... 35 4.1.7 Analysis of experiment observations ............................................................................... 37 4.2 RESULTS FROM THE INTERVIEWS ......................................................................................... 38 4.2.1 Most Used Virtualization Software Packages ................................................................. 38 4.2.2 Virtualization Architecture ............................................................................................. 38 4.2.3 Advantages of Virtualization .......................................................................................... 38 4.2.4 Problems in virtualizations software .............................................................................. 39 4.2.5 Virtualization software future ......................................................................................... 40 4.2.6 Performance virtualization ............................................................................................. 40 4.2.7 Software Packages Types................................................................................................ 40 5 DISCUSSION ............................................................................................................................. 41 5.1 VALIDITY THREATS ............................................................................................................. 41 5.1.1 Internal Validity .............................................................................................................. 41 5.1.2 External Validity ............................................................................................................. 41 5.1.3 Construct Validity ........................................................................................................... 42 5.1.4 Conclusion Validity ........................................................................................................ 42 6 CONCLUSION .......................................................................................................................... 43 7 FUTURE WORK ....................................................................................................................... 45 8 REFERENCES........................................................................................................................... 46 APPENDIX A...................................................................................................................................... 49 APPENDIX B ...................................................................................................................................... 51 iv LIST OF FIGURES Figure 1- Type 1 Hypervisor .................................................................................................. 10 Figure 2- Type 2 Hypervisor .................................................................................................. 11 Figure 3- Research Methodology ........................................................................................... 22 Figure 4- Experiment Model .................................................................................................. 25 Figure 5-Load 1 Total Test Time (ms) ................................................................................... 29 Figure 6-Load 2 Total Test Time (ms) ................................................................................... 29 Figure 7- Load 1 Maximum Connection Time ....................................................................... 32 Figure 8-Load2 MaximumConnectionTime .......................................................................... 32 Figure 9-Load 1 CPU I/O Wait Percentage ........................................................................... 34 Figure 10- Load 2 CPU I/O Wait Percentage ........................................................................ 34 Figure 11- Load 1 CPU User Utilization Percentage ............................................................. 36 Figure 12-Load 2 CPU User Utilization Percentages ............................................................. 36 2 LIST OF TABLES Table 1 - Request .................................................................................................................... 27 Table 2 - Test Total Time ....................................................................................................... 28 Table 3 - Minimum Connection Time .................................................................................... 30 Table 4 - Maximum Connection Time ................................................................................... 31 Table 5 - CPU I/O Wait Percentage ....................................................................................... 33 Table 6 - CPU User Utilization Percentage ............................................................................ 35 3 ACRONYMS I/O Input/output VM Virtual Machine OS Operating System AFS Andrew File System NFS Network File System SAN Storage Area Network SQL Server Query Language CPU Central Process Unit RAM Random Access Memory IBM International Business Machine IEEE Institute of Electrical and Electronics Engineering ACM Association for Computer Machinery HTTP Hyper Text Transfer Protocol PC Personal Computer GB Giga Byte GHZ Giga Hertz USB Universal Serial Bus 4 LAN Local Area Network WLAN Wireless Local Area Network GUI Graphical User Interface 5 1 INTRODUCTION This chapter provides information about thesis Introduction, background, problem statement, motivation, purpose, aims, objectives, research question, research methodology and risk analysis of this thesis. Virtual Machine (VM) is one of the main research areas in the telecommunication industry. The most important function of VM is to run multiple operating systems on the same computer, and each of the operating system functions separately, not coinciding with another operating system in the host. The instruction set architecture is executed by the VM. It is distinct from a real host physical machine. The VM function like a real operating system and provide high level function to the end user [1]. Devi Prasad has conducted a study to analyze the performance of sequential programs in the virtual machine. This study clearly shows two different virtual machine behaviors while executing the sequential programs. VMware and QEMU virtual machines are implemented on the host OS. The host OS and the guest OS are connected through a bridge network interface [2]. Performance of storage systems on the real machine and the virtual machine are analyzed, with three different storage methods like direct attached disk, SAN and a Raid array. Virtual Machine functions well when compared with real machine storage systems. The I/O interface throughput will be same as the real server [3]. Roxana Geambasu has conducted a study of VM performance in the network file systems. She reported that the conventional use of remote access in a VM is accomplished by making use of the network file systems to access a VM image. Andrew File System (AFS) and Network File System (NFS) network filters are good at VM images. When network and power management conditions are poor, AFS network filter utilizes large block size. This can be implemented when the network has large latency. Roxana suggested using either NFS or tuned AFS based network scheme depending on the network conditions [4]. Workload plays a vital role to design computer architecture; usually, the performance evaluation is carried through some sort of workloads. Evaluating the workload under a specific computer model can provide valid results to know about the computer architecture 6 [5]. The performance of reading a file from disk varies from real machine disk to the virtualization machine disk. Large files reading functions are analyzed with real machines and VVM. The large files are most important for scientific and multimedia development. The result indicates that the performance change is based on access modes data size and request size [6]. Guest OS performance and real OS performances are evaluated using real operating system windows XP, Windows Vista and Windows 7 and guest OS is windows vista. The analysis of this performance over the three OS with the guest OS virtual operating system provides better performance with windows seven host OS[7]. The server‟s performance tests were conducted with real machines and virtual machines. This was constructed through installing multiple servers on the virtual machine, which is hosted by a single host machine. The SQL server functions well with virtual machines on behalf of real physical machine. The virtual machine performs better without any special tuning with SQL server [8]. 1.1 Aims and objectives The main aim of this thesis work is to study different virtualization software packages. Our focus will be on performance of web services in virtualization software packages along with real operating system. 1. Investigate and identify virtualization software packages. 2. Analyzing the advantages and disadvantages of the virtualization software. 3. To analyze the web services performance of virtualization packages with real operating system 7 1.2 Research Questions The following are the thesis research questions. 1. What are the most-used virtualization software packages? 2. What are the advantages and disadvantages with virtualization software packages? 3. What is the web service performance in terms of throughput for this software packages along with the real operating systems? 1.3 Experimental Model In this section, a brief explanation about our experimental setup is described; we have chosen computer with following configuration Intel Core 2 Quad Q6600 processor, 4 GB RAM, 2.4 GHZ and Dual core CPU. We have installed and configure real operating system LINUX. We have chosen a remote web server, and we have evaluated request status, connection time, response time, CPU I/O wait percentage and CPU user utilization percentage account as a measure. Next we installed and configured virtual software packages. The packages have been chosen from the literature review and the interview results. We have installed and configured the operating systems Linux on these virtual software packages. Later we analyzed the performance of the web services in the guest OS. 8 2 BACKGROUND 2.1 Virtualization Virtualization technology is one of the main research areas in the field of computing. This was introduced and developed by IBM Corporation. IBM created several virtual machines on one physical mainframe. By using this virtualization technology organizations can provide different services without increasing their network resources [9]. Virtualization technique provides services, same as the real machine services. The computer hardware can be able to run only one operating system per time, due to this feature the vast amounts of resources are underutilized. Hence, by using this virtualization technology same computer hardware can run different types of an operating system on one physical machine. The virtual machines run on top of the host machine and shares resources from the host machine. The virtual machines functions without merging with any other virtual machines in the same host [10]. Virtualization software package produces software that can exactly intend the representation of the hardware. The represented hardware contains static memory, dynamic memory and other resources like the real hardware machine. The OS can be installed using this hardware setting. It works like a physical machine, and the OS provide function as real OS. Take a case where one CPU can be able to run one operating system at a time. If we need to run one more OS on the same computer at the same time it is impossible in that machine. Here the CPU is not utilized completely by the real operating system. In this scenario virtualization provides a better solution to the represented hardware which we get from the virtual software. It can be able to run different OS on the single CPU without interrupting any other OS on the same machine [11]. This can be achieved through virtualization computing. Based on this virtualization technology the operating system can run on virtual hardware. Virtual hardware is emulated from x86 processors. It contains all the resources, the real computing hardware has. So on this hardware it is possible to run the operating system which can provide services like real computer service. 9 The virtual machine is controlled by hypervisor. Hypervisor is called the virtual machine manager. This hypervisor manages guest OS and their storage area, memory and their resources. The so called hypervisor is the backbone of the virtualization technology. There are two types of hypervisor namely, the type one hypervisor and type two hypervisor [12]. Type 1 Hypervisor: In this method, we can run different types of OS without considering about the real operating systems on the host machine. The performance of a guest operating system is less when compared to the real host operating system, but this way we can achieve portability. Type 2 Hypervisor: In this method, the hypervisor is connected between the guest OS and real machine. This performs better when compared to type one hypervisor. Virtual Machines are coupled with physical machine. VM runs like an application in the host machine. OS OS OS Hypervisor Operating System Hardware Figure 1- Type 1 Hypervisor 10 OS OS OS Hypervisor Hardware Figure 2- Type 2 Hypervisor 11 2.2 1. Virtualization methods Hosted Virtulization 2. Para Virtulization 3. Partial Virtulization 4. Desktop Virtulization 5. Software Virtulization 6. Memory Virtulization 7. Data virtulization 8. Storage Virtulization 9. Network Virtulization 2.2.1 Hosted Virtualization This virtual machine is complete abstract of real physical machine. This machine has the entire feature same as a real physical machine that includes memory, operations, storage, etc. Software that supports real machine must support this virtual machine. The full virtualization can be achieved through abstract of underlying real host configuration emulation. The operating system running on the virtual machine is called guest OS. The operating system running on the real machine is called host OS. The VM ware, Virtual Box and Microsoft Virtual Server are few examples of full virtualization [13]. 2.2.2 Para Virtualization The guest operating system is not running on their virtual machine directly. It is not functioned as full virtualization. This virtualization is having the direct interface with the hypervisor or virtual machine manager. This technique gives the enhancement of the guest operating system to improve their functionality, but it has some drawbacks that it does have to provide compatibility like full virtualization. The Para virtualization does not provide functionality if the guest OS not modified their compatibility. This paravirtualization can be achieved through paravirtualization from the hypervisor [14]. 12 2.2.3 Partial Virtual Machine A Partial virtual machine can be defined as the virtual machines which can implement certain kind of environments without providing full virtualization of the hardware. It is based on the application environment like address space sharing application use the partial virtual machine. The partial virtual machine technique exists before full virtualization. This leads to developing full virtualization technology in the virtualization field. 2.2.4 Desktop Virtualization Desktop virtualization can provide desktop environment access through remote client so the client can be able to access the network resources from anywhere. The virtual desktop environments don‟t require compatible system or any hardware resources on the client side, but it requires only network connection. Through this connection the user can use the personalize desktop from remote area [15]. The main types of desktop virtualization: 1. Host virtualization desktop 2. Client virtualization desktop 2.2.5 Host Virtualization Desktop The client host connects to the server virtual machine through personalization desktop or random assigning desktop method. The client connects to the data center through remote connection using client host login into the hosted virtual machine. The desktop virtualization application and services that are running on the server through this method provide greater security to data. The management of this virtual machine is also easy to manage when compared with our traditional desktop methods [16]. 13 2.2.6 Client Virtualization Desktop The client virtualization model the operating system is working on the portable device so that the device can be able to carry and run into system on a single host. This method is similar to full virtualization method. This method provides greater security to data and also it is easy to manage network. Protecting data and confidential information from hackers is important to any organization. Using this method provides greater reliability to data. It is easy to monitor each client activity with this method [17]. Disadvantages with desktop virtualization 1. To provide high graphical interface to the client is a big problem with this technology. 2. This requires dedicated bandwidth for services to client. 3. If there is any problem in network connection or bandwidth problem, in that situation it is difficult to manage with this technology. To overcome this mentioned issue, the better solution is to provide high bandwidth and reliable network connection which allows handling the problem in the easiest way [18]. 2.2.7 Memory Virtualization Memory virtualization is the use of virtualization memory to run any form of virtual applications. To run multiple VM on a system requires, each VM to share and map their memory without coinciding with one another. The application performance is based upon memory performance. If the application can be able to access large amount of memory it will increase the performance of the application [19]. 14 2.2.8 Data Virtualization: Data virtualization is a collection of different data storages from different places. It provides a logical structure like front end application. The data virtualization can be able to access data from different data sources. We can be able to access data from a single place. It is easily portable and easy to manage the database with this data virtualization method. The user can access data with un-interrupted service [20]. 2.2.9 Storage Virtualization It is an abstract of pool storage and presented as single storages are Network. Virtualization storage appears as a server representing a single storage device from a central point of view. The storage technique is easy to manage; we can increase the storage amount without changing the network configuration. The storage places can be dynamically allocated [21]. 2.2.10 Network Virtualization Network virtualization is combining the entire network into one mode and allocating their bandwidth, channels and other resources based on their workload. All devices in the network have some allocated resources so it is easy to manage the overall network and reliability of the computing is increasing by using this technology. This technology provides scalability to each group in the network and also increases the security and reliable resources to all devices participating in the network [11]. 15 2.3 Virtualization software packages VMware Virtual Box QEMU 2.3.1 VMware VMware is a corporation that delivers VMware products. VM stands for Virtual Machine. VMware started their corporation a decade ago. They developed virtual software packages for x86 based architecture. VMware developed this technology with combination of a binary translation and direct function on the processor, this provides a way to frame a virtual machine software packages. This technology can be able to provide virtualization. The software can run multiple guest OS simultaneously on the same physical host with this virtualization software packages [22]. There are several software packages that VMware is delivering to the IT market; They are listed below: 1. VMware view 2. VMware Thin App 3. VMware Workstation 4. VMware vSphere 5. VMware vCenter Server 6. VMware studio 7. VMware vFabric Product Family 8. VMware vCenter Operations 9. Management suite 10. VMware Go. . 16 2.3.2 Virtual Box Virtual Box is originally named as Oracle Virtual Box, which is owned by Sun Microsystems and development process is being handled by Oracle Corporation. Virtual Box supports x86 architecture based software package. Virtual Box is an open-source software package. This Virtual Box can be installed on a host operating system and this can be able to run guest host on this Virtual Box application. This can support wide numbers of OS platforms windows, Solaris, Linux, MAC, Windows 7, Windows XP etc. The Virtual Box support software and hardware virtualization. The current version of this Virtual Box is 4.1.6.[23] 2.3.3 QEMU QEMU is one of the virtualization open source software package. QEMU working is based on dynamic binary translation. This QEMU is written by Fabrice Bellard. QEMU can able to run guest operating system on host machines based on the dynamic binary translation technique. QEMU virtual machine is operating on two modes. 1. User Emulation Mode 2. System Emulation Mode. User mode emulation method is binary of one CPU is been execute on another CPU under a same operating system. All the systems call process is executed straight on the host machine. The system emulation mode is emulating whole peripheral of the host machine. This emulation method operating function is similar to the real host machine and proving good performance services. QEMU works on different architecture including x86, x86_64, Power PC BookE and PowerPC Book3s KVM and s390x. Due to this feature QEMU is used to run on different machines with different operating systems based on the end user need. QEMU virtual machine software package emulates all peripherals required to run the operating systems. QEMU virtual images are used to store on the host storage device. [24]. 17 2.4 The Benefits of the virtualization 2.4.1 Hardware Reducibility and Reusability The virtualization technology allows using the existing hardware without increasing the hardware resources in organization. If the organization needs to increase the number of users in their firm, they need to upgrade their software and hardware. Virtualization utilizes hardware resources more effectively without wasting the resources from the existing hardware architecture. This feature can reduce the necessity to buy new hardware to the organization. Virtualization network architecture reduces the physical space occupied by the physical machine, if the network is constructed by the virtual infrastructure [9]. To allocate hardware for each server or application is time consuming and cost investment for companies. To maintain the hardware and resource allocation are the main roles and responsibilities for networks administrators, this virtualization technology overcomes all the major problems and provide a way to use the hardware resources to a maximum extent and bring the nearest performance of real machine through the virtual machine [25]. 2.4.2 Cost Reduction To expand an existing IT infrastructure is more expensive for organizations. It requires adding servers and workers in to the organization which requires investing more money. The virtualization in general reduce the IT sector investment cost for infrastructure, also it is provides a way to use variety of operating systems, services application, different storage methods and servers[26]. The method of assigning individual server to all applications is increasing the cost of organization. Virtualization network infrastructure has an ability to separate servers and applications from the hardware. It uses the server as a pool for all services to separate each service [27]. 18 2.4.3 Disaster Recovery The existing disaster recovery needs proprietary hardware resources, skilled operators, complex configurations and complex testing process. These require high cost investment; it leads to limit the implementation of disaster recovery possibility for organization people. Thus virtualization is a best way for disaster recovery for the companies. The failure of the main server or else some crash on remote server is big challenge for IT organizations. To recover the information with less time is one of the main motives during the disaster situation in the companies. Virtualization allows recovering the data from the servers and providing solutions with less downtime and minimal impact or no loss for the information from the server. After a disaster occurs, to run the services it mainly depends on the backup from existing server or virtual server images quality. In this scenario virtualization copies the existing image from the servers quick with less downtime [28]. Double take protection software offers IT infrastructure protection, move and recover option. Double take provides full failover and data replication for business based servers similar to SQL, Microsoft Exchange [29]. Double take technique provides security to existing recovery method; improve the data security and loss with less impact on down time. To eliminate downtime on failover the administrator can access the real time data copies of secured application similar to database and email. The double take service is constantly monitoring the primary data‟s. If any failover occurs, it automatically switches to secondary real time backup servers so that the services will not be affected during the situation. The end-user series is not interrupted in this method. This double take protection provides full security to the server on real time [30][29]. 19 2.4.4 Server Migration Server virtualization allows to quick portability. The server migration is common practice in companies. To move a physical server from one place to another place is still in practice but it requires lot of time and the downtime of the server is increased due to the physical migration of all servers from one place to another place. Virtualization allows migrating the server from remote site, offering simple solutions with less downtime. When the server loads increases during the processing time, the data center manager can move the running virtual machine server to another hypervisor in order to increase the processing capability of the server [31]. 2.4.5 Power Consumption This is one of the main advantages of virtualization technology. It consumes less power for operating the virtual network infrastructure. This leads the company to utilize less power which reflects on fewer amounts of electricity bills. Cooling the data centers also consumes more power for traditional physical machines, but virtualization network reduces the cost of cooling [32]. 20 2.5 2.5.1 Challenges in virtualization Security Issues The virtualization is most widely used by many organizations. Running multiple virtual machines on same physical hardware, enable service isolation on this server without coinciding with other services on the same machine but the physical machine should guarantee that the machine is secured from threats. If the physical machine security breach reflects all virtual machines hosted in the physical server are in high risk. The security of the virtual machines based on the network infrastructure, the VM network mode should be more secure in order to keep the data safe from the intruders [33]. Virtualization provides more number of target nodes to malicious software‟s and intruders. Information security providers need to understand the network more deeply to design security policies inside the network. 2.5.2 Physical Machine Failure Virtualization network consist of more virtual LAN on a single physical machine, if any case virtual machine fail in the LAN, it will not affect the other virtual machines in the same LAN. If the physical machine fails, it will affect the overall virtual machines which are hosted on the physical server this leads to massive shutdown of all the services offered from the physical machine [34]. 2.5.3 Input/output Request I/O request management is shared through shared storage from the physical machine. Each I/O related commands need to access the command through host machine. If any of the I/O services command needs to execute, it uses the host resources. Virtualization layer device communicates via host device layer. If virtual machine needs to perform any read or write operation, these commands executes through host device. The server level environment I/O accessing speed should be good in order to achieve better performance [35][34]. 21 3 . RESEARCH METHODOLOGY Methodology is defined as, the way researcher approach with the theoretical research question to implement in a practical way to evaluate and bring results based on their hypothesis or expected outcome from their research work. The main contribution in this method denotes that in, what way the researcher drive through the research in order to achieve their goals. Research Methodology Research Question 1 Interview/Literature Review Research Question 2 Interview/Literature Review Research Question 3 Experiment Conclusion Figure 3- Research Methodology 22 Conceptual Model defined as theoretical information collected by the research topic and the way to transform the information to construct a design [36]. Example: World Map is collection of geographical information that contains information about continent, country, land space, etc. There are two main research approaches 1. Qualitative research 2. Quantitative research. 3.1 Qualitative Research Qualitative research is based upon understanding the process and analysis of the process. There is no numerical model in this research method. Quantitative research method is collection of theoretical information and the model based upon the theoretical study, and analysis can be based on a result of the design [37]. Both models play an important role in our thesis. Virtualization software user expectations and their experiences for this case, we will conduct an interview with industrial experts as part of our qualitative research. The quantitative part of our research, we will discuss our experimental setup with experts, and we will make changes if required. 3.1.1 Interview Interview process is process of qualitative research. This helps to us to gather information about virtualization software and their usage information‟s in organization environment. This information‟s is not from the quantitative approach. There are mainly three types of interview 1. Structured interview. 2. Semi structured interview. 3. Unstructured interview. Structured interview is asking the set of questions to the interviewee, the questions are same for all interviewees. The questions answer can be rated as good, bad and average. The semi structured interview is not based on set of questions also the interviewer can ask different types of questions based upon the interviewer roles and their experiences also the new 23 questions arises by the interviewer make the understanding of virtualization software current trend and their advantages and disadvantages more clearly to their research questions. Unstructured interview is the interviewer can ask new questions during the interview and following the questions based on their previous interview type. The rating for this unstructured interview is not required [38]. 3.2 3.2.1 Quantitative Research Literature Review Literature review is a key entry to research work. This help to find the current research work and their related work on the technology. We selected literature review to find the current trend in virtualization software‟s also in that we are identifying the most used virtualization software in the organization. We also investigate their virtualization software‟s advantages and disadvantages. We find the resources from databases like IEEE (Institute of Electrical and Electronics Engineer), ACM (Association of Computer Machinery), journals, articles and Google scholar etc… We will study the virtual software packages concepts and their recent development. This information is collected through our university library database. Also we accessed the information through internet search engines. Based on this technical information, we designed our experimental model in order to answer our research questions. Later, we discussed the experimental setup with industrial experts as part of the interview. The model is then evaluated, and results are observed. We conducted an interview with industrial experts, and we discussed about the real time situation based on our research questions. We will collect the data‟s from the interviews in order to validate our research. 24 3.2.2 Experimental Model Work Load Host System QEMU VMware VirtulaBox Output Analysis Figure 4- Experiment Model The results from the interview and literature study provide the most familiar software in the organizations. We conducted an experiment to analyze the performance of web services on virtual machines and real host machine. We have installed Fedora 16 operating systems for all the virtual machines and real machine. The performance parameters total test time; connection time, Number of request, CPU I/O wait and CPU utilization percentage are taken to analyze the performance of web service load. The Apache ab tool is used to measure the parameter's connection time, completed request, minimum connection time and maximum connection time. We have chosen this tool to analyze the performance of this parameter [39]. The load one consists of 1000 requests and the load two consists of 2000. 25 In order to find the CPU I/O wait percentage and CPU user utilization percentage we have used a tool called the Sysstat monitors packages. It provides the parameters CPU I/O wait percentage and CPU user percentage of the machine. We run this tool simultaneously along with ab tool to obtain the parameters CPU I/O wait time percentage and CPU user utilization percentage. CPU user utilization denotes the percentage of the CPU utilize by the user. The percentage of the CPU usage is not same at all the time. It is used to vary depending upon load on the system. The CPU user utilization percentage is not same for all systems. It is depending upon the physical configurations and the software architecture [40]. When a system is functioning on a certain load, it is impossible to process another load in the same system until the CPU processes and complete the work from the current user load. CPU utilization has a big impact while considering performances of application services. The CPU I/O time is one of the important factors that can influence the performances of the machine. These factors solely depend upon the system. If the system CPU I/O wait is high, this can degrade the performance of the machine. Virtual machines are accessing CPU services from the host machines. The virtual machines CPU I/O percentage is varied from virtual machines to virtual machine [41]. Completed Request is a parameter which we have taken to analyze if the status of the web request is successfully processed or not. The entire http request is initializing from the client side. This parameter clearly denotes the status of the request [39]. Connection time is also a factor to provide details about the connection time. Each request is sent from the client machine. The connection time is varying from machines to machines. This parameter is help full when we try to analyze the performance of the web services in the system. 26 4 4.1 4.1.1 RESULT AND ANALYSIS Experimental Results Web Request System Competed Request Competed Request (No of Request – 1000) (No of Request – 2000) Real Host 1000 2000 QEMU 1000 2000 VMware 1000 2000 Virtual Box 1000 2000 Table 1 - Request Table represents the load one and load two scenarios. Number of web requests sent from the client systems and their status of the requests. The two scenarios are implemented on real host machine and virtual machines and their responses are noted down. The real host and the virtual machines have completed all their requests successfully. The host machine and virtual machine performance are same in this scenario. QEMU, VMware and Virtual Box all these machines are completed their request successfully. It is denoting that all the virtual machines and real host performances are same in this load. 27 4.1.2 Test Total Time System Load 1 Total Load 2 Total Time(Sec) Time(Sec) Real Host 160 337 QEMU 173 360 VMware 166 355 Virtual Box 188 370 Table 2 - Test Total Time Table representing the total time taken from the two loads. The table is clearly showing that each machine total time is varying and has a unique value. The load one scenario real machine total time is less compared the total time with virtual machines. Real machine performance is a step ahead with virtual machines. Among the virtual machines VMware having less total time compared to Virtual Box and QEMU. The load two scenarios are reflecting the same performances among the virtual machines and real machine. The values are varying from the load one to load two scenarios because the loads two scenario number of request are doubled compared with load one. Real machine total time is less when compared to other virtual machines. The VMware performing better when compared the total time with Virtual Box and QEMU. Real machine total test time parameter is performing well among other machines with same loads. VMware performance is good along with Virtual Box and QEMU. 28 Figure 5-Load 1 Total Test Time (ms) Figure 6-Load 2 Total Test Time (ms) 29 4.1.3 Minimum Connection Time System Minimum Connection Time(ms) Load 1 Load 2 Real Host 0 0 QEMU 0 0 VMware 1 1 Virtual Box 0 0 Table 3 - Minimum Connection Time Table represents minimum connection time for http web requests in load one and load two. Real host, QEMU and Virtual Box minimum connection time is null. The three machines connection time is same in load two scenario but VMware minimum connection time is one for both loads. 30 4.1.4 Maximum Connection Time System Maximum Connection Time(ms) Load 1 Load 2 Real Host 1 1 QEMU 33 30 VMware 1 3 Virtual Box 24 32 Table 4 - Maximum Connection Time This table represents maximum web request connection time for load one and load two scenarios. Load one scenario real host and virtual machines maximum connection time are varying from one machine to another machine. Real host and VMware maximum connection time for load one value is one. QEMU connection time is high in load one scenario and Virtual Box value is lesser than QEMU and greater than real host and VMware. The load two scenario real host maximum connection time is one same as load one value. Virtual Box maximum connection time is high among all machines. QEMU is slighter lesser than Virtual Box. VMware value is less among all virtual machines. Both scenarios real host performance is best compared with other machines. VMware performance is best among virtual machines. 31 Figure 7- Load 1 Maximum Connection Time Figure 8-Load2MaximumConnectionTime 32 4.1.5 CPU I/O Wait System Load 1 CPU I/O Wait % Load 2 CPU I/O Wait % Real Host 0.06 0.18 QEMU 1.71 2.89 VMware 0.98 2.17 Virtual Box 2.33 3.51 Table 5 - CPU I/O Wait Percentage The tables representing the CPU I/O wait for each machine with respectively with load one and load two. The load one scenario real host performance is good compared with virtual machine CPU I/O wait percentage. VMware CPU I/O wait is less among other virtual machines QEMU and Virtual Box. Virtual Box CPU I/O wait percentage is high among other machines. In the load two scenarios, the real host performance is high compared with other machines. VMware CPU I/O wait percentage is less compared with Virtual Box and QEMU. VMware percentage is good among virtual machines, Real host CPU I/O wait percentage is better for load one and load two scenarios. Virtual Box utilization percentage is more among virtual machines. 33 Figure 9-Load 1 CPU I/O Wait Percentage Figure 10- Load 2 CPU I/O Wait Percentages 34 4.1.6 CPU User Utilization System Load 1 Load 1 CPU User Utilization % CPU User Utilization % Real Host 0.19 0.42 QEMU 2.16 3.14 VMware 1.79 2.9 Virtual Box 3.62 5.3 Table 6 - CPU User Utilization Percentage CPU user utilization percentage is shown in the above table for load one and load two scenarios. Load one scenario real host CPU utilization percentage is less among all machines. VMware percentage is low compared with other machine Virtual Box and QEMU. Virtual Box utilization percentage is more compared with real host and virtual machines. The load two scenarios are denoting that real host CPU user utilization time is less among all machines. VMware utilization is less among Virtual Box and QEMU. From the two loads real host providing better performance than all other machines and the VMware provides best performance among virtual machines. 35 Figure 11- Load 1 CPU User Utilization Percentage Figure 12-Load 2 CPU User Utilization Percentages 36 4.1.7 Analysis of experiment observations The analysis of the experimental results, we have chosen the following parameters: connection time, total time, no of request sent, completed request, CPU I/O wait time and CPU user utilization time. These parameters are observed based on the consecutive performance of the system. The results shown from the two load test clearly indicate that the performance of the real machine is outstanding when compared to virtual machines. Although virtual machines cannot be compared with the real machines yet the system resources can be utilized more effectively while using the virtual machines. When we consider the total test time for two loads, the real host machine's performance is best among the virtual machines. The virtual machines total time is higher than the real machine because virtual machine resources are directly depending on the real machine. Any operations needs to be executed via real host resources, so the parameters „performances are not similar to a real machine. In virtual machines VMware performance is good also it took less connection time among Virtual Box and QEMU. The parameter value of load two is not exactly double as load one but it‟s comparatively eighty five percent higher than load one value. The reason for this difference is the system network speed CPU utilization time and I/O wait time, and few more factors cause the difference between these results. . The connection time parameters also clearly shows that real machines performance is excellent compared with virtual machines. VMware maximum connection time is less value but minimum connection time is more along with QEMU and Virtual Box. The real host machine and virtual machines web requests are in same range. CPU I/O wait is one of the factors that strongly affect the performance of a system. The real host machine CPU I/O time is outstanding when compare to virtual machines. VMware CPU I/O wait time is less along with Virtual Box and QEMU. 37 CPU user utilization percentage factor also clearly indicates that real machine usage is less along with virtual machines. Virtual Box user utilization percentage is high among virtual machines. VMware CPU utilization percentage is low along with Virtual Box and QEMU. 4.2 4.2.1 Results from the interviews Most Used Virtualization Software Packages The interviewee most of them named a software called VMware. This is a virtualization software package delivered by VMware Inc. There were different versions of products from VMware that are mainly based on the organization’s requirements they choose their required VMware products. 4.2.2 Virtualization Architecture The mostly used architectures are host virtualization and bare metal virtualization by the organizations. The virtualizations method can be chosen by the organizations depending on their needs and their services. 4.2.3 Advantages of Virtualization Virtualization Software‟s are easy to manage when compared with real networking resources. Also virtualization reduces the cost of the infrastructure management in IT. It allows using their physical resources effectively and more efficiently. Virtual machines are easy to migrate from one place to another place. So if any disaster situation arises recovery is made easy if the organization use virtualizations software. Virtualization is easy to upgrade from the existing version. Virtual software packages provide ownership of the products with less cost. One physical machine can run multiple guest OS on their one processor. It allows dividing the physical memory into several parts and can usable for other application. If any one of the guest OS create some problem this will not affect other guest OS in the same processor. It can easily discard the particular guest OS without interrupting the other OS in the same machine. 38 Virtualization is also used to test the software‟s before if any software introduce into real word scenarios, by using this test they can be able to analyze the performance of the particular software to improve the reliability. The servers need to be allocated for each service separately. But virtualization allows building the servers on a single physical machine so the organizations can use several servers with low cost. Virtualization allows designing their network with low cost with providing uninterrupted services to the organizations. Server maintenance is made flexible while the organization uses the virtualization technology; it allows servicing the server without interrupting the existing servers in the same pool. 4.2.4 Problems in virtualizations software The virtualization software is not compatible for all platforms. The physical host‟s failure in the organization can interrupt all guest machines which are installed on the host machine, so this is one of the high threats while using the virtualization technology. The performance of the virtualization is not same as real host machine. The structure of the virtual machine is sometimes difficult to understand. It requires dedicated bandwidth connection to communicate on networks. Some of the interviewees discussed that there are threats on security side of the virtual packages. The virtual machine need to be monitored very carefully from the intruders. 39 4.2.5 Virtualization software future This is one of the most developing technologies in the field of computing. It supports various platforms; this feature leads bright future for this field. 4.2.6 Performance virtualization Virtualization software provides good support and services in terms of web services. It is depending mainly based on their services. It provides nearest performance to real machine performance. 4.2.7 Software Packages Types Most of the interviewees replied that they prefer commercial package virtual software‟s. Commercial software packages are proving lot of features on their products also they are supporting their product services within the demanded time. 40 5 5.1 DISCUSSION Validity Threats Threats are common in all projects. There are several threats affecting the research findings. Here we have mentioned few threats we faced while conducting our thesis work. 1. Internal validity 2. External validity 3. Construct validity 4. Conclusion validity 5.1.1 Internal Validity Internal validity is interventions of researchers while conducting research study and experiments. Our interview questions are mostly open-ended questions; we analyzed and noted down the main points from the interviewee. After the interview from the experts, we discussed points from the entire interview. We discussed the points with the experts again to confirm the information is valid or not. All the interviews are related to our core research area; also we conducted the literature review and the experiment scenario. We analyzed all the factors and we finally drawn some conclusion. These steps reduce the internal threats effectively [42]. 5.1.2 External Validity External validity is made sure that the research approach and findings usually from the empirical and experimental data‟s are generalized from other research. We have chosen the interview participants based on our core area of thesis work. We conducted our interviews from the organization which is relevant to our thesis core also based on their views. We conducted and modified our implementation. This reduces the level of eternal threats and increase the results to more generalized one [43]. 41 5.1.3 Construct Validity Construct validity is a relationship between the theory and observation of the results.Data Triangulation: In our study, we collected data‟s from literature review, Interview and experiment results. It is a quite difficult to analyze the results from the literature review, interviews and experiment. To reduce this threat amount, we analyzed the results from the three modes, and we draw a conclusion very carefully to minimize this threat. Primary Studies: The main motive of the literature review is to learn more relevant to virtualization software‟s later we short out some sort of virtualization software‟s based on their types, and we used the related studies to this thesis work [44]. 5.1.4 Conclusion Validity Conclusion validity is to validate research results is reliable and reasonable. Before we conduct an interview we discussed the interview question with our supervisor after getting feedback from him, we made required changes before the interviews also we discussed the interview pattern and experiment model. so these factors reduce the conclusion threat as much as possible[44]. 42 6 CONCLUSION RQ1. What are the most-used virtualization software packages? We have answered RQ1 through literature review and interview; from the literature review, we found the available virtualization methods and their virtualization software packages. In the interview process, we approached professionals currently working in the virtualization technology. They shared their views and experiences with us on the following aspects; mostused virtualization software‟s in the industry and the reasons for choosing the software in their organization. From the interview and literature review, we find out that most using virtualization software packages are VMware, Virtual Box and QEMU. RQ2. What are the advantages and disadvantages with virtualization software packages? RQ2, we have answered based on literature review and interview. We followed literature review to find the major advantages and disadvantages in virtualization software packages. We discussed with professionals about the software packages and their advantages and disadvantages. Virtualization Software‟s are easy to manage and also reduce the cost of the infrastructure management in IT. It allows using their physical resources effectively and more efficiently. It is easy to migrate from one place to another place. Disaster recovery is made easy, if the organization use virtualizations software. It is easy to upgrade the machine. It provides ownership to products at lower cost. One physical machine can run multiple guest OS on its processor. Easy to divide the physical memory into several parts and can be usable for other applications. If guest OS creates a problem, this will not affect other guest OS in the same host. It can easily discard the particular guest OS without interrupting the other OS in the same machine. It is also used to test the software performance before introducing it to the real world. The organization can use several servers with low cost. The maintenance of the servers makes them easy without interrupting the current services. The virtualization software is not compatible for all platforms. The host machine failure affects all the virtual machines. The performance of the virtualization is not same as a real host machine. It requires dedicated bandwidth. 43 RQ3. What is the web service performance in terms of throughput for this software packages along with the real operating systems? Parameters like number of request; total connection time, minimum connection time, maximum connection time; CPU I/O wait percentage and CPU user utilization percentage on web service application are analyzed. The whole request is initialized from client machine to remote machine. The number of request parameter show that, all the machines completed their request successfully for load one and load two. The performance is same for real machine and virtual machines. The total connection time parameter results point out that real machine took less total time to complete the load one and load two. Virtual machine total time is high when compared with real machine total time but VMware total time is less among other virtual machines QEMU and Virtual Box. When we consider the connection time parameter the real machine connection time is less along with virtual machines. Among virtual machines VMware connection time is better than QEMU and Virtual Box. The two loads CPU I/O wait percentage indicates that real machine took less CPU I/O wait compared with virtual machines also VMware consume less percentage among virtual machines. The CPU user utilization for the load one and load two is less for real machine along with virtual machine. VMware CPU user utilization is lesser than Virtual Box and QEMU. The analysis result clearly showing that real host performance is incomparable with virtual machine software packages web service performance. Virtualization software packages performance is not same as real machine performance but VMware performance is best among other virtual machines when compared to Virtual Box and QEMU. VMware does not provide closest performance to real host but it provides better performance among other virtual machines. 44 7 FUTURE WORK The thesis focused and identified the most-used virtualization software packages and their advantages and disadvantages. The web service performances of these software packages are carried out based on accessing the remote servers from virtual machines, and we noted down the parameters which we obtained from the experiments. This thesis provides a detailed view about virtualization software packages that are currently using by industries To continue this work they can be able to perform the network tests in these software packages and analyze their performance by varying the bandwidth capacity of the network. Another continuity work from this thesis is they can be able to implement the Para virtualization platform for virtualization software packages and analyze web services in those software‟s. 45 8 [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] REFERENCES Deshi Ye, Qinming He, Hua Chen, and Jianhua Che, “A Framework to Evaluate and Predict Performances in Virtual Machines Environment,” in IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, 2008. EUC ’08, 2008, vol. 2, pp. 375-380. D. P. Bhukya, S. Ramachandram, and A. L. Reeta Sony, “Evaluating performance of sequential programs in virtual machine environments using design of experiment,” in 2010 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), 2010, pp. 1-4. I. Ahmad, J. M. Anderson, A. M. Holler, R. Kambo, and V. Makhija, “An analysis of disk performance in VMware ESX server virtual machines,” in 2003 IEEE International Workshop on Workload Characterization, 2003. WWC-6, 2003, pp. 6576. R. Geambasu and J. P. John, “Study of Virtual Machine Performance over Network File Systems.” L. K. John, P. Vasudevan, and J. Sabarinathan, “Workload characterization: motivation, goals and methodology,” in Workload Characterization: Methodology and Case Studies, 1998, 1999, pp. 3-14. Youhui Zhang, Gelin Su, Liang Hong, and Weimin Zheng, “On Virtual-MachineBased Windows File Reads: A Performance Study,” in Pacific-Asia Workshop on Computational Intelligence and Industrial Application, 2008. PACIIA ’08, 2008, vol. 2, pp. 944-948. G. Martinovic, J. Balen, and S. Rimac-Drlje, “Impact of the host operating systems on virtual machine performance,” in 2010 Proceedings of the 33rd International Convention MIPRO, 2010, pp. 613-618. S. Bose, P. Mishra, P. Sethuraman, and R. Taheri, “Performance Evaluation and Benchmarking,” R. Nambiar and M. Poess, Eds. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 167–182. Yunfa Li, Wanqing Li, and Congfeng Jiang, “A Survey of Virtual Machine System: Current Technology and Future Trends,” in 2010 Third International Symposium on Electronic Commerce and Security (ISECS), 2010, pp. 332-336. Qiang Li, Qinfen Hao, Limin Xiao, and Zhoujun Li, “VM-based Architecture for Network Monitoring and Analysis,” in Young Computer Scientists, 2008. ICYCS 2008. The 9th International Conference for, 2008, pp. 1395-1400. Zhitao Wan, “A Network Virtualization Approach in Many-core Processor Based Cloud Computing Environment,” in 2011 Third International Conference on Computational Intelligence, Communication Systems and Networks (CICSyN), 2011, pp. 304-307. Y. Chubachi, T. Shinagawa, and K. Kato, “Hypervisor-based prevention of persistent rootkits,” in Proceedings of the 2010 ACM Symposium on Applied Computing, New York, NY, USA, 2010, pp. 214–220. Wei Chen, Hongyi Lu, Li Shen, Zhiying Wang, and Nong Xiao, “DBTIM: An Advanced Hardware Assisted Full Virtualization Architecture,” in IEEE/IFIP International Conference on Embedded and Ubiquitous Computing, 2008. EUC ’08, 2008, vol. 2, pp. 399-404. J. LeVasseur et al., “Pre-virtualization: Soft layering for virtual machines,” in Computer Systems Architecture Conference, 2008. ACSAC 2008. 13th Asia-Pacific, 2008, pp. 1-9. 46 [15] Li Yan, “Development and application of desktop virtualization technology,” in 2011 IEEE 3rd International Conference on Communication Software and Networks (ICCSN), 2011, pp. 326-329. [16] “Remote Desktop Virtualization Host.” http://technet.microsoft.com/enus/library/dd560648(WS.10).aspx. [17] Youhui Zhang, Xiaoling Wang, Gelin Hong Liang Su, and Dongsheng Wang, “Portable desktop applications based on user-level virtualization,” in Computer Systems Architecture Conference, 2008. ACSAC 2008. 13th Asia-Pacific, 2008, pp. 16. [18] R. Perez, L. van Doorn, and R. Sailer, “Virtualization and Hardware-Based Security,” IEEE Security & Privacy, vol. 6, no. 5, pp. 24-31, Oct. 2008. [19] A. Agne, M. Platzner, and E. Lubbers, “Memory Virtualization for Multithreaded Reconfigurable Hardware,” in 2011 International Conference on Field Programmable Logic and Applications (FPL), 2011, pp. 185-188. [20] L. Weng, Gagan Agrawal, U. Catalyurek, T. Kur, S. Narayanan, and J. Saltz, “An approach for automatic data virtualization,” in 13th IEEE International Symposium on High performance Distributed Computing, 2004. Proceedings, 2004, pp. 24- 33. [21] Zhang Qiang, Wu Yunlong, Cui Dong, and Dang Zhuang, “Research on the security of storage virtualization based on trusted computing,” in 2010 2nd International Conference on Networking and Digital Society (ICNDS), 2010, vol. 2, pp. 237-240. [22] J. Smith and R. Nair, Virtual Machines: Versatile Platforms for Systems and Processes, 1st ed. Morgan Kaufmann, 2005. [23] “VirtualBox.” https://www.virtualbox.org/. [24] “QEMU.” http://wiki.qemu.org/Main_Page. [25] R. Uhlig et al., “Intel virtualization technology,” Computer, vol. 38, no. 5, pp. 48-56, May 2005. [26] C. Weltzin and S. Delgado, “Using virtualization to reduce the cost of test,” 2009, pp. 439-442. [27] A. A. Semnanian, J. Pham, B. Englert, and X. Wu, “Virtualization Technology and its Impact on Computer Hardware Architecture,” 2011, pp. 719-724. [28] T. Adeshiyan et al., “Using virtualization for high availability and disaster recovery,” IBM Journal of Research and Development, vol. 53, no. 4, pp. 1-11, Jul. 2009. [29] www.vmware.com/pdf/vmware_doubletake.pdf “vmware_doubletake.pdf.” . [30] http://www.vmware.com/files/pdf/DR_VMware_DoubleTake.pdf “DR_VMware_DoubleTake.pdf.” . [31] G. Khanna, K. Beaty, G. Kar, and A. Kochut, “Application Performance Management in Virtualized Server Environments,” 2006, pp. 373-381. [32] M. Pedram and I. Hwang, “Power and Performance Modeling in a Virtualized Server System,” 2010, pp. 520-526. [33] A. van Cleeff, W. Pieters, and R. J. Wieringa, “Security Implications of Virtualization: A Literature Study,” 2009, pp. 353-358. [34] M. Rosenblum and T. Garfinkel, “Virtual machine monitors: current technology and future trends,” Computer, vol. 38, no. 5, pp. 39-47, May 2005. [35] J. Kirkland, D. Carmichael, C. L. Tinker, and G. L. Tinker, Linux Troubleshooting for System Administrators and Power Users, 1st ed. Prentice Hall, 2006. [36] “The Essence of Research Methodology.” ISBN 978-3-540-71658-7. [37] P. I. Newman, P. C. S. Ridenour, and C. Ridenour, Qualitative-Quantitative Research Methodology: Exploring the Interactive Continuum, 1st ed. Southern Illinois University Press, 1998. [38] S. E. Hove and B. Anda, “Experiences from conducting semi-structured interviews in empirical software engineering research,” in Software Metrics, 2005. 11th IEEE International Symposium, 2005, p. 10 pp.-23. [39] “Welcome to The Apache Software Foundation!” http://www.apache.org/. [40] “Linux Administration Handbook : Evi Nemeth, Garth Snyder, Trent R. Hein: Books.” ISBN 0130084662.. 47 [41] H. Takahashi, H. F. Ahmad, and K. Mori, “Layered memory architecture for high IO intensive information services to achieve timeliness,” in 2008 IEEE 11th HighAssurance Systems Engineering Symposium, 3-5 Dec. 2008, Piscataway, NJ, USA, 2008, pp. 343-9. [42] “Internal Validity.” http://www.socialresearchmethods.net/kb/intval.php. [43] “External Validity -- Educational Research -- Del Siegle.” http://www.gifted.uconn.edu/siegle/research/Samples/externalvalidity.html. [44] “Conclusion Validity.” http://www.socialresearchmethods.net/kb/concval.php. 48 APPENDIX A Interview Questions Beginning of the Interview 1. What is your name? 2. What is your company name? 3. What is your designation at your company? 4. What is your designation role in your company? Interview questions relevant to thesis work 1. Does your organization use virtualization software? 2. What is the name of the virtualization software? 3. What kind of virtualization architecture implementing in your organization? 4. What are the main reasons for using virtualization software? 5. How virtualization software can helpful to an organization? 6. What are the major problems in virtualization software? 7. How this can affect the organization? 8. What is the future of Virtualization Software? 9. What is the performance of Virtualization Software in terms of services? 49 10. Which type of Virtualization Software packages do you prefer? A. Open Source Why do you prefer open source? B. Non Open Source Why do you prefer non open source? 11. What is performance of Virtualization Software services comparing to real host operating system services? 50 APPENDIX B Interview Transcription Interview 1 Interview has been conducted from Balagi, IBM Corporation, Singapore. His designation in the company is advisory IT specialist. His role in the organization is optimizing the virtualization technology. The interview has been conducted with two main motives. The first section of the interview, we have discussed about the various virtual software packages being used in his organization. The interviewee named a virtual software package called VMware and kernel based virtual machine QEMU. He discussed about the VMware virtual software package, which is one of the main software packages being used by most of the organizations. He mentioned that VMware package is the most widely used product in their organization. We further continued questioning about architecture usage of the VMware in IBM, He said that there are different types of virtualization architectures available, but they are implementing full virtualization and Para virtualization architectures. We have asked about the services they are offering to their client or end-users, he told based on this architecture, their organization provide virtual servers to end-users. Later we discussed about main reasons to use the virtualization technology in their organization. He answered that “We are using the virtual software packages in our organization to reduce cost of the physical resources, it is easy to maintain and manage the network infrastructure, and also it is easy to migrate from one place to another place. If any disaster occurs in the network infrastructure, virtualization allows usage of the physical resources effectively. These are the main reasons to use the virtual machines‟‟ He told that the major drawback of using virtualization software is, high-end server cannot be able to implement as a virtual machine. Also when it comes to a security concern virtual machine is not good for the security aspects. 51 According to his point of view about the future of the virtualization software, he told virtualization is having a good future in the IT organizations. This technology is being developed and providing lot of services and it can be used without considering about real machine resources. The future of the virtualization is a boom according to his point of view. He also mentioned that the performance of the virtual machines in terms of services, is really good until unless it has a proper design for the virtual machines. If tune-up of the system and design procedure is optimized, it will provide good services to the end users. The major advantage he told about the virtualization software‟s is it reduces the cost of the resources also it reduces the man power, power consumption and is easy to manage the data centers. He would like to prefer the commercial software packages instead of open source packages. He recommended to choose the commercial packages provide support and services on time and it has more features compared with non commercial virtualization software packages. Interview 2 This interview is taken from Mullapudi. He is an employee of HCL Corporation. He is working as a senior IT specialist. His role is system administrator also managing UNIX servers and virtual machines. Interview has been conducted in two parts. In the first part of the interview we have been discussing about most used virtualization software packages in organizations. The second part of the interview we have discussed about the virtualization software package advantages and their disadvantages. When we discussed about most used virtualization software packages in organization, he told that they are using virtualization software called VMware for most of the applications. On the client side and their server side they are working with VMware software package. Later, we continued discussing about the virtualization architecture being used in their organization; He told that they are using host based virtualization architecture. he mentioned they are not using bare metal architecture based virtualization. When we discussed with him about main reasons to use virtualization software in their company, He told that “we are using the virtualization software packages because it consumes less space and cost of the packages is also less when compared with real 52 resources. Virtual software is easy to maintain and manage. We can increase the number of servers based on the requirement without considering much about the physical resources. The end user point of view they did not have any clue while they are using virtual server or real server also compatibility wise and their performance wise it is very good” He told that there will be some problems while using virtualization software packages that if any problems occur in the host machine server, it affects the whole organization. It will lead the organization to shutdown the whole process until the problem gets rectified. According to his point of view, the future of the virtualization software packages has good scope since it has been optimized by new versions. According to customer point of view it reduces cost of the resources. He mentioned virtualization software packages have good impacts in IT sector. He told that open-source software packages are good as per his view. The open-source software can be changed according to companies need; also, he mentioned that the commercial packages are good to get the support of product when the customers demand services. Mullapudi told virtualization software is good to use, and it has nice features. This virtualization leads to less maintain and flexible network infrastructure. He is very satisfied with the VM ware virtualization product even though various products are available and providing virtualization packages, but he feels that VMware is comfortable and convenient to use compared with other virtualization software packages. Interview 3 This Interview has taken from Mohan Karuppanan from IBM Corporation, Chennai, India. He is working as system administrator at this organization. His role is to support and mainten the virtual servers. This interview has been focused into two major aspects; we have discussed about and mostused virtualization software‟s in organization and their advantages and disadvantages of these software packages. When we discussed about the most-used virtualization software packages, he mentioned a software package called VMware. He told that VMware is one of the virtualization software‟s which is used by most of the organizations. 53 He mentioned bare metal virtualization architecture is used by clients‟ of their organizations. We have continued discussing reasons for adopting virtualization technology in IT organizations, He told it reduces cost of the resources and with less cost, the company people can purchase more products and virtualization software packages are easy for main physical resources when compared to real physical resources. He mentioned an example that virtualization product has easy portability in disaster situation, we can migrate the virtual machines from one place to another, and this can provide uninterrupted services to the clients. It simplified the network infrastructure of the IT organization. When we discussed about the problems raised with using of virtualization software packages he told that there are few flaws in virtualization technology, for instance, he told that if any of the host servers went down this will affect the services of the virtual machines, so this is one of the main problems coming to virtualization technology according to his point of view. The future of the virtualization product, he mentioned that virtualization technology will have good impact in computing filed. It has some strong advantages, for example, management service, cost based and service based, power consumption. Due to these positive advantages of this technology, definitely it will be grown in the field of computing. He told us that the performance of the virtual machines are good also the services like web services, database services, it has been performing well when it has configured in a certain standard level. According to his point of view, we discussed about commercial packages and open source of the virtualization machines. He strongly emphasis commercial packages virtual machines are good when compared with non commercial packages. The Commercial packages are providing excellent services when compared with non commercial virtual software packages. In organization services, product support is playing import aspects to choose a product. He told us that the performance differences between the real machine and the virtual machine is not same, since virtual machine cannot be replaced with real machines, always real physical machine performance is unbeatable when compared with virtual machine performance, but some virtual machine products provide nearest performance with real machine performance. 54 He told about his opinion about the virtual machine to use it since it provides lot of features and also cost of the virtual software packages is less when compared with real physical resources. He told that he believes virtual machine will make a large impact in IT sector. Interview 4 This interview has been conducted from Srinivas Rao from Vijay Electricals, India. He is working as a System Administrator. His responsibility is to manage the network systems in the organizations. This interview is carried based on two main aspects. 1. To find the most used virtualization software packages in his organization. 2. The advantages and disadvantages of the virtualization package. When we asked about virtualization software details in his organization, he mentioned that they are using the VMware virtualization product. When we discussed about the virtualization architecture, he told that they are using host based virtualization in his organization. He told that virtualization software packages provide easy network management. The cost of the virtualization products is less when compared with real resources. It reduces physical resources in the companies also it decreases the power usage and virtualization technology requires less manpower to maintain. He told that virtualization software is more helpful when compared with physical resources, for instance, if any disaster occurs, it is easy to recover, and it can be migrating from one workplace to another workplace. When we questioned about the problems associated along with virtual software packages he told that each software has some advantages and disadvantages; virtual software‟s also have some problems, if the host machine went down during the operation time it has large impact on company services. The virtual machines are not having security when compared with real machines. He told that future of the virtualization is very good. It is one of the booming technologies in this real computing field, a lot of companies adopting virtual Software packages into their organizations. It clearly shows that virtualization is adopting in computing sector. 55 Later, we continued discussion about virtualization performances; Interviewee told that performance wise virtualization machine is good. It has been proving good services as per his experiences and he mentioned about the performance in the virtualization software. He told that virtual machines are performing well in real word, but it has not been proving performance equal to real machines. The impacts are acceptable in the computing field. He strongly emphasis commercial virtualization software packages are good, and it is providing better services and good features. Virtualization technology provides a lot of advantages and few disadvantages, but it can be able to get ownership of virtual machines with fewer amount of cost, and also it occupies less space. These major features cannot be replaced by real machines according to his view. Interview 5 The interview has been conducted from Sundar at Patni Organization India. He is working as a test engineer. His role is to maintain and monitor the network resources in the company. The interview is taken from him for two main purposes. The first is to identify the most used virtualization software packages in organizations, and the second reason is to identify the major advantages and disadvantages with these software packages. When we asked about virtualization software, he told VMware and QEMU virtual machines. In his organization, they are using VMware software package. He told that they are using Para virtualization architecture in their organization. He told that organizations are using virtualization software because it is good to use since it reduces the company resources investment cost. It requires less space to implement. It is possible to increase the network resources without considering about company physical space and investment cost. It is easy to maintain the network infrastructure. When there is a need to increase the network area in the organization or increase the servers, user's virtualization provides better functionality in those cases. If there is any problem that occurs in the virtual machine, it will not affect the other virtual machines which are residing in the same host machine. 56 Later, we questioned him about the virtualization software disadvantages, he told that if any problem occurs on the host machine, it will severely impact the virtual machines hosted in the host machine. The security point of view we should monitor the network resources carefully from the intruders in the way he mentioned upon his view of virtualization software disadvantages. He told about the performances of the virtual software packages, According to his view the performance of the virtual machine is good and also it provides better services for applications that run on virtual machines. The performance of the virtualization software is not same as real host performance, but it provides fewer differences with real host based machines. The feature of the virtualization machines offers a way to choose virtualization software instead of real physical resources. When he talked about the type of the virtualization packages, He would like to prefer commercial packages. The reason he mentioned to choose commercial packages is commercial packages provide services when it required for the product also it has lot of features. According to his view, he supports virtualization software in the IT organizations. Since it has lot of features as he mentioned above, this is one of the adopting technologies in the IT organization. Interview 6 The Interview has been conducted from Ramakrishna Raveela at CGI, India. He is working as a senior software engineer. His role in the organization is developer and system maintainer. In the interview we have discussed about two main areas of virtualization technology. The first part of the interview we have discussed about the most used virtualization software packages in his organization. The second part of the interview we have discussed about the advantages and disadvantages on these software packages. He told that they are using the VMware at his organization. VMware is one of the software, which is used by most of the companies When we started discussing about the virtualization architecture, he told they are using host based virtualization and bare metal virtualization technology. He continued mentioning few 57 reasons to use virtualization technology at his organization. The cost of the physical resources is high, virtual software package's costs are less when compared with real resources. The real resources occupy more physical spaces in the company. While using virtualization it reduces to use more physical host at work place. The man power required to maintain the resources is less if the virtualization technology is used. It has portability, thus it provides flexible environment for users. Later, we continued discussion on the virtualization software problems; he mentioned that as per the security point of view virtualization software is not good. The security engineers must have to carefully monitor the virtualization software activities. The virtual machine cannot be replaced with physical machine performances, but it will perform well but not like as real machine‟s performance. The future of the virtualization technology is good. This technology is being accepted and implemented by most of the organizations. If the security issues are fixed in the virtualization, this development will be huge. He would like to prefer commercial software packages instead of open sources software packages. He told that commercial packages features are well organized when compared with non commercial products he mentioned. He personally recommends choosing the commercial packages. When we asked about the personal summary of the virtual machines he mentioned that virtualization technology is good. It provides more features to use the IT services more effectively. The VMware products are good as per his experience.. 58 Interview 7 The interview is taken from Suscheel from IBM, Singapore. This interview has two aspects. The first part of the interview, discussion about the most used virtualization packages. The second part we have discussed about the advantages and disadvantages on the virtualization software packages. He mentioned about the virtualization software packages named as Virtual Box and VMware. VMware is the most using virtualization packages at his organizations according to his work experiences on virtualization. When we discussed about the reasons to use the virtualization technology, he mentioned that. 1. Virtualization offers, to use the resources effectively and efficiently. 2. Virtualization reduces cost of the resource's investment. 3. It is very convenient to implement and it is flexible to maintain the network. 4. It provides ownership of the product with less amount of cost. 5. Easy to upgrade the virtual machine. 6. The performance of the virtual machine is good. Later, we discussed about the problems arose while using the virtual machines. He told that the virtual machines are not capable of implementing high processing application server. The virtual machine is purely dependent on the host machine resources. He told that the performance of the virtual machine is good in terms of web services and database services applications. Virtualization commercial packages are good and convenient to use in an organization, since it has lot of good features, application oriented when compared with non commercial software packages. He told that the future and growth of the virtualization software will increase gradually due to virtualization benefits. The company that is using the virtualization technology in their organizations has good progress in terms of services, when considering customer satisfaction. Virtualization fulfills almost all the requirements. When we discussed about the virtual machine performance along with real machine performance, He told that real machine performance is not equal to real machines. 59 No misspellings or grammatical errors. Demonstrates full knowledge. Can answer all questions with explanations and elaborations. Clear organization with good and logical flow between parts. Varies the pitch, timbre and energy of the voice according to the needs of the presentation to maintain interest. Presentation falls within required time frame Enhances presentation and keeps interest. All key points articulated/covered. Thoroughly explains all points. Multiple vocalized pauses noticed at appropriate places in presentation or in answering questions. PAD009, DR version 1.01, 2007-08-23, Robert Feldt Several key points glossed over. Majority of points covered in depth, some glossed over. Thoughts articulated clearly, but flow is somewhat hampered. 1-2 misspellings or grammatical errors. At ease with material. Can answer questions but without elaboration. Presentation is less than minimum time. Adds nothing to presentation. Presentation is on the edges of the required time frame. Key points articulated/covered but not engaging/enhancing. 3… Uncomfortable with information. Can answer only basic questions. No or unclear logical flow between parts. Small variations in … Some variations in … A few … only some at appropriate … Multiple slumps. Too static or dynamic movements. Shows some negativity towards work and/or results. Mild tension; trouble recovering from mistakes. Occasionally slumps. Occasionally shows positive feelings about work and/or results. Makes mistakes but recovers quickly from them. Displays little or no tension. Some … Few … Some … Somewhat adapted … 2 – Fair/some/little control Only focuses on one part of the audience. Does not scan audience. 3 – Good control Occasionally looks … with parts of the audience. 4 - Superior command 4 or more … Incomplete grasp of information. Cannot answer questions. Incomplete; several key points omitted. Hard to understand work and/or results. Confusing order and organization. No variation in pitch, timbre or energy of voice. A constant and boring voice which is hard to listen to. Mumbling. Presentation is more than maximum time. Poor, distracts audience and is hard to read/interpret. No vocalized pauses noticed. Does not attempt to look at audience at all. Reads notes or looks at computer throughout. No hand gestures are noticed and/or body language is not adapted to presented content. Sits during presentation or slumps repeatedly. Shows no interest in the presented work and/or results. Nervous. Problems recovering from mistakes. 1 – Minimal or no control Student(s)/Work:______________________________________________________ Reviewed by:_________________________________________________________ Constantly looks at and maintains eye contact with different parts of the audience. Natural hand gestures and body language are demonstrated. Well adapted to the content. Stands up straight with both feet to the ground. Turned to audience. Demonstrates a strong, positive feeling about work and results. Relaxed and self-confident with no mistakes. * Key criteria which is the main basis for evaluation and grading Flow, Coherence * Language * Subject knowledge * Completeness * Visual aids Timing Vocalized pauses (ah, um, well etc) Voice variations Poise Enthusiasm Posture, Poise Gestures Criteria Eye contact (Oral) Defense/Presentation Rubric Master Thesis Electrical Engineering Thesis no: MEEyy:xx Month Year Response Time Effects on Quality of Security Experience. Asad Muhammad (840713-7218) Wajahat Ali (850723-2638) School of Computing Blekinge Institute of Technology 371 79 Karlskrona Sweden This thesis is submitted to the School of Computing at Blekinge Institute of Technology in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering. The thesis is equivalent to 20 weeks of full time studies. Contact Information: Author(s): Wajahat Ali Address: Lindblomsvägen 106, Ronneby E-mail: [email protected] Asad Muhammad Address: Kungsgatan 98, Lgh 0801, 37438, Karlshamn E-mail: [email protected] University advisor(s): Charlott Lorentzen, Ph.D. Student COM/BTH School of Computing Blekinge Institute of Technology 371 79 Karlskrona Sweden Internet Phone Fax : www.bth.se/com : +46 455 38 50 00 : +46 455 38 50 57 ii ABSTRACT The recent decade has witnessed an enormous development in internet technology worldwide. Initially internet was designed for applications such as Electronic Mail and File Transfer. With technology evolving and becoming popular, people use internet for e-banking, e-shopping, social networking, e-gaming, voice and a lot of other applications. Most of the internet traffic is generated by activities of end users, when they request a specific webpage or web based application. The high demand for internet applications has driven service operators to provide reliable services to the end user and user satisfaction has now become a major challenge. Quality of Service is a measure of the performance of a particular service. Quality of Experience is a subjective measure of user’s perception of the overall performance of network. The high demand for internet usage in everyday life has got people concerned about security of information over web pages that require authentication. User perceived Quality of Security Experience depends on Quality of Experience and Response Time for web page authentication. Different factors such as jitter, packet loss, delay, network speed, supply chains and the type of security algorithm play a vital role in the response time for authentication. In this work we have tried to do qualitative and quantitative analysis of user perceived security and Quality of Experience with increasing and decreasing Response Times towards a web page authentication. We have tried to derive a relationship between Quality of Experience of security and Response Time. Keywords: Quality of Experience, Quality of Service, Response Time and Security. ii ACKNOWLEDGEMENTS We would like to thank our advisor Charlott Lorentzen. Without her generous support and guidance this thesis work would have been impossible. We are also thankful to Prof. Markus Fiedler for his valuable suggestions and opinions for this thesis. We are grateful to our parents for their endless support and love. Asad Muhammad Wajahat Ali iii iv Contents CHAPTER 1 .......................................................................................................................................... 3 INTRODUCTION ................................................................................................................................ 3 1.1 OBJECTIVE.................................................................................................................................... 4 1.2 RESEARCH QUESTIONS .................................................................................................................. 4 1.3 DOCUMENT STRUCTURE ............................................................................................................... 4 CHAPTER 2 .......................................................................................................................................... 7 BACKGROUND ................................................................................................................................... 7 2.1 QUALITY OF EXPERIENCE ............................................................................................................. 7 2.2 RELATED WORK ........................................................................................................................... 9 2.3 RESEARCH METHODOLOGY ........................................................................................................ 10 CHAPTER 3 ........................................................................................................................................ 13 EXPERIMENT SETUP ..................................................................................................................... 13 3.1 DESIGN ....................................................................................................................................... 13 3.2 EXPERIMENT DESCRIPTION ........................................................................................................ 13 CHAPTER 4 ........................................................................................................................................ 17 RESULTS ............................................................................................................................................ 17 4.1 QUANTITATIVE ANALYSIS .......................................................................................................... 17 4.2 QUALITATIVE ANALYSIS ............................................................................................................ 21 4.3 DISCUSSION ................................................................................................................................ 22 CONCLUSION & FUTURE WORK ............................................................................................... 25 5.1 CONCLUSION .............................................................................................................................. 25 5.2 FUTURE WORK ........................................................................................................................... 25 BIBLIOGRAPHY ................................................................................................................................. 27 APPENDIX A...................................................................................................................................... 29 APPENDIX B ...................................................................................................................................... 35 APPENDIX C ...................................................................................................................................... 41 v vi Introduction vii 2 Chapter 1 Introduction Internet plays a vital role in everyday life in this modern era of technology. It has become a medium for exchange of information and communication. People use internet for e-mails, e-banking, social networking, e-books, voice and data exchange and a lot of other applications. Most of the web pages require a username and a password for user authentication. When the user enters the desired information for authentication, user has to wait for some time for the authentication procedure to complete and for the information to be fetched from the server and displayed in front of him/her. The Response Time (RT) for retrieving a particular web page or internet service depends on the type of authentication procedure, different network conditions or security protocols running in the background of which the user is unaware. An authentication procedure consists of a chain of messages before it is completed. The user perception is based upon the whole RT and if the greatest contributor to the RT within a network is found then it can be minimized or can be made scalable for large network delays with the aim to preserve good Quality of Experience (QoE) [2]. Authentication solutions are designed for user security to keep the undesired and unauthorized people out. Through authentication, the end user has to wait some extra time. If the response time increases the end user gets less interested in the service. Studies have shown that a user notices a response time of 100 ms, gets bored after 4 s and the risk of leaving a web page at 10 s [6]. The users do get concerned about a login on a particular internet website when the authentication procedure takes little or more time to get full access and the user starts judging the service and the level of security. Within this last decade internet traffic has increased drastically. With increasing number of users, user satisfaction has become a major challenge for service providing operators. In the present situation the service providers should provide fast and reliable services to meet the demands of the users in order to be able to run their businesses in the competitive market. The performance of any web page depends on Quality of Service (QoS). QoS includes factors e.g delay, packet loss, throughput and jitter. User satisfaction or QoE is subjective in nature and depends on QoS parameters. The service provider should ensure that the service is safe and available all the time. It is important to understand how the end users feel about the performance and the level of security for a service. By qualitative and quantitative analysis of user perception towards web authentication procedure, the effect of RT on QoE of Security can be studied. This can help the service providers to judge user perception and the level of security for their service. The end user in most cases is unaware of the technical problems within a network and analyzes the service based on RT whereas the service provider knows the technical issues within the network and analyzes the problems by monitoring QoS parameters. Based on an experiment it has been shown that the user interaction time with a web site and the method of page loading affects the QoS [10]. Tolerance of delay depends on users conceptual models about the working of a system. Poor web performance creates a poor corporate image and the users feel less secure while 3 using the website. The user perception can be integrated into server design and therefore results in QoS that reflects user’s perception about the quality [10]. As the use of web based technology is growing with more and more users uploading their personal data over the web, authentication plays a vital role in the internet world to ensure the security of data. Based on the level of security, the RT for the web authentication can be changed and it can be useful for the service providers to deliver secure services to the end users and stay competitive in the global market. 1.1 OBJECTIVE One of the main objectives of this thesis work is to study the effect of response time on user perceived security and derive a relationship between them. As the user perceived security also affects QoE, the other objective is to derive a relationship between user perceived security and QoE. To achieve this we have done a web login experiment with different response times. Students at Blekinge Tekniska Högskola (BTH) took part in this experiment and answered a survey questionnaire. After collecting data from the users survey questionnaires, we have done qualitative and quantitative analysis of the data to study the user behavior in terms of level of security towards web login authentication procedure. We have visualized data in Microsoft excel to show the relationship between RT and performance of web page, and the relationship between RT and user perceived security. 1.2 RESEARCH QUESTIONS 1. Do the users feel more secure if the response time for a web login is longer? 2. How does users perceived security relate to increased and decreased response times for web authentication? 3. What is the relationship between user perceived security and QoE? 1.3 DOCUMENT STRUCTURE The remaining report is organized as follows. Chapter 2 defines the technical background and the related work that has been done in this field. Chapter 3 describes the experimental setup and chapter 4 describes the qualitative and quantitative analysis of results. Chapter 5 concludes the report and presents future work. 4 Background 5 6 Chapter 2 Background Internet was initially designed for simple applications such as World Wide Web (www), email and file transfer. With the passage of time, Internet has now become the backbone of most existing technology. Therefore it is important to understand and analyze the networks for more robust and secure future web services. Different studies and experiments have been carried out to understand the elements of networks that can be improved for providing security to the end user as most of the user’s private information is available on the Internet today. There has always been a threat of unwanted users accessing other’s private information. As a result, with the developing technology the end users are more concerned about the security of a web page [18]. The service providers are trying to make sure that the users get fast and reliable services. The user perception about a particular web page or service (QoE) judges the overall performance and security of a service. This can help the service providers to update their services to satisfy the end user. Different service providers compete with each other to provide services. The main aim of each company or service provider is to capture a large share of market and it is only possible if the users are satisfied with the service. Performance and security are two important parameters that can judge a user’s satisfaction level for web page authentication. Depending on RT, users get access to a particular web page after authentication. Therefore a users satisfaction levels can be estimated through RT. RT itself depends on different network conditions i.e. QoS, type of security algorithms, supply chains etc. To study the effect of RT on QoE of Performance and QoE of Security; we have performed an experiment and a survey in this research work with a web login procedure. The main aim of this experiment is to study the behavior of different users towards web login procedure. 2.1 QUALITY OF EXPERIENCE The concept of QoE is used to measure user satisfaction level as shown in Fig1. QoE is defined as the overall acceptability of an application or service as perceived subjectively by the end user [5]. QoE includes complete end-to-end system ranging from users, terminal, customer premises network, and core and access network to service infrastructure [5]. In this thesis work QoE refers to the users experience based on end-to-end RT for web logins. QoS QoE End User Application /Service Network Fig1. Relationship between QoS and QoE [12]. 7 2.1.1 QUALITY OF SECURITY EXPERIENCE When the user perception is based on applications or services that have the factor of security involved with them e.g. web applications or web services that require authentication then security plays a vital role in QoE. If the users do not feel secure enough while using a particular service, the service provider might lose the customers. It is important to know whether the users really care about the security for a specific service or not. If the users do not care then the aspect of security might be compromised [6]. Since security plays a crucial role in services today, we have done an experiment with a web login with authentication procedure based on RT to study the user behavior about security and QoE. 2.1.2 QUALITY OF EXPERIENCE MEASUREMENT For statistical analysis and quantitative measurement of QoE, there must be a group of users of an appropriate size who participate in the experiment and then give their answers or ratings to the experiment by answering a survey. When measuring QoE, questions are asked from the users with a particular service or application in mind. The questions must be solely related to the service or application and should not pose any misunderstanding for the users. The questions must be generic and specific for all the users participating in the experiment. If these measures are not taken into account then the experiment might lead to biased results. 2.1.3 CHALLENGES User’s subjective emotions and past experience play an important role in measuring QoE. While measuring QoE, users must be tested with an experiment which is close to a real life scenario. If the experiment is unrealistic then the users may get confused and give biased ratings which may lead to faulty results. Based on previous experience, users who have used internet with slow speed may answer the questions differently as compared to users who use a high speed internet connection. It may also be that not many users are willing to participate in the experiment. The users may not find the experiment interesting and this may lead to users not giving honest ratings. Sometimes users don’t even care and just give the rating as a formality of participating in the experiment. Sometimes users’ subjective emotions play a negative role in rating e.g. if users have too much on their mind or they are busy. To get honest user ratings and good results, the above mentioned challenges should be met by performing an experiment close to the real scenario. In this thesis work we have done an experiment which is close to the real scenario to get good user ratings. 8 2.2 RELATED WORK Quality of Experience (QoE) is a widely discussed topic in the modern era of internet systems and communications. Accessing user perceived security with QoE is a new research area and not much work has been done on it. User perceived security has been evaluated in different ways with the help of experiments. The discussion below shows the current research in the area of QoE and user perceived security. Defining and measuring QoE is difficult and involves studies from different disciplines. QoE has many factors involved of which some are subjective and non controllable while others are objective and controllable [6]. Subjective factors include user emotions, experience and expectations whereas objective factors include technical and non technical factors which can be either application dependent or terminal dependent [6]. A model for user perception of security in web pages has been developed with the help of OpenID web login experiments and MOS (Mean Opinion Score) for quantitative analysis [1]. Previous experiments indicate that there is a difference in opinions about web pages that require login for authentication or security than normal web pages when there is a delay in the service. Users show slightly higher patience towards web pages when the factor of security is involved [1]. Identification and quantification of decisive factors for QoE of Extensible Authentication Protocol Method for GSM Subscriber Identity Modules (EAP-SIM) with OpenID authentication has been studied to find out the parts of the EAP-SIM authentication which give the greatest contribution to RT (Response Time) [2]. Based on the experiments future optimization of user perception towards safety can be analyzed [2]. Society’s behavior towards getting rid of anxiety and achieving a greater sense of safety has been studied with the help of an experiment using nursing care robot for security evaluation [7]. Whereas [8] has discussed an approach to diminish the anxiety of people’s minds and judging sense of safety towards science and technology from the standpoint of interface engineering. The effect of color, voice and information presentation on user perceived safety has been studied [8]. Similarly [9] has discussed the user sense of security in terms of safety, anger and disgust by using a humanoid robot’s pick and place motion. Different experiments designed to estimate user tolerance of QoS in the area of e-commerce have been presented and designing web servers based on users conceptual models for web tasks and user tolerance has been discussed [10]. The research work above has shown QoE and sense of security from different angles. Sense of security has been taken as feelings of happiness, fear, anger or disgust in different research papers as mentioned above. Based on the response time sense of security in terms of privacy of information for web logins has not been addressed. The current literature lacks study in relationship between response time and user perceived security and their overall effect on QoE. The main aim of this thesis is to study the response time involved in the authentication procedures and their effect on user perceived security and overall QoE. 9 2.3 RESEARCH METHODOLOGY For this thesis work, we have performed a local web login experiment with users based on various RTs. For qualitative and quantitative analysis of RT effects on QoE of performance and QoE of security we have designed a survey questionnaire for users participating in the experiment. The survey questionnaire will help us analyze how users feel about the performance and the security of the web login based on different RTs. For ratings we have chosen Continuous Rating Scale (CRS) methodology. CRS is used for user subjective ratings. By using this methodology the users are asked to give a rating by placing a mark at a position corresponding to their perception of the observed phenomenon on a continuous line. The line is usually labeled at each end. The main advantage of this scale is that user’s immediate reaction to the changing level of QoS which affects QoE can be judged quantitatively [13]. This assessment is applicable to systems with variable QoS or tasks of low cognitive load [13]. CRS was developed to allow users to access both audio and video in video conferencing applications [14], [15]. In this thesis work we have used this scale for data transfer application in the form of a webpage. For quantitative analysis of user perceived security, we have divided the ratings into three categories for making the analysis simple. First category corresponds to the users who gave high ratings to 0 s RT for web page security. Second category corresponds to the users who gave high ratings to 1 s RT for web page security and the third category corresponds to the users who gave low ratings to 0 s RT and high ratings for 8 s RT for web page security. These categories are indicated in Appendix B. 10 Experiment Setup 11 12 Chapter 3 Experiment Setup In this chapter we will discuss the experiment set up and how the users undertake the experiment. The experiment includes, adding various RTs in a web page login created in PHP and MySQL and the development of a survey questionnaire for user ratings. 3.1 DESIGN The main idea behind the experiment setup is that the users are given a platform, in this case a local web page with a user login system. As our work aims at judging user sense of security and QoE, the web page takes the username and password from the user for authentication and fetches the information from the web server. To bring our experiment close to real life situations, we introduced various RTs in the login procedure. For quantitative and qualitative analysis of user sense of security and QoE we designed a questionnaire. By analyzing the user ratings the relationship between response times, user perceived security and QoE has been derived. 3.2 EXPERIMENT DESCRIPTION In this section, a detailed description of the experiment is described. We set up the experiment in windows environment. For this we installed Apache web server and SQL database in Windows 7 operating system. The web page interface consisted of simple username and password fields as shown in Figure 3.1. Response time plays an important role in authentication procedure and it is dependent on networks conditions, type of security algorithm and supply chains. To simulate the existence of response time, we introduced delay in the login procedure with the help of the sleep(x) command in PHP where x is the required delay in seconds. We created four cases by introducing response times of 0 s, 1 s, 4 s and 8 s. For the qualitative and quantitative analysis of response time and its effects on user perceived security and QoE, we designed a survey questionnaire which consisted of two questions. These questions were repeated for the response times as mentioned above. 28 different students with engineering backgrounds participated in the experiment and answered the survey questionnaire. CRS was for user ratings. Each user performed the web login experiment for each RT and then gave the rating on the questionnaire. Users also participated in a discussion after completing the survey to give their opinion about the web logins experience in normal life and their way of thinking about the sense of security and QoE. After the survey we translated the user ratings into percentages for analysis. For visualization of results we plotted the results in graph using Microsoft Excel. 13 Fig. 3.1 Web page for user experiments 14 Results 15 16 Chapter 4 Results 4.1 QUANTITATIVE ANALYSIS In this section we will show the quantitative analysis of the results in the form of graphs. The users first performed the web login experiment with RTs of 0s, 1s, 4s and 8s and then gave the ratings on the questionnaire. 4.1.1 PERFORMANCE OF WEB LOGIN First we needed to study the effect of RT on performance of web page login. Based on the RT for the authentication procedure, 28 users gave ratings which can be seen in Figure 4.1 with RT on x-axis and users perceived web login performance on yaxis. We have used exponential, linear, logarithmic and power regression lines for quantitative analysis. We compared the regression lines with their R2 values to get the best fitting trend line for our experiment as shown in Table 1. Exponential regression line gave the best R2 value than linear, logarithmic and power regression lines for predicting the trend. This regression line is shown in Figure 4.1 Table 1.Regression Lines with Coefficient of Determination for QoE of Performance Regression Line R2 Value Regression Line Equation Exponential Linear 0.992 0.956 y = 86.74e-0.13x y = -7.169x + 84.37 Logarithmic 0.567 y = -2.01ln(x) + 53.52 Power 0.479 y = 49.68x-0.03 Figure 4.2 indicates that with the increase in RT the performance of the web login decreases. Almost all the users gave high ratings for RT of 0 s which is the ideal case and then for RT of 1 s. RT of 8 s is considered worst for authentication procedures. This proves that with the increasing RT, the performance decreases. In other words it can be seen that with the increasing RT, the user perception about the performance of authentication procedures decreases i.e. the QoE decreases. The exponential regression line shown in Figure 4.1 indicates the decreasing trend with the increasing RTs. RT affects the user satisfaction and with the increasing RT, there is a greater risk that the user might stop using the service which has high response times. 17 Fig. 4.1 RT vs. Performance of web page login Figure 4.1 indicates that with the increase in RT, users perceived performance of the web login decreases. Almost all the users gave high ratings for RT of 0 s which is the ideal case and then for RT of 1 s. RT of 8 s is considered worst for authentication procedures. It can be seen in the graph that with the increasing RT, the user perception about the performance of authentication procedures decreases i.e. the QoE of performance decreases. The exponential regression line shown in Figure 4.1 indicates the decreasing trend with the increasing RTs. 4.1.2 SECURITY OF WEB LOGIN For the quantitative analysis of security the users performed the same experiment with the mentioned RTs. Different users gave different ratings about the sense of security. Since there was a big variation in security perception between all the users that participated in the experiment, the quantitative results were divided into categories to make the analysis simple. For each category we plotted linear, logarithmic, exponential and power regression lines for comparison of R2 values. The R2 values of these regression lines for each category are shown in Table 2. Category 1 shows perceived security for 53.5 % of all users that participated in the experiment as indicated in Figure 4.2. For this category, exponential regression line gave best R2 value as compared to linear, logarithmic and power regression lines as indicated in Table 2. The users of this perception category gave an average rating of 80 % for perceived security at 0 s RT. With the increase in RT the user perceived security decreases. These users felt less secure for RT of 8 s. The decreasing trend in user perceived security can be seen in Figure 4.2 with exponential regression line. 18 Table 2. Regression Lines with Coefficient of Determination for QoE of Security Regression Line R2 Value Regression Line Equation Exponential Linear Logarithmic 0.982 0.936 0.695 y = 75.17e-0.11x y = -5.595x + 73.68 y = -1.76ln(x) + 48.90 Power Linear Exponential 0.571 0.996 0.983 y = 46.64x-0.03 y = -4.756x + 80.23 y = 83.53e-0.08x 2 Logarithmic Power Power Logarithmic 0.895 0.847 0.874 0.835 y = -14.9ln(x) + 76.89 y = 78.23x-0.26 y = 64.16x0.018 y = 1.021ln(x) + 64.57 3 Linear 0.799 y = 2.733x + 51.86 Exponential 0.756 y = 51.59e0.046x Category 1 Fig. 4.2 RT vs. Average user perceived security Figure 4.3 indicates the results for category 2. This category shows the results for 25 % of all users. These users felt 53 % secure for the RT of 0 s and said that this is the ideal case and this can never be possible to have 0 s RT with different network conditions. They gave preference to RT of 1 s and rated the security level of web page login to approximately 75 %. They felt less secure for RTs of 4 s and 8 s. For 19 this category linear regression line gave R2 values as compared to exponential, logarithmic and power regression lines as shown in Table 2. Fig 4.3 RT vs. Average user perceived security Figure 4.4 indicates results for category 3. This category shows perceived security for 21.5 % of all users who participated in the experiment. The users for this category gave entirely different ratings as compared to the above mentioned categories. They rated the security of web login to 46 % for RT of 0 s and felt more secure for a RT of 8 s and rated the security of web login to 71 %. For this category power regression line gave better R2 value as indicated in Table 2. With the increase in RT there is an increasing trend in user perceived security as indicated in Figure 4.4. The reason that users gave for this behavior was that authentication should require more RT. If there is a security check running in the background then the network should spend some time to provide authentication for this web page login. 20 Fig. 4.7 RT vs. Average user perceived security 4.2 QUALITATIVE ANALYSIS In this section we present the qualitative results. Each user participated in a small discussion after the experiment in which they presented their experiences about how they perceive security and performance of web logins in everyday life. As the user group consisted of international students with different cultural backgrounds, therefore they presented different thoughts about security and QoE based on their past experiences. Users said that if the RT is 0 s then the service is better, performance vice, but for security of web login there should be some more waiting time. One user said that if the RT is from 1 s to 3 s then I think the service is very good and the web login is safe and I would not bother about security for this RT. Users also said that if it’s a trusted web login e.g. Hotmail, Gmail or Yahoo then they don’t bother about the security issue. But if it takes more than 6 s then they might think about leaving that Internet service provider. Two users who use high speed Internet connections said that 200–300 ms are enough for a good service and for authentication procedure to complete for web logins. One user said that a RT of 0 s is unreal and the service might be good but web login may not be safe as there would be no security algorithm running in the background. Six users said that if it takes very long time to login then they think there is something wrong with their system or router. One user said: if it takes more than 7 s to authenticate myself to a web page then first I think that there is something wrong with my system or router. If they are working all right then I have second thoughts about security of the webpage. Another user said: if the RT is 2 s to 4 s then there might be a complex security algorithm running in the background or there might be proxy servers in the way that are responsible for that and I think in that case the 21 service is good and introduction of RT of 2 s to 4 s is because of extra security of a web page. Eight users who had a past experience of using slow network speeds said: authentication normally takes a long time and they feel safe that way. So they rated 0 s RT as not safe and they preferred RT of 8 s as being the safest RT for security. One user said: If the RT is above 4 s then my perception will be that some third party is trying to access my information unless it is some trust worthy website but at the same time the service might still be good. 4.3 DISCUSSION After analyzing all the data and visualizing them, it can be seen that users have different satisfaction levels based on their past experiences or network conditions. RT plays a vital role in judging the performance and security of a service. As users keep most of their private information on the internet, they require a good service to access the information and at the same time demand security. After the analysis we can say that QoE and security are dependent on RT. If the RT is large then the users rate the performance of a web page very low. 53.5 % of the users who participated in the experiment rated security of web login low for high RTs. 25 % of the users gave their highest ratings for RT of 1 s for security of the web login and gave low ratings for RT higher than 1 s. 21 % of the users had different views and rated security of web login high for high RTs. In their opinion, authentication should take some time and the higher the RT, the more complexity of the security algorithm in the background. From Figure 4.1 it can be seen that almost all the users judged the QoE of performance as good at small RTs and gave low ratings to performance for high RTs. From Figure 4.2 and Figure 4.3 it can be seen that with the increasing RT majority of the users feel less secure for a particular web login. So if the RT increases, QoE of performance and QoE of security both decrease. If the RT is high due to network conditions then service providers should give better services to the customers. If the RT is high for authentication of a particular website but at the same time RT is low for other websites then the complexity of security algorithm must be responsible for introducing RT or there might be a problem with the security of that website. In this case users might think that there is no problem with the performance of service but that there is a problem with the security of that particular web login. 22 Conclusion & Future Work 23 24 Chapter 5 Conclusion & Future Work 5.1 CONCLUSION We have presented qualitative and quantitative analysis of Quality of Security Experience in this thesis work. Quality of security experience depends both on QoE and user perceived security. With the help of web login experiment and analysis of users’ survey ratings, we have evaluated user QoE of performance and security of a web page login for different RTs. We have performed an experiment and a survey to study user behavior towards increasing and decreasing RTs for web authentication. The experiment consisted of a web page login where users entered username and password for authentication. We made four cases for authentication with RTs of 0 s, 1 s, 4 s and 8 s. The survey consisted of two questions for each case. First question was related to the performance of the web page and second question was related to the security of the web page. Users first performed the experiment and then answered the survey questionnaire for each case. Based on the user ratings and discussions, we analyzed the results and plotted the relationship between RTs, performance of web page and security of web page. After performing the experiment and analysis of results we came to the conclusion that there is difference in user perception about quality and security for web page logins. The results suggest that with the increasing RTs the users perceive the performance of service as worse. So for increased RT the QoE of performance decreased. From survey ratings and discussions, users had different opinions based on their past experiences and network speeds for security. 53.5 % of the users who participated in the experiment have rated that with increase in RT, they feel less secure with the authentication procedure for web login. 25 % of the users preferred RT of 1 s over other RTs and felt secure. They said if RT is higher than 3 s then there might be a third party trying to access their information. 21.5 % felt secure with the increase in RT and think that the complexity of the security checks plays a major role in adding extra RT. Therefore, for better QoE of performance and better QoE of security of web pages that require authentication, the RT should be small. The security algorithm for web authentication should be designed in a way that it is secure and at the same time it does not increase the RT. The service providers should improve the service by controlling QoS parameters which can help them reduce the RT so that users feel more secure while using their service. The security algorithms for web authentication and network conditions introduce increasing and decreasing RTs. For better QoE of security, both these entities need to work in a way to reduce RT. 5.2 FUTURE WORK In this thesis work we have not used any security algorithm for user authentication. We have only used increasing and decreasing RTs to check the user behavior 25 towards QoE of security. The future work should investigate QoE of security and QoE of performance with the introduction of an actual security algorithm in the experiment and then study the user behavior with survey and interviews. The role of complexity of security algorithms in producing increasing RTs and its effects on user sense of security and QoE needs to be investigated. This way the users might give different ratings based on their knowledge and expectations. The results of this thesis work should be compared with the experiment including security algorithm for authentication to find the differences in user behavior. 26 BIBLIOGRAPHY [1] C. Lorentzen, M. Fiedler, H. Johnson, J. Shaikh and I. Jorstad. On User Perception of Web Login– A Study On QoE in the Context of Security. In proceedings of Australian Telecommunication Networks and Applications Conference (ATNAC 2010), Auckland, New Zealand, November 2010. [2] C. Lorentzen, M. Fiedler, H. Johnson, J. Shaikh and I. Jorstad. Decisive Factors for Quality of Experience of OpenID Authentication Using EAP-SIM. In Proceedings of the European Teletraffic Seminar (ETS 2011), Pozan, Poland, February 2011. [3] C. Eliasson, M. Fiedler and I. Jorstad: A criteria-based evaluation framework for a authentication schemes in IMS. In proceedings of the 4th International Conference on Availability, Reliability and Security (AReS), Fukuoka, Japan, March 2009, pp. 865-869. [4] T. Ciszkowski, C. Eliasson, M. Fiedler, Z. Kotulski, R. Lupu and W. Mazurczyk. SecMon: End-toEnd Quality and Security Monitoring System. Annales UMCS, Informatica, AI 8 (2008), pp 186201. [5] J. Zhang and N. Ansari,”On Assuring End-to-End QoE in Next Generation Networks: Challenges and a Possible Solution,” Communication Magazine, IEEE, Issue: 7 Volume 49, July 2011, pp. 185-191 [6] C. Lorentzen,”User Perception and Performance of Authentication Procedures,” Licentiate dissertation, Dept. School of Computing, Blekinge Institute of Technology, Karlskrona, Sweden, 2011. [7] H. Tamura, Y.Minura, M. Inuiguchi,”Value judgment for evaluating the sense of security based on various utility theoretic approaches,” In proceedings of SICE annual conference, Sapporo, Japan, August 2004. [8] M. Nakatani, R. Tabata, S. Nishida,”Discussion about a sense of security and satisfaction,” In Proceeding of the IEEE International Conference on Systems, Man and Cybernetics, Taipei, Taiwan, October 2006. [9] S. Nonaka, K. Inoue, T. Arai, Y. Mae,”Evaluation of human sense of security for coexisting robots using virtual reality,” In Proceedings of IEEE International Conference on Robotics and Automation, April 2004. [10] N. Bhatti, A. Bouch, and A. Kuchinsky. Integrating user-perceived quality into web server design. In Proceedings of WWW’00, Amsterdam, 2000. [11] J. Shaikh, M. Fiedler and D. Collange,”Quality of Experience from user and network perspectives,” In Annuals of Telecommunications: Quality of Experience – 1 Metrics and performance evaluation, February 2010, pp. 47-57 [12] Kilkki K., Quality of Experience in Communication Ecosystem, Journal of Universal Computer Science, Vol. 14, Page 5, 2008 [13] A. Bouch, M. A. Sasse, H. DeMeer, “Of Packets and People: A User-centered Approach to Quality of Service,” Quality of Service, 2000. IWQOS. 2000 Eighth International Workshop, 2000, pp. 189-197. [14] ACTS TAPESTRIES, “Acceptability studies in selected areas of audio-visual communications,” ACTS Project AC055, Deliverable R/003/b2, 1997. [15] A. Bouch, A. Watson and M. A. Sasse, “QUASS – A tool for measuring the subjective quality of 27 real time multimedia audio and video,” In Proceedings of HCI 98, (Sheffield, England), 1-4 September 1998. [16] M. Fiedler, T. Hossfeld and P. Tran-Gia. A Generic Quantitative Relationship between Quality of Experience and Quality of Service. IEEE NETWORK, Special Issue on Improving QoE for Network Service, Vol. 24, No. 2, pp. 36-41, March/April 2010. [17] S. Eriksen, C. Eliasson, M. Fiedler, S. Chevul and A. Ekelin. Mapping service quality – comparing quality of experience and quality of service for Internet-based map services. In Proceedings of the 30th Information Systems Research Seminar in Scandinavia (IRIS), Tampere, Finland, August 2007. [18] Wubin, Z. K. Feng, Y. Y. Axin,”A Data Safety Transmission Solution in Web Application” International Conference on Web Intelligence and Intelligent Agent Technology, January 2008, pp. 303-306 28 APPENDIX A PHP CODE 29 30 Appendix A PHP Code <?PHP error_reporting (E_ALL ^ E_NOTICE); ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>BTH information</title> <link href="style.css" rel="stylesheet" type="text/css" /> </head> <body> <div id="maincontainerwrapper"> <div id="header"><img src="images/head_logo_en.png" width="568" height="85" alt="header" /></div> <?PHP session_start(); $delay = array('0', '1', '4', '8'); //$delay = array('0.5', '1', '2', '5', '10', '10', '5', '2', '1', '0.5'); if(isset($_POST['logout'])) { $_SESSION['aktiv'] = false; $_SESSION['cnt']++; } if(isset($_POST['name']) && isset($_POST['pass'])) { $_SESSION['aktiv'] = true; } if($_SESSION['aktiv'] == false && $_SESSION['cnt'] < 10) { echo ' <h1> Welcome to BTH info login</h1> <form action="index.php" method="post"> Username: <input name="name" type="text" /><br/> 31 Password: <input name="pass" type="password" /><br/> <input name="submit" type="submit" /> </form> '; } if($_SESSION['aktiv'] == false && $_SESSION['cnt'] >= 10) { echo 'test klart<br>'; foreach($delay as $key => $value) { echo '# '.($key+1).' '.$value.'<br>'; } } if($_SESSION['aktiv'] == true) { if(!isset($_SESSION['cnt'])) { $_SESSION['cnt'] = 0; } if(isset($_SESSION['cnt']) ) { //sleep($delay[$_SESSION['cnt']]); sleep(0); } ; echo "Welcome to BTH info<br>"; echo 'Test no. '.($_SESSION['cnt'] + 1); echo "<form action=\"index.php\" method=\"post\"> <input hidden=\"1\" name=\"logout\"> <input name=\"submit\" type=\"submit\" value=\"logout\" /> </form> <br> <br> LIFE AT BTH <br> <br> Blekinge Institute of Technology (BTH) is one of Sweden's most interesting and beautiful places for higher education! BTH is also the most distinctly profiled institute in Sweden, thanks to our strong emphasis on applied information technology and innovation for sustainable growth. 32 BTH was founded in 1989 which means that we are a young institute who manage education and research in new ways, but still with good quality. The humanities, social sciences, management and health sciences are all integrated into an applied IT profile that enables technology and the humanities to develop in exciting new directions. Teaching and research at BTH are of a high international standard, with practical learning serving as the focal point for students, teachers and researchers. The emphasis on research, especially cutting edge research, in all our degree programmes is designed to preserve the vital link between education and research. In addition to the large number of nationalities represented on the faculty level, international students from all over the world give us a truly international environment. "; } ?> <div id="footer">This is an experiment.Wajahat</div> </div> </body> </html> 33 34 APPENDIX B GRAPHS 35 36 Appendix B GRAPHS 37 38 39 40 APPENDIX C SURVEY QUESTIONNAIRE 41 42 Appendix C SURVEY QUESTIONNAIRE Age: Gender: Male / Female Nationality: Time Spent in Sweden: University Program: Internet Usage (Web Browsing)/Week (Hours): Any education about security at university level? Yes / No. If yes then how much? For what do you use internet mostly? Case 1 1. How would you rate the performance of this web page, considering response time? Worst | - - - - - - - - - | - - - - - - - - - | Best 2. How would you rate your own perception of safety with regards to the response time for this web page log in? Not Safe At All | - - - - - - - - - | - - - - - - - - - | Totally Safe Case 2 1. How would you rate the performance of this web page, considering response time? Worst | - - - - - - - - - | - - - - - - - - - | Best 2. How would you rate your own perception of safety with regards to the response time for this web page log in? Not Safe At All | - - - - - - - - - | - - - - - - - - - | Totally Safe 43 Case 3 1. How would you rate the performance of this web page, considering response time? Worst | - - - - - - - - - | - - - - - - - - - | Best 2. How would you rate your own perception of safety with regards to the response time for this web page log in? Not Safe At All | - - - - - - - - - | - - - - - - - - - | Totally Safe Case 4 1. How would you rate the performance of this web page, considering response time? Worst | - - - - - - - - - | - - - - - - - - - | Best 2. How would you rate your own perception of safety with regards to the response time for this web page log in? Not Safe At All | - - - - - - - - - | - - - - - - - - - | Totally Safe Discussion 44
© Copyright 2026 Paperzz