Instructions for Isolating One Median from the Sea of Data Introduction A print includes, at a minimum, quadruplicates of each antigen so we need a way of compressing all this information into one numerical value for statistical analyses. Inevitably, compressing the data leads to loss of information so we use a scheme that minimizes the loss of useful information. To this end, we take the median value of a unique antigen. The median is an excellent representation of the average value for data that is not normally distributed and reduces skewing do to outliers in the data. Which Column? GenePix already calculated the median, mean, standard deviation, ratio, and many other statistical parameters during the analysis section of adding a grid to the slide. In addition, GenePix 5.0 outputs information from the GAL file, settings, and flags. In all, it’s a sea of information, and for a person doing their first analysis, swimming may not look like an attractive option. Fortunately, we use a simple strategy in our statistical analyses. This strategy uses the median value of the fluorescent color used with the secondary antibody. For example, if a cy3-conjugated antibody was used, then we want to collect the F532 Median – B532 column (column AX in Excel). Alternatively, if a cy5conjugated antibody was used, then we want to collect the F635 Median – B635 column (column AW in Excel). Consolidating All the Slides into One Sheet Launch Microsoft Excel Select File, Open… (Ctrl+O) Select a slide from the GenePix analysis (see Howto Grid µArray Images.doc) Select the Name column (column G in Excel) Select Edit, Copy (Ctrl+C) Select File, New… (Ctrl+N) I’ll call the new spreadsheet, spreadsheet A, and the old spreadsheet, spreadsheet X. Select Edit, Paste (Ctrl+V) in the first column (column A). Rename the label Name to something such as Unique ID (later programs don’t like the header Name). Return to spreadsheet X. Select the appropriate FXXX Median – BXXX column (depends on color) Select Edit, Copy (Ctrl+C) Return to spreadsheet A. Select Edit, Paste (Ctrl+V) in the second column (column B). Rename the label FXXX Median – BXXX to something more descriptive of the sample and slide #. Select File, Open… (Ctrl+O) Select the next slide from the GenePix analysis Repeat the procedure for adding the Median column from spreadsheet X to spreadsheet A until you have one file containing all the slides from the scan/analysis. Select File, Save As Name the Excel file with a description that represents the state of the analysis. For example, you might name the file slXX-YY_print_YY_MM_DD_medians.xls The first part in the data compression process is complete. Sorting All the Median Values Select File, New… (Ctrl+N) I’ll call the new spreadsheet, spreadsheet B, and the spreadsheet with all the slides, spreadsheet A. Highlight the entire data set (Ctrl+A) from spreadsheet A. Select Edit, Copy (Ctrl+C) Return to spreadsheet B. Select Edit, Paste (Ctrl+V) See the images below for the final product of the next few steps*. Delete empty rows above headers. Insert a row between 1 & 2 (beneath the row of sample/slide names) Number the empty cells beneath each slide with a number that starts with one and increases to n, where n is the total number of slides in the experiment. Move the Unique ID to cell A2. Highlight cells A3 – MN (where M is the final data column and N is the final data row) Select Data, Sort… Select Column A, Check Ascending, Check No Header Row, and Click OK This step groups all the replicates together (see sorted image). Unsorted Image * Sorted Image I reference cells by their row (numbers) and their column (letters). For example, A1 is the cell for the first row and first column. Consolidating All the Median Values into One Median The algorithm to consolidate the one or two thousand total antigens into only the unique antigens uses functions that the average Excel user may or may not be familiar with. I’ll go slowly and clarify each step along the way. Please refer to the image below because I rely on the cell names and numbers in this explanation. This particular experiment contains 26 samples, spanning columns B – AA. I hid columns E – AA to fit all the relevant information onto a reasonably sized image. Highlight cells A1 – AA2 Select Edit, Copy (Ctrl+C) Select cell AD1 Select Edit, Paste (Ctrl+V) Enter the number 1 in cell AB2 Enter the following algorithm in cell AB3 =IF((EXACT($A3,$A4)),1+AB2,1) Select cell AB3 Select Edit, Copy (Ctrl+C) Highlight cells AB3 – ABn† Select Edit, Paste (Ctrl+V) Select cell AC3 Enter the number 1 in cell AC3 Select cell AC4 Enter the following algorithm in cell AC4 =AC3+1 Select cell AC4 Select Edit, Copy (Ctrl+C) Highlight cells AC4 – ACn Select Edit, Paste (Ctrl+V) Select cell AD3 Enter the following algorithm in cell AD3 =IF((EXACT($A3,$A4))," ",A3) Select cell AD4 Select Edit, Copy (Ctrl+C) Highlight cells AD4 – ADn Select Edit, Paste (Ctrl+V) Select cell AE3 Enter the following algorithm in cell AE3 =IF((EXACT($A3,$A4))," ",MEDIAN(OFFSET($A$3,($AC3-$AB2),AE$2,1,1):OFFSET($A$3,$AC2,AE$2,1,1))) Select cell AE4 Select Edit, Copy (Ctrl+C) Highlight cells AE4 – BDn Select Edit, Paste (Ctrl+V) The second part in the data compression process is complete. Eliminating the Extra Space Highlight cells AD1 – BDn Select Edit, Copy (Ctrl+C) Select Sheet2 Right click on cell A1 and select Paste Special… Select Values and click OK Highlight cells A3 – AAn Select Data, Sort… Select Column A, Check Descending, Check No Header Row, and Click OK Scroll down to row n and make sure all the antigens are grouped continuously. If they aren’t, then delete the intervening rows. Save this file and the data compression is now complete! † n represents the final row of the complete data set Brief Explanation of the Algorithm There are two main parts to this algorithm. The first part searches through the list of antigen names looking for duplications. The second part determines the number of replicates for a particular antigen then computes the median value of all the replicates. The first algorithm uses the EXACT function to search for exact string matches. I ask the question, is this name the same as the previous one. If the answer is yes, then I add a blank to the cell and move on. If the answer is no, then I add the name to the cell and move on. The second algorithm uses the EXACT function in the same manner as the first algorithm. In addition, the second algorithm also uses the OFFSET and MEDIAN functions. The OFFSET function allows me to include all the cells that particular group of replicates occupy. The MEDIAN function computes the median value of this set.
© Copyright 2026 Paperzz