The kinship2 R Package for Pedigree Data: Supplementary Material Jason P. Sinnwell, Terry M. Therneau and Daniel J. Schaid Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905 This supplementary document provides additional details and examples for the methods described in the application note entitled ``kinship2: An R package for pedigree data''. The examples below make use of Figure 1 of the main document, a 17-subject pedigree containing four generations, identical twins, and a consanguineous marriage. PEDIGREE PLOTS As a brief example, let us assume we have a data frame in R named fam1 with columns: subjid as a unique id; dad and mom for subject id of each subject's parents (0 if a founder, i.e., no parents in the pedigree); sexcode as 1 or 2 for male or female; and disease and vitalstatus indicator variables for disease and vital status. Below shows an example of how to construct the pedigree object and then plot it: R> ped1 <- with(fam1, pedigree(id=subjid, dadid=dad, momid=mom, sex=sexcode, affected=disease, status=vitalstatus)) R> plot(ped1) The status argument expects the vital status indicator and the affected argument expects an indicator of disease status. The affected argument can also be a matrix, which provides a way to store multiple indicator variables for each subject that can be used in plot symbols or simply used later in analyses. Figure S1 shows a detailed pedigree plot created from subjects 1-10 of the full pedigree shown in Figure 1 of the main document. The plot shapes are split into as many portions as there are indicator variables in the affected matrix of the pedigree object. This example has three indicators, only one of which indicates disease status. The pedigree.legend function created the legend in the top right; indicating which portion of the shape shading indicates a positive status for the three variables in the affected matrix: disease, smoker, and availstatus. Additionally, the plot method allows the subject id to be alternate text. Supplementary figure S1 has the subject identifier concatenated with a character string created from the indicator available in the object's affected matrix, separated by a line-break character. Finally, though not shown here, the plot allows subject-specific colors, which applies to the plot symbol and identifier. PEDIGREE TRIMMING The pedigree.shrink function iteratively removes subjects given their availability status (e.g., genetic data available) and an affected status, from a pedigree in the following order, until the pedigree is of a desired bit size. 1. Uninformative subjects, i.e., unavailable (no genetic data available) with no available descendants 2. Available terminal subjects with unknown affected status if both parents are available 3. Unkown affected status 4. Unaffected 5. Affected The first two steps are always run to shrink the pedigree, regardless of the desired bit size. If after these steps the bit size is not met, the algorithm iteratively shrinks the pedigree by preferentially removing subjects, in the order of steps 3 to 5. If there are multiple subjects that meet the condition under consideration, one of them is randomly chosen to be removed. The method’s default bit size of 16 was chosen because some linkage software has memory requirements that increase exponentially by pedigree size, but a different bit size may be chosen for determining informative subjects in genome-wide association and sequencing studies where cost may be more of a limiting factor. Given the affected information shown in Figure 1, and an availability status that is TRUE for all affected individuals, the pedigree in Figure 1 shrinks to the pedigree in supplementary Figure S1 after just the first two steps of pedigree.shrink, with a bit size of (2*6)-4=8. The command to shrink the pedigree in Figure 1 to the pedigree in Figure S1 is shown below, followed by a call to the pedigree.shrink print method. R> shrink1 <- pedigree.shrink(ped1, avail=availstatus) R> print(shrink1) Pedigree Size: Original Only Informative Trimmed N.subj Bits 17 19 10 8 10 8 Unavailable subjects trimmed: 13 15 16 17 Non-informative subjects trimmed: 11 12 14 Trimming a pedigree can also be carried out by the built-in R sub-setting utilities, using the indices of the subjects that are to be either kept or removed. For example, the pedigree in Figure S1 can be created from the pedigree in Figure 1 with either of the following commands. R> pedS1 <- ped1[-c(11,12,13,14,15,16,17)] R> pedS1 <- ped1[1:10] KINSHIP MATRIX EXAMPLES Supplementary Table S1 shows the kinship matrix for the pedigree shown in supplementary Figure S1. Ignoring for a moment the fourth generation, subject 10, the kinship coefficients for the first three generations follow a simple pattern, where self kinship coefficients are 0.50, parent-offspring coefficients are 0.25, and grandparent-grandchild coefficients are 0.125. Avuncular coefficients for subject pairs 4-7, 3-8, and 3-9 are also 0.125, and cousins, pairs 7-8 and 7-9, are half that at 0.0625. Also, because subjects 8 and 9 are identical twins, the kinship coefficients with each other are the same as their self kinship coefficient. Subject 10 in the fourth generation introduces a few scenarios handled seamlessly by the kinship function. First, subject 10 is equally-related to her uncle as to her father because subjects 8 and 9 are identical twins. Furthermore, subject 10’s self kinship coefficient is greater than 0.5 because of inbreeding; that is, there is a 1/32=0.03125 chance that the two alleles at a given locus are identical by descent from either of her great-grandparents. The inbreeding coefficient for any subject i can be obtained by hi=2Ki,i-1; therefore, h10=0.0625. The X chromosome kinship matrix for the same 10-subject pedigree is in Table S2, where we have included the sex (M=Male, F=Female) for each subject with the identifiers in rows and columns so it is clear to follow the differences from Table S1. Table S1 Kinship matrix on autosomes for 10-member pedigree in Figure S1 ID(sex) 1(M) 2(F) 3(M) 4(F) 5(F) 6(M) 7(F) 8(M) 9(M) 10(F) 1(M) 0.50 0.00 0.25 0.25 0.00 0.00 0.13 0.13 0.13 0.13 2(F) 0.00 0.50 0.25 0.25 0.00 0.00 0.13 0.13 0.13 0.13 3(M) 0.25 0.25 0.50 0.25 0.00 0.00 0.25 0.13 0.13 0.19 4(F) 0.25 0.25 0.25 0.50 0.00 0.00 0.13 0.25 0.25 0.19 5(F) 0.00 0.00 0.00 0.00 0.50 0.00 0.25 0.00 0.00 0.13 6(M) 0.00 0.00 0.00 0.00 0.00 0.50 0.00 0.25 0.25 0.13 7(F) 0.13 0.13 0.25 0.13 0.25 0.00 0.50 0.06 0.06 0.28 8(M) 0.13 0.13 0.13 0.25 0.00 0.25 0.06 0.50 0.50 0.28 9(M) 0.13 0.13 0.13 0.25 0.00 0.25 0.06 0.50 0.50 0.28 10(F) 0.13 0.13 0.19 0.19 0.13 0.13 0.28 0.28 0.28 0.53 Table S2 Kinship matrix on X chromosome for 10-member pedigree in Figure S1 ID(sex) 1(M) 2(F) 3(M) 4(F) 5(F) 6(M) 7(F) 8(M) 9(M) 10(F) 1(M) 1.00 0.00 0.00 0.50 0.00 0.00 0.00 0.50 0.50 0.25 2(F) 0.00 0.50 0.50 0.25 0.00 0.00 0.25 0.25 0.25 0.25 3(M) 0.00 0.50 1.00 0.25 0.00 0.00 0.50 0.25 0.25 0.38 4(F) 0.50 0.25 0.25 0.50 0.00 0.00 0.13 0.50 0.50 0.31 5(F) 0.00 0.00 0.00 0.00 0.50 0.00 0.25 0.00 0.00 0.13 6(M) 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 7(F) 0.00 0.25 0.50 0.13 0.25 0.00 0.50 0.13 0.13 0.31 8(M) 0.50 0.25 0.25 0.50 0.00 0.00 0.13 1.00 1.00 0.56 9(M) 0.50 0.25 0.25 0.50 0.00 0.00 0.13 1.00 1.00 0.56 10(F) 0.25 0.25 0.38 0.31 0.13 0.00 0.31 0.56 0.56 0.56 FIGURE LEGENDS: Figure S1: A detailed pedigree plot with text added to the subject identifier. Subject symbols show the status for three affected indicators, with a legend in the top right giving the portions attributed to each variable.
© Copyright 2026 Paperzz