Investigating patterns of prehistoric dispersal in Eastern Polynesia: a commensal approach using complete ancient and modern mitochondrial genomes of the Pacific rat, Rattus exulans. Katrina West A thesis submitted in fulfillment of the requirements for the Degree of Master of Science in Genetics at the University of Otago, Dunedin, New Zealand June 2016 Abstract The pattern of prehistoric human dispersal through Eastern Polynesia, representing one of the last major migrations in human history, remains highly contested with conflicting archaeological evidence. The commensal approach, using the phylogenies of commensal animal and plant species has been increasingly applied since the late 1990s as a proxy for human movement. Previous genetic commensal research using the Pacific rat, Rattus exulans, a species transported across Remote Oceania as part of the Polynesian expansion, has been largely unsuccessful in distinguishing finer-scale movements between Remote Oceanic islands, with the majority of rat samples adhering to the same mitochondrial control region haplotype. It was anticipated that with greater molecular resolution provided by complete mitochondrial genome sequencing, a greater number of Pacific rat lineages would be distinguished and could be used as a proxy to investigate the origins and dispersals of Eastern Polynesian people. Archaeological rat specimens were obtained from the earliest occupational contexts across Western and Eastern Polynesia. Complete mitochondrial sequencing of both ancient and modern specimens of the Pacific rat was undertaken. Nine ancient and twenty-five modern haplotype lineages were identified. A central haplotype, derived from an ancestral haplotype in Western Polynesia, is ancestral to all Eastern Polynesian rat populations and reflects a previously proposed central East Polynesian homeland region from which eastern expansion occurred. An Easter Island and Tubuai (Austral Islands) grouping of related haplotypes suggests that both islands were established by the same colonisation wave, proposed to have originated in the central homeland region before dispersing through the south-eastern corridor of Eastern Polynesia. The application of second-generation sequencing in generating complete and credible mitochondrial genomes from archaeological and modern commensal specimens provides greater molecular resolution to investigate prehistoric dispersal and interaction spheres across the Pacific. i Acknowledgements It is with immense gratitude that I would like to acknowledge the people who have made this research project possible. Firstly, I would like to thank my supervisor, Lisa Matisoo-Smith, for your invaluable advice and support over the last year. It has been an absolute pleasure to have worked with you and to have been part of your lab. To Olga Kardailsky and Catherine Collins I am extremely appreciative of all the time you have spent with me, imparting your knowledge from sample processing to bioinformatics. You are both great teachers and I am deeply grateful for the skills you have taught me, of which I am sure I will carry with me into my next endeavour. I would also like to mention James Boocock - your assistance in bioinformatics was greatly appreciated. I am also grateful to all the archaeologists and anthropologists who excavated/trapped the Pacific rat samples for which I used in this project. To Jessica, Kate, Edana and Denise without you. You guys are amazing and I look forward to future catch-ups around the world! Most importantly to my family your continued support and faith in me, even from afar, has been instrumental to my goals and success. Of course, none of my work would have been possible without financial support from the university. I am incredibly grateful for the University of Otago Masters Scholarship, which has supported me during my time in New Zealand. ii Table of Contents Abstract....................................................................................................................................... i Acknowledgements .............................................................................................................. ii Table of Contents .................................................................................................................. iii List of Tables ............................................................................................................................ v List of Figures .........................................................................................................................vi List of Abbreviations .......................................................................................................... vii 1 Introduction ......................................................................................................................1 1.1 Overview of Pacific colonisation and dispersal models.......................................... 1 1.2 Interaction spheres in post-settlement Polynesia ................................................. 10 1.3 The commensal approach ............................................................................................... 12 1.4 aDNA: its use and limitations......................................................................................... 19 1.4.1 Preservation................................................................................................................................ 19 1.4.2 Contamination............................................................................................................................ 21 1.4.3 DNA Damage ............................................................................................................................... 22 1.4.4 New methodologies ................................................................................................................. 23 1.5 Research Objectives .......................................................................................................... 26 2 Methods ........................................................................................................................... 27 2.1 Modern DNA sequencing ................................................................................................. 27 2.1.1 Sample Collection ..................................................................................................................... 27 2.1.2 DNA Extraction .......................................................................................................................... 29 2.1.3 PCR amplification ..................................................................................................................... 29 2.1.4 Library preparation & sequencing .................................................................................... 29 2.2 Ancient DNA sequencing ................................................................................................. 32 2.2.1 Sample Collection ..................................................................................................................... 32 2.2.2 DNA Extraction .......................................................................................................................... 35 2.2.3 Library preparation ................................................................................................................. 36 2.2.4 Bait production & preparation ............................................................................................ 39 2.2.5 Hybridisation capture & sequencing ................................................................................ 40 2.3 Bioinformatics processing .............................................................................................. 41 2.3.1 Modern DNA processing ........................................................................................................ 41 2.3.2 Ancient DNA processing ........................................................................................................ 42 iii 2.4 Phylogenetic Analysis....................................................................................................... 42 3 Results .............................................................................................................................. 46 3.1 DNA preservation and sequence recovery ............................................................... 46 3.2 Sequence authenticity ...................................................................................................... 46 3.3 Phylogenetic analysis ....................................................................................................... 50 4 Discussion ....................................................................................................................... 64 4.1 DNA preservation in Pacific archaeological contexts ........................................... 64 4.2 The application of complete mitochondrial genome sequencing .................... 65 4.3 Implications for Pacific rat dispersal.......................................................................... 68 5 Conclusion....................................................................................................................... 76 5.1 Summary ............................................................................................................................... 76 5.2 Future Directions ............................................................................................................... 77 6 Supplementary material ............................................................................................ 78 7 Reference List ................................................................................................................ 89 iv List of Tables 2.1 Tissue samples collected for modern Pacific rat DNA 27 processing. 2.2 a) Long-range PCR amplification for modern samples. 30 b) Long-range PCR amplification for bait. 2.3 Archaeological samples collected for ancient mitochondrial 33-34 DNA processing. 3.1 Summary statistics for successful library amplification and 47-48 post-sequencing processing of ancient R. exulans samples. 3.2 High-coverage ancient samples used in the phylogenetic 49 analyses. 6.1 Variable positions across 203bp of the control region 78-81 (from position 15,392 15,595). 6.2 Archaeological sample information. 82-85 6.3 Modern sample information. 86-88 v List of Figures 1.1 Map depicting Near and Remote Oceania. 2 1.2 The geographical boundaries of Melanesia, Micronesia 3 and Polynesia. 1.3 A neighbour-joining tree and corresponding map depicting 15 the distribution of the mitochondrial control region haplogroups I, II, IIIA and IIIB. 2.1 Geographical distribution of modern and ancient samples 28 collected for DNA processing. 2.2 Targeted overlapping fragments for PCR amplification. 31 2.3 Archaeological sample of Pacific rat (R. exulans) femur. 35 2.4 Overview of the hybridisation capture on beads method. 37 3.1 51 to the ancient sample MS10592. 3.2 Median-joining network of control region haplotypes. 52 3.3 Distribution of nucleotide diversity ( ) across a) modern 54 and b) ancient complete mitochondrial sequences. 3.4 Rooted maximum likelihood (ML) tree inferred from the 56 modern (complete mitochondrial) dataset. 3.5 Median-joining network of the modern (complete 57 mitochondrial) dataset. 3.6 Rooted maximum likelihood (ML) tree inferred from the 59 ancient dataset. 3.7 Median-joining network of the ancient (complete 60 mitochondrial) dataset. 3.8 Median-joining network of the modern and ancient (complete mitochondrial) datasets, including acquired GenBank sequences. vi 63 List of Abbreviations ~ Approximately ± Plus-minus Five prime Three prime l Microlitre M Micromolar aDNA Ancient DNA BAM Binary version of SAM file bp Base pairs B.P. Years before Present COI or COX1 Cytochrome c oxidase subunit I COII Cytochrome c oxidase subunit II COIII Cytochrome c oxidase subunit III Cyt b Cytochrome b DNA Deoxyribonucleic acid dNTP Deoxyribonucleotide triphosphates EDTA Ethylenediaminetetraacetic acid disodium salt dihydrate g Gram HVR I Hypervariable I region ISEA Island Southeast Asia kb Kilo base pairs km Kilometre M Molar mg Milligram MJ Median-joining ml Millilitre ML Maximum-likelihood mtDNA Mitochondrial DNA NADH Nicotinamide adenine dinucleotide ND4L-4 NADH Dehydrogenase subunit 4L SGS Second generation sequencing vii NJ Neighbour-joining PCR Polymerase chain reaction qPCR Quantitative PCR rpm Rotations per minute rRNA Ribosomal RNA SNP Single nucleotide polymorphism sq Square TE Tris-EDTA TET Tris-EDTA with Tween tRNA Transfer RNA U Unit VCF Variant call file viii 1 Introduction 1.1 Overview of Pacific colonisation and dispersal models The expansion of modern humans into the Pacific, commencing with the mid-late Pleistocene migration through south-east Asia to Australia and New Guinea, and culminating with the colonisation of Eastern Polynesia within the last 1000 years, represents one of the earliest and one of the last major migrations in human history (Matisoo-Smith 2015; Duggan et al. 2014). Following the out of Africa migration, a group of early modern humans in Sundaland (Southeast Asia) migrated onto the island continent of Sahul (Pleistocene New Guinea and Australia) approximately 55-40,000 B.P. (Summerhay Allen 2004; Groube et al. 1986). This would have required crossing Wallacea, a sizeable biogeographical ocean barrier separating Sundaland from Sahul. Multiple open-water crossings, each over a minimum distance of 70km (Matisoo-Smith 2015), would only have been possible with the construction of seaworthy watercraft. The crossing onto Sahul is arguably the earliest example of purposeful, longSahul, humans quickly dispersed over the continent reaching southern Australia and north-eastern New Guinea by 50with successive water crossings reached the Bismarck Archipelago by 40,000 B.P. (Leavesley et al. 2002) and the Solomon Island chain by 28,000 B.P. (Wickler & Spriggs 1988). This region of early human settlement across what is now Australia, New Guinea, the Bismarck Archipelago and the Solomon archipelago is increasingly referred to 1991). This is to distinguish the region much later period and are culturally distinct from those in Near Oceania (see Figure 1.1). Historically, Oceania was classified into three cultural/geographical regions 1.2). The Polynesian classification has been frequently used, as it is a meaningful unit for a culturally homogenous region. The use of Melanesia and to a lesser 1 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! Figure' 1.1' Map' depicting' Near' and'Remote' Oceania.!These!regions!are!separated!by! the!red!dotted!line.!The!Polynesian!Triangle!is!overlaid!over!parts!of!Remote!Oceania!that! are! affiliated! with! the! Polynesian! culture.! Within! Polynesia,! there! is! a! distinct! cultural! separation! between!Western!and!Eastern!Polynesia,!which!is!denoted!geographically!by! the!blue!dotted!line.!Base!map!courtesy!of!Les!O’Neill.! 2! ! ! ! ! Figure' 1.2' The' geographical' boundaries' of' Melanesia,' Micronesia' and' Polynesia.! These! regions!are!denoted!in!green,!yellow!and!blue,!respectively.!Base!map!courtesy!of!Les!O’Neill.! ! 3! extent Micronesia however, has been heavily criticised as the communities across both regions are culturally, biologically, and linguistically diverse (Green 1991); a reflection of population development over a 36,000 year time span. The use of Remote and Near Oceania as both a cultural and geographical divide in Oceania -island distances. The extensive ocean distance separating Near and Remote Oceania was never breached by the original inhabitants of the Solomon Islands. This crossing, instigated by a new wave of human migration, was not to occur for another 25,000 years. Around 3,400 B.P., Austronesian speakers from Island Southeast Asia (ISEA) moved into the Bismarck Archipelago (Summerhayes et al. 2010), settling alongside the original inhabitants. A mixed culture developed into what is now known as the Lapita Cultural Complex (Kirch 2000). Lapita settlements are commonly distinguished by unique earthenware ceramics and the introduction and utilisation of new plant and animal species, such as the Pacific rat, Rattus exulans (Matisoo-Smith et al. 2004). New watercraft technologies such as the invention of outrigger canoes enabled the Lapita people to voyage out beyond the Solomon Islands and make long distance open-sea crossings into the wider Pacific (Irwin 2008). Their success instigated a phase of long-distance voyaging and migration over the next few centuries. Eastward expansion beyond Near Oceania led to the colonisation of the Reef and Santa Cruz groups by 3200 B.P. (Green et al. 2008), Vanuatu and Fiji by 3100 and 3000 B.P. respectively (Denham et al. 2012), Tonga by 2800 B.P. (Burley et al. 2015) and Samoa by 2750 B.P. (Petchey 2001; Clark et al. 2016). After some period of time, the descendants of these Lapita migrants proceeded to move further east into what is now denoted as Polynesian Triangle sq. km stretch of ocean comprising of over 500 islands scattered between Hawaii, Easter Island (Rapa Nui) and New Zealand (Matisoo-Smith 2015). Nearly every habitable island in this region was colonised, some subsequently used for permanent settlements and others abandoned (Kirch 2000). The timing and sequence of settlement across many of the major archipelagos of Eastern Polynesia 4 remain largely unresolved, with conflicting estimates of initial colonisation varying by more than 1000 years (Wilmshurst et al. 2008). relied upon comparative material, cultural and linguistic analyses to develop early dispersal models in Eastern Polynesia (Burrows 1938; Buck 1944; Emory 1946). The prevailing scenario proposed that the Society Islands were the major dispersal centre leading to the colonisation of the rest of the Polynesian Triangle. Preliminary radiocarbon dating of Pacific samples during the late 20th century however, gave way to new theories on Polynesian settlement. The Marquesas Islands were identified as the first group of East Polynesian islands to be settled, after they produced the earliest radiocarbon dates at approximately 1650 B.P. (Emory & Sinoto 1965; Sinoto 1970, 1983). Subsequent radiocarbon dating indicated that Easter Island was settled as early as 1550 B.P., the Hawaiian Islands by 1200 B.P., the Society Islands by 1150 B.P., the Cook Islands by 1050 B.P., and New Zealand between 1150-950 B.P. (Sinoto 1970, 1983; Simmons 1973; Bellwood 1978; Jennings 1979). Artefact assemblages characterised by diagnostic types of wooden hand clubs, basalt adzes, compound-shank fishhooks and pearl pendants from early deposits, indicated a direct cultural link between the Marquesas and the Society Islands (Sinoto 1983). Early material types in the Henderson Islands, Mangareva and Cook Islands also had similar characteristics to artefacts affiliated with the early Marquesan culture. Other archaic artefacts found in the Society Islands, such as whale-tooth pendants and diagnostic harpoon heads were likewise uncovered in early sites in New Zealand, signifying a direct cultural link between these two regions (Sinoto 1983). Reversed-triangular adzes from a later phase in the Society Islands occupation, dated to around 600 B.P., were discovered in Hawaii, indicating a later link between Hawaii and the Society Islands (Emory & Sinoto 1965; Sinoto 1983). In association with the early radiocarbon dates and cultural/material analyses, an orthodox scenario emerged whereby 1) an ancestral Polynesian society developed in the West Polynesian archipelagos of Tonga and Samoa over a 1500 year period; 2) the Marquesas Islands were colonised from Western 5 served as a major dispersal centre towards other East Polynesian islands; and 3) the Society Islands served as a secondary dispersal centre, leading to the colonisation of New Zealand and initiating a secondary movement to Hawaii (Emory & Sinoto 1965; Bellwood 1978; Jennings 1979; Sinoto 1983). The orthodox scenario however attracted substantial criticism. Excavation sampling error as a result of limited and patchy archaeological activity on other Eastern Polynesian islands alluded to the prematurity of the orthodox scenario as an accurate East Polynesian dispersal model (Irwin 1981). A lack of shared artefacts between the earliest assemblages uncovered in the Marquesas and those in West Polynesia (i.e. Samoa and Tonga) also challenged the idea that the Marquesas was the first group of islands to be colonised in Eastern Polynesia (Bellwood 1970; Davidson 1976). If the Marquesas was directly colonised via Western Polynesia, early assemblages from the Marquesas should resemble those from a phase in Western Polynesian culture (Davidson 1976). Given that Eastern Polynesian languages share many lexical and phonological variants, and that Eastern Polynesian culture in general shares many similarities, Kirch (1986) argued that these variants must have developed after an East/West Polynesian split, but prior to further splits within the Eastern Polynesian community. If the Marquesas was settled by 1650 B.P. and was succeeded with a rapid migration to Easter Island and the Society Islands, it is unlikely that that an ancestral East Polynesian language and other cultural innovations would have had the time to develop prior to the rapid migration and establishment of communities across the Polynesian Triangle (Kirch 1986). Alternatively, sustained inter-island groups (Hunt 1979), which would have allowed shared language and cultural innovations to develop concurrently (Kirch 1986). Similar artefact assemblages, such as those between the Marquesas and the Society Islands, may therefore not be representative of a one-way migration and settlement of early Marquesan culture on the Society Islands, but of a shared culture between the archipelagos, sustained by frequent exchange. The settlement sequence depicted in the orthodox 6 scenario therefore may not be as simplistic, but rather involved numerous interisland migrations and interactions (Biggs 1972). Kirch (1986) also noted that the absence of diagnostic artefacts associated with -tooth pendants, harpoon heads and compound-shank fishhooks) in Hawaiian deposits, indicated that Hawaii may have been settled prior to the development of archaic material culture in Central Polynesia. Several C14 age determinations, suggesting Hawaiian occupation as early as the fourth century A.D (~1600-1700 B.P.) (Kirch 1985) during which the Marquesas was allegedly colonised, placed further strain on the orthodox scenario. Kirch (1986) considered it unlikely that Hawaii was rapidly colonised after the purported settlement of the Marquesas, given that there would not have been enough time for the development of a shared ancestral East Polynesian language. Kirch (1986) re-evaluated a number of radiocarbon dates and suggested that the Marquesas was colonised during the late 1st millennium B.C (prior to 1950 B.P.), and both Easter Island and Hawaii by 1550 B.P. He argued that the shared artefact assemblages between the Marquesas, Society Islands and New Zealand are not representative of an archaic material culture, but of a consistently dated between 1250-850 B.P., which was after the settlement of Easter Island and Hawaii. Spriggs & Anderson (1993) conducted a meta-analysis of 147 previously published radiocarbon dates in an attempt to revise the conflicting chronologies. Samples and their corresponding age approximations were rejected as part of a context, materials were associated with short Pacific chronologies or high inbuilt age, samples were of mixed isotopic fractionation, if a stratigraphic context provided no overlap in dating, and if samples were inadequately treated prior to dating. The results from the Spriggs & Anderson (1993) analysis suggested a later chronology for the settlement of East Polynesia, with the Marquesas settled by 1650-1350 B.P., Hawaii after 1350 B.P., Society Islands after 1200 B.P., the Cook Islands by 1150 B.P. and Easter Island towards the end of the 1st millennium A.D 7 (prior to 950 B.P). This built on previous work by Anderson (1991) focusing on the initial settlement of New Zealand, which was dated to between 950-750 B.P. Subsequent radiocarbon studies using short-lived materials, e.g. short-lived plants chronometric hygiene protocol, have generally supported a short and rapid colonisation of East Polynesia from 1050 B.P. to 750 B.P. The short chronology model, incorporating an 1800 year pause between the settlement of Western and Eastern Polynesia, suggests the colonisation of the Society Islands and Marquesas to have commenced between 1050-750 B.P. (Anderson & Sinoto 2002), 950-550 B.P. across the Cook Islands (Kirch et al. 1995; Allen & Wallace 2007), and by 750 B.P. in both Easter Island and New Zealand (Hunt & Lipo 2006; Wilmshurst et al. 2008). Conve model for an earlier and gradual colonisation of Eastern Polynesian islands from 2500 B.P. to 750 B.P. The paleoenvironmental approach uses palynological, paleobotanical and sedimentological analyses of island ecosystems to identify human disturbance in relation to forest clearance, erosion and alluvial deposition, and the introduction and extinction of species, as a proxy for human occupation. Stratigraphic and geochemical evidence from Mangaia, Cook Islands, reveal dramatic changes to the erosional/depositional cycle and a decline in the organic content in sediments, attributed to human occupation as early as 2000 B.P. (Kirch & Ellison 1994). Palynological changes such as an increase of charcoal, in conjunction with a reduction in forest taxa concentrations attributed to slash-andburn horticulture, is dated to around 2500 B.P. (Kirch 1996). In Hawaii, changes in vegetation, the presence of charcoal, the introduction of Polynesian plants and the Pacific rat, Rattus exulans, have been consistently dated to the period 900-1000 B.P. (Burney & Kikuchi 2006; Burney et al. 2001). Rat-gnawed seeds, reflecting the anthropogenic introduction of Rattus exulans in New Zealand, were dated to approximately 750 B.P. (Wilmshurst & Higham 2004). 8 The long or short chronologies for human colonisation in Eastern Polynesia offer conflicting estimates that vary by more than 1000 years. A recent meta-analysis incorporating 1,434 published radiocarbon dates from across the major archipelagos of Eastern Polynesia, indicated that arguments for colonisation prior to the first millennium A.D (before 1950 B.P.), have largely been based on materials providing imprecise calibrations (Wilmshurst et al. 2011). Materials such as long-lived plant, unidentified charcoal, bone and marine shell were often unreliable, introducing substantial in-built age and measurement errors. Dates from these types of materials extended back to 2450 B.P. and showed a larger spread of ages from all island sites. In comparison, dates from short-lived plant and terrestrial bird eggshell provided a much narrower and reliable calibration range from 925-430 B.P. (Wilmshurst et al. 2011). Exclusion of unreliable materials produced a shortened chronology and a two-phase sequence of settlement establishment in the Society Islands ~925-830 B.P. before a widespread radiation towards the Marquesas, ~750-673 B.P.; Easter Island, ~750697 B.P.; Tokelau, 750-550 B.P.; Hawaii, ~731-684 B.P.; New Zealand, ~720-668 B.P.; Southern Cooks, ~700-669 B.P.; and the Line Islands, ~675-657 B.P. (Wilmshurst et al. 2011; Petchey 2010). Recent radiocarbon dating research has focused on selecting short-lived materials (Rieth et al. 2011; Molle & Conte 2011; Kahn 2012), which have generally supported the model of Eastern Polynesian expansion in the early second millennium A.D (after 950 B.P.). A robust chronology and dispersal model for Eastern Polynesian expansion is crucial in order to address initial colonisation events, social development and interactions, trade, resource use and the anthropogenic impact on Pacific ecosystems (Rieth et al. 2011). As discussed above, the use of radiocarbon dating in conjunction with associated material assemblages often forms the basis for dispersal models. However, these chronologies are heavily reliant upon finding suitable samples from the earliest occupation layers across different sites. Furthermore, coastal submergence can greatly affect the visibility of archaeological sites (Dickinson & Green 1998). With many Polynesian islands lacking sustained archaeological research, island groups may be inaccurately represented in dispersal models. 9 1.2 Interaction spheres in post-settlement Polynesia Following the initial settlement of the Polynesian Triangle, widespread voyaging and interaction was evident within island groups and between remote archipelagos (Rolett 2002). Early inter-island interactions may have been crucial to widen marriage spheres, develop political and economic influence, and facilitate the trade of plants and animals that some island populations lacked (Kirch 1988; Clark et al. 2014). The unequal distribution of important tool materials, such as basalt a volcanic rock used for making adzes, may also have facilitated initial trade (Best et al. 1992). High-precision methods such as X-ray fluorescence, electron microprobe analysis and petrography have been used to assess the composition and subsequently the source location of traded basalt in Polynesia. These analyses have provided conclusive evidence for the regional trade of basalt adzes, from the Austral Islands and Society Islands into Rarotonga and Mangaia (Cook Islands) (Sheppard et al. 1997), Eiao (Marquesas Islands) into Moorea (Society Islands) and Mangareva (Gambier Islands) (Weisler 1998), and Pitcairn Island into Mangareva (Conte & Kirch 2004). Basalt and ceramic was also widely exchanged between the Western Polynesian archipelagos of Fiji, Tonga and Samoa (Clark et al. 2014; Dye & Dickinson 1996), with an eastern extension into the Cook Islands (Walter 1998; Kirch 1995; Allen & Johnson 1997). However, artefact transfer between Fiji, Tonga and Samoa was relatively low for the first 1500 years post-colonisation, until 1200 B.P. when inter-archipelago basalt trading began to increase in frequency, peaking around 800-900 B.P (Clark et al. 2014, Cochrane & Reich 2016). Interestingly, this coincides with the expansion into Eastern Polynesia. However, the maintenance of a single language (that eventually evolved into Proto-Polynesian) in the Samoa-Tongan region for the first 1000 years postcolonisation does signify some level of continuous interaction between these two archipelagos (Pawley 2015). A reduction in inter-archipelago interaction between 1700 B.P. and 1200 B.P. likely led to the regional development of the dialects Tongic and Nuclear Polynesian. Rolett (2002) postulated a model of interaction spheres across central-east Polynesia, with prehistoric contact occurring between most islands, except for the 10 most isolated. Rolett argued that based on current evidence, a core interaction zone existed between the Southern Cooks, Society Islands and Austral Islands. Trade links also connected Samoa and Tonga with the Southern Cooks, indicating interaction across the Western and Eastern Polynesian cultural divide. A smaller interaction sphere existed between the Marquesas and the Society Islands, and between Mangareva and the Pitcairn group on the eastern fringes of Polynesia (Rolett 2002). Linguistic evidence also suggests that a secondary interaction sphere may have existed between the Austral Islands, Mangareva (Gambier Islands) and the Pitcairn islands to the eastern Tuamotus and the Marquesas (Kirch 2000). However, given limited archaeological fieldwork in this region of Eastern Polynesia, there has been no material analysis linking these islands to support this claim. An abrupt decline in imported materials reported in later deposits, indicated that widespread trade in interaction spheres contracted after 500 B.P. (Rolett 2002). As populations grew larger and more self-sufficient, the benefits of long-distance trading may have grown smaller (Kirch 1988). Island deforestation as a result of human settlement may also have limited the ability to construct voyaging canoes, as the availability of timber declined over time. It is argued that this contributed to the isolation of Mangareva (Weisler 1994) and Easter Island (Van Tilburg 1994). Heightened hostilities between Tahiti and Moorea in the Society Islands (Haddon & Hornell 1975) may have contributed to a breakdown of the Societies as a regional hub in the core interaction sphere, which would have had indirect repercussions on the outlying archipelagos (Rolett 2002). Ethnohistoric evidence also suggests that inter-chiefdom hostilities were present between Marquesan islands, constraining inter-island voyaging both within the Marquesas and with other island groups (Robarts 1974). The investigation of Polynesian interaction spheres and the subsequent breakdown of these exchange networks after 500 B.P. is critical for understanding the development, through both sustained interaction and isolation, of Polynesian chiefdoms. However, like the chronological research on colonisation, the research on Polynesian trade and interaction relies heavily upon archaeological fieldwork 11 and the subsequent retrieval of comparative material assemblages. The commensal approach, utilising recent advances in genome sequencing and the construction of comparative databases, is increasingly being applied to study dispersal and interaction patterns across the Pacific. 1.3 The commensal approach Molecular research has been applied to the study of Pacific origins and settlement exhibited a high frequency of globin gene mutations, which was also found in populations in Near Oceania. The first mitochondrial research in the Pacific itochondrial DNA (mtDNA) markers found in high frequency in Polynesia) within populations in Island Southeast Asia (ISEA) (Melton et al. 1995; Redd 1995). This provided support for a Polynesian origin in ISEA. Later research found that modern Polynesian populations showed a high frequency of a Near Oceanic derived Y-chromosome (that developed prior to the Austronesian intrusion) even in populations exhibiting a near fixation of ISEA derived mtDNA lineages (Hage and Marck 2003; Kayser et al. 2000; Kayser 2010). This indicated that substantial mixing between the indigenous non-Austronesians and Austronesian-speaking people took place as the latter moved into the Bismarck Archipelago. Hage and Marck (2003) argued that the high frequency of an indigenous non-Austronesian Y-chromosome in modern Polynesian populations is indicative of matrilocal residence patterns and a matrilineal descent structure in the ancestral Austronesian-speaking people. It was proposed that Indigenous non-Austronesian males would have moved in with their s that eventually moved into Remote Oceania. This is consistent with evidence of matrilineal descent in some modern populations found across Micronesia and Island Melanesia (Jordan et al. 2009; Allen 1984). Polynesian populations are however, not well represented in many Pacific-wide genetic studies. Technical and cultural concerns towards the sequencing, storage and interpretation of indigenous human DNA in the Pacific have limited human 12 genetic research in this region. Subsequently this has led to the use of commensal species (plant and animal species transported by humans) as a proxy for human migration (Matisoo-Smith 2015). Introduced by the Lapita people, animal species such as the dog, chicken, pig and the Pacific rat have been found in the earliest archaeological deposits across Polynesia (Matisoo-Smith 2015). Plant species such as kava, breadfruit, taro, yam, ti and the paper mulberry were also introduced (Whistler 2009). Phylogeographic analyses of these commensal species have provided invaluable information on geographic origins and dispersals in conjunction with human movement across the Pacific (Matisoo-Smith 1994; Matisoo-Smith et al. 1998, 2004; Matisoo-Smith & Robins 2008; Barnes et al. 2006; Larson et al. 2007; Storey et al. 2007, 2010; Oskarsson et al. 2012; Roullier et al. 2013; Thomson et al. 2014; Greig et al. 2015). The Pacific rat (Rattus exulans) was the first commensal species investigated in relation to human migration in Oceania (Matisoo-Smith 1994). Argued to have originated in Flores, Indonesia (Thomson et al. 2014), Pacific rats were transported by ancestral Polynesians and are now widely distributed across the region. Large deposits of skeletal remains in early Lapita sites signify that Pacific rats may have been exploited as a food item, and therefore intentionally carried across the Pacific. Unintentional transport as a stowaway however cannot be excluded (Matisoo-Smith et al. 1998). Rattus exulans inability to interbreed with the recently introduced European rats (Rattus rattus and Rattus norvegicus) makes it an ideal species to study models of Lapita origins and dispersals, as its genetic history has not been compromised. Furthermore, there is little evidence of transportation in historic vessels; therefore it has been proposed that the modern Pacific rat populations are direct descendants of the original rats carried by the Polynesian colonists (Matisoo-Smith 1994). Conversely, the dogs, pigs and chickens introduced by the Lapita peoples were the same species as those later introduced by Europeans (Matisoo-Smith et al. 2004). Substantial admixture with the European introduced populations obscures genetic patterns in modern populations that are solely associated with the Lapita movement. 13 Initial research investigating genetic variation in 480bp (base pairs) of the hypervariable mitochondrial control region in modern Pacific rats, identified a central East Polynesian interaction sphere comprised of the southern Cooks and Society Islands (Matisoo-Smith et al. 1998). This broad interaction sphere was central to a northern (i.e. the Marquesas and Hawaii), and a southern Polynesian sphere (i.e. Kermadec Islands and New Zealand). The authors suggested that this likely reflects a central homeland region from which major Eastern Polynesian islands were colonised. This signified that that Marquesas was not part of the central homeland as previously suggested, and given the monophyletic nature of the Marquesan samples in this study, was considered relatively isolated postcolonisation. The Marquesas was however a stepping-stone towards the colonisation of Hawaii (Matisoo-Smith et al. 1998), which coincides with previous linguistic and archaeological research (Green 1966; Elbert 1982; Allen 2014). The appearance of non-Marquesan Pacific rat lineages in Hawaii also suggests multiple introductions either from an early secondary colonisation or later trade links with the central homeland. Within the southern sphere, there is evidence of multiple introductions into New Zealand, both from the Kermadecs and directly from the central homeland. The monophyletic nature of the Pacific rats sourced from the Chatham Islands however, indicate long-term isolation post-colonisation from New Zealand (Matisoo-Smith et al. 1998). This is also consistent with previous linguistic and archaeological research (Harlow 1979; Sutton 1980). A later phylogeographic analysis targeted 240bp of the mitochondrial control region in both ancient and modern Pacific rats across the Pacific (Matisoo-Smith et al. 2004). This study indicated the existence of three major control region haplogroups; the first, an isolated group within Southeast Asia; the second, a dispersed group from Southeast Asia to Near Oceania; and the third, a primarily Remote Oceanic group (see Figure 1.3.). The isolation of Haplogroup I consisting of samples from the Philippines, Borneo and Sulawesi is indicative of an interaction sphere confined within Southeast Asia. This group has no relation to the Oceanic haplogroups following an ancestral split. Haplogroup II is distributed across both Southeast Asia and Near Oceania from the Philippines, through the Bismarck Archipelago and into the Solomon Islands and Santa Cruz Islands. The distribution 14 Figure 1.3 A neighbour-joining tree and corresponding map depicting the distribution of the mitochondrial control region haplogroups I, II, IIIA and IIIB (Matisoo-Smith et al. 2004). Bootstrap values are presented on the main branches. The map symbols are as follows: haplogroup I, filled circle; haplogroup II, asterisk; haplogroup IIIA, open triangle; haplogroup IIIB, filled triangle. Haplotype names are given on the radiating branches of the NJ tree. 15 of Haplogroup II is suggestive of an encompassing interaction sphere that coincides with previous archaeological research indicating the existence of obsidian trading, commensal species translocations and post-Lapita interactions in this region (Flannery 1995; Kirch 1997; Friedlaender et al. 2002). Haplogroup III has been largely found in Remote Oceanic populations and is associated with the Lapita/Polynesian movement through Remote Oceania. Its distribution extends from Vanuatu across to Central and East Polynesia, to Hawaii, New Zealand, and from back migrations, into the Caroline and Marshall Islands. The only western (Near Oceanic) samples belonging to Haplogroup III are from Halamhera, Wallacea, which has been identified as the most likely point of origin for this haplogroup. The lack of Near Oceanic samples in Haplogroup III was surprising, given evidence for the Lapita movement through both Near and Remote Oceania. The authors noted however, that their Near Oceanic samples largely came from large islands (e.g. New Guinea and Bougainville) that were previously occupied and therefore not targeted by the Lapita populations as they moved through New Oceania. If rats of Haplogroup III exist in Near Oceania, it is likely that they are residing on the smaller, offshore island regions that were uninhabited pre-Lapita (Matisoo-Smith et al. 2004). Incomplete sampling may therefore have undetected two major introductions into Oceania the first from an initial pre- Lapita introduction into Near Oceania, and the second with the Lapita movement through offshore island regions in Near Oceania before a further dispersal into Remote Oceania. Further research targeting islands in the Bismarck Archipelago found some overlap between Haplogroup II and III (Matisoo-Smith & Robins 2008). Haplogroup III is further divided into subgroups IIIA and IIIB, the latter derived from IIIA. These subgroups overlap in Western Polynesia, however only subgroup IIIB has been found in Eastern Polynesia (Matisoo-Smith et al. 2004). Samples from the haplogroup IIIB either belong or are derived from the haplotype, R9, which has been found across central East Polynesia. Unfortunately, given that most East Polynesian samples adhere to this R9 haplotype, there has been limited success in determining the dispersal route across this region. This was evident in a study on 16 ancient Pacific rat samples from Anakena, Easter Island (Barnes at al. 2006). All samples presented the R9 haplotype and therefore could not indicate precisely where the Easter Island population originated in Eastern Polynesia. However, the limited variation in the ancient Easter Island samples did indicate a single or limited introduction of the species, followed by extreme isolation (Barnes et al. 2006). Interestingly, the introduction of the Pacific rat onto Easter Island ecologically devastated the island, which in turn is theorised to have contributed to the isolation and cultural collapse of the Rapa Nui people (Hunt & Lipo 2012). Other research of commensal species has provided an insight into the complexity of human migration and commensal translocations across Southeast Asia and the Pacific. Larson et al. (2007) found evidence for three introductions of pig, the first involving the translocation of Sus celebensis to Flores, Timor and Sulawesi by Neolithic settlers; the second involving the introduction of another species (Sus scrofa) originating from mainland Southeast Asia and now found in Taiwan, the Philippines and Micronesia; and the third, the translocation of an independent Pacific lineage of Sus scrofa by the Lapita people across Near and Remote Oceania. This Pacific lineage was traced to Vietnam. However, the absence of the Pacific lineage in Taiwan, the Philippines, Borneo and Sulawesi, suggests that if the model, the Austronesians did not initially transport this lineage. However it was picked up along the way and carried by their Lapita descendants across the Pacific (Larson et al. 2007). Ancient mtDNA analyses investigating the translocation of dogs (Canis familiaris) into the Pacific indicate at least two introductions, which are distinct from the New Guinea Singing dog (NGSD) found in Papua New Guinea, and the dingo lineage that was already established in Australia by 3500 B.P. (Savolainen et al. 2004; Milham & Thomson 1976). Archaeological evidence of dogs in early Lapita contexts is relatively rare, although they are found in many early sites across Central and East Polynesia, except for Easter Island. Two distinct lineages are evident in ancient Polynesian dogs - Arc1, which outside of Oceania is predominately found in mainland and western ISEA; and Arc2, found across all of ISEA (Oskarsson et al. 17 2012; Savolainen et al. 2004). Given that neither of these haplotypes were found in Taiwan or the Philippines, the translocation of these lineages must not have passed through the northeastern route into the Pacific. The presence of a specific haplotype found in ancient New Zealand dogs and in modern dogs largely from Indonesia, provides evidence for a southwestern dispersal through Indonesia (Greig et al. 2015). When and how these dogs dispersed across the Pacific however will require greater sampling and phylogenetic resolution. Chickens are a commensal species that have been found in the earliest Lapita sites across both Near and Remote Oceania (Storey et al. 2010). Recent ancient DNA (aDNA) analyses targeting a 150bp segment of the mitochondrial D-loop, have found evidence for two introductions of chicken into the Pacific, from mainland and Southeast Asia (Storey et al. 2007). Interestingly, this coincides with the two introductions of dogs from similar locations in Southeast Asia. A chicken bone uncovered in a Pre-Columbian archaeological site in El Arenal, Chile produced a sequence identical to prehistoric chickens uncovered in Tonga and American Samoa. The bone sample was radiocarbon dated to 622 ± 35 B.P., which in conjunction with the genetic data provides firm evidence for a Polynesian migration and subsequent translocation of chickens from Oceania to South America (Storey et al. 2007). Thomson et al. (2014) indicated a strong reservation towards this finding, claiming modern chicken contamination of the El Arenal sample. Storey and Matisoo-Smith (2014) insist however that their samples had not been compromised by contamination and that Thomson et al. (2014) did not provide any proof to justifiably dispute their evidence. Further evidence for prehistoric contact between South America and Polynesia is in the widespread presence of sweet potato (Ipomoea batatas), a plant originating from South America that has been found in many pre-European archaeological sites across Polynesia. Targeting both chloroplast DNA and nuclear microsatellites in modern and herbarium samples collected from the Americas, Oceania and Southeast Asia, Roullier et al. (2013) provided strong support for a pre-historic translocation of sweet potato from the Peru-Ecuador region of South America into Polynesia. Whether the multiple lineages in prehistoric Polynesia are 18 representative of one of more introductions, is at the moment unknown. Roullier et al. (2013) did note however that sharing of prehistoric clones between Eastern Polynesian and the Western Pacific was quite rare, reflecting a cultural and subsequent trading boundary between these regions. Conversely in the Western Pacific, a lack of geographic structure suggests a wide circulation, presumably facilitated by trade of several clones. Sharing of some clones within Eastern Polynesia also indicate circulation within this region. Later European contact and modern movements led to multiple new lineages that have recombined and reshuffled the prehistoric sweet potato lineages across Polynesia (Roullier et al. 2013). 1.4 aDNA: its use and limitations The field of ancient DNA (aDNA) is a relatively new area of molecular research, increasingly applied in fields such as anthropology, zoology and plant biology. The sequencing of ancient genetic material sourced from multiple geographic and temporal populations has provided invaluable insights into past demographic history and evolutionary processes (Slatkin & Racimo 2016; Ramakrishnan & Hadly 2009). There are however, many technical issues that arise in working with ancient DNA. Low preservation, contamination and post-mortem damage can affect the amplification, sequencing and authenticity of genetic data. Improved extraction/amplification methodologies and sequencing technologies are however addressing many of the limitations that have plagued previous aDNA research. 1.4.1 Preservation Ancient DNA refers to any DNA retrieved from preserved tissue (e.g. bones, teeth, hair and skin) and other materials (e.g. seeds and pollen). DNA degradation in preserved tissue is largely affected by age, density, environmental factors (e.g. temperature fluctuation and acidity) and post-excavation procedures (Robins et al. 2001). Across the Pacific, Pacific rat samples recovered from older archaeological sites (e.g. from sites in the Santa Cruz Islands, Tonga and Niue dated between 2000-3000 B.P.) have previously produced a low polymerase chain reaction (PCR) amplification success rate for mtDNA (Robins et al. 2001). In comparison, samples 19 from contexts dated to within 1000 B.P. (e.g. Cook Islands, Hawaii, New Zealand and Easter Island) have produced PCR success rates between 60-100%. The site context (i.e. whether it is open, within a rock shelter, or cave) also influences aDNA preservation, with protected sites, such as caves providing a stable temperature and calcareous environment that limits the growth of microbial activity and subsequent degradation of bone apatite (Nielsen-Marsh et al. 2000). Conversely, open sites such as dirt mounds and dunes, in association with higher moisture, acidity and temperature can lead to faster DNA degradation. Pacific islands vary from tropical to temperate climates. All-year round warm-hot temperatures and humidity in tropical parts of the Pacific suggests that DNA preservation will be poor. However, there has been very little research in assessing temperature at various depths in archaeological sites across the Pacific which with lower temperatures would be more conductive to DNA preservation (Robins et al. 2001). Unfortunately, given that many archaeological sites within Remote Oceania are situated within beach dunes (Robins et al. 2001), low temperatures are likely meaningless given that water percolation from the nearby ocean dissolves bone apatite, leading to the growth of microorganisms that metabolise DNA (Bollongino et al. 2008). In conclusion, Robins et al. (2001) indicated that protected sites (i.e. caves and rock shelters) from more temperate islands (e.g. New Zealand, New Caledonia and Easter Island) have produced samples with the highest success rate of DNA recovery, regardless of contextual age, but within the last 4000 years. In tropical environments, both site protection and age have an influence on DNA degradation, with a higher PCR success rate on samples from rock shelters and caves, dated to within the last 2000 years. Samples recovered from open and/or coastal environments produced the lowest PCR success rates and are highly influenced by contextual age (Robins et al. 2001). Post-excavation procedures also influence DNA degradation, as the environmental conditions radically change, and in hot excavations, samples may be left exposed to direct sunlight for hours (Smith et al. 2003). UV light can induce free radical damage to DNA (Bollongino et al. 2008). An analysis on the effect of postexcavation treatments on DNA amplification indicated that washing, treatment 20 with chemicals and long-term storage at room temperature (typical in museum collections) is detrimental to DNA preservation (Pruvost et al. 2007). Recently excavated and unwashed samples provided twice as much authentic DNA than samples that underwent post-excavation treatment (Pruvost et al. 2007). If samples are to be stored long-term, they should be kept unwashed and ideally be placed in a cold, dry environment, such as a freezer (Yang et al. 2005). 1.4.2 Contamination The incorporation of microorganisms into a sample from the site environment, touching of samples without gloves and washing of samples with tap water, both on an excavation and in the laboratory, are the biggest sources of exogenous DNA contamination. Well-preserved samples can be decontaminated in the laboratory, however degraded samples such as porous bone and teeth, are more vulnerable to the penetration of contaminants into the tissue structure itself (Bollongino et al. 2008). The hypersensitivity of PCR reactions whereby amplifying minute amounts of DNA will indiscriminately amplify large amounts of contaminant DNA, can potentially overwhelm the amplification of the authentic DNA template. A number of guidelines are now widely implemented to minimise contamination in the field and in the laboratory, and to decontaminate samples prior to DNA extraction in order to reduce the amplification of exogenous PCR products. The wearing of gloves and the changing of tools is encouraged when handling material in the field to minimise human contamination (Yang et al. 2005). It is also recommended that samples are no longer washed in the field to reduce contamination from the water source. Sample decontamination and DNA extraction must be conducted in a dedicated and isolated laboratory, away from routine DNA amplifications, such as in a modern genetics laboratory. Gloves, face masks, hair nets and in some laboratories are routinely used in the laboratory to minimise contamination from the researcher. There are a number of decontamination strategies to remove exogenous DNA, such as removing surface layers, employing UV irradiation, washing the sample in ethanol, bleach and/or purified water. The immersion of 21 samples into 6% bleach for a period of 15min has been assessed as the most practical and cost-effective method for removing surface contamination (Kemp & Smith 2005). The inclusion of blanks (a control treated as a real sample) should be processed in conjunction with extraction and amplification steps to evaluate crosscontamination. Independent replication is also necessary to discount intralaboratory contamination (Cooper & Poinar 2000). 1.4.3 DNA Damage Enzymatic repair mechanisms that maintain genome stability in living cells, cease to function after death (Lindahl 1993). This leaves DNA vulnerable to strand fragmentation by endonuclease activity and microorganisms that inhabit decaying tissue. Successful sequencing of samples aged between 4 to 13,000 years generally shows DNA fragmentation of between 40-500 bp (Paabo 1989). Preservation in a cold environment can inhibit substantial destruction by these degradative processes, however other chemical processes, such as hydrolytic deamination and oxidative damage can impair the integrity and the ability to amplify and sequence aDNA (Dabney et al. 2013). Hydrolytic damage to both phosphodiester and glycosidic bonds can result in single-stranded nicks and abrasic sites where the base has been de-attached from the sugar-phosphate backbone (Hofreiter et al. 2001). This in turn, can lead to greater strand breakage and aDNA fragmentation, which is the prominent form of aDNA damage (Mitchell et al. 2005). The effect of hydrolytic deamination (i.e. the removal of amine groups) of cytosine bases, is commonly found on the ends of aDNA fragments. Deamination of cytosine produces a uracil base that during DNA replication will incorporate adenine (A), an analogue of thymine (T). This will result in ancient DNA fragments characterised by transitions of cytosine (C) to thymine (T) and on the complementary strand, guanine (G) to adenine (A) (Dabney et al. 2013). C to T transitions can originate in hot spots across both ends of ancient DNA library preparation, as the exonucleolytic activity of T4 DNA 22 own C to T transitions) are removed and the new ends produced during blunt end complemented by a G to A transition after rebuilding complementary strand (Sawyer et al. 2012; Briggs et al. 2007; Brotherton et al. 2007). The development of miscoding lesions from hydrolytic deamination alters the DNA template, leading to nucleotide misincorporation during amplification and sequencing. Oxidative damage, alkylation and Maillard reactions (between reducing sugars and amino groups) can also lead to the development of nucleotide modifications and crosslinks, formed between DNA strands or different DNA fragments. These blocking lesions obstruct DNA polymerases, preventing amplification and subsequent sequencing (Dabney et al. 2013; Willerslev & Cooper 2005). This can lead to a proportionally larger amount of amplified contaminant DNA over endogenous DNA. 1.4.4 New methodologies The initiated early aDNA research as it became possible for the first time to target and amplify small, specific aDNA sequences (Golenberg et al. 1990; DeSalle et al 1992). Many of these early studies were however later disregarded, due to the unregulated amplification of exogenous contaminants (Rizzi et al. 2012). Stricter protocols to minimise contamination during both sampling and processing in the laboratory (Poinar & Cooper 2000), and in assessing the authenticity of sequences in bioinformatics processing have since been widely implemented. Ancient DNA is usually quite short and fragmented, constraining amplification to fragments no larger than 500bp (Willserslev & Cooper 2005). This is quite small, considering that the human mitochondrial and nuclear genomes are approximately 16,500 bp and 3x109 bp, respectively. The traditional aDNA methodology involved amplifying 23 small target fragments (typically up to 200bp), generating several clones for each amplified fragment (applying Sanger sequencing) and meticulously aligning and comparing any overlap from different clones to construct a finalised consensus sequence (Rizzi et al. 2012). The recent emergence of second-generation sequencing however has revolutionised the field of ancient DNA. Secondgeneration sequencing in conjunction with developments in bioinformatics processing have made it possible to simultaneously generate billions of reads from multiple samples in parallel, which are then overlapped and assembled into larger DNA sequences, i.e. complete mitochondrial genomes or nuclear genes (Knapp & Hofreiter 2010). Second-generation sequencing was first implemented in ancient DNA research in 2006, with the generation of a 13 million bp nuclear sequence from an extinct woolly mammoth (Poinar et al. 2006). Prior to the incorporation of this sequencing technology, the largest ancient dataset obtained was a 27,000 bp nuclear sequence from a Pleistocene cave bear (Noonan et al. 2005). Another increasingly applied methodology in ancient DNA research, is the use of hybridisation capture for target DNA enrichment prior to second-generation sequencing (Ng et al. 2009). This new method, limiting the overuse of PCR amplification (which can amplify large amount of exogenous contaminant DNA), involves hybridising target DNA fragments to a known genomic region (alternatively referred to as bait) that has been immobilised on magnetic, streptavidin covered beads. After washing (to remove any non-target libraries e.g. contaminants) the target libraries are eluted from the streptavidin beads and amplified using ligated adapters, prior to sequencing. If libraries are to be pooled for multiplex sequencing (simultaneous sequencing of multiple samples), then a barcode (unique sequencing tag) can be added to each sample during library preparation. The sequencing reads pertaining to each individual sample can then be identified during the bioinformatics processing (Knapp & Hofreiter 2010). Second-generation sequencing can provide direct measures of deamination, depurination, fragmentation and blocking lesions, contributing new insights into degradation and providing a means to authenticate aDNA (Overballe-Peterson et al. 2012; Dabney et al. 2013). The development of polymerase extension profiling, 24 using the high-throughput of second generation sequencing, can identify the locations of blocking lesions, such as crosslinks (Heyn et al. 2010), providing valuable information on biochemical processes in aDNA. Second-generation sequence datasets can be analysed for aDNA deamination patterns using the program mapDamage 2.0 (Jonsson et al. 2013), which provides a quantitative estimate of damage parameters. The appearance of cytosine deamination, characterised by transition across sequence reads and can be used in conjunction with other measures to authenticate aDNA. Previous research has recorded a maximum rate of 0.05 for misincorporation within modern samples (<117 years of age) and a minimum frequency of 0.11 for samples older than 500 years (Sawyer et al. 2012). Increasing coverage during sequencing or applying enzymatic treatments to deaminated residues during blunt-end repair can reduce nucleotide misincorporation as a result of deamination (Briggs et al. 2010; Overballe-Peterson et al. 2012). Direct estimates of exogenous contamination can also be calculated by mapping acquired reads to reference genomes (i.e. from humans, pigs, chickens, cows) (Green et al. 2009). A lack of contamination in conjunction with sequences exhibiting typical aDNA damage patterns can be used to authenticate endogenous DNA sequences. Third-generation sequencing technologies performing single-molecule DNA sequencing will likely be widely utilised in the near future. Unlike secondgeneration technologies, single-molecule DNA sequencing does not require the preparation of DNA libraries (that can increase the level of nucleotide misincorporations) and PCR amplification (that can unintentionally amplify contaminant DNA) (Orlando et al. 2011; Derrington et al. 2010; Pushkarev et al. 2009). Preliminary research has indicated a higher yield of endogenous DNA using single-molecule DNA sequencing on Helicos HeliScope platforms, than using second-generation technologies on Illumina sequencing platforms (Orlando et al. 2011). It is anticipated that the development of third-generation sequencing will outcompete current technologies, providing greater yield, sensitivity and accuracy of aDNA sequencing. 25 1.5 Research Objectives The main aim of this thesis was to investigate the origins and dispersals of Eastern Polynesian people using complete ancient and modern Pacific rat (Rattus exulans) mitochondrial phylogenies. Hybridisation capture, second-generation sequencing and subsequent bioinformatics processes were applied to recover complete ancient Pacific rat mitochondrial sequences, obtained from early archaeological deposits across Western and Eastern Polynesia. A modern complete mitochondrial Pacific rat database was also generated from contemporary tissue samples collected across Eastern Polynesia. Specific objectives were to: 1. Assess whether complete mitochondrial sequencing provides a greater molecular resolution (than control region analyses) to distinguish island populations/lineages across Eastern Polynesia. 2. Examine how the distribution of prehistoric mtDNA haplotypes corresponds with archaeological and linguistic evidence of human dispersal across Polynesia. 3. Identify the immediate origin of Easter Island Pacific rats. 4. Assess whether modern Pacific rat populations are representative of prehistoric populations. 5. Evaluate prehistoric Pacific rat DNA preservation across Polynesia. 26 2 Methods 2.1 Modern DNA sequencing 2.1.1 Sample Collection A total of 97 modern Pacific rat tissue samples from the Society Islands, Cook Islands and Samoa were amassed for mitochondrial genome extraction and sequencing (see Table 2.1; Figure 2.1). These samples were collected by Lisa Matisoo-Smith, James Russell, Lucie Falquier, Gerald McCormack and Terry Hunt between 1992-2009. Rats were trapped using basic snap-type rat traps. Heart, liver and tail samples were removed and either immersed in ethanol at room temperature or placed within a -80°C freezer for long-term storage. Table 2.1 Tissue samples collected for modern Pacific rat DNA processing. Location Island Society Islands Huahine Date of Collected by: collection 1993 Lisa Matisoo-Smith Tahiti 1993 Lisa Matisoo-Smith 7 Raiatea 1993 Lisa Matisoo-Smith 16 Tetiaroa 2009 Aitutaki 1992 Gerald McCormack 4 Pukapuka 1992 Gerald McCormack 7 Atiu 1992 Gerald McCormack 3 Apia, Upolu 1992 Terry Hunt 3 Cook Islands Samoa James Russell and Lucie Falquier (n) 21 36 97 TOTAL 27 ! Figure'2.1'Geographical'distribution'of'modern'and'ancient'samples'collected'for'DNA'processing.!Blue!and!green!filled! triangles!denote!modern!and!ancient!samples,!respectively.! ! 28! 2.1.2 DNA Extraction The extraction and library preparation of modern Pacific rat samples was conducted in a modern DNA laboratory in the Anatomy Department at the University of Otago. DNA was extracted from heart and liver tissue samples using a MagJET Genomic DNA Kit (Thermo Fisher Scientific, Inc), according to the manufacturers instructions. Purified DNA samples were quantified using a NanoDrop 2000 (Thermo Fisher Scientific, Inc) and stored at 4°C until further use. 2.1.3 PCR amplification Whole mitochondrial genomes were amplified using a modified PCR protocol; the original outlined in Robins et al. (2010). Four overlapping fragments, ranging from 4-6kb in length, were amplified separately using targeted primers (see Table 2.2a and Figure 2.2). Long-range polymerase chain reactions (PCR) were conducted using a KAPA LongRange HotStart PCR Kit. PCR amplification reactions in 30 l contained: 1x KAPA LongRange Buffer (without Mg2+ ); 1.75-3mM MgCl2; 0.3mM dNTPs; 0.5 M of each forward and reverse primers; 0.75U KAPA LongRange HotStart DNA Polymerase; and 2 l of DNA template. The following amplification protocol was used: 94°C, 3 min, followed by 10 cycles of: 94°C, 30 s; various annealing temperatures (47°C - 50°C) (see Table 2.2a), 30s; 68°C, 8min, 25 cycles of: 94°C, 30 s; various annealing temperatures (47°C - 50°C), 30s; 68°C, 8min with a 10 s time increment per cycle, followed by 1 cycle of 72°C, 10 min and a hold at 10°C. Amplification of PCR products was analysed by electrophoresis on a GelRed (Biotium) stained 1% agarose gel using a KAPA Universal DNA Ladder (KAPA Biosystems) for comparison. PCR products that successfully amplified were purified using Agencourt AMPure XP beads (Beckman Coulter Inc) and quantified using a NanoDrop 2000 (Thermo Fisher Scientific, Inc). 2.1.4 Library preparation & sequencing Double-stranded barcoded libraries were constructed from the modern PCR fragments for paired end sequencing on an Illumina MiSeq platform (San Diego, CA, USA). Modern libraries were built to contain primer-binding sequences, a 29 Table 2.2 a) Long-range PCR amplification for modern samples. b) Long-range PCR amplification for bait. a) Region amplified 12S COI COI COIII COIII Cyt b b) Cyt b 16S Region amplified COI Cyt b Cyt b COI Primer pairs Av175312SF AAACTGGGATTAGATACCCCACTAT & R6036R ACTTCTGGGTGTCCAAAGAATCA BatL5310 CCTACTCRGCCATTTTACCTATG & LR2RB CTGATTGGAAGTCAGTTGTATTTTT L10647F TTTGAAGCAGCAGCCTGATAYTG & RCb9H TACACCTAGGAGGTCTTTAATTG RGlu2L CAGCATTTAACTGTGACTAATGAC & Long 16SR TGATTATGCTACCTTTGCACGGTCAGGATACC Primer pairs BatL5310 CCTACTCRGCCATTTTACCTATG & RCb9H TACACCTAGGAGGTCTTTAATTG RGlu2L CAGCATTTAACTGTGACTAATGAC & R6036R ACTTCTGGGTGTCCAAAGAATCA 30 Amplicon Annealing length temperature 5.5 kb 50°C 4.1 kb 47°C 5.5 kb 50°C 4.2 kb 50°C Amplicon Annealing length temperature 9.5 kb 47°C 8.3 kb 49°C ! Figure' 2.2! Targeted' overlapping' fragments' for' PCR' amplification.! The! red! arrows! indicate! where! four! overlapping! fragments! were! amplified! for! modern! mitochondrial!DNA!sequencing.!!The!blue!arrows!indicate!where!two!overlapping! fragments!were!amplified!as!part!of!the!R.#norvegicus!bait!preparation,!for!ancient! hybridisation!capture.! ! barcode! sequence! for! sample! identification! in! bioinformatics! processing,! and! adapter!sequences!that!will!bind!to!the!MiSeq!flow!cell.!For!each!sample,!the!four! overlapping!PCR!fragments!were!pooled!into!equimolar!ratios!with!the!addition!of! PCR! grade! water! up! to! 100μl.! Pooled! DNA! fragments! were! then! sonicated! for! 9! cycles! of! 15sec! on,! 45sec! off,! using! a! Picoruptor! (Nippongene).! Sonication! was! necessary! to! break! the! longMrange! DNA! fragments! into! smaller! sizes,! prior! to! ! 31! further processing. DNA ends were blunted using T4 DNA Polymerase (New England BioLabs Inc), phosphorylated using T4 Polynucleotide Kinase (New England BioLabs Inc) and purified using Agencourt AMPure XP beads (Beckman Coulter Inc), in preparation for A-tailing and adapter ligation. Fragments underwent A-tailing using the enzyme Klenow (New England BioLabs Inc), which subsequent ligation of adapters. DNA fragments were once again purified using the AMPure XP beads prior to adapter ligation. Illumina compatible double stranded adapters (ds_AdapterP5 and ds_AdapterP7) were prepared by hybridising each d reverse complementary strands, using the following thermal profile: 95°C, 10s; ramp to 14°C (-0.1°C/s). The adapter sequences (ds_AdapterP5 and dsAdapterP7) were then ligated to the ends of fragments, followed by purification using the AMPure XP beads. A high fidelity PCR was conducted using a KAPA HiFi PCR Kit (KAPA Biosystems). PCR amplification reactions in 30 l contained: 1x KAPA HiFi Buffer; 0.3mM dNTPs; 0.3 M of the primer P5_extension; 0.3 M of a barcoded index primer P7_extension that contains a short, unique sequence allowing sample identification in subsequent bioinformatics processing; 0.6U KAPA HiFi DNA Polymerase; and 19 l of DNA template. The following amplification protocol was used: 94°C, 3 min, 20 cycles of: 94°C, 25 s; 60°C, 15s; 72°C, 15s; followed by a final extension of 72°C for 15 min. This was followed by another purification using the AMPure XP beads. Libraries were quantified using a Qubit 2.0 Fluorometer (Invitrogen) and pooled in equimolar amounts (40ng of DNA per library). Libraries were sequenced using 2 x 250 base pair-end runs on an Illumina MiSeq platform, conducted by New Zealand Genomics Limited (NZGL) at Massey University, New Zealand. 2.2 Ancient DNA sequencing 2.2.1 Sample Collection A total of 77 ancient Pacific rat archaeological samples from Tonga, Tokelau, American Samoa, Easter Island, Austral Islands, Cook Islands and the Society Islands were amassed for mitochondrial genome extraction and sequencing (see Table 2.3; Figure 2.1). These samples were collected by Dave Burley, Simon Best, 32 Table 2.3 Archaeological samples collected for ancient mitochondrial DNA processing. Age of material or associated dates: 1 = Present 500 B.P., 2 = 500 1000 B.P., 3 = 1000 2000 B.P., 4 = 2000 3000 B.P. * refers to unknown. Location Site, Island Age Site type Date of collection Collected by: Excavation report (n) Tonga Pukotala 4 Open 1997 Dave Burley Burley et al. 1999 2 Faleloa, Foa 4 Open 1997 Dave Burley Burley et al. 1999 2 Mele Havea, 4 Open 1997 Dave Burley Burley et al. 1999 1 Fakaofo 2 Open 1986 Simon Best Best 1988 6 Ofu Village, Ofu 4 Open (coastal) 2012 Jeffrey T. Clark Quintus et al. 2015 6 Olosega Village, Olosega 4 Open 2015 Jeffrey T. Clark Unpublished 6 Anakena 2 Open (coastal) 2004 Terry Hunt Hunt & Lipo 2006 11 2 Open (coastal) 2007 Robert Bollt Tokelau American Samoa Easter Island Austral Islands Atiahara, Tubuai 33 Worthy & Bollt 2011 5 Location Site, Island Age Site type Date of collection Collected by: Cook Islands Ureia, Aitutaki 3 Open 1987 Melinda Allen Mangaia * Open (coastal) * Kazumichi Katayama Leach et al. 1994 4 Terre Teahutapu, , Maupiti 1 2012 Jennifer Kahn Kahn et al. 2015 14 Terre Namotu, Maupiti 1, 2 2012 Jennifer Kahn Kahn et al. 2015 4 Haranai Valley, Maupiti 1, 2 Open 2012 Jennifer Kahn Kahn et al. 2015 4 Haumi archaeological site, Moorea 1, 2 * 2014/15 Jennifer Kahn Unpublished 6 Society Islands Open (coastal) Open (coastal) TOTAL Excavation report Allen & Steadman 1990 (n) 6 77 34 Jeffrey T. Clark, Terry Hunt, Robert Bollt, Melinda Allen, Kazumichi Katayama and Jennifer Kahn from archaeological excavations between 1986 and 2015. These archaeological samples were unwashed and placed within a dark and dry environment for long-term storage. 2.2.2 DNA Extraction All aDNA extractions and library building prior to PCR amplifications was conducted in a purpose-built Ancient DNA laboratory in the Richardson Building at the University of Otago. Access and use of the aDNA laboratory requires adherence to strict protocols in place to minimise contamination risks, particularly from areas where amplified DNA is a present, i.e. modern DNA laboratory. Femurs were common in the acquired Pacific rat bone assemblage (see Figure 2.3) and were targeted for DNA extraction. Figure 2.3 Archaeological sample of Pacific rat (R. exulans) femur. 35 Femurs are the largest bone in the Pacific rat skeleton, weighing on average 0.1g (Matisoo-Smith et al. 1997). Given it is not appropriate to pool bone samples in order to prevent more than one individual from being analysed, the femur provided the best chance of retrieving preserved DNA from an individual rat. Prior to DNA extractions, bone samples were soaked in 5% bleach for 10 minutes, rinsed several times with ultrapure water and left to dry overnight. Each bone sample was ground into a fine powder using a sterile mortar and pestle and digested overnight in an extraction buffer containing 0.45M of EDTA and 0.25mg/ml of Proteinase K (Rohland & Hofreiter 2007). Samples were processed in sets of nine in conjunction with one extraction blank. Extraction blanks were processed alongside samples throughout extraction and library preparation, and sequenced in a separate run consisting of all blanks. This was necessary to identify any crosscontamination between samples that would jeopardise the integrity of each sequence in a set. Following the overnight digestion, the supernatant from each digested sample (and blank) was added to 2.5ml of binding buffer and then combined with 100 l of silica suspension. Samples were rotated for 3hrs in the dark at room temperature to bind the DNA to the silica surface. After binding, the silica was extracted, repeatedly cleaned with a wash buffer to remove PCR-inhibiting agents and DNA eluted in 1x TE buffer. Each sample was then spun down and the supernatant, containing the DNA extract, was transferred to siliconised tubes and stored at 20°C (Rohland et al. 2010). 2.2.3 Library preparation Double-stranded barcoded libraries were generated from aDNA extracts (Knapp et al. 2012) in preparation for hybridisation capture and paired end sequencing on an Illumina MiSeq platform (see Figure 2.4a). Blunt-end repair was performed using T4 DNA Polymerase (New England Biolabs Inc) and phosphorylated using T4 Polynucleotide Kinase (New England Biolabs Inc), allowing for subsequent adaptor ligation. Post-blunt end repair, DNA samples were purified with a MinElute PCR 36 ! ! ! Figure' 2.4! Overview' of' the' hybridisation' capture' on' beads' method.! a)! The! production! of!indexed! (barcoded)! libraries!from! individual!bone!samples.!b)!The! production!of!immobilised!bait!using!a!fresh!tissue!sample!from!the!species!Rattus& norvegicus! c)' Hybridisation! of! indexed! libraries! onto! baited! beads.! Captured! libraries! are! then! washed! before! being! denatured! from! the! beads,! eluted! and! amplified!prior!to!multiplex!sequencing.! ! 37! Purification Kit (QIAGEN), with the following modification: DNA eluted in 0.1x TE + 0.05% Tween. Illumina compatible adaptors, Sol_adap_P5 and Sol_adap_P7-BIO, were ligated to the ends of the target sequences and purified with the MinElute PCR Purification Kit (QIAGEN) with the following modifications: 2x PE wash steps and elution of DNA in 25 l of 1xTE. The Sol_adap_P7- nd, allowing the DNA library to immobilise onto streptavidin-coated beads. The beads were then washed to remove unincorporated adaptors, before being re-suspended in a master mix containing Bst Polymerase, an enzyme that fills in any nicks to complete adaptor attachment. A quantitative PCR (qPCR) was then performed to determine the number of cycles required for the sufficient amplification of each library in the immortalisation step. qPCR reagents and target libraries were prepared in the ancient DNA lab, before being transferred to the modern DNA lab for PCR amplification. Each qPCR amplification reaction in 25 l contained: 1 l of adaptor ligated library; 10 M each of the universal adaptors, Sol_quant_P5 and Sol_quant_P7; and 12.5 l of SYBR Green Master Mix. The following qPCR conditions were used: 95°C, 10 min, 40 cycles of: 94°C, 30 s; 58°C, 30s; 72°C, 1min, followed by a final extension at 72°C, 10 min on a Stratagene MxPro 3000P. Each qPCR amplicon was then visualised by electrophoresis on a GelRed (Biotium) stained 1% agarose gel using a KAPA Universal DNA Ladder (KAPA Biosystems) for comparison to check the success of library preparation and for the presence of adapter dimers. If libraries presented adaptor dimers, a second attempt at library preparation was made or they were removed from further processing. Preparation of libraries for immortalisation was conducted in the ancient DNA laboratory, before being transferred to the modern DNA laboratory for PCR amplification. Each immortalisation amplification reaction in 50 l contained: 19 l of adaptor ligated library; 1xTaq Buffer; 2.5mM MgCl2; 1mM of dNTPs; 3.75U of AmpliTaq Gold polymerase (ThermoFisher Scientific); 0.2 M of extension primer, Sol_ext_P5; and 10 M of a barcoded index primer, Sol_ext_P7. The following immortalisation conditions were used: 95°C, 12 min, # cycles (determined by when the plateau was reached during the qPCR) of: 94°C, 30 s; 58°C, 30s; 72°C, 38 1min, followed by a final extension of 72°C for 10 min. Following immortalisation, samples were purified with a MinElute PCR Purification Kit (QIAGEN) with the following modification: 2x PE washes. Immortalised libraries were then amplified using a KAPA HiFi PCR Kit (KAPA Biosystems) in preparation for the capture, each 50 l reaction containing: 1 l of immortalised library; 1x KAPA HiFi Buffer; 0.3mM dNTPs; 0.3mM each of amplification primers, Sol_amp_p5 and Sol_amp_p7; and 1U of KAPA HiFi DNA Polymerase. The following amplification conditions were used: 94°C, 5 min, 10 cycles of: 94°C, 20 s; 55°C, 55s; 72°C, 15s, followed by a final extension of 72°C for 5 min. Following amplification, samples were purified with a MinElute PCR Purification Kit (QIAGEN) with the following modification: 2x PE washes. 2.2.4 Bait production & preparation Baited beads were produced for hybridisation capture (see Figure 2.4b) using tissue from a laboratory rat (Rattus norvegicus), sourced from the Department of Anatomy, University of Otago. A preserved tissue sample of an R. exulans, originally sourced from Near Oceania, was initially used for bait production however the sample provided heavily degraded mitochondrial DNA that could not be amplified in two overlapping fragments. The R. norvegicus sample was therefore used to mass-produce the 8-9.5kb overlapping fragments. DNA extractions for the bait production methodology. Whole mitochondrial genomes were amplified using a modified PCR protocol as outlined in the modern DNA amplifications above. The mitochondrial genome was amplified in two overlapping fragments using the following targeted primers (see Table 2.2b and Figure 2.2). PCR products were analysed by electrophoresis and if successfully amplified, were purified with a MinElute PCR Purification Kit (QIAGEN) and quantified using a NanoDrop 2000 (Thermo Fisher Scientific, Inc). Bait preparation was conducted following the protocol detailed in Maricic et al. (2010). The two long-range fragments were pooled into equimolar amounts and sonicated for 9 cycles of 15sec on, 45sec off, using a Picoruptor (Nippongene). 39 Sonication was necessary to break the long-range DNA fragments into smaller sizes, prior to further processing. Sonicated bait was then aliquoted into small PCR Each bait sample underwent blunt end repair followed by purification with a MinElute PCR Purification Kit (QIAGEN). Each blunt-ended sample was then -T/B adapters, purified with a MinElute PCR Purification Kit (QIAGEN) and quantified with a NanoDrop 2000 (Thermo Fisher Scientific). Five hundred nanograms of Bio-T/B adapter-ligated bait DNA was then aliquoted into PCR tubes and denatured at 98°C. Bait DNA was combined with purified streptavidin beads and rotated at room temperature for 20 minutes to allow binding. Post-binding, the baited beads were collected and re-suspended in TET buffer until ancient samples were ready for hybridisation capture. 2.2.5 Hybridisation capture & sequencing Ancient libraries were captured using a modified hybridisation protocol, the original outlined in Maricic et al. (2010). A hybridisation mixture containing: 2 g of barcoded library; 1 M each of the blocking oligonucleotides: BO_p5_f, BO_p5_r, BO_p7_f(1), BO_p7_r(1), BO_p7_f(2), BO_p7_r(2); 0.6x Agilent blocking agent; and 0.6x Agilent hybridisation buffer was prepared for each sample and treated to the following conditions: 95°C, 3 min; 37°C, 30 min. The hybridisation mixture was then added to the baited beads and rotated at 12rpm for 48hrs in a 65°C hybridisation oven (see Figure 2.4c). Following the capture, the baits beads, now containing captured libraries, were isolated and washed as described by Maricic et al. (2010), before being re-suspended in 15ul of 1x TE solution and heated at 95°C, 3 min, denaturing the captured libraries from the beads. The supernatant containing the captured libraries was transferred to a new siliconised tube and stored at -20°C. Captured libraries were then re-amplified using a KAPA HiFi PCR Kit (KAPA 15 l of post-captured library; 1x KAPA Buffer; 0.3mM dNTPs; 0.3mM each of the amplification primers, Sol_amp_p5 and Sol_amp_p7; and 1U of KAPA HiFi DNA Polymerase. The following 40 amplification conditions were used: 94°C, 5 min, 20 cycles of: 94°C, 20 s; 55°C, 55s; 72°C, 15s, followed by a final extension of 72°C for 5 min. Following amplification, samples were purified with a MinElute PCR Purification Kit (QIAGEN), quantified Qubit 2.0 Fluorometer (Invitrogen) and stored at -20°C. Samples were pooled in equimolar concentrations per run, resulting in 80-150ng of each library being sequenced. Libraries were sequenced using 2 x 75 base pair-end runs on Illumina MiSeq, conducted by New Zealand Genomics Limited (NZGL) at Massey University, New Zealand. A negative run containing all the blank samples was also sequenced. 2.3 Bioinformatics processing 2.3.1 Modern DNA processing The raw reads from the modern samples were mapped to a reference genome (R. exulans, GenBank NC_012389.1) (Robins et al. 2008) in BWA v.0.7.12 (Li & Durbin 2009), implementing the maximal exact matches (MEM) command (Li 2013). PCR (http://broadinstitute.github.io/picard/). Variant calling was conducted using the GATK Haplotype Caller (McKenna et al. 2010) and the subsequent variant call file (vcf) converted to a (https://github.com/theboocock/ancient_dna_pipeline, FASTQ Boocock file 2014) for phylogenetic analyses. To assess the impact of contamination from laboratory reagents, reads were also mapped to mitochondrial reference genomes from cow (Bos Taurus, GenBank accession NC_006853.1), pig (Sus scrofa, NC_0012095.1), human (Homo sapiens, GenBank NC_012920.1), chicken (Gallus gallus, GenBank NC_001323.1) and dog (Canis lupus familiaris, GenBank NC_002008.4). A contamination ratio was calculated by the ratio of all mapped reads to a reference (quality > =20) in comparison to those mapping to the R. exulans mitochondrial genome. Sequencing coverage was evaluated in Geneious (v.8.1.8) (http://www.geneious.com, Kearse et al. 2012) and only sequences exhibiting more than 90% coverage were retained. 41 2.3.2 Ancient DNA processing The raw reads from the ancient samples were initially processed in AdapterRemoval (v.2) (Lindgreen 2012) to remove adapters, merge paired-end fragments (overlapping by at least 11 base pairs), and remove stretches of Ns, bases with low quality scores (<30) and short reads (<25). Reads were then mapped to a reference genome (R. exulans, GenBank KJ530564.1) (Tsangaras et al. 2014) in BWA v.0.7.12 (Li & Durbin 2009), following recommended settings for ancient DNA, i.e. n 0.03 (allow for more substitutions), -o 2 (allow more gaps at the beginning) and l 1024 (deactivate seed mapping) (Schubert et al. 2012). As outlined previously, reads were also mapped to the mitochondrial genomes of cow, human, pig, chicken and dog to calculate a contamination ratio for each sample. PCR dup MarkDuplicates tool and from merged reads using script from PaleoMix (Schubert et al. 2014). Merged and unmerged reads were combined into one BAM file for each sample using Samtools (Li et al. 2009). The program mapDamage (v.2.0) (Jónsson et al. 2013) was then implemented to assess damage patterns and lower characteristic aDNA damage patterns were produced for each sample. INDEL realignment was conducted using the GATK Best Practices (McKenna et al. 2010). A variant call file was generated and a consensus sequence produced, that was SN Viewer (IGV) (Robinson et al. 2011; Thorvaldsdóttir et al. 2013). Sequencing coverage was evaluated in Geneious (v.8.1.8) (http://www.geneious.com, Kearse et al. 2012) and only sequences exhibiting more than 80% coverage were used in subsequent phylogenetic analyses. 2.4 Phylogenetic Analysis Prior to the phylogenetic analyses of the complete mitochondrial genomes, a control region analysis was conducted to categorise each sample into the three major control region haplogroups I, II, III (A & B) as described in Matisoo-Smith 42 et al. (2004). Complete modern and ancient mitochondrial sequences were aligned to four 203bp control region fragments (from position 15,392 15,595) using MUSCLE (Edgar 2004) in Geneious (v.8.1.8) (Kearse et al. 2012). These four control region fragments were obtained from GenBank (AY604204, AY604209, AY604231, AY604228; Matisoo-Smith et al. 2004) and represent haplogroups I (haplotype R1), II (R2), IIIA (R15) and IIIB (R9). Three publicly available complete mitochondrial Pacific rat sequences sampled from Thailand, Papua New Guinea and New Zealand (Genbank EU273709.1, EU273710.1, EU273711.1; Robins et al. 2008) were also added to the alignment, as they are analysed in conjunction with the Polynesian assemblage in downstream complete mitochondrial phylogenetic analyses. After the full mitochondrial sequence alignment, base positions outside of 15,392 15,595 were removed to export a complete 203bp sequence alignment. A median-joining network was generated in PopArt (Leigh & Bryant 2015; Bandelt et al. 1999). However as PopArt removes variable sites where one or more sequences provide alignment gaps or missing data, incomplete sequences were removed from the median-joining network analysis. Instead, these samples were individually assessed for variable positions in Geneious (v.8.1.8) and if possible assigned to their respective control region haplogroups. Estimates of nucleotide diversity, ( ), the average number of nucleotide variations per site between two samples, were calculated across complete modern and ancient mitochondrial sequences in DnaSP (v.5.10.1) (Librado & Rozas 2009). This was conducted to evaluate the level of variation outside of the control region and therefore assess the benefit of sequencing full mitochondrial genomes for phylogenetic analyses. Separate modern and ancient alignments were generated using MUSCLE in Geneious (v.8.1.8). Only sequences exhibiting more than 90% coverage were used for this analysis, as the program removes sites where one or more sequences provide alignment gaps or missing data. For the complete mitochondrial phylogenetic analyses, three alignments (modern, ancient and modern + ancient) were generated using MUSCLE in Geneious (v.8.1.8). The best-fit model for nucleotide substitution in the modern and ancient datasets was determined 43 using jModeltest2 (https://github.com/ddarriba/jmodeltest2, Guindon & Gascuel 2003; Darriba et al. 2012) with 11 substitution schemes. Model selection was computed using the Bayesian and Akaike information criteria (BIC and AIC). A Bayesian inference analysis of phylogeny was conducted using the MrBayes plugin (v.2.2.2: Huelsenbeck et al. 2001; Ronquist & Huelsenbeck 2003) implemented in Geneious (v.8.1.8). This consisted of running four chains with the heated chain temperature set at 0.2 to allow sufficient chain-swapping over 20 million generations, with a burn-in length of 100,000 and sampling every 2000 generations. The F81 substitution model (Nst=1) (Felsenstein 1981) was applied for the modern sequence analysis and the HKY85 substitution model (Nst=2) (Hasegawa et al. 1985) with invariable rate variation (rate variation command BIC and AIC calculations conducted in jModeltest2. Gaps or missing/ambiguous characters are treated as missing data in MrBayes. Unlike PopArt and DnaSP, sites with valid characters are retained for analysis, regardless of whether another sequence produces missing data at that site. Complete mitochondrial sequences from Thailand, Papua New Guinea and New Zealand (Genbank EU273710.1, EU273709.1, EU273711.1; Robins et al. 2008) were added for the purposes of outgrouping (Papua New Guinea) and also added to assess where these sequences group in relation to the Polynesian assemblage. The resulting Bayesian consensus trees were visualised in Geneious (v.8.1.8). Convergence was assessed using the Trace tool in Geneious; there were no obvious trend lines to suggest that the MCMC was still converging and there were no large-scale fluctuations that would suggest poor mixing. Effective sample sizes (ESS) for all traces exceeded 100. Maximum likelihood (ML) analyses (Felsenstein 1981) were conducted using the PhyML plugin (v.3.0: Guindon et al. 2010) implemented in Geneious (v.8.1.8). Substitution rate categories were set to 4 and bootstrap support was estimated using 100 replicates. The F81 substitution model (Felsenstein 1981), with the proportion of invariable sites fixed to zero and software-optimised estimates of the gamma shape parameter and the transition/transversion ratio was applied to the modern dataset. The HKY85 substitution model (Hasegawa et al. 1985), with 44 software-optimised estimates of the proportion of invariable sites, the gamma shape parameter and the transition/transversion ratio was applied to the ancient dataset. Like MrBayes, PhyML does not exclude sites where one or more sequences provides missing data, but will ignore missing data for a particular sequence. The complete mitochondrial sequences from Thailand, Papua New Guinea and New Zealand (Genbank EU273710.1, EU273709.1, EU273711.1; Robins et al. 2008) were also added. The Papua New Guinea sample was used as an outgroup. Median-joining networks were generated using the aligned modern, ancient and modern + ancient datasets in PopArt. The combined modern + ancient haplotype analysis was conducted to visually assess how all the sequences associate in a spatial and temporal sense. The sequences from Thailand, Papua New Guinea and New Zealand (Genbank EU273710.1, EU273709.1, EU273711.1; Robins et al. 2008) were added as part of the modern + ancient haplotype analysis. 45 3 Results 3.1 DNA preservation and sequence recovery A total of 47/97 modern samples produced adequately preserved DNA for fragment amplification and subsequent sequencing. The overall PCR success rate across modern samples was 48.5%. This low PCR rate could be attributed to the degradation of DNA in some of the tissue samples, given all tissue samples were retrieved from various states of preservation implemented over the last few decades. Some samples were preserved in ethanol at room temperature, whilst others had been placed in an -80°C freezer for long-term storage. Fifteen modern samples were removed post-sequencing as a result of low coverage (<90%) and read depth (<2) when mapped to a Rattus exulans mitochondrial reference genome. Thirty-two samples remained, providing an overall success rate of 33% from modern DNA extraction to downstream data analyses. These high coverage modern samples all originate from the Cook Islands and Society Islands. A total of 40/77 ancient libraries produced adequately preserved DNA that successfully amplified (success rate: 52%; see Table 3.1). These amplified libraries were then subjected to hybridisation capture and sequencing. Twenty-seven samples were removed post-sequencing as a result of low coverage (<80%) and read depth (<2) when mapped to a Rattus exulans mitochondrial reference genome. Thirteen ancient samples remained (see Table 3.1 and 3.2), providing an overall success rate of 17% from ancient DNA extraction to downstream data analyses. 3.2 Sequence authenticity To estimate the degree of reagent contamination, reads from each sample were mapped to mitochondrial reference genomes from a pig, human, cow, chicken and dog. This was compared to reads mapped to a rat mitochondrial genome. In the ancient sequences, only one sample (MS10605) exhibited a small amount of human contamination equating to approximately 0.03%. A lack of exogenous DNA, particularly from cow, pig and chicken, indicates that reagents used in this study were free from contamination. It has been previously asserted that laboratory 46 Table 3.1 Summary statistics for successful library amplification and post-sequencing processing of ancient R. exulans samples. Successful DNA amplification of (n) libraries, prepared from ancient bone samples, is given by the amplification success rate. Amplified libraries were then subjected to hybridisation capture and Illumina sequencing. The overall success rate is of (n) remaining samples, after low coverage removal during bioinformatics processing. Age of material or associated dates: 1 = Present 500 B.P., 2 = 500 1000 B.P., 3 = 1000 2000 B.P., 4 = 2000 3000 B.P. * refers to unknown. Amplification success rate (n) after low coverage removal Overall success rate Site Type Age (n) libraries Pukotala Open 4 2 50% 0 0% Faleloa, Foa Open 4 2 100% 1 50% Mele Havea, Open 4 1 100% 0 0% Tokelau Fakaofo Open 2 6 66.7% 2 33.3% American Samoa Ofu Village, Ofu Open (coastal) 4 6 33.3% 0 0% Open 4 6 0% 0 0% Open (coastal) 2 11 90.9% 6 54.5% Location Site Tonga Olosega Village, Olosega Easter Island Anakena 47 Site Type Age (n) libraries Amplification success rate (n) after low coverage removal Overall success rate Open (coastal) 2 5 60% 2 40% Open 3 6 0% 0 0% Mangaia Open (coastal) * 4 0% 0 0% Terre Teahutapu, , Maupiti Open (coastal) 1, 2 14 64.3% 0 0% Terre Namotu, Maupiti Open (coastal) 1, 2 4 50% 1 25% Haranai Valley, Maupiti Open 1, 2 4 75% 0 0% * 1, 2 6 50% 1 16.7% 77 52% 13 17% Location Site Austral Islands Atiahara, Tubuai Cook Islands Ureia, Aitutaki Society Islands Society Islands Haumi archaeological site, Moorea TOTAL 48 Table 3.2 High-coverage ancient samples used in the phylogenetic analyses. Associated layers dates are as reported in archaeological excavation publications (see Table 2.3), except for the Society Islands (Jennifer Kahn, personal communication). Site Tonga Faleloa, Foa MS10596 2600 ± 50 B.P. Tokelau Fakaofo MS10592 1090 ± 60 B.P. MS10591 1090 ± 60 B.P. MS10606 610 560 B.P. MS10610 610 560 B.P. MS10609 610 560 B.P. MS10602 610 560 B.P. MS10605 610 560 B.P. MS10604 610 560 B.P. MS10601 769 675 B.P. MS10600 769 675 B.P. Terre Namotu, Maupiti MS10502 495 320 B.P. Haumi archaeological site, Moorea MS10523 750 550 B.P. Easter Island Austral Islands Society Islands Sample code Associated Location Anakena Atiahara, Tubuai 49 layer date reagents, such as dNTPs, can contain DNA from domestic and laboratory animals that may obscure endogenous DNA amplification (Leonard et al. 2007; Thomson et al. 2014). No blank processed with any of the modern and ancient samples exhibited contamination, providing no indication of cross-contamination during the processing of samples. The deamination patterns exhibited in the reads from each ancient sample were consistent with expected damage in ancient DNA sequences. For example, sample MS10592 exhibited an elevated frequency of T within the first ten and of A nucleotide misi the maximum rate of 0.05 for samples less than 117 years old, and is consistent within samples older than 500 years (Sawyer et al. 2012). Conclusively, this indicates that the sequences are endogenous mtDNA of rats obtained from early archaeological deposits across Polynesia. 3.3 Phylogenetic analysis An analysis of the control region was undertaken to categorise samples into the three major haplogroups I, II, III as described in Matisoo-Smith et al. (2004). The majority of Polynesian samples (both modern and ancient) clustered within Haplogroup IIIB, in particular adhering to the R9 haplotype (see Figure 3.2). The samples that were missing sites and subsequently removed from the medianjoining network analysis, were analysed individually for variable positions that would indicate haplogroup membership. For each sample, variable positions across the 203bp fragment of the control region are shown in the Appendix Table 6.1. The samples MS10602, 606, 609 (Easter Island); MS10591, 592 (Fakaofo, Tokelau); MS10596 (Foa, Tonga); MS10600, 601 (Tubuai, Austral Islands) and MS10523 (Moorea, Society Islands) contained missing sites in the control region and were therefore assessed individually. MS10606, 609 and 592 were assigned to haplogroup IIIB. MS10600, 602 and 523 were missing some key variable sites but could be assigned to haplogroup III. MS10596, 591 and 601 were missing a 50 a) b) Figure 3.1 of reads pertaining to the ancient sample MS10592. a) The grey brackets, indicative of strand breaks, incorporate the start and end of the mitochondrial sequence. The frequencies of bases A, G, C and T are shown for the first and last ten nucleotide sites. b) Nucleotide misincorporations across the first and last 25 nucleotide sites. The coloured lines are indicative of the following nucleotides - red T; grey C; blue A; purple G. 51 substantial amount of variable sites and could not be assigned haplogroup membership. As depicted in Figure 3.2, the samples Raia28, Raia45 and Raia16 (Raiatea, Society Islands) radiated by one mutation off the haplotype R9. It was unclear whether Raia28 and Raia45 were part of haplogroup IIIA or IIIB, however after individual assessment all three samples were assigned to haplogroup IIIB. Therefore all samples, except for those with substantial damage and the Puk001 (Pukapuka, Cook Islands) sample, were assigned to haplogroup IIIB. The Puk001 sample was clustered within haplogroup II, alongside the Papua New Guinean Figure 3.2 Median-joining network of control region haplotypes. Colours represent one of four central haplotypes chosen to represent each haplogroup (I, II, IIIA and IIIB). The diameter of each circle is proportional to the frequency of a specific haplotype. 52 sequence (GenBank EU273709.1). Interestingly, the Thailand sequence (GenBank EU273710.1), clustered outside all three major haplogroups, indicating the existence of potentially another major R. exulans haplogroup. Nucleotide variation across both complete modern and ancient mitochondrial genome datasets (>90% coverage) was evaluated using the program DnaSP (Librado & Rozas 2009). In the modern dataset, a total of 113 variable sites were identified. The average number of nucleotide differences (k; Tajima 1983) between two modern sequences was 9.2 ± 18.92. Figure 3.3a illustrates the variation in nucleotide diversity ( ; Nei 1987) across the modern mitochondrial dataset, using overlapping windows of 100 bp (step size: 25) that are centred at the midpoint. In the modern dataset, the highest nucleotide diversity was found at the end of the tRNA-pro coding region and the beginning of the control region from the nucleotide positions 15339-15690, peaking at 0.00402. Conversely, the hypervariable I region (HVR I) of the control region (positions 16024-16383) produced a lower level of nucleotide variation, peaking at 0.00156. Across the coding region in modern samples, the highest nucleotide diversity was found in the protein-coding genes COX1 (peak of 0.00286) and ND4L-4 (peak of 0.00261). In the ancient dataset, a total of 20 variable sites were identified. The average number of nucleotide differences (k; Tajima 1983) between two ancient sequences was 6.5 ± 11.5. Figure 3.3b illustrates the variation in nucleotide diversity ( ; Nei 1987) across the ancient mitochondrial dataset. In ancient sequences, the highest level of diversity was found at the beginning of the protein-coding region ND5 (from nucleotide position 11713-12130), with a peak of 0.01. The control region in comparison shows a lower level of nucleotide variation, with a peak of 0.002 in the HVR I region. The consensus maximum likelihood tree for modern samples, supported by Bayesian posterior probabilities, indicated two major groupings the first consisting of sequences from Papua New Guinea and Pukapuka (indicative of control region haplogroup II) and Thailand, and the second consisting of rats from 53 a)! b)! ! ! Figure' 3.3! Distribution' of' nucleotide' diversity' (π)' across' a)' modern' and' b)' ancient' complete' mitochondrial' sequences.' Nucleotide! diversity,! (π),! represents! the! average! number! of! nucleotide! variations! per! site! between! two! samples.! This! is! distributed!in!windows!of!100bp!(step!size!of!25!and!centered!at!the!midpoint)!across! the!entire!mitochondrial!genome,!represented!in!a!linearised!genetic!map.! ! ! ! 54! across Remote Oceania (haplogroup III) (see Figure 3.4). Using the substitution rate on the maximum likelihood tree, approximately 100 mutational steps separate the modern Near Oceanic and Remote Oceanic populations. Across the Remote Oceanic populations, there was high Bayesian posterior probability support for a group containing New Zealand and Huahine (Society Islands) sequences, distinct from the rest of the Polynesian samples. Other Huahine lineages however, were present in the wider Central Polynesian assemblage. Overall there was very little bootstrapping and posterior probability support for phylogeographic structure. Whilst many islands have produced multiple lineages, there are no supported island groupings whereby all lineages are grouped together and are distinct from the rest of the Polynesian assemblage. The median-joining network of modern sequences revealed a major central East Polynesian haplogroup, consisting of many Cook Island and Society Island linages radiating from an ancestral haplotype comprising of five samples from Tahiti, Raiatea (Society Islands) and Aitutaki (Cook Islands) (see Figure 3.5). The radiating lineages are distinct from the ancestral haplotype by one to nine mutational steps. The Pukapuka sequence is widely distinct from the central East Polynesian haplogroup, which is to be expected given that it has been confirmed as a haplogroup II rat. The consensus maximum likelihood (ML) tree for the ancient samples, supported by Bayesian posterior probabilities, also indicated the same two major groupings, indicative of Papua New Guinea (haplogroup II) and Thailand, and all other Remote Oceanic (haplogroup III) rats (see Figure 3.6). Using the substitution rate on the maximum likelihood tree, the Near Oceanic populations are separated by 73 mutational steps from the deepest branch in the ancient Remote Oceanic group. Tonga and Tokelau, which have previously been identified as part of an overlapping region between the control region haplogroups IIIA and IIIB (MatisooSmith et al. 2004), produced ancient samples (MS10596 & MS10592) that clustered on the deeper branches within the Polynesian assemblage. The Tongan sample was unfortunately too damaged in the control region to evaluate whether it belonged to either IIIA or IIIB. The Tokelau sample (MS10592) however was assigned to haplogroup IIIB. The maximum-likelihood analysis indicated that 9 mutational steps separate the Tonga and Tokelau samples. Conversely, the median 55 ! ! ! ! ! ! Figure!3.4!Rooted!maximum!likelihood!(ML)!tree!inferred!from!the!modern! (complete! mitochondrial)! dataset.!Values!above!each!node!represent!Bayesian! posterior! probabilities! (obtained! using! MrBayes)! and! ML! bootstrap! values! (obtained!from!PhyML).!Only!posterior!probabilities/bootstrap!values,!where!one! or!both!are!above!0.7!and!70!respectively,!are!included.!Individual!sample!names! are! included! and! geographic! populations! defined! by! colour.! Scale! bar! represents! amino!acid!substitutions!per!site.!Additional!R.#exulans!sequences!(Thailand,!Papua! New! Guinea! (outgroup)! and! New! Zealand;! Genbank! EU273709.1,! EU273710.1,! EU273711.1)!(Robins!et!al.!2008)!were!included.!! ! 56! ! ! ! Figure!3.5!Median?joining!network!of!the!modern!(complete!mitochondrial)! dataset.! Colours! represent! geographic! sampling! locations! across! central! east! Polynesia.!The!diameter!of!each!circle!is!proportional!to!the!frequency!of!a!specific! haplotype.! ! ! 57! -joining (MJ) network indicates they are only separated by one mutational step (see Figure 3.7). However PopArt does remove varying sites from the MJ analysis if one or more sequences provide missing data and therefore cannot be considered a reliable source for mutational steps between sequences. Median-joining networks do not identify ancestor-descendent relationships but instead are based on overall similarity. However the ancient median-joining analysis has produced a network that shows a direct link between Western (Tonga) and far Eastern Polynesia (Easter Island), where the Tokelau sample (MS10592) could be considered a transitional haplotype. It can therefore be argued that that the central haplotype corresponding to MS10592 (and thereafter all Polynesian haplotypes) are directly descended from this ancestral haplotype uncovered in Tonga. This coincides with what we know of Pacific rat and associated human dispersal across the Pacific, which has overwhelming headed in an easterly direction. A shallower branch on the ancient maximum likelihood tree revealed a Society Island grouping (Moorea & Maupiti islands), with high bootstrap and Bayesian support. This is distinct from the rest of the Polynesian assemblage that included samples from New Zealand, Tokelau (MS10591), Easter Island and the Austral Islands. The median-joining network indicates the existence of two haplotypes corresponding to the samples from Maupiti and Moorea. These haplotypes radiate off the central haplotype (MS10592) by several mutations, according to the ML tree. Given their MJ network position, it would seem that these two Society Island haplotypes are not directly ancestral to the rest of the Polynesian assemblage. Whilst we cannot exclude the possibility that the central haplotype corresponding to MS10592 could also exist in the Society Islands, there is no conclusive evidence from this study to indicate an Eastern Polynesian expansion from the Society Islands. The other ancient sample from Fakaofo, Tokelau (MS10591) clustered with the rest of the Polynesian assemblage (New Zealand, Easter Island and Austral Islands) with high Bayesian support on the maximum-likelihood tree. According to the ML tree, MS10591 is separated by 14 mutational steps from the central haplotype. 58 ! ! Figure' 3.6!Rooted' maximum' likelihood' (ML)' tree' inferred' from' the' ancient' (complete' mitochondrial)' dataset.!Values!above!each!node!represent!Bayesian! posterior! probabilities! (obtained! using! MrBayes)! and! ML! bootstrap! values! (obtained!from!PhyML).!Only!posterior!probabilities/bootstrap!values,!where!one! or!both!are!above!0.7!and!70!respectively,!are!included.!Individual!sample!names! are! included! and! geographic! populations! defined! by! colour.! Scale! bar! represents! amino!acid!substitutions!per!site.!Additional!R.#exulans!sequences!(Thailand,!Papua! New! Guinea! (outgroup)! and! New! Zealand;! Genbank! EU273709.1,! EU273710.1,! EU273711.1)!(Robins!et!al.!2008)!were!included.! ! 59! Figure'3.7!Median?joining' network' of'the' ancient'(complete'mitochondrial)' dataset.' Colours! represent! geographic! sampling! locations! across! West! and! East! Polynesia.!The!diameter!of!each!circle!is!proportional!to!the!frequency!of!a!specific! haplotype.' ! ! ! ! ! ! 60! On the shallowest branches of the ancient maximum-likelihood tree, there is high Bayesian support for a distinct Easter Island/Tubuai (Austral Islands) grouping. The corresponding median-joining network revealed a large far Eastern Polynesian haplotype (consisting of 5 Easter Island samples) that radiates off the central haplotype (MS10592). According to the ML tree, they are separated by approximately 8 mutations from the central haplotype. Another Easter Island (MS10606) and two Tubuai haplotypes radiate from this large East Polynesian haplotype. The combined ancient and modern complete mitochondrial median-joining network is illustrated in Figure 3.8. The GenBank sequences corresponding to Papua New Guinea, Thailand and New Zealand were also included in this network analysis. Coinciding with the modern median-joining network, many of the modern Society Island and Cook Island (Aitutaki) samples form a large grouping of haplotypes, consisting of a central haplotype (of eight samples) and 15 haplotypes that directly radiate off this node. The ancient Society Island samples form part of this group, with the Maupiti sample adhering to the central haplotype and the Moorea sample radiating off the central node. The ancient Society Island populations could therefore be considered directly ancestral to the modern Society Island populations. The modern and ancient Society Islands/Cook Islands haplotypes are directly connected to the ancient sample from Fakaofo, Tokelau (MS10592), which is the central haplotype from which all East Polynesian samples are derived. The central haplotype is derived from the ancient Tongan haplotype, which then links back by 73 mutations to the haplogroup II rats (Papua New Guinea and Pukapuka) and the Thailand sample. The modern New Zealand haplotype directly radiates off the central haplotype (MS10592), inferring direct ancestry from this haplotype. The other sequence obtained from Fakaofo, Tokelau (MS10591) also radiates off the central haplotype as described previously. The large far Eastern Polynesian group consisting of samples from Easter Island and Tubuai, Austral Islands directly radiates off the central haplotype. Unlike in the ancient only MJ tree, an Austral Island sample has been incorporated into the larger haplotype. This can be 61 attributed to the use of a larger sample assemblage in PopArt. As PopArt removes variable sites from consideration where one or more samples may be missing a site or produce a gap, it is likely that the mutation that previously distinguished the Austral Island sample from the larger Easter Island-only haplotype has been removed from the analysis. Interestingly, two modern Huahine (Society Island) samples directly radiate off the shared ancient Easter Island/Austral Island haplotype. These two samples (Hua27 and Hua24) grouped with the New Zealand sequence in the modern-maximum likelihood analysis. 62 ! Figure' 3.8' Median/joining' network' of' the' modern' and' ancient' (complete' mitochondrial)' datasets,' including' acquired' GenBank' sequences.' Colours! represent! geographic! sampling! locations! across! the! Pacific.! The! diameter! of! each! circle!is!proportional!to!the!frequency!of!a!specific!haplotype.! ! 63! 4 Discussion 4.1 DNA preservation in Pacific archaeological contexts DNA preservation and recovery from post-mortem samples is largely dependent on environmental factors (i.e. temperature, moisture and acidity of the surrounding context) and age of the sample. As discussed in Chapter 1, Pacific preservation varies across tropical to temperate climates, and whether the sample was recovered from a protected, open and/or coastal context (Robins et al. 2001). Bones recovered from open sites and that fall within the age range 1000-3000 B.P., have generally produced low quality DNA. This includes early samples from the previously produced 0% PCR success rates (Robins et al. 2001). In this study, assemblages obtained from open sites dated to between 1000-3000 B.P. produced a range of library amplification success rates (prior to hybridisation capture) from 0 to 100%. Interestingly, assemblages from Foa, amplification success rates of 100, 100 and 50% respectively. However after the removal of low coverage genomes during bioinformatics processing, this success rate abruptly declined. Only one genome (obtained from Foa, Tonga) from the age range 1000-3000 B.P. was retained for phylogenetic analyses. Robins et al. (2001) observed that open site samples, dated up to 1000 B.P., produced higher PCR success rates between 60-100%. This included samples from the Cook Islands, New Zealand, Chatham Islands, Norfolk Island and Hawaii. PCR success in these open sites however, was generally higher in islands under a temperate climate than a tropical one. In this study, assemblages from open sites dated up to 1000 B.P. produced a higher library amplification success rate ranging from 50 to 90.9%. Easter Island produced both the highest amplification success rate (90.9%) and overall success rate (54.5%), which can likely be attributed to its younger contextual age and temperate climate. The other assemblages, all obtained across tropical Polynesia, produced overall success rates between 0 to 40%. 64 Previous commensal studies using archaeological specimens across the Pacific have produced varying PCR success rates. Larson et al. (2007) targeted three 120bp fragments of the mitochondrial control region in ancient pig samples collected across Island Southeast Asia and Oceania. Amplifiable DNA was present in 23/57 ancient bones and teeth. Of these, only 5 archaeological specimens were obtained from Pacific prehistoric sites, which dated from 925 to 300 B.P. These five samples originated from Tubuai, Hanamiai (Marquesas), the Reef Islands (Solomon Islands), Tanagatatau Rock Shelter on Mangaia (Cook Islands) and Mussau Island (Papua New Guinea). Many of the samples that failed to produce amplifiable DNA were from prehistoric contexts ranging from ~9000-600 B.P. All of these sites were situated in tropical/humid climates i.e. Micronesia, Taiwan, China, Indonesia, Philippines and Papua New Guinea. Conversely, prehistoric dog remains retrieved from across Polynesia have consistently produced amplifiable DNA (Savolainen et al. 2004, Greig et al. 2015). Savolainen et al. (2004) successfully sequenced 263bp of the mitochondrial control region from all 19 collected prehistoric archaeological samples. These were obtained from the Cook Islands (sites dated from 1000 B.P.), New Zealand (sites dated from 700-400 B.P.) and Hawaii (undated prehistoric site). A high level of DNA preservation has earliest archaeological site (Greig et al. 2015). All 16 dog samples produced enough endogenous DNA for hybridisation capture and complete mitochondrial sequencing. Fourteen samples produced high coverage (over 95%) and were retained for phylogenetic analyses. DNA preservation in archaeological deposits across the Pacific seem to be highly influenced by the surrounding environment, attributed to both climate and site context. In turn, the susceptibility to degradation, particularly in humid open-site contexts, is compounded by age. 4.2 The application of complete mitochondrial genome sequencing The majority of commensal studies in the Pacific, particularly those using archaeological specimens, have targeted fragments of 100-300bp in the mitochondrial control region. Mitochondrial hypervariable regions accumulate mutations at a faster rate than other mitochondrial regions (Stoneking 2000), producing a greater amount of population variation that can be utilised for 65 phylogenetic research. Pacific commensal studies in the last decade have generally followed aDNA protocols such as Matisoo-Smith et al. (1997) and Shapiro et al. (2004), where short overlapping fragments are amplified and pieced together after Sanger sequencing by capillary electrophoresis. The cloning of PCR products was also used to assess template damage, detect numts (nuclear copies of mtDNA) and evaluate accuracy and contamination. As ancient DNA is highly fragmented, it was practical to target fragments of the mitochondrial control region that produced the most usable variation. The emergence of second-generation sequencing however has made it viable to generate longer DNA sequences, e.g. complete mitochondrial genomes, from fragmented aDNA (Knapp & Hofreiter 2010). The use of hybridisation capture for target DNA enrichment prior to second-generation sequencing (Ng et al. 2009), is also beginning to supersede the traditional primerbased PCR amplification that can often amplify large amounts of exogenous contaminant DNA (Greig et al. 2015; Li et al. 2013; Knapp et al. 2012; Burbano et al. 2010; Hodges et al. 2007). Recent Pacific commensal studies using complete mitochondrial genomes have reported a higher level of molecular resolution that can distinguish phylogenetic groups and individuals with greater accuracy (Greig et al. 2015; Miao et al. 2013; Duggan et al. 2014). Matisoo-Smith et al. (2004) study of modern and ancient Pacific rat phylogenies across ISEA and the Pacific targeted ~240bp of the mitochondrial control region. This produced 32 haplotypes from 27 variable sites. Notably, the majority of Remote Oceanic samples (Haplogroup IIIB) either belonged to or radiated off the R9 haplotype, making it very difficult to distinguish dispersal routes. A follow up study by Barnes et al. (2006) of Easter Island rats, found that all ancient Pacific rat samples belonged to the R9 haplotype. It was therefore impossible to establish the specific source population for Easter Island rats. The control region analysis conducted in this study found that the majority of the Polynesian samples (both modern and ancient) clustered within Haplogroup IIIB, in particular adhering to the R9 haplotype. The use of this control region fragment alone did not produce enough variation to distinguish island groups or specific lineages. 66 Nucleotide diversity across the complete mitochondrial genomes varied between modern and ancient samples. In the ancient dataset (comprising of 1 Western and 12 Eastern Polynesian samples), 20 variable sites were identified from which 9 haplotypes were determined. Ancient nucleotide diversity was highest in protein coding regions, outside of the control region. In the modern dataset (comprising 31 Eastern Polynesian samples and a haplogroup II sample), 113 variable sites were identified from which 24 haplotypes were present in the median-joining analysis. In the modern dataset, the highest nucleotide diversity was found at the beginning of the control region, but notably outside of the hypervariable I region. The low variation in the hypervariable regions of both modern and ancient Pacific rat genomes may reflect a number of bottlenecks in the establishment of populations. Pacific rats, carried by migrating humans, would have been established on Pacific islands in a stepping-stone manner. A subset of the population (and therefore haplotypes) would have been carried on each migratory journey, as humans moved eastwards into the far reaches of Polynesia. Furthermore, as populations were established on islands, gene flow would have been heavily restricted and subsequently prolonged low variation. As nucleotide diversity in the HVR I region of both ancient and modern samples is similarly low, it would seem that hypervariable variation has not yet recovered. However, overall there was more hypervariable variation in the modern sequences than in the ancient, signifying that mutations have accumulated since the initial establishment of island populations. The varying distribution of nucleotide diversity across complete ancient and modern mitochondrial genomes indicates that targeting one region (e.g. HVR I) may not provide enough variation across all samples to sufficiently distinguish populations. Furthermore, the effect of bottlenecks would have significantly decreased genetic variation. In conjunction with previous Pacific commensal research using complete mitochondrial genomes (Duggan et al. 2014; Greig et al. 2015; Miao et al. 2013), my results suggest that complete mitochondrial sequencing is necessary to assess all mitochondrial variation and gain the highest molecular resolution for subsequent phylogenetic analyses. 67 4.3 Implications for Pacific rat dispersal The three major control region haplogroups, first identified by Matisoo-Smith et al. (2004) show clear geographic patterning associated with various Pacific rat dispersal and interaction spheres across Southeast Asia and the Pacific. As discussed previously, Haplogroup I rats are restricted to the Southeast Asian region, Haplogroup II is distributed across both Southeast Asia and Near Oceania, whilst Haplogroup III has been largely confined to Remote Oceania and is associated with the Lapita/Polynesian movement through this region. An analysis of a 203bp fragment of the control region was conducted to categorise all samples to their respective haplogroups prior to further phylogenetic analyses. A complete mitochondrial sequence from Thailand clustered outside of the three haplogroups, revealing the existence of a potentially new major Pacific rat haplogroup in mainland Southeast Asia. The identification of a modern Pukapukan sample as part of Haplogroup II was surprising, given no sample further east than the Santa Cruz Islands has been associated with this haplogroup (Matisoo-Smith et al. 2004). As the Pukapuka sample is part of the modern assemblage, it is possible that its represents a recent translocation of rats from ISEA/Near Oceania. This could have occurred as a result of modern shipping/travel between islands in Near and Remote Oceania, however it has been previously asserted that Pacific rats do not travel on modern ships, nor have been found in ports (Matisoo-Smith 1996). It is also uncertain whether modern translocations of rats can gain a foothold in previously colonised islands unless original populations have undergone extinction. Under ideal conditions, rats are capable of doubling their population every 47 days (Hunt & Lipo 2012). Within several months, populations can reach over one million, which would make it very difficult for a newly translocated haplotype to gain in frequency. Furthermore, wild rats are highly territorial (Barnett 1955) using olfactory cues to distinguish familiar from foreign conspecific rats (Alberts & Galef 1973). This would assert a large influence on the reproductive success of newly introduced rats. Alternatively, the Pukapuka sample could be representative of a prehistoric Haplogroup II population established as a result of early interaction and trade with 68 Western Polynesia that has links into Near Oceania. Whilst Pukapuka, as part of the Cook Island group, technically falls within the Eastern Polynesian region, the island seems to be linked to Western Polynesia. For example, the language spoken on Pukapuka does not form part of the Central Eastern Polynesian linguistic group from which the Rarotongan and Penrhyn (Cook Island) languages are derived (Trudgill 2004). Instead, Pukapuka is one of the main Samoic languages related to Tokelauan, Tuvaluan, Uvean, Futunan and Samoan, found across Western Polynesia. The Samoan and Eastern Polynesian group (which further subdivided into Central Eastern and Rapanui, Easter Island) are said to have divided from Nuclear Polynesian, an ancestral language that originated after the human settlement of Samoa (Trudgill 2004). There has been limited archaeological research on the colonisation and post-settlement interactions regarding Pukapuka. However, an analysis on the composition of basalt adzes retrieved from across Polynesia, found that a Pukapuka sample clustered with basalt samples from Western Samoa, Fiji, Tonga, Tokelau, Solomon Islands and Tuvalu (Best et al 1992). This may be representative of basalt trade between Pukapuka and Western Western Polynesia, it is possible that a Haplogroup II rat population originating from Near Oceania and transferred through post-settlement interactions in Western Polynesia was established on Pukapuka. It was not possible to assign the ancient Tongan haplotype (sample MS10596) to a control region haplogroup, as a result of DNA deterioration/damage. As the sample MS10596 is only separated by ten mutational steps from MS10592 (Fakaofo, Tokelau) however, it was safely assumed that MS10596 forms part of haplogroup III that is found across Remote Oceania. MS10596 was uncovered from an early archaeological context on Foa, dated to 2600 ± 50 B.P. (Burley et al. 1999). Presently, the colonisation of Tonga and Samoa is conceded to have occurred by 2800 and 2750 B.P. respectively (Burley et al. 2015; Petchey 2001; Clark et al. 2016). Given only a 50 year time span between the colonisation of these two major Western Polynesian archipelagos, it would be expected that the original haplotype/s carried by the rats into both Tonga and Samoa would be the similar and likely contain the haplotype corresponding to MS10596. 69 Archaeological and linguistic evidence suggests that Tonga and Samoa form the Western Polynesian homeland from which Eastern Polynesian populations are derived (Green 1966, 1981; Pawley 1966; Groube 1971; Emory 1946). MatisooSmith et al. (1998) study of modern Pacific rat variation across Near and Remote Oceania, found that a Western Polynesian (Samoan) sample clustered with Society Island and Cook Island samples on the deeper branches of a neighbor-joining tree. analysis, the Foa (Tonga) sequence (MS10596) occupied the deepest branch, outside of all other Remote Oceanic samples. The clustering of the Samoan and Tongan sequences on the deeper branches of their respective analyses signifies that they are sister lineages to those in Eastern Polynesia. In conjunction with the ancient median joining network, illustrating a direct Eastern Polynesian radiation off MS10596, it is possible that the Tongan haplotype uncovered in this study is ancestral to all those found in Eastern Polynesia. All East Polynesian sample/groups identified as part of haplogroup IIIB, radiate off the central haplotype (relating to the sample MS10592) that was uncovered in Fakaofo, Tokelau. Given that this haplotype directly radiates off the Tongan haplotype in the median joining network, it is likely a descendent haplotype carried by Tongan and/or Samoan populations into central East Polynesia. The identification of this haplotype in Tokelau alone however, does not suggest that all East Polynesian populations are descended from an early Tokelauan population. Further ancient sampling across Central Polynesia would be required to assess the distribution of this central haplotype. Given that the earliest radiocarbon dates associated with human settlement in central East Polynesia are found in the Society Islands (~925-830 B.P.; Wilmshurst et al. 2011), it is highly probable that this central haplotype would be found in this region. Furthermore, Matisoo-Smith et al. 1998 identified that the southern Cooks/Society Islands formed a central-east interaction sphere central to a northern and a southern sphere. This central-east interaction sphere, linked to all major East Polynesian island groups, likely reflects a central homeland region from which eastern expansion occurred. 70 The central haplotype is separated from another Tokelauan haplotype (sample MS10591) by 14 mutational steps. The samples MS10591 and MS10592 were uncovered from the same early occupation layer, although from different trenches at the archaeological site (Best 1988). This indicates that the central and diverged haplotypes existed concurrently. The derived haplotype may have diverged off the central haplotype either prior or after human establishment on Fakaofo. If it diverged prior to migration and was part of a central-east population that contained both haplotypes, then it would be expected that both haplotypes could be found in the earliest cultural layers in Fakaofo. If the derived haplotype diverged after establishment, then the layer context in which both samples were found may not be representative of an early human cultural layer. The ancient Society Island sequences obtained from Maupiti and Moorea (MS10502 and MS10592) are derived from the central haplotype but differ by several mutational steps. These samples and their associated layers were dated to 495-320 B.P. and 750-550 B.P. respectively (Jennifer Kahn, personal communication). Therefore the samples from Maupiti and Moorea do not represent the oldest archaeological layers that date to initial colonisation in the Society Islands. However, widespread radiation from the Society Islands may not have occurred until 750 B.P., given that no Eastern Polynesian archipelago, except for the Society Islands, have been dated prior to this (Wilmshurst et al. 2011; Petchey 2010). As no other recovered ancient central East Polynesian haplotypes are derived from the haplotype found on Moorea, which would have existed at the time of widespread expansion, or Maupiti, it could be argued that the continued East Polynesian expansion did not originate from these islands. The haplotype of the ancient sample from Maupiti however, is the central node of which many modern Society Island and Cook Island samples share or are derived. Further sampling of ancient Pacific rats across the Society Islands and Cook Islands would be required to assess whether the central haplotype was established in these central island groups and whether it perhaps existed concurrently with derived haplotypes, such as those found in Moorea and Maupiti. If early archaeological contexts in the Society Islands are devoid of samples carrying the central 71 haplotype, then the Society Islands may not be the central hub from which all other Eastern Polynesians populations are derived. The ancient maximum-likelihood analysis provided high Bayesian support for a distinct Easter Island/Tubuai (Austral Islands) grouping, separated by 8 mutational steps from the central haplotype. Previous research that targeted 239bp of the mitochondrial hypervariable region in ancient Easter Island Pacific rats, were unsuccessful in identifying an immediate origin for this population (Barnes et al. 2006). This was because all ancient Easter Island samples belonged to the R9 haplotype, which is widely found across Remote Oceania. The absence of variation was attributed to a single or limited introduction of rats, followed by long-term isolation. Hagelberg et al. (1994) targeted the mitochondrial DNA from ancient human remains, confirming a Polynesian origin for the Rapanui people. This is consistent with linguistic evidence of an ancestral Eastern Polynesian language group that split into Central Eastern Polynesian languages (e.g. (Easter Island) (Trudgill 2004). This early divergence of the Rapanui language would be consistent with an early Easter Island colonisation followed by extreme isolation from all other Eastern Polynesian groups. The immediate origin of the prehistoric people of Easter Island is however still unknown. Wilmshurst et al. (2011) re-analysis of Eastern Polynesian radiocarbon dates, indicated that Easter Island was colonised by ~750-697 B.P. Following establishment in the Society Islands around 925-830 B.P, there was a fast and widespread radiation to all major Eastern Polynesian archipelagos between ~750650 B.P. The grouping of the Tubuai (Austral Is) and Easter Island haplotypes in this study suggests that both island groups were established by the same colonisation wave, which carried the haplotype from which Easter Island and Tubuai rats share or are diverged. Whilst haplotype networks, such as the medianjoining network do not indicate ancestor-descendent relationships and therefore the direction of migration, it is unlikely that the Tubuai haplotypes are derived from the Easter Island population. Geographically, Easter Island is situated in the eastern-most corner of Polynesia. Migratory voyages are likely to have passed 72 through other island archipelagos in a stepping-stone manner before reaching Easter Island. Therefore it is highly likely that the Austral Islands were colonised prior to Easter Island. Preliminary radiocarbon dating on Tubuai suggests that it had been occupied earlier (Worth & Bollt 2011), contemporaneous with sites in the southern Cooks (Walter 1998). This is the first commensal study to indicate a potential immediate origin for the Easter Island population, providing support for the colonisation of Easter Island through a south-eastern voyaging corridor, originating from the Austral Islands and potentially moving through the Gambier Islands and Pitcairn Islands. Further sampling would be required in the Gambier and Pitcairn Islands however to establish haplotype continuity in this southeastern corridor. As discussed above, Rapanui diverged from the ancestral Eastern Polynesian language group, distinct from all other Central Eastern Polynesian languages. This likely reflects the isolation of the Rapa Nui people post-colonisation. The adherence of all six ancient Easter Island rats sampled in this project to two haplotypes, representing the initial and a derived haplotype, suggests a limited introduction of rats onto the island. This coincides with Barnes et al. (2006) finding of limited variation, which was attributed to extreme isolation after the initial colonisation. However given that the data is based on rats and not humans, it is possible that there were human arrivals post-colonisation that did not carry Pacific rats with them. The progression of a shared Central East Polynesian language, exempting Rapanui, is suggestive of one or more interaction spheres that effectively maintained language continuity across Central East Polynesia. As discussed in Chapter 1, there is archaeological, linguistic and genetic evidence for a number of interaction spheres across Eastern Polynesia (Matisoo-Smith 1998; Sheppard et al. 1997; Conte & Kitch 2004; Rolett 2002; Kirch 2000). Rolett (2002) argued that a core interaction zone existed between the Southern Cooks, Society Island and Austral Islands, central to many other smaller interaction spheres. Kirch (2000) also suggested that based on linguistic evidence, one of these smaller interaction spheres may have existed between the Austral Islands, Mangareva (Gambier 73 Islands) and the Pitcairn islands to the eastern Tuamotus and the Marquesas. The inclusion of the Austral Islands into both the central and south-eastern interaction spheres indicates that it was potentially the link between these two regions. Interestingly, two modern Huahine (Society Island) sequences were found to radiate off the ancient Eastern (Austral Is/Easter Is) Polynesian haplotype. These two lineages are not derived from any of the haplotypes currently found in the Society Islands. The establishment of these unusual lineages on Huahine may reflect a post-colonisation translocation of rats from the Austral Islands into the Society Islands as part of the core interaction zone. The Society Island and Cook Island populations were the only groups that successfully produced both ancient and modern data. The ancient Maupiti sample represents a haplotype from which nearly all modern and ancient Society Island/Cook Island populations share or are derived. Modern Pacific rat populations in the Society Islands and Cook Islands are therefore largely representative of prehistoric populations that were established with initial human settlement. It has been previously reported that modern Pacific rat populations in the Chatham Islands are representative of prehistoric populations (Matisoo-Smith 2002). In New Zealand however, not all ancient lineages were found in the modern populations. Widespread extinction on mainland New Zealand likely lead to the loss of ancient lineages that are no longer represented in extant populations on the offshore islands (Matisoo-Smith 2002). Extinction of lineages may also have occurred in other island groups and subsequently are no longer represented in modern populations. Conversely, modern populations may have incorporated lineages that are not represented in prehistoric assemblages. This may have resulted from post-colonisation trade movements, however it has not yet been critically assessed whether these movements were associated with further rat dispersal. Given the alleged difficulty of foreign lineages to integrate and increase in frequency within established populations (Barnett 1955; Alberts & Galef 1973), introduced lineages after initial colonisation may have remained at a low frequency or went extinct. 74 Without the use of both ancient and modern specimens concurrently, it is difficult to make accurate interpretations on origins, migrations and post-colonisation interactions. Whilst the results of this project can confirm that the modern Society Island/Cook Island populations are representative of their prehistoric ancestors and as such could be used in further analyses as a proxy, there is limited information about temporal haplotype continuity from other regions in Oceania. It is recommended that any further research involving the sampling of Pacific rats in Oceania incorporate the use of both prehistoric and modern Pacific rat samples. 75 5 Conclusion 5.1 Summary Differences in the variation of nucleotide diversity across complete ancient and modern Pacific rat mitochondrial genomes indicated that targeting one region (e.g. hypervariable I region) of the mitochondrial genome, may not provide enough variation across both ancient and modern samples to sufficiently distinguish haplotype lineages. Complete mitochondrial sequencing provided greater molecular resolution, distinguishing new haplotype lineages within the encompassing control region R9 haplotype of Eastern Polynesia. Recent advancements in target capture and second-generation sequencing have made it accessible and more reliable to retrieve complete mitochondrial genomes from ancient specimens. There is great opportunity for the wider application of complete mitochondrial sequencing as part of commensal research across the Pacific. Varying rates of DNA preservation in archaeological deposits across the Pacific however does limit the ability to extract amplifiable DNA. Ancient library amplification success rates in this study coincide with previous reports of low DNA preservation in hot, humid environments, particular if the archaeological site is open and older than 1000 B.P. The highest overall success rate (after low coverage removal) was produced from the archaeological site of Anakena, Easter Island. Notably, Easter Island falls under a more temperate climate than most Polynesian islands and has only been colonised within the last 750 years. This is consistent with high success rates reported in other temperate islands such as New Zealand and the Chatham Islands, which were also colonised within the last 750 years. The ancient Eastern Polynesian haplotypes, representative of samples sourced from some of the earliest archaeological layers across Eastern Polynesia, indicate descent from a Western haplotype uncovered in Foa, Tonga. This coincides with archaeological and linguistic evidence of a Western Polynesian homeland in Tonga and Samoa, from which Eastern Polynesian populations are derived. A central Eastern Polynesian haplotype that was uncovered in Fakaofo, Tokelau, is ancestral 76 to all other haplotypes found across Eastern Polynesia. The central haplotype reflects a previously proposed central East Polynesian homeland region from which eastern expansion occurred. Radiocarbon evidence indicates that the Society Islands is the most likely candidate, however the ancient Society Island haplotypes reported in this project are not ancestral to other Eastern Polynesian populations. The ancient Maupiti (Society Islands) haplotype is however ancestral to nearly all modern lineages found across both the Society Islands and the Cook Islands. The modern Society Island/Cook Island populations are therefore largely representative of prehistoric populations in this region. Easter Island and Tubuai (Austral Islands) share related haplotypes that are derived from the central haplotype in central East Polynesia. This suggests that both islands were established by the same colonisation wave that originated in the central homeland region and moved through the south-eastern corridor, culminating with the colonisation of Easter Island. This is the first commensal study to provide a potential immediate origin for the settlement of Easter Island. 5.2 Future Directions Further sampling of ancient specimens across central East Polynesia would be required to assess the distribution of the central haplotype. Given archaeological evidence that posit that the Southern Cooks/Society Island were the first Eastern Polynesian islands to have been colonised from Samoa/Tonga, it is likely that the central haplotype exists in this region. Further sampling in the south-eastern corridor, from the Austral Islands through to the Pitcairn and Gambier Islands would also be recommended to assess the continuity of the Easter Is/Austral Is haplotype, distributed as part of a proposed south-east colonisation wave out of central East Polynesia. Given the high molecular resolution afforded by complete mitochondrial sequencing, it would be interesting to expand this study into Hawaii and New Zealand to investigate multiple reported colonisations and where they originate. An expansion of modern sampling outside of the Society Islands/Cook Islands would also be useful in investigating multiple post-colonisation interaction spheres across the region. Commensal studies can provide invaluable genetic evidence that used in conjunction with archaeological, linguistic and ethnographic data can evaluate models of human origins and dispersals across the Pacific. 77 6 Supplementary material Table 6.1 Variable positions across 203bp of the control region (from position 15,392 15,595). Colour coding: Adenine, red; Guanine, Blue; Cytosine, yellow; Thymine, Green. Amplified Haplo- Haplo- Papua Puk001 Thaifragment group group New land position I (R1) II (R2) Guinea 15,435 15,478 15,498 15,508 15,510 15,528 15,544 15,552 15,553 15,555 15,562 15,572 15,579 15,582 15,583 C T T A C C A T A T T C T C A T C T A C T A C A C A C T T A T T A C A A C A A A A C A C A A A Amplified fragment position 15,435 15,478 15,498 15,508 15,510 15,528 15,544 15,552 15,553 15,555 15,562 15,572 15,579 15,582 15,583 MS10596 MS10591 MS10601 MS10600 C N N N N N N N N N N N N N N N N N N N N N N N N N N N T C N N N N N N N N N N N N C A N N N N N N N N N N MS10602 N N N N N N N N N N 78 Haplogroup IIIA (R15) C C T G C T G C A T A C T T C Haplogroup IIIB (R9) C C T G C T G C G T A C T T C Raia45 Tah8 C C T A C T G C G T A C T T C MS10606 C N N Raia28 A Amplified fragment position 15,435 15,478 15,498 15,508 15,510 15,528 15,544 15,552 15,553 15,555 15,562 15,572 15,579 15,582 15,583 MS10523 Raia16 MS10592 N C C T G T N G C G T A C T T C Amplified fragment position 15,435 15,478 15,498 15,508 15,510 15,528 15,544 15,552 15,553 15,555 15,562 15,572 15,579 15,582 15,583 Tah3 Tah4 N N N MS10609 MS10610 MS10604 MS10605 MS10502 N N N Hua40 Tah2 C C T G C T G C G T A C T T C 79 Raia27 Hua24 Tet60.6 Hua23 Amplified fragment position 15,435 15,478 15,498 15,508 15,510 15,528 15,544 15,552 15,553 15,555 15,562 15,572 15,579 15,582 15,583 Hua19 Hua27 Tah5 Tah1 Rino1 Hua32 Hua12 Raia21 Amplified fragment position 15,435 15,478 15,498 15,508 15,510 15,528 15,544 15,552 15,553 15,555 15,562 15,572 15,579 15,582 15,583 Raia47 Hua47 Hua44 Hua43 Hua41 Raia17 Rama 24 C C T G C T G C G T A C T T C C C T G C T G C G T A C T T C 80 Rama 3 Amplified fragment position 15,435 15,478 15,498 15,508 15,510 15,528 15,544 15,552 15,553 15,555 15,562 15,572 15,579 15,582 15,583 Rama 14 Hua13 Hua29 New Zealand C C T G C T G C G T A C T T C 81 Table 6.2 Archaeological sample information. If samples were successfully amplified for sequencing they were given a database code. Only samples above 80% coverage were retained for phylogenetic analysis. Location Society Islands Site Terre Teahutapu, Plage Terre Namotu, Maupiti Haranai Valley, Maupiti Archaeological Sample Code Database Code MAU-1 N99E100 B1 UB 80 MS10451 MAU-1 N99E160 C1 UB 827-828 MAU-1 N99E100 A2 UB 59 MAU-1 N97E98 B5 UB 347 MAU-1 N100E100 B2 590-592 MAU-1 N99E100 MAU-1 N99E96 B1 MAU-1 N100E100 B3 MAU-1 N99E96 B3 MAU-1 N98E98 C1 MAU-1 N96E98 B3 MAU-1 N96E98 B4 MAU-1 N98E98 B3 MAU-1 N97E98 B7 MAU-5 N105 E98.5 C2 MAU-5 N100 E100 C2 MAU-5 N105 E98.50 C4 MAU-5 N105 E98.5 C1 MAU-11 TP1 B4 MS10452 MS10453 MS10454 MS10455 MS10457 82 Coverage > 80% MS10463 MS10478 MS10482 MS10503 MS10502 MS10506 Yes Haumi archaeological site, Moorea American Samoa Ofu Village, Ofu Olosega Village, Olosega Cook Islands Aitutaki MAU-11 TP2 A4 MAU-11 TP1 B8 MAU-11 TP2 A3 MS10511 MS10507 SCMO350 N95 E113 A2 MS10531 SCMO350 N95 E118 A1 SCMO350 N95 E116 C2 SCMO350 N95 E120 C4 SCMO350 TP2 A3 SCMO350 N95 E119 C2 AS-13-41 XU-4 Layer V #57 AS-13-41 XU-4 Layer III Level 6 #63 AS-13-41 XU-4 Layer V Level 3 #46 AS-13-41 XU-3 Layer X Level 1 #7 AS-13-41 XU-4 Layer VI Level 7 #67 AS-13-41 XU-3 Level VI Level 4 #15 AS-12-3 13W/10N Layer V Level 10 AS-12-3 14W/10N Layer V Level 9 AS-12-3 11W/10N Layer V Level 8 AS-12-3 1E/21N Layer IV Level 6 AS-12-3 14W/10N Layer VI Level 12 AS-12-3 IE/21N Layer V Level 8 Ait10 TP7 VII/7 #232 Ait10 TP1 VI,VII/6 #22 Ait10 TP5 IV/4 #146 Ait10 TP10 V15 #277 83 MS10523 MS10518 MS10549 MS10543 Yes Mangaia Tokelau Fakaofo Tonga Faleloa, Foa Mele Havea, Ha'afeva Austral Islands Atiahara, Tubuai Ait10 TP3 V16 #102.1 Ait10 TP8 VII/6 #245 AI 737 AI 740 AI 739 AI 742 Fakaofo Sq.4 Layer D26 Fakaofo Sq.4 Layer C1 Fakaofo Sq.2 Layer C4 Fakaofo Sq.2 Layer C2 Fakaofo Sq.4 Layer D5 Fakaofo Sq.4 Layer D4 Tonga:Ha'ano:Ha1 Pukotala site unit 14 Level 7 Tonga:Ha'ano:Ha1 Pukotala site unit 14 Level 6 Tonga:Foa:Fo1 Faleloa site Unit 18 Level 8 Tonga:Foa:Fo1 Faleloa site Unit 18 Level 2 Tonga:Ha'afena:Hf1 Mele Havea site Unit 2 Level 7 Tub452 AT1.1 Sq.B6 Layer 4 134cm BD Tub452 AT1.1 Section B Layer 7 AB14 166 BD Tub452 AT1.1 Section A Sq.09 Layer 5 123-30 BD Tub452 AT1.1 Sq.D5 L4 137.46cm BD Tub452 AT1.1 Section A Sq.M5 152 BD 84 MS10592 MS10593 MS10591 MS10594 Yes Yes MS10595 MS10596 MS10597 Yes MS10598 MS10600 Yes MS10601 Yes MS10599 Easter Island Anakena Layer 5 Anakena North Unit 1 UH ID#0033 Anakena North Unit UH ID#0015 Anakena North Unit 1 UH ID#0005 Anakena North Unit 1 UH ID#0030 Anakena North Unit 1 UH ID#0028 Anakena North Unit 1 UH ID#0022 Anakena North Unit 1 UH ID#0013 Anakena North Unit 1 UH ID#0018 Anakena North Unit 1 UH ID#0024 Anakena North Unit 1 UH ID#0010 Anakena North Unit 1 UH ID#0020 85 MS10602 MS10603 MS10604 MS10605 MS10606 MS10607 MS10608 MS10609 MS10610 MS10611 Yes Yes Yes Yes Yes Yes Table 6.3 Modern sample information. Only samples that were successfully amplified in 4 fragments were sequenced. Samples that exhibited high coverage above 90% were retained for phylogenetic analysis. Location Site Cook Islands Pukapuka Atiu Aitutaki Society Islands Tetiaroa Modern sample code Puk001 Puk002 Puk003 Puk004 Puk005 Puk006 Puk007 Atiu1A Atiu2A Atiu3A Rino1 Rama24 Rama3 Rama14 Tet27.6 Tet60.6 Tet13.6 Tet4.6 Tet11.6 Tet9.6 Tet24.6 Tet3.6 Tet12.6 Tet74.6 Tet5.6 Tet10.6 Tet6.6 Tet15.6 Tet29.6 Tet25.6 Tet14.6 Tet28.6 Tet65.6 Tet78.6 Tet7.6 Tet64.6 Tet18.6 86 Yes Coverage > 90% Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Sequenced Huahine Tahiti Raiatea Tet2.6 Tet75.6 Tet31.6 Tet21.6 Tet16.6 Tet26.6 Tet1.6 Tet23.6 Tet57.6 Tet8.6 Tet22.6 Tet30.6 Tet17.6 Hua23 Hua19 Hua40 Hua43 Hua13 Hua41 Hua18 Hua29 Hua24 Hua47 Hua12 Hua36 Hua44 Hua11 Hua30 Hua39 Hua26 Hua42 Hua28 Hua27 Hua32 Tah5 Tah1 Tah2 Tah3 Tah4 Tah8 Tah6 Raia28 Raia16 Raia21 Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes 87 Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Raia47 Raia6 Raia23 Raia25 Raia45 Raia22 Raia3 Raia26 Raia29 Raia8 Raia46 Raia27 Raia17 Samoa Apia, Upolu Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Apia001 Apia002 Togitogigh4 88 Yes Yes Yes Yes 7 Reference List Alberts, J. R., & Galef, B. G. (1973). Olfactory cues and movement: Stimuli mediating intraspecific aggression in the wild Norway rat. Journal of Comparative and Physiological Psychology, 85(2), 233-242. doi: 10.1037/h0035050. Allen, M. (1984). Elders, Chiefs, and Big Men: Authority legitimation and political evolution in Melanesia. American Ethnologist, 11(1), 20-41. doi: 10.1525/ae.1984.11.1.02a00020. Allen, M., & Steadman, D. (1990). Excavations at the Ureia site, Aitutaki, Cook Islands: preliminary results. Archaeology in Oceania, 25(1), 24-37. Allen, M. S. (2014). Marquesan colonisation chronologies and post-colonisation hypothesis. Journal of Pacific Archaeology, 5(2). Allen, M. S., & Johnson, K. T. (1997). Tracking ancient patterns of interaction: Recent geochemical studies in the southern Cook Islands. In M. Weisler (Ed.), Prehistoric Long-distance Interaction in Oceania: an interdisciplinary approach (Vol. 21, pp. 111-113): New Zealand Archaeological Association Monograph. Allen, M. S., & Wallace, R. (2007). New evidence from the East Polynesian gateway: Substantive and methodological results from Aitutaki, southern Cook Islands. Radiocarbon, 49(3), 1163-1180. Anderson, A., & Sinoto, Y. H. (2002). New Radiocarbon Ages of Colonization Sites in East Polynesia. Asian Perspectives, 41(2), 242-257. Bandelt, H. J., Forster, P., & Röhl, A. (1999). Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution, 16(1), 37-48. Barnes, S. S., Matisoo-Smith, E., & Hunt, T. L. (2006). Ancient DNA of the Pacific rat (Rattus exulans) from Rapa Nui (Easter Island). Journal of Archaeological Science, 33(11), 1536-1540. doi: 10.1016/j.jas.2006.02.006. Barnett, S. A. (1955). Competition among Wild Rats. Nature, 175(4446), 126. doi: 10.1038/175126b0. Bellwood, P. S. (1970). Dispersal centers in East Polynesia, with special reference to the Society and Marquesas Islands. In R. C. Green & M. Kelly (Eds.), Studies in 89 Oceanic culture history. Bernice Pauahi Bishop Museum. Bellwood, P. S. (1978). The Polynesians: prehistory of an island people. London: Thames and Hudson. Best, S. (1988). Tokelau archaeology: a preliminary report of an initial survey and excavations. Bulletin of the Indo-Pacific Prehistory Association, 8, 104-118. Best, S., Sheppard, P., Green, R., & Parker, R. (1992). Necromancing the stone: Archaeologists and adzes in Samoa. The Journal of the Polynesian Society, 101(1), 45-85. Biggs, B. (1972). Implications of Linguistic Subgroup with Special Reference to Polynesia. Pacific Anthropological Records, 13, 143-152. Bollongino, R., Tresset, A., & Vigne, J. D. (2008). Environment and excavation: Pre-lab impacts on ancient DNA analyses. Comptes Rendus Palevol, 7(2), 91-98. Boocock, J. (2014). Ancient DNA Pipeline. from GitHub Repository, https://github.com/theboocock/ancient_dna_pipeline Briggs, A. W., Stenzel, U., Johnson, P. L., Green, R. E., Kelso, J., Prüfer, K., . . . Lachmann, M. (2007). Patterns of damage in genomic DNA sequences from a Neandertal. Proceedings of the National Academy of Sciences, 104(37), 14616-14621. Briggs, A. W., Stenzel, U., Meyer, M., Krause, J., Kircher, M., & Pääbo, S. (2010). Removal of deaminated cytosines and detection of in vivo methylation in ancient DNA. Nucleic Acids Research, 38(6), e87. Brotherton, P., Endicott, P., Sanchez, J. J., Beaumont, M., Barnett, R., Austin, J., & Cooper, A. (2007). Novel high-resolution characterization of ancient DNA reveals C> U-type base modification events as the sole cause of post mortem miscoding lesions. Nucleic Acids Research, 35(17), 5717-5728. Buck, P. H. (1944). Arts and crafts of the Cook Islands. Honolulu, Hawaii: Bernice P. Bishop Museum. Burley, D., Edinborough, K., Weisler, M., & Zhao, J.-X. (2015). Bayesian modeling and chronological precision for Polynesian settlement of Tonga. PLoS One, 10(3), e0120795. doi: 10.1371/journal.pone.0120795. Burley, D. V., Nelson, D. E., & Shutler, R. (1999). A radiocarbon chronology for the Eastern Lapita frontier in Tonga. Archaeology in Oceania, 34(2), 59-70. doi: 10.1002/j.1834-4453.1999.tb00429.x. 90 Burney, D., & Kikuchi, W. (2006). A Millennium of Human Activity at Makauwahi Cave, Human Ecology, 34(2), 219-247. doi: 10.1007/s10745006-9015-3. Burney, D. A., James, H. F., Burney, L. P., Olson, S. L., Kikuchi, W., Wagner, W. L., . . . Nishek, R. (2001). Fossil evidence for a diverse biota from Kaua'i and its transformation since human arrival. Ecological Monographs, 71(4), 615-641. Burrows, E. G. (1938). Western Polynesia: a study of cultural differentiation (Vol. 7). Goteburg: Etnologiska Studier. (Reprinted 1970 with same pagination by University Bookshop, Dunedin, NZ). Clark, G. R., Reepmeyer, C., Melekiola, N., Woodhead, J., Dickinson, W. R., & Martinsson-Wallin, H. (2014). Stone tools from the ancient Tongan state reveal prehistoric interaction centers in the Central Pacific. Proceedings of the National Academy of Sciences, 111(29), 10491-10496. Clark, J. T., Quintus, S., Weisler, M., St Pierre, E., Nothdurft, L., & Feng, Y. (2016). Refining the chronology for West Polynesian colonization: New data from the Samoan archipelago. Journal of Archaeological Science: Reports, 6, 266-274. doi: 10.1016/j.jasrep.2016.02.011. artefact transfer within and beyond the archipelago. Archaeology in Oceania. doi: 10.1002/arco.5090. Connell, J. F., & Allen, J. (2004). Dating the colonization of Sahul (Pleistocene Australia New Guinea): a review of recent research. Journal of Archaeological Science, 31(6), 835-853. doi: 10.1016/j.jas.2003.11.005. Conte, E., & Kirch, P. V. (2004). Archaeological investigations in the Mangareva Islands (Gambier Archipelago), French Polynesia. Berkeley, Calif.: Archaeological Research Facility, University of California, Berkeley. Dabney, J., Meyer, M., & Pääbo, S. (2013). Ancient DNA damage. Cold Spring Harbor Perspectives in Biology, 5(7), a012567. Darriba, D., Taboada, G. L., Doallo, R., & Posada, D. (2012). jModelTest 2: more models, new heuristics and parallel computing. Nature Methods, 9(8), 772-772. Davidson, J. (1976). The Prehistory of Western Polynesia. In J. Garanger (Ed.), La Prehistorie Océanienne (pp. 27-51). Nice: Union Internationale des Sciences 91 Prehistoriques et Protohistoriques. Denham, T., Ramsey, C. B., & Specht, J. (2012). Dating the appearance of Lapita pottery in the Bismarck Archipelago and its dispersal to Remote Oceania. Archaeology in Oceania, 47(1), 39-46. doi: 10.1002/j.1834-4453.2012.tb00113.x. Derrington, I. M., Butler, T. Z., Collins, M. D., Manrao, E., Pavlenok, M., Niederweis, M., & Gundlach, J. H. (2010). Nanopore DNA sequencing with MspA. Proceedings of the National Academy of Sciences of the United States of America, 107(37), 16060. doi: 10.1073/pnas.1001831107. DeSalle, R., Gatesy, J., Wheeler, W., & Grimaldi, D. (1992). DNA Sequences from a Fossil Termite in Oligo-Miocene Amber and Their Phylogenetic Implications. American Journal of Physical Anthropology, 87, 291. Dickinson, W. R., & Green, R. C. (1998). Geoarchaeological context of Holocene subsidence at the ferry berth Lapita site, Mulifanua, Upolu, Samoa. Geoarchaeology, 13(3), 239-263. Duggan, Ana T., Evans, B., Friedlaender, Françoise r., Friedlaender, Jonathan s., Koki, G., Merriwether, D. a., . . . Stoneking, M. (2014). Maternal History of Oceania from Complete mtDNA Genomes: Contrasting Ancient Diversity with Recent Homogenization Due to the Austronesian Expansion. The American Journal of Human Genetics, 94(5), 721-733. doi: 10.1016/j.ajhg.2014.03.014. Dumont d'Urville, J. (1832). Sur les îles du Grand Océan. Bulletin de la Société de Géographie, 17, 1-21. Dye, T. S., & Dickinson, W. R. (1996). Sources of sand tempers in prehistoric Tongan pottery. Geoarchaeology, 11(2), 141-164. Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32(5), 1792-1797. Edward, M. G., David, E. G., Michael, T. C., Charles, J. S., Mary, D., David, H., & Gerard, Z. (1990). Chloroplast DNA sequence from a Miocene Magnolia species. Nature, 344(6267), 656. doi: 10.1038/344656a0. Elbert, S. H. (1982). Lexical diffusion in Polynesia and the Marquesan-Hawaiian relationship. The Journal of the Polynesian Society, 91(4), 499-517. Emory, K. P. (1946). Eastern Polynesia: its cultural relationships. (PhD), Yale University. Emory, K. P., & Sinoto, Y. H. (1965). Preliminary report on the archaeological 92 investigations in Polynesia : field work in the Society and Tuamotu Islands, French Polynesia, and American Samoa in 1962, 1963, 1964. . Honolulu: Department of Anthropology, Bernice P. Bishop Museum. Felsenstein, J. (1981). Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17(6), 368-376. Flannery, T. F. (1995). Mammals of the south-west Pacific & Moluccan Islands: Cornell University Press. Friedlaender, J. S., Gentz, F., Green, K., & Merriwether, D. A. (2002). A Cautionary Tale on Ancient Migration Detection: Mitochondrial DNA Variation in Santa Cruz Islands, Solomon Islands. Human Biology, 74(3), 453-471. Green, R. (1966). Linguistic subgrouping within Polynesia: the implications for prehistoric settlement. The Journal of the Polynesian Society, 75(1), 6-38. Green, R. C. (1981). Location of the Polynesian homeland: a continuing problem. In J. Hollyman & A. Pawley (Eds.), Studies in Pacific languages and cultures in honour of Bruce Biggs (pp. 133-158). Auckland: Linguistic Society of New Zealand. Green, R. C. (1991). Near and remote Oceania history. In A. Pawley (Ed.), Man and a half: essays in Pacific anthropology and ethnobiology in honour of Ralph Bulmer (pp. 491-502). Auckland. Green, R. C., Jones, M., & Sheppard, P. (2008). The reconstructed environment and Archaeology in Oceania, 43(2), 49-61. doi: 10.1002/j.1834- 4453.2008.tb00030.x. Green, R. E., Briggs, A. W., Krause, J., Prüfer, K., Burbano, H. A., Siebauer, M., . . . Pääbo, S. (2009). The Neandertal genome and ancient DNA authenticity. EMBO Journal, 28(17), 2494-2502. doi: 10.1038/emboj.2009.222. Greig, K., Boocock, J., Prost, S., Horsburgh, K. A., Jacomb, C., Walter, R., & MatisooSmith, E. (2015). Complete Mitochondrial Genomes of New Zealand's First Dogs. PLoS One, 10(10), e0138536. doi: 10.1371/journal.pone.0138536. Groube, L., Chappell, J., Muke, J., & Price, D. (1986). A 40,000 year-old human occupation site at Huon Peninsula, Papua New Guinea. Nature, 324(6096), 453. doi: 10.1038/324453a0. Groube, L. M. (1971). Tonga, Lapita pottery, and Polynesian origins. The Journal of the 93 Polynesian Society, 80(3), 278-316. Guindon, S., Dufayard, J.-F., Lefort, V., Anisimova, M., Hordijk, W., & Gascuel, O. (2010). New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Systematic Biology, 59(3), 307-321. Guindon, S., & Gascuel, O. (2003). A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Systematic Biology, 52(5), 696-704. Haddon, A. C. (1975). Canoes of Oceania. Honolulu, Hawaii: Bishop Museum Press. Hage, P., & Marck, J. (2003). Matrilineality and the Melanesian Origin of Polynesian Y Chromosomes. Current Anthropology, 44(S5), S121-S127. doi: 10.1086/379272. Hagelberg, E., Quevedo, S., Turbon, D., & Clegg, J. (1994). DNA from ancient Easter Islanders. Nature, 369(6475), 25-26. Harlow, R. (1979). Regional variation in Maori. New Zealand Journal of Archaeology, 1, 123-138. Hasegawa, M., Kishino, H., & Yano, T. A. (1985). Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. Journal of Molecular Evolution, 22(2), 160-174. Heyn, P., Stenzel, U., Briggs, A. W., Kircher, M., Hofreiter, M., & Meyer, M. (2010). Road blocks on paleogenomes- polymerase extension profiling reveals the frequency of blocking lesions in ancient DNA. Nucleic Acids Research, 38(16), e161. doi: 10.1093/nar/gkq572. Hill, A., Gentile, B., Bonnardot, J., Roux, J., Weatherall, D., & Clegg, J. (1987). Polynesian origins and affinities: globin gene variants in eastern Polynesia. American Journal of Human Genetics, 40(5), 453. Hofreiter, M., Serre, D., Poinar, H. N., Kuch, M., & Pääbo, S. (2001). Ancient DNA. Nature Reviews Genetics, 2(5), 353-359. Huelsenbeck, J. P., & Ronquist, F. (2001). MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics, 17(8), 754-755. Hunt, T. L. (1979). Alternative Perspectives for Early East Polynesian prehistory. Paper presented at the New Zealand Archaeological Association Biennial Conference, Auckland. Hunt, T. L., & Lipo, C. P. (2006). Late colonization of Easter Island. Science, 311(5767), 94 1603. Hunt, T. L., & Lipo, C. P. (2012). Ecological Catastrophe and Collapse: The Myth of 'Ecocide' on Rapa Nui (Easter Island). PERC Research Paper (12/3). Irwin, G. (1981). How Lapita lost its pots: the question of continuity in the colonisation of Polynesia. The Journal of the Polynesian Society, 90(4), 481-494. Irwin, G. (1992). The Prehistoric Exploration and Colonization of the Pacific. Cambridge: Cambridge University Press. Irwin, G. (2008). Pacific Seascapes, Canoe Performance, and a Review of Lapita Voyaging with Regard to Theories of Migration. Asian Perspectives, 47(1), 1227. Jennings, J. D. (1979). The Prehistory of Polynesia. Cambridge, Mass.: Harvard University Press. Jónsson, H., Ginolhac, A., Schubert, M., Johnson, P. L. F., & Orlando, L. (2013). mapDamage2.0: Fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics, 29(13), 1682-1684. doi: 10.1093/bioinformatics/btt193 Jordan, F. M., Gray, R. D., Greenhill, S. J., & Mace, R. (2009). Matrilocal residence is ancestral in Austronesian societies. Proceedings of the Royal Society of London B: Biological Sciences, 276(1664), 1957. doi: 10.1098/rspb.2009.0088 Kahn, J. G. (2012). Coastal Occupation at the GSIslands. Journal of Pacific Archaeology, 3(2), 52-61. Kahn, J. G., Dotte-Sarout, E., Molle, G., & Conte, E. (2015). Mid-to late prehistoric landscape change, settlement histories, and agricultural practices on Maupiti, Society Islands (Central Eastern Polynesia). The Journal of Island and Coastal Archaeology, 10(3), 363-391. Kayser, M. (2010). The Human Genetic History of Oceania: Near and Remote Views of Dispersal. Current Biology, 20(4), R194-R201. doi: 10.1016/j.cub.2009.12.004 Kayser, M., Brauer, S., Weiss, G., Underhill, P., Roewer, L., Schiefenhövel, W., & Stoneking, M. (2000). Melanesian origin of Polynesian Y chromosomes. Current Biology, 10(20), 1237-1246. doi: 10.1016/S0960-9822(00)00734-X. Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., . . . Duran, C. (2012). Geneious Basic: an integrated and extendable desktop software 95 platform for the organization and analysis of sequence data. Bioinformatics, 28(12), 1647-1649. Kemp, B. M., & Smith, D. G. (2005). Use of bleach to eliminate contaminating DNA from the surface of bones and teeth. Forensic Science International, 154(1), 5361. Kirch, P., & Ellison, J. (1994). Palaeoenvironmental evidence for human colonization of remote Oceanic islands. Antiquity, 68(259), 310. Kirch, P. V. (1986). Rethinking East Polynesian Prehistory. The Journal of the Polynesian Society, 95(1), 9-40. Norwegian Archaeological Review, 21(2), 103-117. Kirch, P. V. (1996). Late Holocene human-induced modifications to a central Polynesian island ecosystem. Proceedings of the National Academy of Sciences, 93(11), 5296-5300. Kirch, P. V. (1997). The Lapita Peoples: Ancestors of the Oceanic World. The Peoples of South-East Asia and the Pacific: Blackwell, Cambridge, MA. Kirch, P. V. (2000). On the road of the winds : an archaeological history of the Pacific Islands before European contact. Berkeley, California: University of California Press. Kirch, P. V., Steadman, D. W., Butler, V. L., Hather, J., & Weisler, M. I. (1995). Prehistory and human ecology in Eastern Polynesia: Excavations at Tangatatau Rockshelter, Mangaia, Cook Islands. Archaeology in Oceania, 30(2), 47-65. doi: 10.1002/j.1834-4453.1995.tb00330.x Knapp, M., & Hofreiter, M. (2010). Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives. Genes, 1(2), 227. doi: 10.3390/genes1020227. Knapp, M., Stiller, M., & Meyer, M. (2012). Generating barcoded libraries for multiplex high-throughput sequencing. In B. Shapiro & M. Hofreiter (Eds.), Ancient DNA: Methods and Protocols (pp. 155-170): Humana Press. Larson, G., Cucchi, T., Fujita, M., Matisoo-Smith, E., Robins, J., Anderson, A., . . . Dobney, K. (2007). Phylogeny and ancient DNA of Sus provides insights into neolithic expansion in Island Southeast Asia and Oceania. Proceedings of the 96 National Academy of Sciences of the United States of America, 104, 4834-4839. Leach, B. F., Davidson, J. M., & Horwood, L. M. (1994). Analaysis of faunal material from Ngaaitutaki, Tepaopao and Erua, Mangaia, Cook Islands Museum of New Zealand Te Papa Tongarewa Technical Report (Vol. 1, pp. 43). Leavesley, M. G., Bird, M. I., Fifield, L. K., Hausladen, P. A., Santos, G. M., & Di Tada, M. L. (2002). Buang Merabak: Early Evidence for Human Occupation in the Bismarck Archipelago, Papua New Guinea. Australian Archaeology (54), 55-57. plotype network construction. Methods in Ecology and Evolution, 6(9), 1110-1116. Leonard, J. A., Shanks, O., Hofreiter, M., Kreuz, E., Hodges, L., Ream, W., . . . Fleischer, R. C. (2007). Animal DNA in PCR reagents plagues ancient DNA research. Journal of Archaeological Science, 34(9), 1361-1366. doi: 10.1016/j.jas.2006.10.023 Li, H. (2013). Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv preprint arXiv:1303.3997. Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows Wheeler transform. Bioinformatics, 25(14), 1754-1760. Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., . . . Durbin, R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25(16), 2078-2079. Librado, P., & Rozas, J. (2009). DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics, 25(11), 1451-1452. Lindahl, T. (1993). Instability and decay of the primary structure of DNA. Nature, 362(6422), 709-715. Lindgreen, S. (2012). AdapterRemoval: easy cleaning of next-generation sequencing reads. BMC Research Notes, 5(1), 1. Maricic, T., Whitten, M., & Pääbo, S. (2010). Multiplexed DNA sequence capture of mitochondrial genomes using PCR products. PLoS One, 5(11), e14004. Matisoo-Smith, E. (1994). The human colonisation of Polynesia. A novel approach: genetic analyses of the Polynesian rat (Rattus exulans). The Journal of the Polynesian Society, 103(1), 75-87. Matisoo-Smith, E. (2002). Something Old, Something New: Do Genetic Studies of 97 Contemporary Populations Reliably Represent Prehistoric Populations of Pacific Rattus exulans? Human Biology, 74(3), 489-496. Matisoo-Smith, E. (2015). Ancient DNA and the human settlement of the Pacific: A review. Journal of Human Evolution, 79, 93-104. doi: 10.1016/j.jhevol.2014.10.017. Ancient DNA from Polynesian rats: extraction, amplification and sequence from single small bones. Electrophoresis, 18(9), 1534-1537. Matisoo-Smith, E., Roberts, R. M., Irwin, G. J., Allen, J. S., Penny, D., & Lambert, D. M. (1998). Patterns of Prehistoric Human Mobility in Polynesia Indicated by mtDNA from the Pacific Rat. Proceedings of the National Academy of Sciences of the United States of America, 95(25), 15145-15150. Matisoo-Smith, E., & Robins, J. (2008). Mitochondrial DNA evidence for the spread of Pacific rats through Oceania. Biological Invasions, 11(7), 1521-1527. doi: 10.1007/s10530-008-9404-1. Matisoo-Smith, E., Robins, J. H., & Green, R. C. (2004). Origins and Dispersals of Pacific Peoples: Evidence from mtDNA Phylogenies of the Pacific Rat. Proceedings of the National Academy of Sciences of the United States of America, 101(24), 9167-9172. Matisoo-Smith, L. (1996). No hea te kiore: MtDNA variation in Rattus exulans: a model for human colonisation and contact in prehistoric Polynesia. (PhD), The University of Auckland. McKenna, A., Hanna, M., Banks, E., Sivachenko, A., Cibulskis, K., Kernytsky, A., . . . Daly, M. (2010). The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 20(9), 1297-1303. Melton, T., Peterson, R., Redd, A. J., Saha, N., Sofro, A., Martinson, J., & Stoneking, M. (1995). Polynesian genetic affinities with Southeast Asian populations as identified by mtDNA analysis. American Journal of Human Genetics, 57(2), 403. Milham, P., & Thompson, P. (1976). Relative antiquity of human occupation and extinct fauna at Madura Cave, southeastern Western Australia. Mankind, 10(3), 175-180. Mitchell, D., Willerslev, E., & Hansen, A. (2005). Damage and repair of ancient DNA. 98 Mutation Research: Fundamental and Molecular Mechanisms of Mutagenesis, 571(1), 265-276. Molle, G., & Conte, E. (2011). New Perspectives on the Occupation of Hatuana Dune Site, Ua Huka, Marquesas Islands. Journal of Pacific Archaeology, 2(2), 103-108. Nei, M. (1987). Molecular evolutionary genetics. New York: Columbia University Press. Nielsen-Marsh, C. M., & Hedges, R. E. (2000). Patterns of diagenesis in bone I: the effects of site environments. Journal of Archaeological Science, 27(12), 11391150. Noonan, J. P., Hofreiter, M., Smith, D., Priest, J. R., Rohland, N., Rabeder, G., . . . Rubin, E. M. (2005). Genomic sequencing of Pleistocene cave bears. Science (New York, N.Y.), 309(5734), 597. Connell, J. F., & Allen, J. (2004). Dating the colonization of Sahul (Pleistocene Australia New Guinea): a review of recent research. Journal of Archaeological Science, 31(6), 835-853. doi:10.1016/j.jas.2003.11.005 seafaring. In A. Anderson, J. H. Barrett & K. V. Boyle (Eds.), The global origins and development of seafaring (pp. 57-68). Cambridge, UK: McDonald Institute for Archaeological Research, University of Cambridge. Orlando, L., Ginolhac, A., Raghavan, M., Vilstrup, J., Rasmussen, M., Magnussen, K., . . . Willerslev, E. (2011). True single-molecule DNA sequencing of a Pleistocene horse bone. Genome Research, 21(10), 1705-1719. Oskarsson, M. C. R., Klütsch, C. F. C., Boonyaprakob, U., Wilton, A., Tanabe, Y., & Savolainen, P. (2012). Mitochondrial DNA data indicate an introduction through Mainland Southeast Asia for Australian dingoes and Polynesian domestic dogs. Proceedings of the Royal Society of London B: Biological Sciences, 279(1730), 967-974. doi: 10.1098/rspb.2011.1395. Overballe-Petersen, S., Orlando, L., & Willerslev, E. (2012). Next-generation sequencing offers new insights into DNA degradation. Trends in Biotechnology, 30(7), 364368. doi: 10.1016/j.tibtech.2012.03.007. Pääbo, S. (1989). Ancient DNA: extraction, characterization, molecular cloning, and enzymatic amplification. Proceedings of the National Academy of Sciences, 86(6), 1939-1943. 99 Pawley, A. (1966). Polynesian languages: A subgrouping based on shared innovations in morphology. The Journal of the Polynesian Society, 75(1), 39-64. Pawley, A. (2015). Linguistic Evidence as a Window into the Prehistory of Oceania. In E. Cochrane & T. L. Hunt (Eds.), The Oxford Handbook of Prehistoric Oceania. Petchey, F. (2001). Radiocarbon determinations from the Mulifanua Lapita site, Upolu, western Samoa. Radiocarbon, 43(1), 63-68. Petchey, F., Addison, D. J., & McAlister, A. (2010). Re-interpreting old dates: Radiocarbon determinations from the Tokelau Islands (South Pacific). Journal of Pacific Archaeology, 1(2), 161-167. Poinar, H., Schwarz, C., & Shapiro, B. (2006). Metagenomics to Paleogenomics: LargeScale Sequencing of Mammoth DNA. Science, 311(5759), 392-394. Poinar, H. N., & Cooper, A. (2000). Ancient DNA: do it right or not at all. Science, 5482, 1139. Pruvost, M., Schwarz, R., Correia, V. B., Champlot, S., Braguier, S., Morel, N., . . . Geigl, E.-M. (2007). Freshly excavated fossil bones are best for amplification of ancient DNA. Proceedings of the National Academy of Sciences, 104(3), 739744. Pushkarev, D., Neff, N. F., & Quake, S. R. (2009). Single-molecule sequencing of an individual human genome. Nature Biotechnology, 27(9), 847. doi: 10.1038/nbt.1561. Quintus, S., Clark, J. T., Day, S. S., & Schwert, D. P. (2016). Landscape Evolution and Human Settlement Patterns on Ofu Island, Manu'a Group, American Samoa. Asian Perspectives, 54(2), 208-237. Ramakrishnan, U., & Hadly, E. A. (2009). Using phylochronology to reveal cryptic population histories: review and synthesis of 29 ancient DNA studies. Molecular Ecology, 18, 1310-1330. doi: 10.1111/j.1365-294X.2009.04092.x. Redd, A. J., Takezaki, N., Sherry, S. T., McGarvey, S. T., Sofro, A., & Stoneking, M. (1995). Evolutionary history of the COII/tRNALys intergenic 9 base pair deletion in human mitochondrial DNAs from the Pacific. Molecular Biology and Evolution, 12(4), 604-615. Rieth, T. M., Hunt, T. L., Lipo, C., & Wilmshurst, J. M. (2011). The 13th century Journal of Archaeological Science, 100 38(10), 2740-2749. Rizzi, E., Lari, M., Gigli, E., De Bellis, G., & Caramelli, D. (2012). Ancient DNA studies: new perspectives on old samples. Genetics Selection Evolution, 44(1), 1. Robarts, E. (1974). The Marquesan journal of Edward Robarts, 1797-1824. Canberra: Australian National University Press. Robins, J., Matisoo-Smith, E., & Furey, L. (2001). Hit or miss? Factors affecting DNA preservation in Pacific archaeological material. In M. Jones & P. Sheppard (Eds.), Australasian Connections and New Directions: Proceedings of the 7th Australasian Archaeometry Conference (Vol. 5, pp. 295-305). Robins, J. H., McLenachan, P. A., Phillips, M. J., Craig, L., Ross, H. A., & Matisoo-Smith, E. (2008). Dating of divergences within the Rattus genus phylogeny using whole mitochondrial genomes. Molecular Phylogenetics and Evolution, 49(2), 460466. Robins, J. H., McLenachan, P. A., Phillips, M. J., McComish, B. J., Matisoo-Smith, E., & Ross, H. A. (2010). Evolutionary relationships and divergence times among the native rats of Australia. BMC Evolutionary Biology, 10, 375-375. doi: 10.1186/1471-2148-10-375. Robinson, J. T., Thorvaldsdóttir, H., Winckler, W., Guttman, M., Lander, E. S., Getz, G., & Mesirov, J. P. (2011). Integrative genomics viewer. Nature Biotechnology, 29(1), 24-26. Rohland, N., & Hofreiter, M. (2007). Ancient DNA extraction from bones and teeth. Nature Protocols, 2(7), 1756-1762. extraction method for increased sample throughput. Molecular Ecology Resources, 10(4), 677-683. Rolett, B. V. (2002). Voyaging and interaction in ancient East Polynesia. Asian Perspectives, 41(2), 182-194. Ronquist, F., & Huelsenbeck, J. P. (2003). MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics, 19(12), 1572-1574. Roullier, C., Benoit, L., McKey, D. B., & Lebot, V. (2013). Historical collections reveal patterns of diffusion of sweet potato in Oceania obscured by modern plant movements and recombination. Proceedings of the National Academy of 101 Sciences, 110, 2205-2210. doi: 10.1073/pnas.1211049110. Sarah, B. N., Emily, H. T., Peggy, D. R., Steven, D. F., Abigail, W. B., Choli, L., . . . Jay, S. (2009). Targeted capture and massively parallel sequencing of 12 human exomes. Nature, 461(7261), 272. doi: 10.1038/nature08250. Savolainen, P., Leitner, T., Wilton, A. N., Matisoo-Smith, E., & Lundeberg, J. (2004). A detailed picture of the origin of the Australian dingo, obtained from the study of mitochondrial DNA. Proceedings of the National Academy of Sciences of the United States of America, 101(33), 12387-12390. Sawyer, S., Krause, J., Guschanski, K., Savolainen, V., & Pääbo, S. (2012). Temporal patterns of nucleotide misincorporations and DNA fragmentation in ancient DNA. PLoS One, 7(3), e34131. Schubert, M., Ermini, L., Der Sarkissian, C., Jónsson, H., Ginolhac, A., Schaefer, R., . . . McCue, M. (2014). Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. Nature Protocol, 9(5), 1056-1082. Schubert, M., Ginolhac, A., Lindgreen, S., Thompson, J. F., Al-Rasheid, K. A., Willerslev, E., . . . Orlando, L. (2012). Improving ancient DNA read mapping against modern reference genomes. BMC Genomics, 13(1), 1. Sheppard, P. J., Walter, R., & Parker, R. J. (1997). Basalt sourcing and the development of Cook Island exchange systems. In M. Weisler (Ed.), Prehistoric Long-distance Interaction in Oceania: an interdisciplinary approach (pp. 85-110): New Zealand Archaeological Association Monograph. Simmons, D. (1973). Suggested periods in South Island prehistory. Records of the Auckland Institute and Museum, 1-58. Sinoto, Y. H. (1970). An archaeologically based assessment of the Marquesas Islands as a dispersal center in East Polynesia. Studies in Oceanic culture history, 1, 105132. Sinoto, Y. H. (1983). An analysis of Polynesian migrations based on the archaeological assessments. Journal de la Société des Océanistes, 39(76), 57-67. Slatkin, M., & Racimo, F. (2016). Ancient DNA and human history. Proceedings of the National Academy of Sciences of the United States of America, 113(23), 6380. doi:10.1073/pnas.1524306113 102 Smith, C. I., Chamberlain, A. T., Riley, M. S., Stringer, C., & Collins, M. J. (2003). The thermal history of human fossils and the likelihood of successful DNA amplification. Journal of Human Evolution, 45(3), 203-217. Spriggs, M., & Anderson, A. (1993). Late colonization of East Polynesia. Antiquity, 67(255), 200. Stoneking, M. (2000). Hypervariable Sites in the mtDNA Control Region Are Mutational Hotspots. The American Journal of Human Genetics, 67(4), 1029-1032. doi: 10.1086/303092. Storey, A. A., & Matisoo-Smith, E. A. (2014). No evidence against Polynesian dispersal of chickens to pre-Columbian South America. Proceedings of the National Academy of Sciences of the United States of America, 111, E3583. Storey, A. A., Ramírez, J. M., Quiroz, D., Burley, D. V., Addison, D. J., Walter, R., . . . Matisoo-Smith, E. A. (2007). Radiocarbon and DNA evidence for a preColumbian introduction of Polynesian chickens to Chile. Proceedings of the National Academy of Sciences, 104, 10335-10339. Storey, A. A., Spriggs, M., Bedford, S., Hawkins, S. C., Robins, J. H., Huynen, L., & Matisoo-Smith, E. (2010). Mitochondrial DNA from 3000-year old chickens at the Teouma site, Vanuatu. Journal of Archaeological Science, 37(10), 24592468. doi: 10.1016/j.jas.2010.05.006. Summerhayes, G. R., Leavesley, M., Fairbairn, A., Mandui, H., Field, J., Ford, A., & Fullagar, R. (2010). Human adaptation and plant use in highland New Guinea 49,000 to 44,000 years ago. Science (New York, N.Y.), 330(6000), 78. doi: 10.1126/science.1193130. Sutton, D. G. (1980). A culture history of the Chatham Islands. The Journal of the Polynesian Society, 89(1), 67-93. Tajima, F. (1983). Evolutionary relationship of DNA sequences in finite populations. Genetics, 105(2), 437. Thomson, V. A., Lebrasseur, O., Austin, J. J., Hunt, T. L., Burney, D. A., Denham, T., . . . Cooper, A. (2014). Using ancient DNA to study the origins and dispersal of ancestral Polynesian chickens across the Pacific. Proceedings of the National Academy of Sciences of the United States of America, 111(13), 4826. doi: 10.1073/pnas.1320412111. 103 Thorvaldsdóttir, H., Robinson, J. T., & Mesirov, J. P. (2013). Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in Bioinformatics, 14(2), 178-192. Trudgill, P. (2004). Linguistic and social typology: The Austronesian migrations and phoneme inventories. Linguistic Typology, 8(3), 305-320. doi: 10.1515/lity.2004.8.3.305. Tsangaras, K., Wales, N., Sicheritz-Pontén, T., Rasmussen, S., Michaux, J., Ishida, Y., . . . Greenwood, A. D. (2014). Hybridization Capture Using Short PCR Products Enriches Small Genomes by Capturing Flanking Sequences (CapFlank). PLoS One, 9(10), e109101. Van Tilburg, J. A. (1994). Easter Island: archaeology, ecology and culture: British Museum Press. Walter, R. (1998). Anai'o: the archaeology of a fourteenth century Polynesian community in the Cook Islands. Auckland [N.Z.]: New Zealand Archaeological Association. Weisler, M. I. (1994). The settlement of marginal Polynesia: New evidence from Henderson Island. Journal of Field Archaeology, 21(1), 83-102. Weisler, M. I. (1998). Hard evidence for prehistoric interaction in Polynesia. Current Anthropology, 39(4), 521-532. Whistler, W. A. (2009). Plants of the canoe people: an ethnobotanical voyage through Polynesia Wickler, S., & Spriggs, M. (1988). Pleistocene human occupation of the Solomon Islands, Melanesia. Antiquity, 62(237), 703. Willerslev, E., & Cooper, A. (2005). Ancient DNA. Proceedings of the Royal Society of London B: Biological Sciences, 272(1558), 3-16. Wilmshurst, J., & Higham, T. F. G. (2004). Using rat-gnawed seeds to independently date the arrival of Pacific rats and humans in New Zealand. The Holocene, 14(6), 801-806. doi: 10.1191/0959683604hl760ft. Wilmshurst, J., Hunt, T., Lipo, C., & Anderson, A. (2011). High-precision radiocarbon dating shows recent and rapid initial human colonization of East Polynesia. Proceedings of the National Academy of Sciences of the United States of America, 108(5), 1815-1820. 104 Wilmshurst, J. M., Anderson, A. J., Higham, T. F. G., & Worthy, T. H. (2008). Dating the late prehistoric dispersal of Polynesians to New Zealand using the commensal Pacific rat. Proceedings of the National Academy of Sciences of the United States of America, 105(22), 7676-7680. doi: 10.1073/pnas.0801507105. Worthy, T. H., & Bollt, R. (2011). Prehistoric Birds and Bats from the Atiahara Site, Tubuai, Austral Islands, East Polynesia. Pacific Science, 65(1), 69-85. Yang, D. Y., & Watt, K. (2005). Contamination controls when preparing archaeological remains for ancient DNA analysis. Journal of Archaeological Science, 32(3), 331336. 105
© Copyright 2026 Paperzz