PREFACE In 2001, in a massive collaborative effort of scientists, working from a multitude of disciplines, including biology, biochemistry, chemistry, genetics, engineering, and computer science, one of the tremendous feats in the history of science was accomplished: the sequencing of the human genome. Today, we can imagine a not-too-distant future in which our personal genomes are entirely known to us. We will download our genetic data and know by our sequences whether we are susceptible to particular diseases such as diabetes, cancer, and stroke. We’ll modify our behaviors and mitigate these risks—our lives will change. For some of us, a poor genetic profile will affect our outlook on life, or the economics of our lives. How will medicine adapt to common knowledge of the genome? We do not quite know yet what this world looks like, but some of its weightiest questions are already being asked and debated—and studied by a rapidly expanding field of genomics and bioinformatics research. These are questions about the modern world, the modern person, and the future of biological science. Welcome to the world of bioinformatics. THE APPROACH Concepts in Bioinformatics and Genomics takes a conceptual approach to its subject, balancing biology, mathematics, and programming, while highlighting relevant real-world applications. Topics are developed from the fundamentals up, like in an introductory textbook. This is a comprehensive book for students enrolled in their first course in bioinformatics. A compelling case study gene, the TP53 gene, a human tumor suppressor with strong clinical applications, runs throughout, engaging students with a continuously relevant example. The textbook thoroughly describes basic principles of 00-Momand-FM.indd 13 probability as they lead up to the concept of Expect value (E-value) and its use in sequence alignment programs. Concepts in Bioinformatics and Genomics also describes, from a mathematical perspective, the development of the hidden Markov model and how it can be used to align sequences in multiple sequence alignment programs. Finally, it introduces students to programming exercises directly related to bioinformatics problems. Thoughtprovoking exercises stretch the students’ imaginations and learning, giving them a deeper understanding of software programs, molecular biology, basic probability, and program-coding methodology underpinning the discipline. The material covered in this book provides students with the fundamental tools necessary to analyze biological data. ORGANIZATION Introduction to Bioinformatics: Chapters 1–5 CHAPTER 1 is an overview of molecular biology. It will provide the essential biology vocabulary for understanding bioinformatics. Chapter 2 introduces GenBank, the database that stores the vast amounts of DNA and RNA sequence data crucial for bioinformatics research. CHAPTER 3 discusses molecular evolution, which explains the diversity of sequences and how mutations get passed to progeny. Chapter 4 delves into the derivation of amino acid substitution matrices, the basis of sequence comparison programs, which help us connect molecular evolution to protein structure and function. Chapter 5 discusses amino acid substitution matrices and pairwise sequence comparison programs. Here, we begin to get into the nuts and bolts of algorithms that use data from evolution and protein domain conservation to infer whether two genes are homologs. 27/05/16 7:40 PM xivPREFACE Biology: Chapters 6–10 CHAPTER 6 further develops the topic of pairwise sequence comparison by describing the Basic Local Alignment Search Tool (BLAST) and discusses multiple sequence alignment programs with an emphasis on the first popular program of this class—ClustalW. Chapter 7 is devoted to protein structure prediction programs. This chapter provides strong foundational knowledge of protein structures and the Protein Data Bank. Chapter 8 introduces phylogenetics with a discussion of DNA, protein sequence information, and the construction of phylogenetic trees. Chapter 9 presents genomics analysis with an emphasis on next-generation sequencing (NGS), and annotation of bacterial genomes. Chapter 10 is all about gene expression. Approximately half of this chapter is devoted to methods to measure transcript levels with an emphasis on microarrays and RNA-seq. The other half is devoted to proteomics, where we describe how mass spectrometry is used to identify proteins isolated from 2D-gels. Mathematics: Chapters 11–12 CHAPTER 11 introduces you to probability, a requisite component of bioinformatics research, with an emphasis on counting methods, dependence, Bayesian inference, and random variables. In Chapter 12 the subject of a continuous random variable, introduced in the previous chapter, will be further developed into a discussion of the extreme value distribution and its use in analyzing the significance of an alignment. We conclude the chapter with stochastic processes, specifically Markov chains and hidden Markov models, as well as a mathematical derivation of the Jukes-Cantor model. Programming: Chapters 13–14 CHAPTER 13 focuses on Python, a popular bioinformatics programming language. The Kyte-Doolittle Hydropathy sliding window program (one of the first popular bioinformatics programs) is used to illustrate Python fundamentals and to introduce you to the program design process. Chapter 14 follows this design process and steps you through the development of a pairwise sequence alignment tool. FOR PROFESSORS Approach and Rationale The bioinformatics discipline has matured to the point where there is general agreement on the software programs and databases that are standards in the field. The algorithms that form the foundations of these software programs will not significantly change within the next 00-Momand-FM.indd 14 three to four years. Similarly, databases that are bulwarks of the field will not vanish in the foreseeable future. Understanding the rationale for the basis of these bioinformatics tools is critical for students pursuing molecular life science or bioinformatics careers. Flexible Organization Overall, biology, mathematics, and computer science are presented in an order that systematically develops a student’s understanding of the area. To highlight relevant connections between the three, we include crossreferences in the main text and in footnotes. Those who wish to teach the course with the biology-heavy chapters in the beginning may consider presenting the chapters in the order listed in the table of contents. In this order, the biology-heavy chapters (Chapters 1 through 10) are followed by two mathematics-heavy chapters (Chapters 11 and 12) and two computer science-heavy chapters (Chapters 13 and 14). If instructors wish to integrate computer programming early into the course, they may want to consider presenting the chapters in the following order: 1–5, 13, 14, and 6–12. Chapters 1 through 5 provide the biological rationale for pairwise sequence alignment and Chapters 13 and 14 provide the computer programming background so that students can create their own software tools to align sequences. The programming concepts in Chapters 13 and 14 reinforce the biological principles covered in Chapters 1 through 5. To provide students with more time to learn the Python programming basics, instructors may wish to intersperse topics from Chapters 13 and 14 among topics covered in Chapters 1 through 5. After covering Chapters 1 through 5, 13 and 14, material from the more biology-heavy chapters (Chapters 6–10) and the mathematics-heavy chapters (Chapters 11–12) can be covered. Some bioinformatics and genomics courses are taught in a format consisting of a lecture section and a separate computer lab section. If this is the case, the lecture section can focus on Chapters 1 through 12, the lab section on Chapters 13 and 14. The lab section may allow more time for students to work through small coding assignments that together provide a foundation for a more extensive programming project (described in Chapter 14) to be completed by the end of the lab course. Another way of dividing the material between lecture and lab sections is to focus the lecture on the biology-heavy chapters (Chapters 1–10) and include Chapters 11–14 in the lab. If instructors would like to integrate mathematics earlier in the course they may consider covering Chapters 11 and 12 just prior to Chapter 6. The introductory basic probability segment of Chapter 11, followed by the explicit derivation of extreme value distribution in Chapter 12, provide a strong foundation for the discussion of E-value, an important component of the BLAST 27/05/16 7:40 PM PREFACE xv SUGGESTED ALTERNATIVE PRESENTATIONS OF TEXTBOOK PRESENTATION ORDER OF ALTERNATE CHAPTER FIRST FIVE CHAPTERS PRESENTATION ORDER 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 Biology-heavy chapters first with cross-references to mathematics- and computer science-heavy chapters. 1, 2, 3, 4, 5, 13, 14, 6, 7, 8, 9, 10, 11, 12 Biology-foundation chapters first with computer science-heavy chapters more integrated. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12-lecture 13, 14-lab Biology-heavy and mathematics-heavy lecture section with a lab focused on computer science. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12–lecture 11, 12, 13, 14–lab Biology-heavy lecture section with a lab focused on mathematics and computer science. 1, 2, 3, 4, 5, 11, 12, 6, 7, 8, 9, 10, 13, 14 Biology-foundation chapters first with mathematicsheavy chapters more integrated. program discussed in Chapter 6. The segment of Chapter 12 that introduces hidden Markov models will strengthen the students’ understanding of multiple sequence alignment discussed in Chapter 6. The table above shows our suggestions for alternative sequences of the textbook chapters that can be tailored to your particular needs. THE FEATURES Balance of Biology, Mathematics, and Programming Concepts in Biochemistry and Genomics strikes a balance of topics for all students, no matter their background. Biology students will appreciate the reinforcement of the molecular life science topics and the gradual introduction to basic probability and programming concepts. Basic probability and programming use examples in biology to help biology students see the relevance of these concepts to molecular life science. Mathematics is expertly interwoven with bioinformatics concepts. Students with a background in computer programming will appreciate the basic biology primer in the first chapter. For students who already know how to program in another language, this textbook offers the opportunity to learn the fundamentals of a new language, Python. Genomics Genomics is a field that studies the entire sequenced genomes of organisms. Bioinformatics programs and databases are highly applicable to genomics because of the critical need to analyze and store a large amount of sequence data. Without bioinformatics, we cannot fully assess the genomics data we have c ollected. Chapters that emphasize genomics are C hapter 8 (“Phylogenetics”), 00-Momand-FM.indd 15 TYPE OF INTEGRATION Chapter 9 (“Genomics”) and Chapter 10 (“Transcript and Protein Expression Analysis”). Case studies of TP53, the Tumor Suppressor Gene The TP53 tumor suppressor is mutated in virtually all cancer types, and there is wide interest in using this knowledge to develop better cancer therapies. In Chapter 1, we discuss how p53 was discovered as a protein bound to a monkey virus oncoprotein, and in the last chapter, we show students how to create sequence alignment programs that quantify the similarities between p53 and its paralogs, p63 and p73. By the end of this textbook, students and instructors will have a deep understanding of the molecular biology of this gene and how bioinformatics can be used to further research progress in the fight against cancer. Scientist Spotlight Scientists who made significant contributions to the bioinformatics field are highlighted in “Scientist Spotlight” boxed sections. The scientists who created the first widely applicable amino acid substitution matrices (Margaret Dayhoff), the first global sequence alignment program (Christian Wunsch), the first local sequence alignment program (Michael Waterman), and the first program that successfully predicted protein membrane spanning regions (Russell Doolittle)—these are just a few of the brilliant discoveries and minds featured. A Closer Look From the TP53 gene to DNA fingerprinting and the Neanderthal genome, this boxed material examines in detail some of the most important elements of Concepts in Bioinformatics and Genomics. Replete with figures, photographs, and excerpts from published texts, 27/05/16 7:40 PM xviPREFACE “A Closer Look” provides the background and clarity needed to fully grasp the relevance of bioinformatics. Thought Questions Interspersed throughout the text, “Thought Questions” ask the important conceptual questions and prompt students to problem-solve and apply their knowledge on the fly. These questions provide students opportunities to self-test and better engage with their reading. Answers are found at the end of the chapter. End-of-Chapter Exercises Additionally, a robust list of end-of-chapter exercises encourages students to apply their bioinformatics knowledge holistically. Exercises are qualitative and quantitative, specific and comprehensive. Glossary Terms Glossary terms are highlighted and defined the first time they appear in the text. Concise explanations of the terms are also provided in the glossary section at the end of the book. SUPPORT PACKAGE Oxford University Press offers a comprehensive ancillary package for instructors and students using Concepts in Bioinformatics and Genomics. 00-Momand-FM.indd 16 For Students Companion website (www.oup.com/us/momand): Resources and links to bioinformatics software, tools, and databases are available on the companion website. These are stable resources, such as Dotter, BLAST, GenBank, and many more, that have matured with the discipline into the essential tools for the bioinformatician. The companion site also provides downloadable programming tools that are necessary for students to complete the programming projects and end-of-chapter exercises. For Instructors The Ancillary Resource Center (ARC), located at www .oup-arc.com/momand, contains the following teaching tools: • Digital Image Library includes electronic files in PowerPoint format of every illustration, photo, graph, figure caption, and table from the text—both labeled and unlabeled versions. • Answers to End-of-Chapter Questions includes detailed solutions to all of the many exercises provided at the end of each chapter. • Editable Lecture Notes in PowerPoint format for each chapter help make preparing lectures faster and easier than ever. Each chapter’s presentation includes a succinct outline of key concepts and incorporates the graphics from the chapter. 27/05/16 7:40 PM
© Copyright 2026 Paperzz