AutoAssembler Version 2.0 User’s Manual © Copyright 2000, Applied Biosystems For Research Use Only. Not for use in diagnostic procedures.. ABI PRISM and its Design, Applied Biosystems, SeqEd and Sequence Navigator are registered trademarks of PE Corporation. ABI, AutoAssembler, BioLIMS, Factura, Inherit, and Applied Biosystems are trademarks of PE Corporation or its subsidiaries in the U.S. and certain other countries. Collections Manager is a trademark of Molecular Informatics, Inc. AppleScript and Macintosh are registered trademarks of Apple, Inc. All other trademarks are the sole property of their respective owners. P/N 904947B Software License and Warranty Applied Biosystems Software License and Limited Product Warranty PURCHASER, CAREFULLY READ THE FOLLOWING TERMS AND CONDITIONS (THE “AGREEMENT”), WHICH APPLY TO THE SOFTWARE ENCLOSED (THE “SOFTWARE”). YOUR OPENING OF THIS PACKAGE INDICATES YOUR ACCEPTANCE OF THESE TERMS AND CONDITIONS. IF YOU DO NOT ACCEPT THEM, PROMPTLY RETURN THE COMPLETE PACKAGE AND YOUR MONEY WILL BE RETURNED. THE LAW PROVIDES FOR CIVIL AND CRIMINAL PENALTIES FOR ANYONE WHO VIOLATES THE LAWS OF COPYRIGHT. Copyright The SOFTWARE, including its structure, organization, code, user interface, and associated documentation, is a proprietary product of Applied Biosystems and is protected by international laws of copyright. Title to the SOFTWARE, and to any and all portion(s) of the SOFTWARE shall at all times remain with Applied Biosystems. License 1. You may use the SOFTWARE on a single computer (or on a single network, if your software is designated as a network version). You may transfer the SOFTWARE to another single computer (or network, if a network version), so long as you first delete the SOFTWARE from the previous computer or network. You may never have operational SOFTWARE on more than one computer (or more than one network, if a network version) per original copy of the SOFTWARE at any time. 2. You may make one copy of the SOFTWARE for backup purposes. 3. You may transfer the SOFTWARE to another party, but only if the other party agrees in writing with Applied Biosystems to accept the terms and conditions of this Agreement. If you transfer the SOFTWARE to another party, you must immediately transfer all copies to that party, or destroy those not transferred. Any such transfer terminates your license. continued on next page iii Restrictions 1. You may not copy, transfer, rent, modify, use, or merge the SOFTWARE, or the associated documentation, in whole or in part, except as expressly permitted in this Agreement. 2. You may not reverse assemble, decompile, or otherwise reverse engineer the SOFTWARE. Limited Warranty For a period of 90 days after purchase of the SOFTWARE, Applied Biosystems warrants that the SOFTWARE will function substantially as described in the documentation supplied by Applied Biosystems with the SOFTWARE. If you discover an error which causes substantial deviation from that documentation, send a written notification to Applied Biosystemsr. Upon receiving such notification, if Applied Biosystems is able to reliably reproduce that error at its facility, then Applied Biosystems will do one of the following at its sole option: (i) correct the error in a subsequent release of the SOFTWARE, which shall be supplied to you free of charge, or (ii) accept a return of the SOFTWARE from you, and refund the purchase price received for the SOFTWARE. Applied Biosystems does not warrant that the SOFTWARE will meet your requirements, will be error-free, or will conform exactly to the documentation. Any sample or model used in connection with this Agreement is for illustrative purposes only, is not part of the basis of the bargain, and is not to be construed as a warranty that the SOFTWARE will conform to the sample or model. Limitation Of EXCEPT AS SPECIFICALLY STATED IN THIS AGREEMENT, THE Liability SOFTWARE IS PROVIDED AND LICENSED “AS IS”. THE ABOVE WARRANTY IS GIVEN IN LIEU OF ALL OTHER WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING THOSE OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. NOTWITHSTANDING ANY FAILURE OF THE CENTRAL PURPOSE OF ANY LIMITED REMEDY, APPLIED BIOSYSTEMS LIABILITY FOR BREACH OF WARRANTY SHALL BE LIMITED TO A REFUND OF THE PURCHASE PRICE FOR SUCH PRODUCT. IN NO EVENT WILL APPLIED BIOSYSTEMS BE LIABLE FOR ANY OTHER DAMAGES, INCLUDING INCIDENTAL OR CONSEQUENTIAL DAMAGES, EVEN IF APPLIED BIOSYSTEMS HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. continued on next page iv Term You may terminate this Agreement by destroying all copies of the SOFTWARE and documentation. Applied Biosystems may terminate this Agreement if you fail to comply with any or all of its terms, in which case you agree to return to Applied Biosystemsr all copies of the SOFTWARE and associated documentation. Miscellaneous 1. Failure to enforce any of the terms and conditions of this Agreement by either party shall not be deemed a waiver of any rights and privileges under this Agreement. 2. In case any one or more of the provisions of this Agreement for any reason shall be held to be invalid, illegal, or unenforceable in any respect, such invalidity, illegality, or unenforceability shall not affect any other provisions of this Agreement, and this Agreement shall be construed as if such invalid, illegal, or unenforceable provisions had never been contained herein. 3. This Agreement shall be construed and governed by the laws of the State of California. 4. This Agreement and the Applied Biosystems Sales Quotation constitute the entire agreement between Applied Biosystems and you concerning the SOFTWARE. v vi Contents Software License and Warranty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Applied Biosystems Software License and Limited Product Warranty iii Copyright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Limited Warranty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Limitation Of Liability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 Registering Your Copy of AutoAssembler. . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 How to Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 About AutoAssembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3 Using AutoAssembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 Optional AutoAssembler Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 BioLIMS Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 CAP Remote Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5 Server Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5 New Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5 Compatibility with Previous Releases . . . . . . . . . . . . . . . . . . . . . . . . 1-6 vii Related Software Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6 Factura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6 Sequence Navigator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8 Using This Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9 Conventions Used in This Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9 Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 To Reach Us on the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10 Hours for Telephone Technical Support. . . . . . . . . . . . . . . . . . . . . . 1-10 To Reach Us by Telephone or FAX… . . . . . . . . . . . . . . . . . . . . . . . 1-10 Documents on Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13 To Reach Us by E-Mail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14 Regional Offices, Sales and Services . . . . . . . . . . . . . . . . . . . . . . . . 1-14 2 System Requirements and Installation . . . . . . . . . . .2-1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 In This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1 Hardware and Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Required Computer System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Supplied with AutoAssembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 AutoAssembler Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Installing AutoAssembler Only. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 Using the AutoAssembler Installer . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3 Files Installed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6 Installing the BioLIMS Client Package, Including the AutoAssembler Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 Before You Install. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 To Install the BioLIMS Client Package . . . . . . . . . . . . . . . . . . . . . . . 2-8 To Do a Custom Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9 To Remove the Installed Package . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10 viii Files Installed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 Application Files Installed . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 System Files Installed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13 Starting AutoAssembler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14 To Start AutoAssembler for the First Time. . . . . . . . . . . . . . . . . . . . 2-14 Allocating More Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15 Configuring BioLIMS Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17 Configuring for Server Connection. . . . . . . . . . . . . . . . . . . . . . . . . . 2-17 3 Creating and Assembling a Project . . . . . . . . . . . . . 3-1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 Organizing Your Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 Organizing a From Files Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 Organizing a Large Project With Several Project Files . . . . . . 3-3 Organizing a Networked Project . . . . . . . . . . . . . . . . . . . . . . . 3-3 Organizing a BioLIMS Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 Naming a BioLIMS Project . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 Missing Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 Opening and Closing a Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 Starting AutoAssembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 Viewing the Project Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 Contig List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6 Sequence List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 Project Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 Opening a New Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 Opening an Existing Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 From the Finder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7 While Starting the AutoAssembler Program . . . . . . . . . . . . . . 3-7 From Within the AutoAssembler Program. . . . . . . . . . . . . . . . 3-7 ix Closing a Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8 Adding Sequences From Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9 From File and BioLIMS Projects. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9 Adding Sequences to a From Files Project . . . . . . . . . . . . . . . . . . . 3-10 Removing Sequences from a Project . . . . . . . . . . . . . . . . . . . . . . . . 3-11 Adding Sequences From the BioLIMS Database . . . . . . . . . . . . . . . . . . . . 3-12 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12 In This Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12 Opening BioLIMS Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13 Displaying the Sequence Chooser Window . . . . . . . . . . . . . . . . . . . 3-15 Parts of the Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16 Collection Search Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17 Sequence Search Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18 Searching the BioLIMS Database . . . . . . . . . . . . . . . . . . . . . . . . . . 3-20 Adding Sequences From BioLIMS . . . . . . . . . . . . . . . . . . . . . . . . . 3-22 Removing Sequences from a Project . . . . . . . . . . . . . . . . . . . . . . . . 3-24 Viewing the Sequence List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25 Changing the Information Displayed in the Sequence List . . . . . . . 3-25 Changing the Sort Order in the Sequence List. . . . . . . . . . . . . . . . . 3-27 Assembling Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29 Assembling by Local Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30 Assembling Projects Using the Engine Options . . . . . . . . . . . . . . . 3-31 Assembly Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-31 Engine Assembly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-34 Using a Server to Assemble Project . . . . . . . . . . . . . . . . . . . . . . . . . 3-36 FDF Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-38 Setting Minimum Overlap and Percent Error . . . . . . . . . . . . . . . . . 3-38 Server Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-38 Local Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-40 x The Assembled Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-40 Contig Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-41 Sequence Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-41 Importing Assembled Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-41 Setting Up for AutoUpdating. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43 Opening BioLIMS Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43 Configuring AutoUpdating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43 Changing and Adding Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-44 Adding Sequences using AutoUpdating . . . . . . . . . . . . . . . . 3-44 While the Project is Being Updated . . . . . . . . . . . . . . . . . . . . . . . . . 3-44 Turning Off AutoUpdating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-45 4 Viewing the Consensus. . . . . . . . . . . . . . . . . . . . . . . 4-1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1 Understanding the Project Window Views . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 Layout View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3 Identifying Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4 Displaying File Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4 Zooming In. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5 Alignment View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6 Consensus Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6 Viewing Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7 The Statistics View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7 Displaying the Consensus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7 Statistic View Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8 The Zoom Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8 Displaying Electropherograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9 Opening Electropherogram Displays . . . . . . . . . . . . . . . . . . . . . . . . . 4-9 Hiding Electropherogram Displays. . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9 xi Changing Electropherogram Appearance . . . . . . . . . . . . . . . . . . . . Changing Horizontal Scale . . . . . . . . . . . . . . . . . . . . . . . . . . Changing Vertical Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing Row Height . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing the Display Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Opening the Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing Row Height and Vertical Scale . . . . . . . . . . . . . . . . . . . . Changing Minimum Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . Selecting Base Color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing Consensus Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing Threshold Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changing Orientation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . Changing Network Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manipulating Window Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Arranging Multiple Windows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Stacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Cloning the Project Window to See Multiple Views of the Data . . . Locating Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Finding Sequences and Patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . Searching for Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10 4-10 4-11 4-11 4-12 4-12 4-13 4-14 4-15 4-15 4-16 4-16 4-16 4-17 4-18 4-18 4-18 4-18 4-18 4-19 4-21 4-21 4-21 4-24 5 Editing the Project. . . . . . . . . . . . . . . . . . . . . . . . . . .5-1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . In This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Locating and Controlling Ambiguity in the Consensus . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using the Views to Locate Problem Areas . . . . . . . . . . . . . . . . . . . . . Finding Ambiguities Quickly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Controlling Ambiguity in the Consensus . . . . . . . . . . . . . . . . . . . . . . xii 5-1 5-1 5-1 5-2 5-2 5-2 5-3 5-4 Complementing a Contig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 Translating the Consensus to Protein Sequences . . . . . . . . . . . . . . . . 5-5 Using an Electropherogram to Resolve Ambiguities . . . . . . . . . . . . . 5-6 Finding Ambiguous Areas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7 Resolving Ambiguity in the Project Window . . . . . . . . . . . . . . . . . . . . . . . 5-10 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10 Editing in the Consensus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10 What Gets Saved When You Edit . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10 Keeping Track of Your Edits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11 Selecting Bases or Sequence Segments . . . . . . . . . . . . . . . . . . . . . . 5-11 Adding Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12 Deleting Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13 Replacing Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14 Shifting Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16 Editing Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17 Editing the Valid Range of Data Used for Assembly . . . . . . . . . . . . 5-18 Verifying Orientation and Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20 Changing Statistic View Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 5-20 Checking the Consensus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-21 6 Viewing and Editing Sequences. . . . . . . . . . . . . . . . 6-1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 Viewing and Editing Individual Sequences in Sequence Windows. . . . . . . . 6-2 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2 Opening the Sequence Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3 Viewing the Sequence Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4 Editing in the Sequence Window versus the Project Window . . . . . . 6-5 Closing the Sequence Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5 Using the Annotation View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7 The Annotation View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7 xiii Using the Electropherogram View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8 Editing in the Electropherogram View . . . . . . . . . . . . . . . . . . . . . . . . 6-8 Moving the Selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9 Changing Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9 Adding Bases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10 Using the Sequence View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11 Editing Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11 Adding Bases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12 Deleting Bases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12 Changing Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12 Using the Feature View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13 Editing Feature Ranges and Markings . . . . . . . . . . . . . . . . . . . . . . . 6-14 Changing Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14 7 Reassembling a Project . . . . . . . . . . . . . . . . . . . . . . .7-1 Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . In This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reassembling with New or Changed Sequences . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reassembling with New Sequences. . . . . . . . . . . . . . . . . . . . . . . . . . Reassembling with Changed Sequences . . . . . . . . . . . . . . . . . . . . . . Reassembling to Achieve Different Results . . . . . . . . . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reassembling After Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reassembling After Changing Constraints . . . . . . . . . . . . . . . . . . . . Resetting Overlap Relationships . . . . . . . . . . . . . . . . . . . . . . . Assembling Projects Without Constraints. . . . . . . . . . . . . . . . Reassembling After Changing Minimum Overlap and Percent Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reassembling After Changing Engine Parameters . . . . . . . . . . . . . . xiv 7-1 7-1 7-1 7-2 7-2 7-2 7-4 7-5 7-5 7-5 7-6 7-8 7-8 7-8 7-9 8 Saving and Printing in AutoAssembler . . . . . . . . . . 8-1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 Saving your Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 Saving the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 Project and Sequence Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3 Project Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3 Sequence Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3 Saving Individual Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4 Saving Sequences From the Project Window. . . . . . . . . . . . . . 8-4 Saving Sequences From the Sequence Window. . . . . . . . . . . . 8-6 Printing and Saving Assembly Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7 Project Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7 The Contig Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8 Project Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9 Viewing Assembly Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9 Saving Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10 Printing Assembly Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10 Printing and Copying the Views for Presentations . . . . . . . . . . . . . . . . . . . 8-11 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-11 Printing Project Window Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-11 Printing Sequence Window Views . . . . . . . . . . . . . . . . . . . . . . . . . . 8-12 Copying Project Window Views to Other Programs. . . . . . . . . . . . . 8-14 Copying a Sequence from the Sequence Window . . . . . . . . . . . . . . 8-14 Creating Files for Use with Other Applications . . . . . . . . . . . . . . . . . . . . . 8-16 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-16 Building a Consensus Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-16 Exporting a Consensus Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17 Exporting Sequences to Text Format . . . . . . . . . . . . . . . . . . . . . . . . 8-19 AutoAssembler Layout Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19 xv A AppleScript Dictionary . . . . . . . . . . . . . . . . . . . . . . A-1 Appendix Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1 In This Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1 AppleScript Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2 AutoAssembler Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2 BioLIMS Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6 B References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1 Appendix Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1 In This Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1 Algorithm References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2 Sequence Alignment Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2 Feature Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2 C Key Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 Appendix Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 In This Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 Translation Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2 IUPAC/IUB Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2 Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3 Universal Genetic Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3 Amino Acid Abbreviations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-4 Glossary Index xvi Introduction Overview 1 1 Introduction This chapter provides information on: ♦ The AutoAssembler™ software ♦ Using this manual ♦ How to get help if you need it Before you begin, you should be familiar with the Product License and Warranty in the front of the manual. In This Chapter This chapter contains the following topics: Topic See Page Registering Your Copy of AutoAssembler 1-2 About AutoAssembler 1-3 Using This Manual 1-9 Customer Support 1-10 Introduction 1-1 Registering Your Copy of AutoAssembler Introduction When you register your copy of the AutoAssembler software, you become eligible for telephone and field service support from Applied Biosystems that lasts for 90 days from the date of the first telephone support call. Registering also allows you to purchase upgrades to the software at a lower price than it would cost you to purchase new software. These privileges are only available if you return your registration card. How to Register To register your copy of the AutoAssembler software, fill out the registration card included in this package and return it to Applied Biosystems. 1-2 Introduction About AutoAssembler Introduction The AutoAssembler software allows you to quickly and efficiently assemble small pieces of data from ABI PRISM™ DNA Sequencing Analysis software (as well as data from other sources) into larger segments of data. The AutoAssembler software provides powerful tools for editing the sequences, including the ability to display constantly spaced electropherograms with the assembled sequences. You can build a consensus from the assembled sequences and export that consensus to use with other programs such as Sequence Navigator® software. Use the AutoAssembler software in conjunction with the Factura program to clean up sequence data for analysis and alignment by identifying features you specify (such as vector and ambiguity ranges that are not to be used in assembly). Factura processes sequences in batches, speeding the cleanup process. Factura also provides tools for editing sequences and marking the identified features with color and underscoring. continued on next page Introduction 1-3 Using Using AutoAssembler is an iterative process. You can build a project, AutoAssembler assemble the sequences, view and edit the contig, and then add more sequences and reassemble. Figure 1-1 shows a typical path for using the AutoAssembler software. Create a project Add sequences Assemble the project Review and edit the contig and sequences Reassemble Print or export the consensus Figure 1-1 AutoAssembler user path Optional AutoAssembler software is available in three optional configurations AutoAssembler that expand the capabilities of the program through the use of a server. Versions These options are purchased separately. Note The client sides of the three options are included with the AutoAssembler installation disk as custom options. However, they will not work without the purchased server options. BioLIMS Option The BioLIMS system provides a relational database for sequences created by ABI PRISM DNA Sequencing Analysis software. This database accommodates multiple users and editions while preserving the original data. 1-4 Introduction With the BioLIMS option, you can store sequences for use in your AutoAssembler projects. Using AutoAssembler’s AutoUpdating feature, the BioLIMS database allows you to build a project from sequences stored on the server. AutoAssembler automatically updates and reassembles the project as database sequences are added or edited. CAP Remote Option You may also purchase the Remote Contig Assembly Program (CAP) version of AutoAssembler. Like the Server option, CAP Remote allows you to assemble projects on a UNIX server, making assembly faster for larger projects, and freeing your Macintosh® computer for use during assembly. Server Option The AutoAssembler software supports the Server with Fast Data Finder® (FDF) as a separate purchase option. In this optional configuration, AutoAssembler sends projects to a server for assembly. New Features For those who are familiar with prior versions of the AutoAssembler software, AutoAssembler 2.0 contains the major new features shown in Table 1-1. Table 1-1 New AutoAssembler Features Item Description AutoUpdating With BioLIMS, AutoAssembler allows you to automatically update designated projects to include new or edited sequences as they are entered into the database. AppleScripting Support AutoAssembler now supports a wide variety of AppleScript® commands (see Appendix A for a list of supported commands). User-Configurable Assembly Engines With the engine assembly option, you can update AutoAssembler with new assembly algorithms by plugging in new algorithm engines. continued on next page Introduction 1-5 Compatibility with AutoAssembler 2.0 is fully compatible with projects created with Previous Releases AutoAssembler versions 1.0 and 1.4. This allows you to apply the improved features of AutoAssembler 2.0 to projects you have already created. However, projects created with AutoAssembler 2.0 are not compatible with earlier versions of AutoAssembler. Related Software The following software packages improve the capabilities of Packages AutoAssembler. Factura Before you assemble sequences, you should remove vector sequences and ambiguities from the sequence data. The Factura software uses the parameters you specify to identify the vector and ambiguous sequence areas, assign International Union of Biochemistry (IUB) codes to ambiguous bases, and mark the confidence range. The program performs these operations on batches of sequences, speeding sequence cleanup time considerably. After you have identified clean data using the Factura program, you can import the sequences into an AutoAssembler project, which determines the valid range of data for assembly based on the features identified in Factura. In the project window, you can edit and assemble multiple sequences and view their electropherograms simultaneously. When you are satisfied with the assembled project, you can build a consensus and export it to a file for use with other applications. You can also print (or copy and paste) various parts of the project windows for reports or presentations. Figure 1-2 shows how the AutoAssembler and Factura programs work together to create completed project files. 1-6 Introduction Factura/AutoAssembler Flowchart Sequence files Process Sequences Identify vector sequence Identify ambiguous regions Identify confidence range Identify heterozygotes Print Save a copy Print sequences, including electropherograms Print features, annotation Sequences BioLIMS Database (Newly created or imported) Import to Project Assemble Edit Update/Reassemble Build/Save consensus Save/Export to Print Save Sequence files Figure 1-2 AutoAssembler and Factura interaction Introduction 1-7 Sequence Navigator The Sequence Navigator program runs on Macintosh computers and addresses the unique needs of researchers who compare sequences to identify interesting sequence variations. It is used for mutation identification/heterozygote screening of sequences (such as p53 and HIV) and mitochondrial DNA. The program incorporates five powerful algorithms for pairwise or multiple alignment of DNA and protein sequences. Sequence Navigator software can be used in conjunction with Factura to identify heterozygote base positions and quickly clean up sequences before aligning them. 1-8 Introduction Using This Manual Introduction This manual includes an index, glossary, list of topics for each section, and numerous cross-references to help you find the information you need. Conventions Used The following words and styles draw your attention to specific details of in This Manual the information presented in this manual: Note This is used to call attention to useful information. IMPORTANT This information is indicated because it is necessary for proper operation of the software. CAUTION This word informs you that damage to the application or loss of data could occur if you do not comply with this information. Introduction 1-9 Technical Support To Reach Us on the Applied Biosystems web site address is: Web http://www.appliedbiosystems.com/techsupport We strongly encourage you to visit our web site for answers to frequently asked questions, and to learn more about our products. You can also order technical documents and/or an index of available documents and have them faxed or e-mailed to you through our site (see the “Documents on Demand” section below). Hours for In the United States and Canada, technical support is available at the Telephone following times. Hours Technical Support Product Chemiluminescence 9:00 a.m. to 5:00 p.m. Eastern Time LC/MS 9:00 a.m. to 5:00 p.m. Pacific Time All Other Products 5:30 a.m. to 5:00 p.m. Pacific Time See the “Regional Offices Sales and Service” section below for how to contact local service representatives outside of the United States and Canada. To Reach Us by Call Technical Support at 1-800-831-6844, and select the appropriate option Telephone or Fax (below) for support on the product of your choice at any time during the call. (To in North America open a service call for other support needs, or in case of an emergency, press 1 after dialing 1-800-831-6844.) For Support On This Product ABI PRISM ® 3700 DNA Analyzer ABI PRISM ® 3100 Genetic Analyzer DNA Synthesis 1-10 Introduction Dial 1-800-831-6844, and... Press FAX 8 650-638-5981 Press FAX 26 650-638-5891 Press FAX 21 650-638-5981 For Support On This Product Fluorescent DNA Sequencing Fluorescent Fragment Analysis (includes GeneScan® applications) Integrated Thermal Cyclers BioInformatics (includes BioLIMS™, BioMerge™, and SQL GT™ applications) PCR and Sequence Detection Dial 1-800-831-6844, and... Press FAX 22 650-638-5891 Press FAX 23 650-638-5891 Press FAX 24 650-638-5891 Press FAX 25 505-982-7690 Press FAX 5, or call 240-453-4613 1-800-762-4001, and press 1 for PCR, or 2 for Sequence Detection FMAT Peptide and Organic Synthesis Protein Sequencing Chemiluminescence Telephone FAX 1-800-899-5858, and press 1, then press 6 508-383-7855 Press FAX 31 650-638-5981 Press FAX 32 650-638-5981 Telephone FAX 1-800-542-2369 (U.S. only), or 781-275-8581 (Tropix) 1-781-271-0045 (Tropix) 9:00 a.m. to 5:00 p.m. ET Introduction 1-11 For Support On This Product LC/MS Dial 1-800-831-6844, and... Telephone FAX 1-800-952-4716 650-638-6223 9:00 a.m. to 5:00 p.m. PT 1-12 Introduction Documents on Free 24-hour access to Applied Biosystems technical documents, Demand including MSDSs, is available by fax or e-mail. You can access Documents on Demand through the internet or by telephone: If you want to order... through the internet Then... Use http://www.appliedbiosystems.com/techsupport You can search for documents to order using keywords. Up to five documents can be faxed or e-mailed to you by title. by phone from the United States or Canada a. Call 1-800-487-6809 from a touch-tone phone. Have your fax number ready. b. Press 1 to order an index of available documents and have it faxed to you. Each document in the index has an ID number. (Use this as your order number in step “d” below.) c. Call 1-800-487-6809 from a touch-tone phone a second time. d. Press 2 to order up to five documents and have them faxed to you. by phone from outside the United States or Canada a. Dial your international access code, then 1-858-712-0317, from a touch-tone phone. Have your complete fax number and country code ready (011 precedes the country code). b. Press 1 to order an index of available documents and have it faxed to you. Each document in the index has an ID number. (Use this as your order number in step “d” below.) c. Call 1-858-712-0317 from a touch-tone phone a second time. d. Press 2 to order up to five documents and have them faxed to you. Introduction 1-13 To Reach Us by Contact technical support by e-mail for help in the following product E-Mail areas. For this product area Use this e-mail address Chemiluminescence [email protected] Genetic Analysis [email protected] LC/MS [email protected] PCR and Sequence Detection [email protected] Protein Sequencing, Peptide and DNA Synthesis [email protected] Regional Offices If you are outside the United States and Canada, you should contact Sales and Service your local Applied Biosystems service representative. The Americas United States Applied Biosystems 850 Lincoln Centre Drive Foster City, California 94404 Tel: Fax: Latin America (Del.A. Obregon, Mexico) Tel:(305) 670-4350 Fax: (305) 670-4349 (650) 570-6667 (800) 345-5224 (650) 572-2743 Europe Austria (Wien) Hungary (Budapest) Tel: 43 (0)1 867 35 75 0 Fax: 43 (0)1 867 35 75 11 Tel: Fax: Belgium Tel: Fax: 1-14 Introduction 36 (0)1 270 8398 36 (0)1 270 8288 Italy (Milano) 32 (0)2 712 5555 32 (0)2 712 5516 Tel: Fax: 39 (0)39 83891 39 (0)39 838 9492 Czech Republic and Slovakia (Praha) The Netherlands (Nieuwerkerk a/d IJssel) Tel: Fax: Tel: Fax: 420 2 61 222 164 420 2 61 222 168 31 (0)180 331400 31 (0)180 331409 Denmark (Naerum) Norway (Oslo) Tel: Fax: Tel: Fax: 45 45 58 60 00 45 45 58 60 01 47 23 12 06 05 47 23 12 05 75 Europe Finland (Espoo) Tel: Fax: 358 (0)9 251 24 250 358 (0)9 251 24 243 Poland, Lithuania, Latvia, and Estonia (Warszawa) Tel: Fax: 48 (22) 866 40 10 48 (22) 866 40 20 France (Paris) Portugal (Lisboa) Tel: Fax: Tel: Fax: 33 (0)1 69 59 85 85 33 (0)1 69 59 85 00 351 (0)22 605 33 14 351 (0)22 605 33 15 Germany (Weiterstadt) Russia (Moskva) Tel: Fax: Tel: Fax: 49 (0) 6150 101 0 49 (0) 6150 101 101 7 095 935 8888 7 095 564 8787 Spain (Tres Cantos) South Africa (Johannesburg) Tel: Fax: Tel: Fax: 34 (0)91 806 1210 34 (0)91 806 1206 Sweden (Stockholm) Tel: Fax: 46 (0)8 619 4400 46 (0)8 619 4401 27 11 478 0411 27 11 478 0349 United Kingdom (Warrington, Cheshire) Tel: Fax: 44 (0)1925 825650 44 (0)1925 282502 Switzerland (Rotkreuz) South East Europe (Zagreb, Croatia) Tel: Fax: Tel: Fax: 41 (0)41 799 7777 41 (0)41 790 0676 385 1 34 91 927 385 1 34 91 840 Middle Eastern Countries and North Africa (Monza, Italia) Africa (English Speaking) and West Asia (Fairlands, South Africa) Tel: Fax: Tel: Fax: 39 (0)39 8389 481 39 (0)39 8389 493 27 11 478 0411 27 11 478 0349 All Other Countries Not Listed (Warrington, UK) Tel: Fax: 44 (0)1925 282481 44 (0)1925 282509 Japan Japan (Hatchobori, Chuo-Ku, Tokyo) Tel: 81 3 5566 6100 Fax: 81 3 5566 6501 Introduction 1-15 Eastern Asia, China, Oceania 1-16 Introduction Australia (Scoresby, Victoria) Malaysia (Petaling Jaya) Tel: Fax: Tel: Fax: 61 3 9730 8600 61 3 9730 8799 60 3 758 8268 60 3 754 9043 China (Beijing) Singapore Tel: Fax: Tel: Fax: 86 10 6238 1156 86 10 6238 1162 65 896 2168 65 896 2147 Hong Kong Taiwan (Taipei Hsien) Tel: Fax: Tel: Fax: 852 2756 6928 852 2756 6968 886 2 2698 3505 886 2 2698 3405 Korea (Seoul) Thailand (Bangkok) Tel: Fax: Tel: Fax: 82 2 593 6470/6471 82 2 593 6472 66 2 719 6405 66 2 319 9788 Introduction 1-17 1-18 Introduction System Requirements and Installation 2 Overview 2 Introduction This chapter provides: ♦ Hardware and software requirements for use of the AutoAssembler software ♦ Instructions for installing the AutoAssembler software ♦ Instructions for installing the BioLIMS Client Package and the AutoAssembler software from the BioLIMS CD-ROM (optional) ♦ Information about using your registration code and increasing the memory available for AutoAssembler ♦ Instructions for connecting to the BioLIMS database (optional) In This Chapter This chapter contains the following topics: Topic See Page Hardware and Software Requirements 2-2 Installing AutoAssembler Only 2-3 Installing the BioLIMS Client Package, Including the AutoAssembler Software 2-8 Starting AutoAssembler 2-14 Configuring BioLIMS Access 2-17 System Requirements and Installation 2-1 Hardware and Software Requirements Introduction This section describes the minimum hardware and software requirements for running AutoAssembler. Required Table 2-1 describes the computer system required to run Computer System AutoAssembler. These are the minimum requirements. In general, the more memory, the larger the screen size, and the more processing power you have, the better. Table 2-1 Required Computer System System Component Requirements CPU A PowerPC Mac OS computer. You will benefit from using the fastest computer available. Operating System Mac OS version 7.5.3 or later with Open Transport 1.1 or later. Monitor A 17-inch monitor or larger is recommended, although a monitor size of 640 x 480 pixels can be used. You will benefit from having a larger monitor. Disk Space A minimum of 5.6 MB free disk space. Memory The suggested memory allocation is 9.9 MB of random-access memory (RAM). Supplied with The AutoAssembler installation disk contains the AutoAssembler AutoAssembler program and the client sides of the AutoAssembler options. You may install the client sides of any option, but they will not run without the purchased server side of the option packages. AutoAssembler If you purchase an AutoAssembler option, the package may also Options include one of the following server applications: ♦ AutoAssembler Server Install ♦ AutoAssembler CAP Remote Install Note The BioLIMS Client Package (including the AutoAssembler software) is installed from a CD-ROM disc (see “Installing the BioLIMS Client Package, Including the AutoAssembler Software” on page 2-8). 2-2 System Requirements and Installation Installing AutoAssembler Only Introduction It is important that you disable any virus protection software during the installation process. After installation is complete, you should restart your computer to re-enable any virus protection software. Using the If you are installing AutoAssembler for the Macintosh computer only, AutoAssembler follow the steps described below. If you are installing AutoAssembler Installer software with an AutoAssembler option, follow the directions provided in the Installation Procedure you received with the software. To install AutoAssembler: Step Action 1 If you have not yet done so, disable any virus protection software on your hard disk. 2 Insert the AutoAssembler Install disk into the 3.5-inch disk drive of your computer. The files on the disk are displayed on your screen. 3 Click the Installer icon. 4 When the Installer splash screen appears, click Continue. The following dialog box appears: System Requirements and Installation 2-3 To install AutoAssembler: Step (continued) Action 5 Use the pop-up menu in the lower section of the dialog box to select the hard drive and folder on which to install AutoAssembler. 6 Click Install to install AutoAssembler. If you need to perform a custom installation, select Custom from the pop-up menu in the upper-left corner of the dialog box. The following options appear: Select the checkboxes of the options you wish to install, and click Install. Note Clicking the “I” buttons to the right of the installation options provides information on the particular option. Clicking the Read Me button accesses the Read Me file included with the software. 7 When prompted, insert the remaining disks. When the installation is complete, the following dialog box appears: 2-4 System Requirements and Installation To install AutoAssembler: Step 8 (continued) Action Click the Restart button, unless you want to perform additional installations. Note You do not need to restart in order to use the AutoAssembler program. Restarting reinstates your virus protection and cleans up any temporary files created by the installation procedure. IMPORTANT Before you use the programs, open the AutoAssembler folder and read the Read Me files. To open a Read Me file, double-click the icon. continued on next page System Requirements and Installation 2-5 Files Installed The folders and files should now be installed on your hard disk as shown in Figure 2-1. The files are briefly described in Table 2-2 and Table 2-3. Note Some of these files are only installed with custom installation options (see Table 2-3). Figure 2-1 Location of installed files for the AutoAssembler program 2-6 System Requirements and Installation Table 2-2 lists the files contained in the AutoAssembler folder after installation. Table 2-2 Files Installed in the AutoAssembler Folder Item Description AutoAssembler 2.0 The AutoAssembler program. Double-click the icon shown in Figure 2-1 to start the AutoAssembler program. Engines Folder Contains CAP assembly engine. About AutoAssembler 2.0 Contains information about the AutoAssembler program. Read the file before starting AutoAssembler. Assembly Data A folder used by the AutoAssembler program to store temporary files that result from assembling sequences by the Engine option. This folder is empty when the program is installed, and will contain no more than five temporary files of a single type at any time. Table 2-3 shows the additional files and folders that will be added if you ordered one of the AutoAssembler options. Table 2-3 Optional Installation Option Additional Folders or Files SAServer ABI Folder located in the system folder contains the SAServer.config file. CAP Remote CAP Remote engine added to the Engines folder located in the AutoAssembler folder. System Requirements and Installation 2-7 Installing the BioLIMS Client Package, Including the AutoAssembler Software Introduction The AutoAssembler application is shipped on a CD-ROM disc as part of the BioLIMS Client Package. You must have purchased one BioLIMS Client Package for each Macintosh on which AutoAssembler is installed. This section describes ♦ Complete installation of the BioLIMS Client Package (“To Install the BioLIMS Client Package” below) ♦ Custom installation (“To Do a Custom Installation” on page 2-9), for example, to install the AutoAssembler application alone ♦ Removal of the installation (“To Remove the Installed Package” on page 2-10) Before You Install ♦ Check that you have at least 56MB of free disk space to accommodate the BioLIMS applications. ♦ Quit from all applications that you may have open. ♦ Turn off any virus protection software that you may have running. To Install the Follow these steps to install all of the BioLIMS Client Package onto your BioLIMS Client Macintosh: Package To install the BioLIMS Client Package: Step 1 Action Insert the BioLIMS Client Package CD-ROM disc. The BioLIMS Client Package window opens automatically. 2 Find the BioLIMS Client Installer icon in the BioLIMS Client Package window and double-click to open the BioLIMS Client Installer. 3 Click Continue. 4 This dialog box contains important information that you should read. After you have read it, click Continue to open the BioLIMS Client Installer window. You may print or save the contents if you want. 2-8 System Requirements and Installation To install the BioLIMS Client Package: Step 5 (continued) Action To install the whole BioLIMS Client Package, use the default Easy Install described here. For information about custom installation, see “To Do a Custom Installation” on page 2-9. For information about removing an installed package, see “To Remove the Installed Package” on page 2-10. 6 Use the Switch Disk button or the Install Location pop-up menu to choose the disk on which to install the BioLIMS Client Package. If the software cannot be installed on the chosen disk, a warning appears in the Installer window. 7 Choose the Select Folder item on the Install Location pop-up menu. A Macintosh browser box appears. 8 Use the browser box to select a folder in which to install the BioLIMS Client Package applications. 9 Click Install to begin the installation. 10 At the conclusion of the installation, you should Restart your computer. To Do a Custom You may not want to install all of the BioLIMS Client Package. For Installation example, you might want to install only the AutoAssembler application on your Macintosh. To complete a custom installation: Step Action 1 Follow steps 1 to 5 in the procedure “To Install the BioLIMS Client Package” on page 2-8. 2 Select Custom Install from the pop-up menu at the top left of the window. 3 Check the names of all the applications that you want to install. For information about the individual applications, click the information button to the right of the application name to display an information dialog box. System Requirements and Installation 2-9 To complete a custom installation: Step 4 (continued) Action Use the Switch Disk button or the Install Location pop-up menu to choose the disk on which to install the selected applications. Be sure that there is enough space on the disk to accommodate your chosen applications. The Installer window reports both the space available on the disk and the approximate disk space required for the selected applications. 5 Choose the Select Folder item on the Install Location pop-up menu. A Macintosh browser box appears. 6 Use the browser box to select a folder in which to install the selected applications. 7 Click Install to begin the installation of the selected applications. 8 At the conclusion of the installation, you should Restart your computer. To Remove the If you decide to remove the BioLIMS Client Package from your Installed Package Macintosh, follow these steps. The Remove process deletes all the applications installed in the BioLIMS folder and also the files and folders placed in the System folder by the installer. Note If you have moved BioLIMS files or folders from their original installed locations, they may not be found and deleted by the remove operation. Also, any files that have been added to the application folders, such as those created when the applications are run, are not deleted by the remove operation. CAUTION If you have installed both the BioLIMS Instrument Package and the BioLIMS Client Package on the same Macintosh, you should not use Remove unless you intend to delete both the Client and the Instrument Packages. This is because the Remove process deletes files common to both packages, including files that are in the System Folder. To remove the BioLIMS Client Package files: Step Action 1 Follow steps 1 to 5 in the procedure “To Install the BioLIMS Client Package” on page 2-8. 2 Select Remove from the pop-up menu at the top left of the window. 3 Choose the Select Folder item on the Install Location pop-up menu. A Macintosh browser box appears. 4 Use the browser box to locate the folder that contains the BioLIMS folder. 2-10 System Requirements and Installation To remove the BioLIMS Client Package files: Step (continued) Action 5 Click Remove to begin the removal of the BioLIMS Client Package applications on your disk. 6 At the conclusion of the remove operation, an alert box appears telling you whether or not the remove was successful. Note If files have been moved or added to the BioLIMS folder, the remove operation will be reported as unsuccessful; you should then examine and delete the remaining files in the BioLIMS folder yourself. continued on next page System Requirements and Installation 2-11 \ Files Installed The BioLIMS Client Package installs files in a folder called BioLIMS and also installs some files in your System Folder. Application Files Installed The BioLIMS Client applications are placed in four folders in the main BioLIMS folder: This folder… Contains… Sequencing Analysis the Sequencing Analysis and Basecaller applications, the About Sequencing Analysis text file, and other folders associated with the Sequencing Analysis application Factura the Factura application, the About Factura text file, and other files and folders associated with the Factura application AutoAssembler the AutoAssembler application, the About AutoAssembler text file, the Assembly Data folder, and the Engines folder BioLIMS Extras the Sample2DB, Collections Manager, and SimpleText applications, the About Sample2DB and About Collections Manager text files, the Scripts folder, and the Sybase folder containing the interfaces and other database-related files IMPORTANT Before running an application for the first time, read the About text file for the application. Important information not contained in the manual may be found in the About text file. 2-12 System Requirements and Installation System Files Installed The installer places these files in the Macintosh System Folder: Item Folder Location Description Sybase Config Control Panels SybaseConfig control panel (see page 2-17) libblk Extensions Sybase library extension file libcomn Extensions Sybase library extension file libcs Extensions Sybase library extension file libct Extensions Sybase library extension file libctb Extensions Sybase library extension file libintl Extensions Sybase library extension file libsybdb Extensions Sybase library extension file libtcl Extensions Sybase library extension file libtcp Extensions Sybase library extension file SequenceChooserLib Extensions BioLIMS library extension file ABI Folder System Folder Mobility, comb, & matrix files System Requirements and Installation 2-13 Starting AutoAssembler Introduction Each AutoAssembler package contains a card with a unique registration code. The first time you use the AutoAssembler program, you are asked to enter this code. AutoAssembler then verifies the code. If you use the program on a different computer, you must re-enter the code. IMPORTANT You cannot use the same registration code on more than one computer at a time. To Start This procedure is only necessary the first time you open AutoAssembler AutoAssembler for on a particular Macintosh computer. the First Time To open AutoAssembler for the first time: Step 1 Action In the Finder, double-click the AutoAssembler icon. The first time you do so, the following registration dialog box appears: 2 Enter your name, organization, and registration code (located on the product registration card). 3 Click OK. continued on next page 2-14 System Requirements and Installation Allocating More When you start AutoAssembler, the program sets aside a certain Memory amount of RAM for its own use. AutoAssembler’s default RAM size allows you to assemble a project containing as many as 1000 sequences with an average length of 500 bases. If your projects are considerably bigger than this, you may want to give AutoAssembler and the CAP engine bigger memory partitions. When you assign the CAP engine extra memory, you may speed up assembly. To allocate more memory: Step 1 Action In the Finder, click the AutoAssembler icon and choose Get Info from the File menu. Note Do not double-click the icon. The program must remain closed. The following dialog box appears: 2 Type a larger number in the “Preferred size” entry field in the lowerright corner. Note Add memory in 1 MB increments until your memory problem is solved. System Requirements and Installation 2-15 To allocate more memory: Step 3 (continued) Action Close the Info dialog box. When you start the program, the Finder will allocate the amount of memory you have indicated, if it is available. 2-16 System Requirements and Installation Configuring BioLIMS Access Introduction The BioLIMS system provides a database for sequences created by ABI PRISM DNA Sequencing Analysis software. This database is located on a server, and accommodates multiple users and editions while preserving the original data. Configuring for Before you can access the BioLIMS database, you must configure the Server Connection SybaseConfig control panel. IMPORTANT Anytime you change the BioLIMS database server name, its IP address or host and domain name, or the port number, you must repeat this procedure. To configure the SybaseConfig control panel: Step Action 1 Find the interfaces file in the Sybase folder in the BioLIMS Extras folder. 2 Open the file with SimpleText, or a similar text editing application. 3 Find the lines: SYBASE query MacTCP mac_ether neuron.apldbio.com 2500 and edit them: ♦ Replace SYBASE with the name of the database server. ♦ Replace neuron.apldbio.com with the IP address or host and domain name of the server machine. ♦ Replace 2500 with the port number. You can find this information in the interfaces file on the Sybase server, or your BioLIMS database administrator can provide you with the information. System Requirements and Installation 2-17 To configure the SybaseConfig control panel: Step 4 (continued) Action If you have access to more than one server, duplicate the two lines and edit them for the other servers. For example, for two servers, one called SYBASE and one called SERVER2, the interfaces file might look like this: SYBASE query MacTCP mac_ether neuron.apldbio.com 2500 SERVER2 query MacTCP mac _ether 192.,135.191.128 2025 5 Save and close the interfaces file. 6 Open the SybaseConfig control panel. This control panel is found in the Control Panels folder in the System folder. 2-18 System Requirements and Installation To configure the SybaseConfig control panel: Step 7 (continued) Action The first time you open the SybaseConfig control panel, a file browser opens automatically. If a file browser does not open immediately, click the Interfaces Files button to open a file browser. 8 Use the file browser to locate and open the interfaces file that you edited in the steps above. 9 Set the Default Language pop-up menu to be us_english. 10 Close the SybaseConfig control panel. System Requirements and Installation 2-19 2-20 System Requirements and Installation Creating and Assembling a Project 3 Overview 3 Introduction To assemble sequences using the AutoAssembler software, you must create a project, which maintains information about sequences and the contigs that result when sequences are assembled. The project is displayed in the project window, which allows you to easily edit and assemble the sequences. Saving the project (described on page 8-2) stores the information in a project file for future use. In This Chapter This chapter contains the following topics: Topic Organizing Your Project See Page 3-2 Opening and Closing a Project 3-6 Adding Sequences From Files 3-9 Adding Sequences From the BioLIMS Database 3-12 Viewing the Sequence List 3-25 Assembling Sequences 3-29 Setting Up for AutoUpdating 3-43 Creating and Assembling a Project 3-1 Organizing Your Project Introduction AutoAssembler uses the following two types of projects: ♦ From Files–Contain only sequences from the computer AutoAssembler is running on, or from a non-BioLIMS server. ♦ BioLIMS–Contain only sequences from BioLIMS database collections. Organizing a From When you start a From Files project, you should consider how to store Files Project the sequences so that they remain accessible to the project at all times. Make sure you keep your sequences in the same relative position to the project file with which they are associated. Otherwise, AutoAssembler may not be able to locate the sequences when you try to open them from within the project. If your assembly project requires only one project file and a few related sequences, maintain the project sequences in a folder inside the project folder, as shown in Figure 3-1. If you move or archive the project folder, the project file and sequence files remain in the same relationship to each other Figure 3-1 Example of simple project organization In this configuration, the project file and related sequences move together if you move or archive the project. 3-2 Creating and Assembling a Project Organizing a Large Project With Several Project Files If you have a large number of sequences and want to create several projects to assemble them, store all the related projects, along with their sequences, in a single folder (see Figure 3-2). Figure 3-2 Large project organization In this example, any of the four project files can contain sequences from any of the sequence folders. If you move or archive the Cosmid folder, all the sequences and project files remain in the same relative position. Organizing a Networked Project If you are working on a network server other than the BioLIMS database and share sequences with other people, it is important that the sequences remain on the same volume. If the sequences are moved to another disk drive, another server, or another partition of the same disk, AutoAssembler will not be able to locate them when you open a related project file. If you are using AutoAssembler with the BioLIMS database, the sequences always remain accessible to the respective project. In addition, new sequences can be automatically added to the project (see “Setting Up for AutoUpdating” on page 3-43). continued on next page Creating and Assembling a Project 3-3 Organizing a The BioLIMS database keeps track of all sequences and changes BioLIMS Project made to the sequences by all users connected to the server. For this reason, no special precautions are necessary to maintain links to BioLIMS sequences. However, the BioLIMS access must be open in order to view electropherograms or edit sequence data. If connection is not established, the BioLIMS access dialog box will automatically open when you attempt to access a sequence (see “Opening BioLIMS Access” on page 3-13). Note You cannot mix local files and sequences from BioLIMS. Naming a BioLIMS Project The AutoAssembler AutoUpdating feature relies on the name of the project to identify the collection that contains the correct sequences. For example, a project named “Project 1” assigned to be autoupdated will have all sequences in a collection named “Project 1” automatically added and updated. Note A BioLIMS project must have the same name as a collection on the database if you want to use AutoAssembler’s AutoUpdating feature. Missing Files If you move your sequences out of position relative to the project file with which they are associated, you can still open the project and assemble it. However, if you try to open the sequence in the sequence window, or display a sequence’s electropherogram, the following dialog box appears: This dialog box indicates that the sequence is no longer in the same place in relation to the project file. Use this dialog box to find the sequence. If you cannot find the sequence, click Cancel, and the following dialog appears: 3-4 Creating and Assembling a Project Click Yes to open the project without electropherogram data from the missing file. To re-establish the link between the project and the sequences, re-add the sequences to the project. This provides the AutoAssembler software with the new relative path between the project and the sequence files. Creating and Assembling a Project 3-5 Opening and Closing a Project Starting To start AutoAssembler, double-click the AutoAssembler icon. AutoAssembler Note The first time you start AutoAssembler, you must enter a registration code. Refer to “Starting AutoAssembler” on page 2-14 for specific instructions about starting AutoAssembler for the first time. Viewing the With AutoAssembler open, you can create a new, blank project window Project Window by selecting New from the File menu (Figure 3-3). To change the shape and size of the window, drag the size box in the bottom right corner. Indicates whether or not the sequences in the project are from the BioLIMS database (once the first sequence is added, the project type is assigned, and cannot be changed) Sequence names and information appear in the sequence list After assembly, the contig names appear in the contigs list After assembly, a graphic display of assembly results appears in the lower pane of the project window; use these buttons to change the graphic view Figure 3-3 The empty project window Contig List After assembly, the upper-left pane of the project window lists each contig in the project, as well as an Unassembled list. The Unassembled list contains the names of sequences that have just been added to the project, that do not have any overlaps, or that have only weak relationships with other sequences in the project. 3-6 Creating and Assembling a Project Sequence List The upper-right pane of the project window identifies sequences associated with the project. In an assembled project, you can select a contig or the Unassembled list in the upper-left pane to see the relevant sequences in the right pane. Project Views After assembly, the lower pane of the project window shows a graphic display of the results. See “Understanding the Project Window Views” on page 4-2. Opening a New To open a new project, select New from the File menu. A new, blank Project project opens. You can have several projects open at a time. Opening an You can open a previously created project in one of the following three Existing Project ways: From the Finder Project files are distinguished with the icon shown here. When you double-click a project file icon, the AutoAssembler program automatically starts, if it is not already running. The program displays a project window showing the project just as it was last saved to the file. While Starting the AutoAssembler Program If you press the Option key as you double-click the AutoAssembler icon, the program starts and a standard file dialog box automatically appears, allowing you to select the file you want to open. From Within the AutoAssembler Program If you are currently working in the AutoAssembler program, choose Open from the File menu. A standard dialog box allows you to select the file you want to open. Alternatively, go to the Finder desktop and double-click the project file icon. continued on next page Creating and Assembling a Project 3-7 Closing a Project Save the project to a project file before you close the project window. See Chapter 8, “Saving and Printing in AutoAssembler,” for instructions on saving. To close the project window: Step 1 2 3-8 Creating and Assembling a Project Action Close the project window in one of the following three ways: ♦ Click the Close box in the upper-left corner ♦ Press z-W ♦ Choose Close from the File menu If you have modified the project and have not saved the changes, a dialog box prompts you to save the changes into the project file: ♦ Click Don’t Save to continue the close operation without saving changes. In this case, the project file reverts to the last time you saved it. ♦ Click Cancel to discontinue the close operation. ♦ Click Save to save the changes. Adding Sequences From Files Introduction You can add sequences to a project from several types of files: ♦ Text files that you have created or exported from other applications ♦ Files created by ABI PRISM DNA Sequencing Analysis software ♦ Files from existing Inherit-accessed databases The AutoAssembler software copies a minimum amount of information from the sequence source file into the project and maintains a reference to the source file. AutoAssembler preserves the integrity of the data in the project by checking the system modification date of each source file in the project. Note A single sequence can be included in more than one project, but the file name of each sequence included in a single project must be unique within the project. From File and If you purchased the BioLIMS option with AutoAssembler, projects may BioLIMS Projects be designated as “From Files” or “BioLIMS.” A project acquires this designation based on the first sequence that is added to it. After that, only files of that type may be added. For example, if you added a sequence from BioLIMS, then the project will be designated a “BioLIMS” project, and only sequences from BioLIMS can be added. All command selections change to reflect this. For example, the command Add Sequences in the Project menu becomes Add Sequences from BioLIMS, and so forth. If you do not have the BioLIMS option, all your projects will be From Files. To add sequences from the BioLIMS database, see “Adding Sequences From the BioLIMS Database” on page 3-12. continued on next page Creating and Assembling a Project 3-9 Adding Sequences When you add a sequence to a From Files project, the sequence’s to a From Files name and information are displayed below those of any other Project sequences in the upper-right pane of the project window. You can add a single sequence, individual sequences from various folders, or a group of sequences from one folder. To add a single sequence or a group of sequences: Step 1 Action Choose Add Sequence(s) from the Project menu. The following dialog box appears: Note Choose Add Multiple from the Project menu to select multiple files from different folders. 2 Select the “File type” checkboxes (“3XX” sample files, “TEXT,” or “Inherit”). Note 3 The file list shows only files of the type selected. Add a file or files in one of the following ways: ♦ To add only one file, double-click the filename, or select the file and click Add. ♦ To add all files of the chosen types that are in the open folder, click Add All. A progress indicator appears while the sequences are being added: If necessary, repeat Step 2 and Step 3 to add additional files. 3-10 Creating and Assembling a Project Note Save the project file at this point in order to preserve a copy of the unassembled project. The upper-right pane of the new project shows information about the individual sequences. To specify how much information is displayed and the sort order of the sequences, see “Introduction” on page 3-25. Removing If necessary, you can remove an extraneous sequence from the list of Sequences from a sequences in a project. The sequence is not deleted from your hard Project disk; the file is simply removed from the current project. To remove a sequence from the project: Step Description 1 Select a contig (or the Unassembled list). 2 In the sequence list, select the sequence you want to remove. 3 Choose Remove Sequence from the Project menu. IMPORTANT This command cannot be undone. Once you have removed a sequence from the project window, you cannot use Undo to replace it. You must add it again. The ID number assigned to a removed sequence is not used again in the same project. Note Using Cut, Delete, or Clear removes characters from the selected sequence, but does not remove the sequence itself. Creating and Assembling a Project 3-11 Adding Sequences From the BioLIMS Database Introduction Projects that are to be populated with sequences from the BioLIMS database must be designated BioLIMS project (see “Organizing a BioLIMS Project” on page 3-4). If you have not already configured the SybaseConfig control panel, you must do so before establishing connection with the database (see “Configuring BioLIMS Access” on page 2-17). The interface you use to access the BioLIMS database is called the Sequence Chooser window. The Sequence Chooser window is common to the following BioLIMS applications: ♦ Sample 2DB ♦ Factura ♦ AutoAssembler ♦ Sequencing Analysis Using the Sequence Chooser, you can search the BioLIMS database for specific collections and sequences. Table 3-2 on page 3-17 lists the five collection criteria and Table 3-3 on page 3-18 lists the nine sequence criteria by which you can search. In This Section This section includes the following topics: For this topic Opening BioLIMS Access See page 3-13 Displaying the Sequence Chooser Window 3-15 Parts of the Window 3-16 Collection Search Criteria 3-17 Sequence Search Criteria 3-18 Searching the BioLIMS Database 3-20 continued on next page 3-12 Creating and Assembling a Project Opening BioLIMS The Edit Session Information dialog box contains session information Access for establishing connection to the BioLIMS database. To configure the BioLIMS access: Step Action 1 Choose BioLIMS Access from the Edit menu. The Edit Session dialog box appears. 2 In the text boxes, enter ♦ Your user name on the server ♦ The password for your server account ♦ The name of the database on the server (You may have access to more than one database on the server.) ♦ The server name IMPORTANT 3 All these text boxes are case sensitive. Click the checkbox labeled Save Password if you want your password saved so that you do not have to enter it every time you open the connection. Note If you plan on opening the connection via AppleScript, you should select this checkbox. Saving the password here eliminates the need to have the password included as part of the AppleScript. 4 If you want the database to open automatically when you start the AutoAssembler application, click the checkbox labeled Open on Launch. Creating and Assembling a Project 3-13 To configure the BioLIMS access: Step 5 (continued) Action If you intend to use more than one database or user account, enter an alias name for this session information. Use the pop-up menu to change, add, or remove aliases. If you have more than one alias, select the checkbox labeled Make Default to choose which one appears when you first open the Edit Session dialog box. 6 Click Open to open the connection to the database. Once connection is established, you may add sequences to you project using the sequence chooser window (see the following section). If the connection fails, an alert dialog appears. Check the following: ♦ All the logon information was entered correctly and in the correct case. ♦ Your interfaces files is correctly configured. For more information, see “Configuring BioLIMS Access” on page 2-17. ♦ Consult your BioLIMS database administrator or the BioLIMS System Administration manual. continued on next page 3-14 Creating and Assembling a Project Displaying the To display the Sequence Chooser window, choose Add Sequences Sequence Chooser from BioLIMS from the Project menu. Window The Sequence Chooser window appears (Figure 3-4). Criteria pop-up menu Search button Collection search criteria pop-up menus and text boxes Sequence search criteria pop-up menus and text boxes Split bar Search results Status line Figure 3-4 Sequence Chooser window continued on next page Creating and Assembling a Project 3-15 Parts of the Table 3-1 describes the parts of the Sequence Chooser window that Window were labeled in Figure 3-4. Table 3-1 Sequence Chooser Window Parts Item Description Criteria pop-up menu Use this pop-up menu to specify the search criteria visible on the Sequence Chooser. Note If you only intend to use a subset of criteria, setting only those visible helps reduce clutter in the window. However, the search results are the same whether a criterion is invisible or blank and visible. Search button Click this button to query the BioLIMS database. Note You can also press the Return key to begin a search. Collection search criteria pop-up menus and text boxes Use these pop-up menus and text boxes to define the collection criteria of the search. IMPORTANT Only those sequences that match each and every criterion you specify are returned. That is, search criteria are combined using the logical AND operation. For more information, see “Collection Search Criteria” on page 3-17. Sequence search criteria pop-up menu and text boxes Use these pop-up menus and text boxes to define the sequence criteria of the search. IMPORTANT A collection is returned if one or more of the sequences contained in it fulfill all of the specified sequence criteria. For more information, see “Sequence Search Criteria” on page 3-18. Split bar Drag this bar to alter the relative amount of space allocated to the top and bottom portions of the Sequence Chooser window. Search results After a successful query, found collections are listed in this area as Name, Modification date, and Creator. 3-16 Creating and Assembling a Project Table 3-1 Sequence Chooser Window Parts (continued) Item Description Status line Error messages and other important information is reported here. For example, the Status Line lists how many collections were returned in a search. Collection Search Table 3-2 shows the collection search criteria. The collections returned Criteria by the Sequence Chooser must match all of the collection criteria and contain at least one sequence that matches all of the sequence criteria. Table 3-2 Allowed Collection Search Criteria Pop-up Menu Choices Allowed Text Collection Creator is starts with ends with contains up to 255 characters Name of the creator/owner of the collection Collection Name is starts with ends with contains up to 255 characters Name of the collection Collection Type any run project other NA Collection type, default is any Creation Date is any is is before is after is between date only — set with arrow buttons Date the collection was created Modification Date is any is is before is after is between date only — set with arrow buttons Date the collection was last modified Criterion Description Creating and Assembling a Project 3-17 Sequence Search Table 3-3 shows the sequence search criteria. The collections returned Criteria by the Sequence Chooser must contain at least one sequence that matches all of the specified sequence criteria. Table 3-3 Sequence Search Criteria Criterion Pop-up Menu Choices Allowed Text Description Sequence Creator is starts with ends with contains up to 255 characters including letters, numbers, and punctuation Name of the person responsible for the run Sequence Name is starts with ends with contains up to 255 characters including letters, numbers, and punctuation Name of the sequence Sample Name is starts with ends with contains up to 255 characters including letters, numbers, and punctuation Sample name from the Sample Sheet Gel Path is starts with ends with contains up to 255 characters including letters, numbers, and punctuation The full path name to the original gel file, for example, Hard Disk:Data: GelRuns:L28t Length is any is is less than is greater than is between number The length of the most recent version of the sequence in the database Status any nascent prepare collect analysis cleanup assembly NA Status of the sequence; there are six stages of collection and analysis 3-18 Creating and Assembling a Project Table 3-3 Sequence Search Criteria Criterion Pop-up Menu Choices (continued) Allowed Text Description Instrumentation any gel capillary NA Whether the sample was run on a gel or capillary instrument Start Collect Time is any is is before is after is between date only — set with arrow buttons Date data collection began End Collect Time is any is is before is after is between date only — set with arrow buttons Date data collection ended continued on next page Creating and Assembling a Project 3-19 Searching the Follow these steps to use the Sequence Chooser to search the BioLIMS Database BioLIMS database for specific collections and sequences. To find sequences using the Sequence Chooser: Step 1 Action Choose Add Sequences from BioLIMS from the Project menu. The Sequence Chooser window appears. 2 Use the items from the Find Collection with Criteria pop-up menu (below) to define your search. Note To list all of the items in the BioLIMS database, perform the search with no criteria specified. For large databases, this process may be slow. 3 To use the pop-up menu: Choose menu items... To define the search for... above the horizontal line Collection criteria below the horizontal line Sequence criteria Note As you choose items from the pop-up menu, a black dot appears next to the item on the menu and the item is added to either the search criteria or the sequence criteria section of the window. 3-20 Creating and Assembling a Project To find sequences using the Sequence Chooser: Step (continued) Action The following is an example of the Sequence Chooser window showing four collection search criteria and five sequence search criteria: 4 Use the pop-up menus and text fields to define your search query. When you are satisfied with the search, click Search. The results of the search appear in the lower portion of the window. Note Collections returned by the Sequence Chooser must match all of the collection criteria and contain at least one sequence that matches all of the sequence criteria. 5 To view the sequences contained in the collections, click the small triangle to the left of the collection name. Creating and Assembling a Project 3-21 To find sequences using the Sequence Chooser: Step 6 (continued) Action You can take the following action. If you want to... Then... add a sequence a. Select a sequence. Note You can select multiple sequences by selecting the first sequence, and while pressing either the Shift key, Control key, Option key, or Command key (c) selecting the additional sequences. b. close the Sequence Chooser window Click the Select button. Click the ♦ Close button Adding Sequences In BioLIMS, sequences are organized in collections. Sequences in From BioLIMS collections contain both the changed data and copies of the original sequences, which remain on the database. BioLIMS-based assembly projects can contain sequences from one or more collections, and can contain some or all of the sequences from a particular collection. However, when you use the AutoUpdating feature, the project will contain all of the sequences in only one collection. In order for the AutoUpdating feature to work, the project must have the same name as the collection. If you give the project a different name, autoupdating will not work. 3-22 Creating and Assembling a Project Note An alternate way to add files to a BioLIMS project is to name the project after a collection, and then assign that project to be autoupdated. All sequences in the collection will be added to the project. This is only useful if you want every sequence from the designated collection. To add files to a BioLIMS project: Step 1 Action Open the project to which you want to add sequences. Note The project must either be a BioLIMS project (see page 3-9) or an empty project. IMPORTANT In order for autoupdating to function, the project must have the same name as the collection file from which you want to add sequences. 2 If BioLIMS access is not already open, select BioLIMS Access from the Edit menu. If necessary, modify any of the session information (see “Opening BioLIMS Access” on page 3-13). 3 Click Open, then OK. 4 Select Add Sequences from BioLIMS from the Project menu. The Sequence Chooser window appears: 5 Highlight a collection folder, or open a collection folder and highlight individual sequences within that folder. Note If necessary, you can search for selected files on the server by using the commands in the upper panes of the Sequence Chooser window (see “Searching the BioLIMS Database” on page 3-20). Creating and Assembling a Project 3-23 To add files to a BioLIMS project: Step 6 (continued) Action Click Select to add the sequences to your project. Removing If necessary, you can remove an extraneous sequence from the list of Sequences from a sequences in a project. The sequence is not deleted from the BioLIMS Project database; the file is simply removed from the current project. To remove a sequence from the project: Step Description 1 Select a contig (or the Unassembled list). 2 In the sequence list, select the sequence you want to remove. 3 Choose Remove Sequence from the Project menu. IMPORTANT This command cannot be undone. Once you have removed a sequence from the project window, you cannot use Undo to replace it. You must add it again. The ID number assigned to a removed sequence is not used again in the same project. Note Using Cut, Delete, or Clear removes characters from the selected sequence, but does not remove the sequence itself. 3-24 Creating and Assembling a Project Viewing the Sequence List Introduction You can change the sequence list by specifying what information is displayed for each sequence, and by sorting the list using varied criteria. If the upper-right pane contains more information columns than you can see, use the size box in the lower-right corner of the project window to stretch the window to the right. Changing the You can choose to display any of 13 fields containing information about Information each sequence in the sequence list. Displayed in the To change the information displayed in the sequence list: Sequence List Step 1 Action Choose Format from the Project menu. The following dialog box appears: Table 3-4 on page 3-26 describes the various fields. 2 3 Make changes as follows: ♦ To add more columns of information, drag the fields you want to display from the Fields Available list to the Fields Displayed list. ♦ To remove columns of information, drag the appropriate fields from the Fields Displayed list to the Fields Available list. ♦ To change the order in which the fields are displayed, drag them up or down in the list. ♦ To see the effect of any changes you make without closing the dialog box, click Apply. When you are satisfied with your changes, click Done. Creating and Assembling a Project 3-25 Table 3-4 contains a list of the available sequence list fields and their definitions. Note You may add as many fields to the “Fields Displayed” list as you want, but you may not be able to see all the displayed fields without increasing the size of the Project window. Table 3-4 Project Sequence List Fields Item Description Ambiguity The percentage of ambiguities in the data. Begin The starting position of the sequence along the consensus of the contig. Chemistry The type of chemistry that was used for the run that produced the data (Sample files only). DocID An ID number assigned by the Server algorithm during Server assembly. Used for technical support purposes. End The ending position of the sequence along the consensus. File The name of the sequence file. Use the Show Names command to display it. Gapped Len The length of the sequence, including gaps added in assembly. ID The sequence ID number assigned when the sequence is added to the project. Length The number of nucleic acids in the sequence. Orientation The orientation of the sequence, displayed as an arrow. This column is filled in after the sequence is in a contig. Run Date The date of the ABI sequencer run that produced the data, or the creation date for the file. Sample The name embedded in a ABI Sample file that was assigned when the data was sequenced. Source The file type (Sample, Inherit, Text, BioLIMS). ABII denotes an Inherit file. continued on next page 3-26 Creating and Assembling a Project Changing the Sort You can use the Sort command to change the order in which the Order in the sequences are displayed in the upper-right pane of the project window. Sequence List To change the sort order in the sequence list: Step 1 Action Choose Sort from the Project menu. The following dialog box appears: Table 3-5 describes each of the sorting options. 2 Click the radio button beside the option you want to apply. If you want to view the effect of your selection before closing the dialog box, click the Apply button. 3 Click Done. Table 3-5 describes the project sequence list sorting options. Table 3-5 Project Sequence List Sorting Options Option Sort performed Name Sorts by the sequence filenames in numerical, then alphabetical order. Date Sorts by run date, from earliest to latest. Begin Sorts by the starting positions of the sequences along the consensus, from far-left to far-right. End Sorts by the ending positions of the sequences along the consensus, from far-left to far-right. Length Sorts by the number of nucleic acids in the sequences, from least to most. Gapped Length Sorts by the gapped length, from lowest to highest. Creating and Assembling a Project 3-27 Table 3-5 Project Sequence List Sorting Options (continued) Option Sort performed Orientation Sorts by the orientation of the sequence, normal orientation first. Chemistry Sorts by the chemistry type, in alphabetical order. Sample Name Sorts by the sample names in numerical, then alphabetical order. 3-28 Creating and Assembling a Project Assembling Sequences Introduction After adding sequences to a project, you can assemble them automatically (after selecting the assembly parameters) by choosing Assemble from the Project menu. You can choose between one of the following assembly options: ♦ Local–Conducts assembly by using a local algorithm and parameters that you select ♦ Engine–Conducts assembly by using either a CAP or CAP Remote algorithm (if you have purchased the CAP Remote option) ♦ Server–Conducts assembly by using an algorithm based on a server, leaving you free to work on your Macintosh while the project is being assembled (only available if you purchased the Server Option) Note To read more about the algorithms, see Appendix B. Note Each assembly method may produce slightly different results. These three options can be selected from the Assembly Setup dialog box prior to assembling the project. continued on next page Creating and Assembling a Project 3-29 Assembling by The Local assembly option allows you to assemble data without the use Local Algorithm of server software or an assembly engine. Local assembly is faster than the server for small projects, since the extra time required for moving the project over a network is eliminated. To assemble a project locally: Step Action 1 Choose Assembly Setup from the Project menu. The following dialog box appears: 2 Click the Local icon in the Assemble box. 3 Set the Minimum Overlap and Percent Error (see page 3-38). 4 Click OK to set assembly parameters and close the dialog box. or Click Submit to assemble the project using the parameters you selected. 5 Select Assemble from the Project menu. The following dialog box appears while the project is being assembled: continued on next page 3-30 Creating and Assembling a Project Assembling The second method of assembling a project is by using an installed Projects Using the assembly engine. The AutoAssembler software comes with the Engine Options following engine: ♦ CAP–Macintosh based Contig Assembly Program If you purchased the CAP Remote option, your engine options include the following: ♦ CAP Remote–UNIX-based Contig Assembly Program Both algorithms deliver the same results, but the CAP Remote option is much faster for large projects, while also allowing you to work on your Macintosh during assembly. You may add additional engines by placing them in the Engines folder located in the AutoAssembly folder. The new engines then appear in the pop-up menu in Assembly Setup. You can also enter Assembly Engine Parameters in the Assembly Setup dialog box. Assembly Parameters Table 3-6 shows the parameters that can be modified in the CAP engine included with the AutoAssembler software. (These parameters also apply to the optional CAP Remote engine.) If you have installed additional engines, the parameters you can modify may be different. Note These parameters must be entered in all caps and preceded by a hyphen (as shown in Table 3-6). Table 3-6 Engine Assembly Parameters Parameter Default Description -OVERLEN 20 The minimum length of valid overlap required to join two sequences. Increasing this value will speed assembly and decrease the possibility of false overlaps. Decrease this value if you are assembling short sequences. -FLEVEL 0.70 Minimum percentage of matching bases in a valid overlap. Increasing this value can speed assembly and reduce false overlaps. Creating and Assembling a Project 3-31 Table 3-6 Engine Assembly Parameters (continued) Parameter Default Description -PERCENT 0.86 Minimum percentage of matching bases in the “best” part of any overlap. The “best” part of an overlap refers to the highest quality section of the overlapping bases. This value rarely needs to be modified. -POS5 20 The number of bases in the beginning of a sequence which may be of lower quality than the following bases. Typically, sequences have more ambiguity towards the beginning and end of their length. By designating an area of lower certainty at the beginning of a sequence, the algorithm will assign lower penalties to mismatches or gaps that occur in these bases. -POS3 450 Bases from this value to the end of a sequence which may be of lower quality than the preceding bases. Typically, sequences have more ambiguity towards the beginning and end of their length. By designating an area of lower certainty at the end of a sequence, the algorithm will assign lower penalties to mismatches or gaps that occur in these bases. -WORDSIZE 9 The size of a group of bases (word) that the engine uses to find potential matches. Increasing this value can greatly speed up the assembly engine. For example, a word size of 11 may increase assembly speed by 5 to 10 times. However, increasing word size also greatly increases memory requirements. For example, typical memory requirements for the assembly engine are 5 raised to the power of the word size, so a word size of 9 means 1.9 MB of free memory is required by the assembly engine. If you want to increase -WORDSIZE, first increase the memory allocated to the assembly engine (not the AutoAssembler program itself). For an example of how to increase the memory allocated to an application, see “Allocating More Memory” on page 2-15). 3-32 Creating and Assembling a Project Table 3-6 Engine Assembly Parameters Parameter Default (continued) Description The following parameters should only be modified by expert users. These parameters modify the score the assembly engine assigns to matches, mismatches, and gaps in overlapping sequences. The total score must be greater than the OVERLEN value multiplied by the MATCH value for the engine to consider an overlap valid. -MATCH 20 Score assigned to a correctly matched base in a potential overlap. Increasing this score will make the assembly engine more likely to consider overlaps valid. -MISMAT -40 Score assigned to an incorrectly matched base in a potential overlap. Increasing this score will make in harder for the assembly engine to find valid overlaps. -LTMISM -30 Score assigned to an incorrectly match base residing in the area specified by the POS5 or POS3 parameters (defined above). -OPEN 60 Penalty assigned to the first gap character in a run of gap characters in a potentially overlapping sequence. -EXTEND 43 Penalty assigned to each subsequence gap (after the OPEN penalty has been assigned) in a potentially overlapping sequence. -LTEXTEN 20 Penalty assigned to a gap in an area specified by the POS5 or POS3 parameters (defined above). Creating and Assembling a Project 3-33 Engine Assembly Note Assembling a project with the Engine option also creates a temporary file that can be imported and read by AutoAssembler (see page 3-41). To assemble a project using the Engine option: Step Action 1 Select Assembly Setup from the Project menu. The following dialog box appears: 2 Click the Engine Icon in the Assemble box. The following options appear: 3 Select either Cap or CapRemote from the pop-up menu. If you are using an engine that supports user-entered parameters, you may enter them now. See Table 3-6 on page 3-31 for a list of parameters for the included assembly engine. 3-34 Creating and Assembling a Project To assemble a project using the Engine option: Step 4 (continued) Action Click OK to set Assembly parameters and close the dialog box. or Click Submit to assemble the project using the parameters you selected. 5 Select Assemble from the Project menu. The following dialog box appears while the project is being assembled: continued on next page Creating and Assembling a Project 3-35 Using a Server to The Server option is based on the Myers-Kececioglu model. This model Assemble Project handles repeat sequences more efficiently and can be faster for large projects than the Local option (but not the CAP Engines). Like the CAP Remote option, the Server option allows you full use of your Macintosh computer during assembly, since the computations are performed on a remote server. Assembling on the Server allows you to use the optional Fast Data Finder (FDF). Note The following procedure assumes that you have already logged on to a server. To assemble a project using the Server option: Step 1 3-36 Creating and Assembling a Project Action Select Assembly Setup from the Project menu. The following dialog box appears: To assemble a project using the Server option: Step 2 (continued) Action Click the Server icon from the Assemble box. The following additional options appear: The FDF filter parameters that appear in the expanded Assembly Setup dialog box (when the “More” checkbox is selected) are the parameters computed for a given set of Fragment Assembly (FA) parameters, and are shown for your information only. This manual does not describe how to directly set the FDF parameters (see “FDF Parameters” on page 3-38). 3 Check the “Submit As New” checkbox. 4 Click OK to set Assembly parameters and close the dialog box. or Click Submit to assemble the project using the parameters you selected. 5 Select Assemble from the Project menu. continued on next page Creating and Assembling a Project 3-37 FDF Parameters The More checkbox in the Assembly Setup dialog box does not appear if you purchased the AutoAssembler software without the FDF. When the FDF option is installed on your server, AutoAssembler uses the Minimum Threshold and Error Rate parameters (also known as the FA parameters) to automatically compute the FDF filter parameters. These parameters are then used to perform an assembly in the FDF version of the AutoAssembler program. Although this manual does not provide instructions for setting the FDF parameters, Table 3-7 lists parameters and their definitions. Table 3-7 More Checkbox Parameters Parameter Definition Window Size of the FDF query Offset Number of bases skipped between two queries Tolerance Error tolerance applied to the FDF query Overlap Length of a sequence used by the FDF to extract queries Hit Count Number of contiguous hits used by the FDF filter to decide potential edges Setting Minimum The results of an assembly using the Local or Server options depend Overlap and heavily on the settings you enter for Minimum Overlap and Percent Percent Error Error. The two assembly algorithms use these parameters in slightly different ways. The number of contigs that result from assembly is dependent upon the overlaps that occur in the source sequence files and the assembly parameters you set. If the parameters are too stringent (Minimum Overlap high and Percent Error low), sequences that belong together may not be put into the same contig. If the parameters are too loose (Minimum Overlap low and Percent Error high), sequences that do not belong in the same contig may be put together anyway. Server Algorithm The Server algorithm calculates a statistical score that measures similarity between overlaps and reduces a given overlap score for errors (insertions, deletions, or mismatches) in the overlapping segments. If the resulting value is less than the value you entered as Minimum Overlap, or if the number of errors in the overlap exceeds the 3-38 Creating and Assembling a Project number allowed by the Percent Error parameter, the algorithm ignores the overlap. To calculate the number of allowed errors, the program sums the lengths of the two sequences being compared, and applies the Percent Error value. Example: Two sequences of comparable length Assuming two sequences that are 200 base pairs and 300 base pairs long, respectively, a Percent Error value of 10% yields 50 allowed errors, calculated as follows: 200 + 300 = 500 ∗ 10% = 50 errors allowed in the overlap IMPORTANT Be careful if you are assembling sequences of diverse lengths. Anything other than a very small Percent Error will allow a short sequence to overlap completely with a long sequence, since the long sequence determines that a large number of errors are allowed. Example: Two sequences of diverse length Assume that you want to assemble Sequence A (500 bp) and Sequence B (10,000 bp). If you use a Percent Error value of 10% and a Minimum Overlap of 10, the following calculations apply: 500 + 10,000 = 10,500 ∗ 10% = 1,050 errors allowed The number of errors allowed is greater than the length of Sequence B, which means that the program could align the two sequences if they had 10 overlapping bases, regardless of the number of insertions, deletions, or mismatches. In this case, a percent error value of 2% or less might be more appropriate. Note If you set the Percent Error value to zero, the Minimum Overlap value describes the number of bases required for an overlap. As you enter larger Minimum Overlap values, the time required for assembly decreases. The default value, 10, is a conservative starting point for this parameter. Creating and Assembling a Project 3-39 Local Algorithm The Local algorithm uses the Minimum Overlap parameter simply as the minimum number of bases allowed in the overlap. The Percent Error parameter specifies the percentage errors allowed in the overlap. Example: Local algorithm If the overlap consists of 10 bases and the Percent Error value is 10%, an overlap would be allowed with 9 matching bases and 1 error. IMPORTANT Using either algorithm, if you set the Minimum Overlap value too low or the Percent Error value too high for a particular set of sequences, random similarities can produce false overlaps. If you set the Minimum Overlap value too high or the Percent Error value too low, ambiguities and sequencing artifacts nested in the overlapping regions at the ends of the sequences might cause the algorithm to miss real overlaps. The Assembled The time required to generate overlaps and multiple alignments can Project vary depending on the number of sequences and the amount of overlap between the sequences. When assembly is complete, the status message dialog box disappears and the assembled results appear in the project window (see Figure 3-5). After assembly, the sequence list displays the sequences associated with the contig selected in the contig list The graphic display defaults to the Layout view Figure 3-5 Assembled project (in the Layout view) 3-40 Creating and Assembling a Project Contig Names The contig name appears in the upper-left pane of the window. When you select a name, the component sequences of the contig appear in the sequence list. Contigs are named after the first sequence in the project and numbered incrementally each time you assemble the project. For example, a contig titled “ox208.Contig.4” is the result of the fourth assembly of a project containing sequence ox208 as its first sequence. Sequence Names The diamond shapes no longer appear beside the sequence names, since the sequences have been assembled. The bottom pane of the window is a graphic display of the aligned sequences. The views are described in “Understanding the Project Window Views” on page 4-2. Note This is a good time to save the project. If you saved it before assembly, and want to preserve the unassembled file, use the Save As command and give the assembled file a different filename. Importing When you assemble a project using the CAP engine, two temporary Assembled text files are created and saved in the Assembly Data folder (located in Projects the AutoAssembler folder). These two files are ♦ Engine input file ♦ Engine output file (identified by the suffix “.asmg”) AutoAssembler can import and read the engine export files (that is, files whose names end with “.asmg”) as if they were assembled project files. Export files are smaller than an assembled project file, and can be more easily sent over relatively slow network connections (for example, the Internet). Creating and Assembling a Project 3-41 When imported to AutoAssembler, these files can be read, saved, or printed. However, you should not attempt to edit and reassemble these files, because they do not have the full range of sequence data. To import assembly output files: Step Action 1 Select Import from the File menu. The following dialog box appears: 2 Open the Assembly Data folder and select an export file (a file ending with “.asmg”). 3 Click Open. The file opens in a new project window. 3-42 Creating and Assembling a Project Setting Up for AutoUpdating Introduction With the BioLIMS option, you can set a project to be automatically updated via a network connection every time sequences are added or changed. Sequences are assigned to AutoAssembler project with the Add Sequences from BioLIMS command (see page 3-12). Opening BioLIMS The Edit Session Information window controls how your Macintosh Access communicates with the BioLIMS database (see “Opening BioLIMS Access” on page 3-13 to configure the connection). Configuring The names of the projects to be updated are determined by AutoUpdating preferences you set in the AutoUpdate Settings dialog box. While the project is being updated, you will not be able to use your Macintosh computer. AutoAssembler will not cause other programs on the Macintosh to fail, but will override them during autoupdating. To configure AutoUpdating: Step Action 1 Select AutoUpdate Settings from the Edit menu. The following dialog box appears: 2 Click the Add Project button and select the project(s) you want to be automatically updated in the Open File dialog box. 3 To remove a file from the list, select it and click Remove. Creating and Assembling a Project 3-43 To configure AutoUpdating: Step 4 (continued) Action Select the “Activate Automated Update” checkbox. Note Choose at least a 10 minute wait before starting the update. Once updating has started, the AutoAssembler software will periodically “take control” of the Macintosh computer in order to update the sequences. While this will not cause any programs you are running to fail, it may become disruptive to your work. 5 Click OK to accept the settings. Changing and AutoAssembler maintains links between sequences on the BioLIMS Adding Sequences database and the projects to which they have been added. Each time those sequences are modified, the modified version of the sequence is sent to the project, updating it. Sequences that have been processed by Factura and are added to a project’s collection will be automatically added to the project and the project will be automatically assembled. Adding Sequences using AutoUpdating An alternate way to add files to a BioLIMS project is to name the project after a collection, and then assign the project to be autoupdated. All sequences in the collection that have been processed in Factura will be added to the project and the project will be automatically assembled. While the Project Typically, if you are using BioLIMS to automatically update an is Being Updated AutoAssembler project, the Macintosh the project resides on is left unattended during the updating process. Turn off programs that periodically display user prompts (for example, a calendar program reminding you of an upcoming meeting) to prevent interference with AutoAssembler. continued on next page 3-44 Creating and Assembling a Project Turning Off When a project no longer must be updated, or you must use the AutoUpdating Macintosh on which the project is stored, turn off AutoUpdating. To turn off AutoUpdating: Step Action 1 Select AutoUpdate Settings from the Edit menu. The following dialog box appears: 2 Deselect the “Activate Automated Update” checkbox. 3 Click OK. Creating and Assembling a Project 3-45 3-46 Creating and Assembling a Project Viewing the Consensus Overview 4 4 Introduction After assembling sequences, you can display the consensus and the underlying sequences in one of three project window views. This chapter discusses the views, and how to change the parameters that effect how the views appear in the project window. In This Chapter This chapter contains the following topics: Topic Understanding the Project Window Views See Page 4-2 Displaying Electropherograms 4-9 Changing the Display Parameters 4-12 Manipulating Window Displays 4-18 Locating Sequences 4-21 Viewing the Consensus 4-1 Understanding the Project Window Views Introduction After you submit a project for assembly, the contig that appears in the lower pane of the project window is represented by three different views. It is important that you understand the different views so that you can efficiently edit the sequences. Table 4-1 provides a brief description of each view. Table 4-1 Project Window Views View Description Layout The default that appears after assembly. Arrows display the orientations and relative positions of assembled sequences. Click the button shown at left to return to the Layout view from any other view. Zooming in from this view shows individual nucleotides as colored bars, and the consensus displays half-height bars at positions of lower certainty, to facilitate editing. Double-click a sequence to see its electropherogram. Alignment Shows the specific nucleotide order of each sequence in a region of the contig. In the Alignment view, you can show a constantly spaced electropherogram for each sequence, providing a useful editing tool. Click the button shown at left to change to the Alignment view from any other view. Double-click a sequence to see its electropherogram. Statistics Shows redundancy plotted against consensus base. You can set criteria for the level of redundancy or orientation you consider acceptable. User-definable colors identify certain areas of the sequence. Click the button shown at left to change to the Statistics view from any other view. All views have an axis that represents the consensus sequence and indicates the positions of bases in the currently selected contig. In the Layout and Alignment views, the consensus sequence axis shows a user-specified color at positions representing ambiguous bases. The Statistics view shows level of redundancy along the consensus. continued on next page 4-2 Viewing the Consensus Layout View After you assemble sequences, the Layout view is the default that appears in the lower pane of the project window. This view graphically represents the sequence orientations and relative positions in the contig (see Figure 4-1). You can zoom in from this view to observe the nucleotides of the individual sequences. The selected sequence is highlighted in A box shows the position of the selected sequence in the consensus The consensus sequence is represented by this axis Arrows show orientation of sequences Figure 4-1 The Layout view in the project window In the Layout view, each arrow represents a single sequence, and the direction of the arrow indicates the sequence’s relative orientation. The axis across the top of the lower pane in Figure 4-1 represents the full consensus of the contig. Its length is marked in bases, and gray (the default color) shows the positions of ambiguities. Viewing the Consensus 4-3 Identifying Sequences Sequence names are synchronized with the display in the Layout view, so clicking either an arrow (in the Layout view) or a name (in the sequence list) will highlight the corresponding sequence. To determine the name of a sequence in the Layout view: Step 1 Action Click the sequence in the Layout view. The corresponding sequence name is highlighted in the sequence list. or Click a sequence name in the sequence list and the corresponding sequence is highlighted in the Layout view. Displaying File Names To make it easier to identify files in the Layout view, you can display the sequences with their sample file names. To display the file names of sequences: Step 1 Action To display file names in the Layout view, choose Show Names from the Project menu. The project window should now look like this: Note Displaying file names in the sequence list is described under “Viewing the Sequence List” on page 3-25. 4-4 Viewing the Consensus Zooming In You can obtain closer resolution of the Layout view by zooming in, which displays individual nucleotides as colored bars. To zoom in from the Layout view: Step Action 1 Select a region you want to examine more closely by clicking it or dragging the cursor over a range of the consensus axis. 2 Choose Zoom In (c– =) from the Window menu. Note If you continue to Zoom In, the view switches to the Alignment view. Figure 4-2 shows the Layout view after two Zoom In commands. You can also get to this display by using the Zoom Out command from the Alignment view. Bases are shown as colored bars, and lowercase bases appear as half-height bars in the consensus sequence Ambiguity characters appear below the consensus sequence Characters other than upper- or lowercase A,C,G, and T appear as short black bars Figure 4-2 Layout view after zooming in The Zoom command can be used to facilitate editing. For example, in Figure 4-2, positions at which the consensus base calls are less certain appear as half-height bars in the ambiguity color (the default is gray). Locations identified with codes other than upper- or lowercase A,C,G, and T (for instance, other IUB codes or gaps) appear as black markers that are one-fourth the height of the normal bars. The ambiguity character appears below each ambiguous position in the consensus sequence. Viewing the Consensus 4-5 Alignment View The Alignment view allows fast and easy editing of ambiguities in the sequences. You can quickly show, zoom, and scale synchronized electropherograms that are constantly spaced so that their peaks match the base positions in the sequences. Note You can switch directly to this view from the Layout view by choosing Actual Size (c- ]) in the Sequence menu, or by clicking the button at left. Lower case characters in the consensus indicate positions of lower certainty Arrows in the sequence list indicate sequence orientations Ambiguity characters When you click a sequence in the Alignment view, the corresponding electropherogram is displayed below Figure 4-3 The Alignment view In the Alignment view, you can easily observe specific nucleotide sequences and the nature of marked ambiguities. The consensus sequence appears at the top of the pane, and ambiguity characters below it mark ambiguous base positions. The individual overlapping sequences appear below the ambiguity characters. Consensus Characters The characters of the consensus sequence vary with the composition of the underlying sequences. A lowercase character in the consensus sequence indicates a position of lower certainty. Such characters are marked with the ambiguity color you specify using the Settings command (see page 4-15). ♦ 4-6 Viewing the Consensus A lowercase a,c,g, or t in the consensus sequence indicates that at least the threshold value (but less than 100 percent) of the aligned bases at that position are called as A,C,G, or T, respectively (see page 4-16). Therefore, a possibility exists that the base at that position might not be correctly called. ♦ A lowercase IUPAC (or IUB) code letter in the consensus sequence indicates that no single base is represented by the threshold value (see page 4-16) or more of the calls at that position (see page 4-16) in the underlying sequences. See Appendix C, “Key Codes,” for a table of the IUPAC/IUB codes. Viewing Sequences The orientation of each sequence is recorded by arrows in the sequence list. As in the Layout view, you can click a sequence name in the sequence list to identify the graphic representation of the sequence in the lower pane. You can also click a sequence in the lower pane to highlight the corresponding sequence name in the upper-right pane. The Statistics View The Statistics view displays redundancy and orientation information about the contig, and is useful for finding regions of the consensus that do not have enough underlying data. You select the minimum number and orientation of underlying sequences across the consensus, and the Statistics view highlights areas that do not meet your standards. Click the button shown at left to display the Statistics view (see Figure 4-4). Horizontal line at 3 shows minimum redundancy Figure 4-4 Project window with the Statistics view displayed Displaying the Consensus The Statistics view displays the consensus sequence as redundancy plotted against consensus base. You can set criteria for the level of redundancy or orientation you consider acceptable. User-definable colors identify certain areas of the sequence. The default colors are as follows: Viewing the Consensus 4-7 ♦ Red indicates where the data falls below the minimum orientation settings. ♦ Blue indicates where the data falls below the minimum redundancy. ♦ Gray indicates where the data is acceptable according to the specified settings. To locate the region representing a particular sequence, select the sequence name in the sequence list. A highlight in the lower pane indicates the range of the sequence that is highlighted in the sequence list. Statistic View Parameters The information that appears in the Statistics view is based on parameters you set (see “Changing Statistic View Parameters” on page 5-20). Your settings determine the minimum number of overlapping sequences to be considered acceptable, and the proportion of the total that must be either one orientation or the other. The default setting is the 2+1 rule. This means that for a consensus base to meet a certain quality standard, a minimum of three underlying sequences must exist at that position. At least two of the underlying sequences must be one orientation, and at least one must be the opposite orientation. Note You do not need to specify the actual sequence orientations. For example, when the values are 2+1, this stipulates either two forward and one reverse, or two reverse and one forward. The Zoom The Zoom command allows you to closely examine an area of the Command consensus, or the underlying sequences. The Zoom command is available from both the Layout and Alignment views, and can be used to transition between the two views (for example, zooming in from the Layout view three times will switch the project window to the alignment view). See “Zooming In” on page 4-5 and “Using an Electropherogram to Resolve Ambiguities” on page 5-6 for specific procedures and uses of the Zoom command. 4-8 Viewing the Consensus Displaying Electropherograms Introduction Sequence electropherograms can be displayed in either the Layout or Alignment views of the project window. In the Alignment view, the electropherograms are constantly spaced, so they can be used to resolve ambiguous base calls. Opening Electropherograms can be displayed in the Layout and Alignment views Electropherogram of the project window, for both single or multiple sequences. Displays To display electropherograms: Step 1 Action Display electropherograms in one of the following ways: ♦ Choose Show Electropherogram(s) from the Sequence menu (displays an electropherogram for the selected sequence(s)). ♦ Double-click a sequence in the Layout or Alignment view. ♦ Double-click an area of the consensus sequence. This will display electropherograms for the individual sequences in the selected area of the consensus. Hiding You can also choose to hide electropherograms once they are Electropherogram displayed in either view. Displays To hide electropherograms: Step 1 Action Hide electropherograms in one of the following ways: ♦ Choose Hide All Electropherograms from the Sequence menu. ♦ Double-click a sequence in the Layout or Alignment view that has an electropherogram displayed. ♦ Select a sequence from the sequence window that has its electropherogram displayed, and choose Hide Electropherogram from the Sequence window. continued on next page Viewing the Consensus 4-9 Changing You can change the horizontal and vertical spacing of the Electropherogram electropherogram peaks by zooming and scaling. You can change the Appearance spacing with the mouse, or by changing the settings in the Settings dialog box (see “Changing Row Height and Vertical Scale” on page 4-14). Changing Horizontal Scale When you change the horizontal scale of an electropherogram, the character spacing of the sequence changes as well. Since the character size is global to the Alignment view, changing the horizontal scale changes the spacing of all sequences in the Alignment view. To scale electropherograms horizontally: Step Action 1 Place the cursor over an electropherogram. 2 Press and hold down the Shift-Option keys. The cursor changes to a peak shape with a horizontal arrow, as shown here: 3 As you hold down the Shift-Option keys, click a peak and drag to the left or right. When you release the mouse button, all electropherograms are rescaled to the new width. Note If you reduce the electropherogram to its minimum horizontal spacing, the project window shifts to the Layout view. 4-10 Viewing the Consensus Changing Vertical Scale When you change the vertical scale of an electropherogram, the vertical scale of all other displayed electropherograms changes as well. The rows containing the peaks, however, do not change size, and the peak tops are clipped to the row height. To scale electropherograms vertically: Step Action 1 Place the cursor over an electropherogram. 2 Press and hold down the Option key. The cursor changes to a peak shape with a vertical arrow, as shown here: 3 As you hold down the Option key, click a peak and drag up or down. When you release the mouse button, all electropherograms are rescaled to the new height. Changing Row Height Change the row height to compensate for changes to the vertical scale. To change the row height: Step 1 Action Move the cursor to the bottom of an electropherogram. When it is near the horizontal line that marks the bottom of the row, the cursor changes to a bidirectional arrow, as shown here: 2 Click the horizontal line and drag up or down. When you release the mouse button, all electropherograms are drawn in proportion to the new row height. Viewing the Consensus 4-11 Changing the Display Parameters Introduction You can use the Settings dialog box to specify the following: ♦ Row height and vertical scale of displayed electropherograms ♦ The minimum separation between sequences that are displayed in the same line ♦ The color of base sequences ♦ The color of ambiguous base sequences ♦ The characters that represent insertions in the consensus sequence, or ambiguity between the component sequences of the consensus ♦ The minimum threshold for consensus bases ♦ The manner in which forward and reverse strands are displayed ♦ Network connection preferences IMPORTANT If you are using a network version of the AutoAssembler software, the network parameters appear at the bottom of the dialog box. These were set up when your system was installed or by your system administrator. If your Macintosh computer is unable to communicate with the server, see your system administrator. Do not change the Host or Service parameter values unless you are instructed to do so by your system administrator. continued on next page 4-12 Viewing the Consensus Opening the To display the Settings dialog box (see Figure 4-5), choose Settings Settings from the Edit menu. Sets the height of the rows that display electropherograms Sets the relative height of the electropherogram in the row Click any of the color boxes to display a color picker and change the default color Choose the insertion and ambiguity characters, and the base threshold Choose how forward and reverse strands will be displayed ID numbers for a connected Server Figure 4-5 The Settings dialog box continued on next page Viewing the Consensus 4-13 Changing Row You can change the appearance of electropherograms by using the Height and Settings command. Changes that you make to displayed Vertical Scale electropherograms using the mouse (see page 4-10) are reflected in the Settings dialog box. To scale electropherograms vertically, or to change row height using the settings command: Step Action 1 Choose Settings from the Edit menu. The following dialog box appears: 2 Type a new number in the Vertical Scale entry field and the Row Height entry field. The number in the Vertical Scale entry field expresses the peak height relative to the number (in inches) in the Row Height entry field. For example, if you set the Row Height to 1 and the Vertical Scale to 0.5, the peaks are scaled to half the height of the row that displays them. 3 Click OK. Note Clicking Default Settings resets the fields to the program defaults. continued on next page 4-14 Viewing the Consensus Changing Sequences will often be displayed in the same line in the project views Minimum in order to conserve vertical space and increase the readability of the Separation views. How close together the sequences will be depends on the settings you enter. To change the distance between sequences that are displayed on the same line: Step 1 Action Change the value in the entry field labeled “Min Separation.” This parameter defines the minimum length (number of characters) that must exist between two sequences if they are to be displayed on the same line. Selecting Base You can change the appearance of any of the bases in order to make Color them easier to see on the monitor you are using. Changing the ambiguous base color may make these bases easier to see in the Layout view. To change the color used to mark bases: Step Action 1 Select Settings from the Edit menu. 2 Click any of the base color buttons. The color picker appears. 3 Select a new color by clicking on the color wheel, using the scroll bar, or scrolling fields. Viewing the Consensus 4-15 To change the color used to mark bases: Step 4 (continued) Action Click OK when you are finished. Note bases. Do not duplicate any colors that are used to distinguish Changing To change the insertion or ambiguity characters, enter a new character Consensus in the Insertion Char or Ambiguity Char entry field. Characters ♦ The Insertion Char field defines the character that indicates insertions in the consensus sequence. ♦ The Ambiguity Char field specifies the character that depicts ambiguity at a particular base position of the consensus. This character is placed on a separate line below the consensus sequence in the Alignment view. For an example, see Figure 4-3 on page 4-6. Changing Changing the value in the Threshold field dictates when ambiguity Threshold Value characters are displayed in the consensus sequence. The value refers to the percentage of bases in the underlying sequences of the consensus that are the same. For example, if there are three As and one G in the four sequences underlying the consensus at a particular base position, an A would be displayed in the consensus if the threshold value were 75 percent or lower. If not, a lowercase (lower certainty) or ambiguity character would be displayed. Changing By default, forward and reverse sequences are displayed the same way Orientation in the Alignment view. If, however, you must make the different Parameters directions easy to distinguish, you can change the text styles in which the bases are displayed. To change the way sequences are displayed: Step 4-16 Viewing the Consensus Action 1 Select Settings from the Edit menu. 2 Select the Forward or Reverse radio button. To change the way sequences are displayed: Step (continued) Action 3 Select one of the three checkboxes to determine how forward or reverse strands will be displayed. 4 Click OK when you are finished. Changing Network These parameters should have been set when the program was Parameters installed or by your system administrator. Do not attempt to change these settings without the permission of your system administrator. Viewing the Consensus 4-17 Manipulating Window Displays Introduction The AutoAssembler software allows you to manipulate the project and sequence windows in order to get a better look at your data. You can also clone the project window in order to see multiple views of the same data. Arranging When you have opened more than one sequence window, you can Multiple Windows quickly organize the open windows by either tiling or stacking them by using the respective commands in the Window menu. Tiling To arrange the windows so they do not overlap and a good-sized portion of each is visible, choose Tile from the Window menu. This method is useful when you have only a few windows open. Figure 4-6 shows an example of tiled windows. Project window Sequence windows Figure 4-6 Tiled windows Stacking Choose Stack from the Window menu to arrange a large number of open windows so they are reduced in size, and stacked from back to front so that an edge of each is visible. When the windows are stacked, you can bring any window to the front by clicking the exposed edge of 4-18 Viewing the Consensus that window or selecting the project file name from the Window menu. Figure 4-7 shows an example of stacked windows. Project window Sequence windows Figure 4-7 Stacked windows Cloning the Project Window to See Multiple Views of the Data If you want to look at your data in more than one format, or if assembly has resulted in more than one contig and you want to compare them, you can create more than one window of the project by cloning the original window. Each clone displays the data independently, so you can look at several levels of data. For example, you can display the Layout and the Alignment views at one time, or you can display a different contig in each window. To clone the project window: Step Action 1 Click the project window to make it the active window. 2 Choose Clone from the Window menu. Figure 4-8 shows an example of a cloned project window displaying two different views. Viewing the Consensus 4-19 When you select a range of bases, the same range is selected in the cloned window Figure 4-8 Cloned project window showing the Layout and Alignment views Since both windows display the same project, they have the same name in the Window menu. The name with a checkmark beside it in the Window menu is the frontmost window on the screen. 4-20 Viewing the Consensus Locating Sequences Introduction The AutoAssembler software provides you tools for locating particular sequences in each contig in the project file, and for locating specific patterns in a sequence. These tools are ♦ Find (Again)–Finds sequences in the project window, and patterns in the Sequence view of the sequence window ♦ Search (Again)–Searches for patterns within specific sequences in all contigs of the project Finding Sequences You can use the Find and Find Again commands in the Edit menu to and Patterns search either for text (or a specific sequence) in the project window, or for a pattern of bases in the sequence window. In the Sequence view of the sequence window, you can use the Find command to search for gaps or any string of characters in a sequence. You can also quickly repeat a search operation using the Find Again command, which locates the same information as in the previous Find command. To find a pattern of bases in the Sequence view: Step Action 1 Place the insertion point at the location where you want the search to begin. 2 Choose Find (c-F) from the Edit menu. The following dialog box appears: If the insertion point is at the end of a sequence, you must specify “Wrap around” in the dialog box, or move the insertion point. 3 Type or paste the pattern for which you want to search (up to 255 characters) in the “Find What?” entry field. Viewing the Consensus 4-21 To find a pattern of bases in the Sequence view: Step 4 5 (continued) Action Click the appropriate radio buttons and checkboxes to specify the parameters of the search. ♦ Select “Literal” to specify that all characters be matched exactly as you entered them. ♦ Select “IUPAC/IUB” if you have entered an IUB character as part of the pattern. If you select IUPAC/IUB, using the Find command locates all possible matches. For example, if the pattern you enter is ATM, the command locates either ATA or ATC. ♦ Select “Grep” to set your own codes to represent a wildcard or part of a sequence. Table 4-2 shows a list of the available options. ♦ Select “Offset” to move the cursor to a specified position or range. If you simply enter a number in the “Find What?” entry field, the insertion point is moved to that base position. If you enter a range of numbers, the whole range is highlighted (for example, 123…230). ♦ Select “Case sensitive” if you want uppercase and lowercase variants of a letter to be recognized as different symbols. ♦ Select “Wrap around” if you want the search to start again at the beginning of the sequence after it has reached the end. If the “Wrap around” checkbox is not selected, the search stops at the end of the sequence. Click Find. Table 4-2 provides special expressions for use with the “Grep” option of the Find command. Table 4-2 Selection Expressions for “Grep” Option Expression Match Performed Example [a] (brackets) Any character inside the brackets AA[AC][GT] matches AAAG, AAAT, AACG or AACT. [AGC] matches A,G or C. [l¬¬l] (brackets with ¬ (Option-L) as first character inside) 4-22 Viewing the Consensus Any character except the character(s) inside the brackets A[¬lAG]C matches ACC or ATC. Table 4-2 Selection Expressions for “Grep” Option (continued) Expression Match Performed Example * after character Zero or more such characters AT[CG]*T matches ATT or ATCT or ATGGT, and so on. . (period) Any character AA.A matches AAAA, AACA, AAGA, AATA, AANA, and so on. – (dash) enclosed by brackets A range of characters AA[A-z] matches AAA, AAC, AAG, AAz, and so on. The AutoAssembler software finds the first instance of the pattern you specified and marks its position in the summary graphic at the top of the Sequence view. Note If you only want to find a pattern in the valid range, place the insertion point just before this range in the sequence. To find other occurrences of the same pattern: Step 1 Action Choose Find Again (c-G) from the Edit menu to bypass the dialog box and use the pattern defined in the previous Find command. Each time you use this command, the next occurrence of the specified pattern is located. continued on next page Viewing the Consensus 4-23 Searching for Use the Search command to search for sequences or patterns within Sequences specific sequences. This command searches across all contigs in the project. When you use the Search command, the AutoAssembler software looks for literal matches and does not search the consensus sequences. When the program finds a match, it selects that sequence in the sequence list, and, if you specified a pattern, it highlights the pattern in the Alignment view. To search for sequences or patterns: Step Action 1 Choose Search from the Edit menu. The following dialog box appears: 2 Enter the filename or the sample name or both. Note You only need to enter enough of the name for it to be distinguishable from the other samples or files. 4-24 Viewing the Consensus 3 Enter a simple pattern if you want to search by pattern. 4 Click Search. 5 To continue the search, choose Search Again from the Edit menu. Editing the Project 5 5 Overview Introduction After assembling sequences, you may need to examine and edit ambiguous areas in the consensus resulting from the assembly. To do this efficiently, you should understand the various views available in the project window and in the sequence window. (See Chapter 4, “Viewing the Consensus.”) You can edit the sequences in either window, but edits made to sequences by changing the consensus in the project window will only be saved to the project, not the individual sequences (see Chapter 8, “Saving and Printing in AutoAssembler.”). To edit an individual sequence, see Chapter 6, “Viewing and Editing Sequences.” In This Chapter This chapter contains the following topics: Topic Locating and Controlling Ambiguity in the Consensus See Page 5-2 Resolving Ambiguity in the Project Window 5-10 Verifying Orientation and Redundancy 5-20 Editing the Project 5-1 Locating and Controlling Ambiguity in the Consensus Introduction The AutoAssembler software displays ambiguity in the consensus so that you can easily find and correct problems with the underlying sequences or their assembly. Ambiguities are displayed as special characters, and positions of lower confidence are represented by lowercase characters. Once you find ambiguous areas, you can edit the contig and the underlying sequences in the project window (see “Resolving Ambiguity in the Project Window” on page 5-10). You can also alter the consensus in the following ways: ♦ Use the Threshold value in the Settings dialog to control ambiguity in the consensus ♦ Complement the Contig to view all project window and sequence data as a complementary strand of DNA ♦ Convert the consensus to a three-frame protein translation Using the Views to The project window provides a quick and useful way to edit assembled Locate Problem sequences. You can easily locate problem areas using the Layout view Areas or the Alignment view. If you want to view graphical sequence information for files created by ABI PRISM DNA Sequencing Analysis software in order to clarify the base calls, you can cause synchronized electropherograms to drop down from the sequences. You can then edit either the consensus sequence or any of the component sequences in the Alignment view. Your edits are immediately reflected in the related components or the consensus sequence. Note To read about displaying and scaling electropherograms, see “Displaying Electropherograms” on page 4-9. Each of the project window views display problem or ambiguous areas, so you have several ways of locating bases or regions to edit. ♦ The Layout view shows a comprehensive view of a contig. Ambiguous areas are highlighted with the default color (gray). By using the zoom command, you can view increasingly detailed ambiguous regions of the contig. – 5-2 Editing the Project Zooming in from the Layout view shows ambiguities, as well as a compressed view of the data, and is particularly useful for locating ambiguous sequence ends that can be deleted from the valid range of data used for assembly. ♦ The Alignment view provides progressively focused views of individual ambiguities and positions of low confidence. It shows ambiguities marked with color in the consensus, and displays ambiguity characters that indicate positions in the consensus where ambiguities or insertions exist. – ♦ Zooming in from the Alignment view is useful for comparing ambiguous sequence bases with their corresponding electropherograms (see “Using an Electropherogram to Resolve Ambiguities” on page 5-6). The Statistics view allows you to locate areas of the contig that do not have enough sequences (or enough in each orientation) to provide adequate data redundancy (see “Verifying Orientation and Redundancy” on page 5-20). Finding Use the Tab key to find ambiguities quickly in the consensus or any of Ambiguities the underlying sequences. Table 5-1 shows the various options Quickly available. Table 5-1 Key Commands for Locating Ambiguities Key Command Action Performed Tab Find next ambiguity (character other than ACGT) Shift–Tab Find previous ambiguity (character other than ACGT) Option–Tab Find next ambiguity excluding gaps (character other than ACGT or gap character) Shift–Option–Tab Find previous ambiguity excluding gaps (character other than ACGT or gap character) When you reach the last ambiguity in a sequence, the AutoAssembler software sounds an alert. Use Shift–Tab to move backwards through the sequence. Note You can also find bad regions of the consensus quickly using the Select Next Bad Region (c–H) command in the Edit menu. continued on next page Editing the Project 5-3 Controlling Because different projects may require different levels of ambiguity, you Ambiguity in the can specify a threshold that determines the percentage of bases below Consensus which an ambiguity character appears in the consensus. This setting applies globally to all projects. To control ambiguity in the consensus: Step Action 1 Choose Settings from the Edit menu. The Settings dialog box appears: 2 Enter a threshold value in the Base Threshold entry field. For a base to appear in the consensus, it must appear at the same position in the underlying sequences in at least this percentage. Note 3 The default value is 80 percent. Click OK. continued on next page 5-4 Editing the Project Complementing a The AutoAssembler software allows you to display a contig as though it Contig is from the complementary strand of DNA. When you do so, the data is complemented in all views of the project window and in the sequence window. To complement a contig: Step Action 1 Select a contig in the upper-left pane of the project window. 2 Choose Complement from the Edit menu. The entire contig is complemented. Note When the selected contig is complemented, a checkmark appears beside the Complement command in the Edit menu. To revert the contig, select complement again (the checkmark disappears). Translating the To facilitate editing decisions, you can display a three-frame protein Consensus to translation of the consensus. This can be useful if you are looking for Protein Sequences sequencing errors that cause a potential frame-shift. The Translation view appears as three lines of text below the consensus (see Figure 5-1). Figure 5-1 Protein translation of consensus sequence Editing the Project 5-5 The single-character amino acid aligns with the third position of each codon. For example, for the sequence ATGCCA, the code M (for methionine) aligns with the G, and the code P (for pronine) aligns with the final A. Note that there will be three rows of amino acids to reflect each possible three-base codon. You cannot print, copy, or save the protein translation. You also cannot directly edit it, although the protein codes update when the underlying consensus changes. The protein translation uses a universal codon table when translating ambiguities in the consensus sequence (see Appendix C). Using an The Alignment view is particularly useful for comparing sequence calls Electropherogram with their associated electropherogram (see Figure 5-2). to Resolve Ambiguities Figure 5-2 Alignment view In this example, you could use the electropherogram to resolve the lowercase t (circled in Figure 5-2). The electropherogram uses the same colors as the corresponding bases, allowing you to pick out the strongest signal at any given point. continued on next page 5-6 Editing the Project Finding Use the Layout and Alignment views to find problem areas in the Ambiguous Areas consensus or in the underlying sequences. To locate ambiguous areas in the project window: Step 1 Action In the Layout view, click each of the file names in the sequence list, and observe the positions or regions marked by the ambiguity color in the boxed area of the consensus sequence axis. or Use the Select Next Bad Region (c–H) command from the Edit menu. The following is an example of a highlighted bad region: 2 Choose Zoom In (c–=) from the Window menu until you can clearly see the ambiguous area. The view will appear as follows: Editing the Project 5-7 To locate ambiguous areas in the project window: Step 3 (continued) Action Locate ambiguities by looking for ambiguity characters under the consensus sequence or short black bars in the component sequences. A substantial number of ambiguous characters at either end of a sequence could indicate sufficient ambiguity to warrant removing that region from the valid range of data used for assembly (you can do so using Delete From Valid Range in the Edit menu—refer to “Editing the Valid Range of Data Used for Assembly” on page 5-18). 4 5-8 Editing the Project Change to the Alignment view by clicking the button shown here: To locate ambiguous areas in the project window: Step 5 (continued) Action Locate a region that shows several ambiguity characters (the default shows bullets) under the consensus sequence. or Use the Select Next Bad Region command in the Edit menu. The following is an example of an ambiguous region: If necessary, zoom in (c - =) to the area in question (see “Using an Electropherogram to Resolve Ambiguities” above). If necessary, double-click the sequences to display the underlying electropherograms. 6 Examine the region, and, if you determine that it should be edited, proceed with the steps described in “Resolving Ambiguity in the Project Window” on page 5-10. Editing the Project 5-9 Resolving Ambiguity in the Project Window Introduction The AutoAssembler software allows you to add, delete, replace, and shift bases, either individually or in groups. The way AutoAssembler handles sequence and consensus editing is slightly different, and the differences are described with each procedure. The following procedures briefly describe options for editing sequences and the consensus, and provide several examples. Note You should always edit with lowercase characters (in the consensus, newly entered characters will still appear uppercase, but underlying sequences will be lowercase). Doing so makes it easy to locate areas you have edited. In both the Alignment view and when you zoom in from the Layout view, lowercase bases appear as half-height bars. Editing in the If you edit the consensus, all component sequences that overlap at the Consensus edited position are changed to match your edit, as described in the editing procedures in this chapter. If you edit one of the underlying sequences, the consensus immediately reflects the change. Sequences that overlap with the edited sequence do not change. When you add bases (A, C, G, or T) to the consensus, the added bases always appear in uppercase because they reflect your input to the consensus. However, if you enter them as lowercase, the underlying sequences will remain lowercase to show less than 100 percent certainty in the edit. If you enter a base in the consensus in uppercase, the underlying sequences will be uppercase as well. Note This only applies to A,C, G, or T. Other characters imply less than 100% certainty, and will appear as lowercase in the consensus. What Gets Saved If you are editing either individual sequences or the consensus from the When You Edit project window, your edits are saved into the project file only when you save the project. Note Changes to the sequences made in the project window are not saved to the individual sequence when you save the project. You must either use the Save Sequences command, or save from the sequence window (see “Project and Sequence Relationships” on page 8-3). continued on next page 5-10 Editing the Project Keeping Track of When you have edited a sequence, a triangle appears beside the Your Edits sequence name in the sequence list (in the upper-right pane of the project window). To keep track of edited bases, a simple rule is to edit with lowercase characters (they will still appear uppercase in the consensus). When you do, the edited bases appear in Alignment view sequences as lowercase characters. All other sequence characters are uppercase, so the edits are easy to locate. They are also easy to find using the zoom command, where the lowercase bases appear as half-height bars. Selecting Bases The keystrokes listed in Table 5-2 allow you to quickly select a single or Sequence character, a range of bases representing a segment of a sequence, or Segments an entire sequence. Table 5-2 Keystrokes for Selecting Sequences in the Project Window Keystroke Selection performed Left-Arrow (←) Moves cursor to the left one base. Right-Arrow (→) Moves the cursor to the right one base. Shift-Left-Arrow (⇑ ←) Selects the next base to the left. Holding down the Shift key and pressing the arrow key additional times extends the selection. Shift-Right-Arrow (⇑ →) Selects the next base to the right. Holding down the Shift key and pressing the arrow key additional times extends the selection. Option-Left-Arrow Moves the cursor to the left end of the current sequence. ( ←) Option-RightArrow ( →) Moves the cursor to the right end of the current sequence. Shift-Option-LeftArrow (⇑ ←) Selects a range from the cursor position to the left end of the sequence. Shift-Option-RightArrow (⇑ →) Selects a range from the cursor position to the right end of the sequence. Up-Arrow (↑) Moves the cursor up one sequence in both the sequence list and in the Alignment view. This operates in a circular manner. If the cursor is in the top sequence, it moves to the bottom sequence. Editing the Project 5-11 Table 5-2 Keystrokes for Selecting Sequences in the Project Window Keystroke Selection performed Down-Arrow (↓) Moves the cursor down one sequence in both the sequence list and the Alignment view. This operates in a circular manner. If the cursor is in the bottom sequence, it moves to the top sequence. Option-Up-Arrow ( ↑) Moves the cursor to the consensus sequence. Option-DownArrow ( ↓) Moves the cursor to the bottom sequence in the contig. Adding Bases When you insert a base to the left of a gap, the base replaces the gap in the sequence. For example, typing c to the left of the gap in the sequence AA–CT results in the sequence AAcCT. When you insert a base to the right of another base, place-marker gaps are inserted in the overlapping sequences to maintain the downstream alignment. If there was a gap to the right of the character you typed, it remains. Example: Adding a base to the right of a gap Step 1 Action Insert a t to the right of the gap in the middle of the following sequence: AATCT A A– CT AATCT The following sequence results: A A T – CT A A – t CT A A T – CT You can select the t and choose Shift–Left from the Edit menu, or type c–Shift–Left Arrow to shift the t to the left and align the gaps. 5-12 Editing the Project Note When you enter lowercase characters in the consensus, the new bases will still appear as uppercase characters because they reflect your input to the consensus. The underlying sequences will appear in lowercase. To add bases: Step Action 1 Click to place the cursor at the position you want to add a base or multiple bases. 2 Type the new character or characters you want to insert. Deleting Bases When you delete bases, gap characters maintain downstream alignment of the sequence with the contig. For example, deleting the N from the sequence AANCT results in the sequence AA–CT. When you delete a range of bases, place-marker gaps replace each of the deleted bases. For example, deleting NC from the sequence AANCT results in AA– –T. When you replace a base or range of bases in the consensus, the corresponding bases in the underlying sequences change to match the consensus. Example: Deleting a gap in the consensus Step 1 Action Delete the gap in the following alignment (the italicized sequence is the consensus): A A Gc – T A A G– – T A A GC T T A A GC – T CT CT CT CT The following sequence results: A A Gc T C T A A G– T C T A A GC T C T A A GC T C T Two gaps and one T in the underlying sequences are deleted. Editing the Project 5-13 To delete bases: Step 1 Action Delete a base or bases in one of the following ways: ♦ Click to the right of the desired base, or select a range of bases, and press Delete. ♦ Click to the left of the desired base and press the Forward Delete key ( X ). ♦ Use the standard Macintosh Cut (c - X) command. ♦ Replace the base or bases as described below. Note To delete a range of bases from either end of a sequence in order to change the valid range of data used for assembly, see the procedure on page 5-18. It allows you to alter the valid range without losing the sequence data. Replacing Bases When you replace a base or range of bases in the consensus, the corresponding bases in the underlying sequences change to match the consensus. If you type lowercase characters, the bases that are changed in the underlying sequences appear as lowercase characters so you can locate them easily. Underlying sequences that do not change remain as uppercase characters. 5-14 Editing the Project Note When you enter lowercase characters in the consensus, the new bases will still appear as uppercase characters because they reflect your input to the consensus. The underlying sequences will appear in lowercase. Example: Replacing bases in the consensus Step 1 Action Replace ct in the consensus in the following alignment (the italicized sequence is the consensus) by selecting the characters and typing ct: AAc t TT A A GT T T A A CC T T A A CC T T A A CT T T A A GT T T The following alignment results: A A CT AAc T A A Ct A A Ct A A CT AAc T TT TT TT TT TT TT In the third position, a c replaces G in the top and bottom sequences. In the fourth position, t replaces C in the second and third sequences. Note Whenever you enter characters in the consensus, they are displayed as uppercase, regardless of whether or not you entered them in uppercase. However, as in this example, entering in lowercase does affect the underlying sequences. Editing the Project 5-15 To replace bases: Step Action 1 Drag to select the desired base or bases. 2 Type a new character or characters to replace the selected ones. The change is reflected immediately. Note When you highlight a range of bases, then type one character, the first base is replaced by the character, and the other selected bases are replaced by gaps. If you continue to type other characters, the gaps are replaced. Note To replace a single base, you can also click to the right of the base you want to edit, backspace to remove the character, and then type the new character. When you backspace, a gap character maintains the alignment of the sequence in the contig. Typing a new character replaces the gap character with that character. Shifting Bases Instead of using Cut and Paste commands, you can shift bases to the left or right in the consensus sequence. To shift bases or sequence segments: Step 1 Action Select the base or segment. See Table 5-2 on page 5-11 for a list of keyboard shortcuts you can use to select bases or sequence regions. 2 Choose Shift Left or Shift Right from the Edit menu. or Press c–Shift–Left–Arrow or c–Shift–Right–Arrow. continued on next page 5-16 Editing the Project Editing Examples This section uses a specific overlap to demonstrate some different editing options you might choose. Assume that, after assembly, you have the following three fragment overlaps: A A GC – A C T A A GN C A C T A A GC – A C T In this case, you have chosen to edit the overlap by removing the N and realigning the sequences, although in other circumstances the character you edit might depend on what appears in the electropherogram data. Example 1: Step Action 1 Drag to select the NC in the middle sequence. 2 Type C to replace the two selected bases. This creates the following alignment: AAGC–ACT AAGC–ACT AAGC–ACT Note Reassembling removes the unnecessary gaps. If you want to remove them without reassembling, select the corresponding gap in the consensus and press Delete. Example 2: Step Action 1 Click to the left of the N. 2 Press the Forward Delete key ( X ). The alignment looks like: AAGC-ACT AAG-CACT AAGC-ACT Reassembling removes the unnecessary gaps. If you want to align the Cs and gaps, follow Step 3 and Step 4. Editing the Project 5-17 Example 2: Step 3 (continued) Action Press Shift-Right-Arrow. This selects the C. 4 Press c-Shift-Left-Arrow. This shifts the C to the left to align it: AAGC-ACT AAGC-ACT AAGC-ACT Example 3: Step Action 1 Click to the right of the N. 2 Press Delete. This creates the following alignment, which is the same as that created in Step 2 of the previous example: AAGC-ACT AAG-CACT AAGC-ACT These three examples demonstrate that many methods exist for editing any one base or region of data. Try different options to discover what is most comfortable or efficient for your own editing purposes. Editing the Valid AutoAssembler uses a feature called *ABI_ValidRange to determine the Range of Data range of sequence data used for assembly. By changing this feature, Used for Assembly you can increase or decrease the amount of data used for assembly without altering the contents of the sequence data files. This can be a handy tool if the vector or ambiguity range defined by Factura is either longer or shorter than necessary for a given sequence. You can change the feature in two ways: ♦ 5-18 Editing the Project Edit the range in the Feature view of the sequence window to make it longer or shorter. Editing a feature in the Feature view of the sequence window is described in “Editing Feature Ranges and Markings” on page 6-14. ♦ Use the Delete From Valid Range command in the Edit menu. You can easily do this in the Alignment view of the project window, as the following procedure shows. This procedure preserves the data in your sequence file, but removes it from the valid range of data used for assembly. To quickly remove either end of a sequence from the valid range: Step Action 1 From the Alignment view of the project window, select the range of bases (at either end of the sequence) you want to delete. 2 Choose Delete From Valid Range from the Edit menu. The selected region is automatically deleted from the *ABI_ValidRange feature, which defines the valid range of the data used for alignment. Deleting a region from the valid range does not delete it from the sequence. It simply hides the region without moving the sequence, and ensures that all sequence data is preserved. Editing the Project 5-19 Verifying Orientation and Redundancy Introduction The Statistics view allows you to rapidly locate areas of the consensus that do not have a specified number of sequences. This feature is particularly useful for finding areas that require more sequence data. The parameters the Sequence view uses to check the consensus can be modified in the Configure Statistics dialog box. Changing Statistic The 2+1 rule is generally accepted as the minimum requirement for View Parameters good quality data. This provides one sequence in each orientation and one extra sequence in one of the orientations for verification. For a more stringent redundancy of five, you could specify orientation ratios of 4+1 or 2+3. To set parameters for the Statistics view: Step Action 1 Choose Statistics Settings from the Edit menu. The following dialog box appears: 2 Enter numbers in the two entry fields to specify the number of sequences required in each orientation. You do not need to specify the actual sequence orientations. 5-20 Editing the Project 3 To change one of the colors that identify failure or compliance, click the color filed to the right of the description to display the color picker. 4 Ensure that the “Show average redundancy line” checkbox is selected if you want to show a line that represents average redundancy. 5 Click OK. Checking the The Statistics view provides an overall view of the problem areas in a Consensus given contig. Once you locate a potential problem, you can use the Layout view to verify the underlying sequences. You can then add new sequences to attain desired redundancy, or extend the range of the existing sequences (see “Editing Feature Ranges and Markings” on page 6-14). To locate problem areas in the Statistics view: Step Action 1 Change to the Statistics view by clicking the Statistics view button in the lower-left corner of the project window: 2 Locate areas in the consensus that appear to fail your redundancy criteria. or Choose Select Next Bad Region (c–H) from the Edit menu. The program highlights the next entire region of the consensus that does not meet the criteria set in the Statistics Settings, as in the following example: Editing the Project 5-21 To locate problem areas in the Statistics view: Step 3 (continued) Action Click the Layout view button. The area selected in the Statistics view is displayed. In the figure above, there are four sequences in the ambiguous area, but all are in the same orientation, failing the orientation test. 4 In the Layout view, verify that the underlying sequences for this position include the minimum number of sequences in one orientation, and in the opposite orientation. For example, if you specified 1+2 in the Statistics settings, there should be at least two sequences in one orientation, and at least one in the opposite orientation. If necessary, you could then change the range of a sequence’s valid data to extend a sequence and attain your specified redundancy (see “Editing Feature Ranges and Markings” on page 6-14). 5-22 Editing the Project Viewing and Editing Sequences 6 Overview 6 Introduction While you are reviewing or editing the consensus, you may find it necessary to view or edit the underlying sequences (for example, in order to extend the range of valid data). Using the sequence window allows you to isolate a particular sequence and view its electropherogram, annotation, and feature data. When you make changes and save in the sequence window, your changes are saved directly to the sequence’s sample file. In This Chapter This chapter contains the following topics: Topic See Page Viewing and Editing Individual Sequences in Sequence Windows 6-2 Using the Annotation View 6-7 Using the Electropherogram View 6-8 Using the Sequence View 6-11 Using the Feature View 6-13 Viewing and Editing Sequences 6-1 Viewing and Editing Individual Sequences in Sequence Windows Introduction The AutoAssembler sequence window displays information about an individual sequence in up to four views: ♦ Annotation view ♦ Sequence view ♦ Feature view ♦ Electropherogram view You can use this window to view the native (variable) peak spacing in sample file electropherograms generated by ABI PRISM DNA Sequencing Analysis software, or to edit the individual sequences. Although you will probably do most of your editing in the project window because of the ease with which you can compare sequences and electropherograms, you might occasionally want to use individual sequence windows to view the electropherograms with their native spacing. The electropherograms in the Alignment view are displayed with constant spacing, so they line up properly with the corresponding nucleotide sequences. You might also want to change the features that are defined in individual sequences by changing the color marking or range of a certain feature. You must perform such changes in the sequence window. Note If you are using the BioLIMS option, the connection to the database must be open for you to view or edit a sequence. continued on next page 6-2 Viewing and Editing Sequences Opening the You can open the sequence window in several ways to view individual Sequence Window sequences. You can also open several sequence windows and view them simultaneously. To open a sequence window: Step 1 Action Open the sequence window in one of the following ways: ♦ Double-click the name of the sequence in the sequence list. ♦ Select the sequence in the sequence list or in the current view by clicking the sequence once, then choose Show Sequence (z-D) from the Sequence menu. ♦ Select a region of interest in a sequence in the lower pane, then press z-D. The bases you select in the project window are displayed in the sequence window when it opens. Note If the sequence was produced on ABI PRISM DNA Sequencing Analysis software, the sequence window opens in the Electropherogram view. Note If you have changed the physical location of the sequence file, you may not be able to view electropherogram data (see “Organizing a From Files Project” on page 3-2). continued on next page Viewing and Editing Sequences 6-3 Viewing the If your sequence file was produced on ABI PRISM DNA Sequencing Sequence Window Analysis software, the sequence window opens in the Electropherogram view (see Figure 6-1). Lock image Buttons used to change view The valid range is marked green in the summary graphic Use the size box to change the size or shape of the window Figure 6-1 The sequence window in Electropherogram view Immediately below the standard Macintosh computer title line and close box is a display window to the right of a lock image. The horizontal line in this summary graphic represents the length of the sequence, and reflects the cursor position as you move it to different places in the sequence. The valid range of data used for assembly is marked green. If you click the lock image, the sequence is protected from edits. You cannot Cut from or Paste to the sequence (using the Edit menu) as long as the lock is closed. Click the image a second time to unlock it. You can display up to four different views by using the buttons located in the bottom-left corner of the window. Each button’s function is described in the following sections. Note The Electropherogram view is only available for files containing ABI PRISM DNA sequencing and analysis software electropherogram data. If you open a database sequence saved in the Inherit Analysis program, a sequence created using the New Sequence command in Factura, or a Text sequence entered on a word processor, the sequence window opens in Sequence view. \ continued on next page 6-4 Viewing and Editing Sequences Editing in the Sequence Window versus the Project Window Edits made in the sequence window can be saved to the original sequence file (which retains the original data). Edits made to sequences in the project window are only stored in the assembled project, and are not saved to the individual sequences unless you use the Save Sequence command (see “Saving Individual Sequences” on page 8-4). Closing the Save any editing before you close the sequence window. See “Saving Sequence Window Individual Sequences” on page 8-4 for instructions on saving. To print any of the four views of the sequence window, see “Printing Sequence Window Views” on page 8-12. Note See “Project and Sequence Relationships” on page 8-3 for a description of the relationships between project and sequences. To close the sequence window: Step 1 Action Close the sequence window in one of three ways: ♦ Click the close box. ♦ Choose Close from the File menu while the window is active. ♦ Press c-W while the window is active. Viewing and Editing Sequences 6-5 To close the sequence window: Step 2 6-6 Viewing and Editing Sequences (continued) Action If you modified the sequence and have not saved it, the following alert box appears, allowing you to save the changes if you want. ♦ To save changes only to the project file, click Save. Changes are stored in the project file, but not in the original sequence file. ♦ To store changes in the project file and the original sequence file, click Update. The changes are saved to the project file and the sequence file (which also retains a copy of the original data). ♦ To cancel closing the window, click Cancel. ♦ To continue to close the window without saving features for the named sequence, click Don’t Save. This reverts the sequence to the last saved data. Using the Annotation View The Annotation The Annotation view shows information stored in the file about the View ABI PRISM DNA sequencing and analysis instrument run that produced the sequence data, as well as annotations from a database entry (text files do not have annotations). Click the button shown at left to display the Annotation view. Figure 6-2 shows an example of the Annotation view. Figure 6-2 The sequence window in Annotation view Information in the Annotation view cannot be edited in AutoAssembler. Viewing and Editing Sequences 6-7 Using the Electropherogram View Introduction The Electropherogram view (Figure 6-3) is available only with ABI PRISM DNA sequencing and analysis software data files. It is useful for viewing electropherograms with their native spacing, or for displaying original base calls while you are editing. Click the button shown at left to return to Electropherogram view from any of the other views. If you click the sequence in the Sequence view and then switch to the Electropherogram view, the electropherogram shows a range of bases in the region of the sequence where you placed the insertion point. You can zoom in or out, and display the original sequence for comparison if you are editing. Editing in the In the Electropherogram view, you can keep track of your edits by Electropherogram choosing Show Original from the Sequence menu. When you do so, a View second line of data that represents the original appears at the top of the window (see Figure 6-3). The line below it represents the data you can edit. Original data Data you can edit Figure 6-3 Electropherogram view with original data In the Electropherogram view, the Edit menu commands are not available, and you can only edit one base at a time. You can add, delete, or change bases in much the same way as described for the Sequence view on page 6-11. However, the spacing of the characters is much more precise. If you use the sequence window Electropherogram view while you are editing, you can choose Show Original from the Sequence menu to 6-8 Viewing and Editing Sequences display the original data directly above the edited data for reference. See Figure 6-3 for an example. Moving the Selection Multiple base positions (approximately ten) are available between the displayed bases in the Electropherogram view. If you place the insertion point between two characters and click, a position is selected. Following are some hints about moving the selection from one position to another: ♦ To move from base to base, use the Left-Arrow and Right-Arrow keys. ♦ To move from position to position (often pixel-by-pixel), press the Option key while you use the Left-Arrow key. Pressing the Option and Right-Arrow key moves the cursor to the end of the sequence. IMPORTANT Because the available base positions are so close together, it is possible to select a position very close to one of the bases when you are actually trying to select the base itself. If you do so, you might insert a character when you intend to change an existing character. Use the Zoom command (c–=) to make it easier to see and edit individual bases. Changing Bases To change a base: Step 1 Action Place the insertion point to the right of the character you want to select, and click the mouse button. If necessary, use the Zoom command (c–=) to see the bases more clearly. 2 Press the Right–Arrow or Left–Arrow key to move to the base you want to select. Note Using the Right–Arrow or Left–Arrow keys moves the cursor base by base only. To select the gaps between bases, press the Option-Left-Arrow key combination. 3 Enter the new base. Viewing and Editing Sequences 6-9 Adding Bases If you add bases in Sequence view and then switch to the Electropherogram view, the new bases are spaced as evenly as possible between the two previously existing bases. To add a base in the Electropherogram view: Step 6-10 Viewing and Editing Sequences Action 1 Place the insertion point to the right of the point at which you want to insert the character and click the mouse button. 2 To move the insertion point, hold down the Option key and use the Left–Arrow key to move to the position where you want to insert the base. 3 Type the new character. Using the Sequence View Introduction To change to Sequence view from any of the other views, click the button shown at left. The Sequence view shows the nucleotide sequence in the center of the window, with the base positions at the beginning and end of each row (see Figure 6-4). The valid range of data is marked green with a bold green underline. Figure 6-4 The sequence window in Sequence view In the Sequence view, you can search for specified patterns or use any of the standard Macintosh operating system editing commands (Cut, Copy, Paste, Undo/Redo) to change the bases. Note To find a specified pattern, see “Finding Sequences and Patterns” on page 4-21. Editing Sequences In the Sequence view, you can use the standard editing commands found in the Edit menu to cut, copy, paste, and clear bases or ranges of the sequence in the active window. The Edit menu commands operate as described in the Apple System Software User’s Guide. Note To select the entire sequence (including marked features), choose Select All in the Edit menu. Viewing and Editing Sequences 6-11 Adding Bases To add a base or range of bases in the Sequence view: Step Action 1 Place the insertion point at the position in the sequence where you want to add one or more bases. 2 Type the characters you want to insert. Deleting Bases You can delete a base or range of bases by using standard Macintosh editing commands. To delete a base or range of bases from the sequence: Step Action 1 Select the base or range of bases. 2 Press the Delete key or choose Clear or Cut (c–X) from the Edit menu. Changing Bases You can also change bases in the sequence by highlighting and replacing them in the same way you would replace text in a word processing program. To change a base in the sequence: Step Action 1 Select the base you want to change. 2 Type the new character you want in that position. Note You can also place the insertion point to the right of the character you want to replace, press the Delete key, then type the character you want in that position. 6-12 Viewing and Editing Sequences Using the Feature View Introduction The Feature view displays Factura-identified features, as well as features for a database entry (see Figure 6-5). To display Feature view, click the button shown at left. After you have updated the sequence files with the results of batch worksheet processing in the Factura program, feature ranges are added to the view, identifying portions of the data that represent vector, ambiguity, and confidence range. When you import these sequences into the AutoAssembler program, the vector and ambiguity ranges are used to determine the valid range of the data, effectively eliminating poor-quality data. Note All the information is maintained in the original data. The data used by the AutoAssembler program is identified by the *ABI_ValidRange feature. Figure 6-5 The sequence window in Feature view In the Feature view, you can modify features by changing their ranges, or changing the colors and borders that mark the features. continued on next page Viewing and Editing Sequences 6-13 Editing Feature You can change the range, description, or color marking of any feature Ranges and in a sequence feature table using the sequence window Feature view. Markings Changing Features A sequence file will only have feature information if feature information was entered in Factura. To change a feature: Step Action 1 From the sequence window, click the Feature view button. 2 Double-click the feature you want to change. or Select a feature and choose Modify Feature in the Sequence menu. The following dialog box appears: 3 4 6-14 Viewing and Editing Sequences Make any desired changes to the range as follows: a. Select either the beginning or ending value in the “Feature range(s)” entry fields. b. Type a new value in the entry field. c. Click Replace. Change the feature description as follows: a. Select the text in the Description entry field. b. Type the new description. To change a feature: Step (continued) Action 5 Make desired changes to the color marking by choosing one of the eight marking styles in the Style pull-down menu (see Table 6-1). 6 When you have finished making changes in the dialog box, click OK. Table 6-1 provides a complete list of marking styles available in the Feature view’s Add/Edit Feature dialog. Table 6-1 Default Marking Styles Style Name Color Border Blue Blue No underline Red Single Red Light underline Green Bold Bright Green Heavy underline Gray Double Gray Double underline Brown Brown No underline D Green Single Dark Green Light underline D Blue Bold Dark Blue Heavy underline Purple Double Purple Double underline Viewing and Editing Sequences 6-15 6-16 Viewing and Editing Sequences Reassembling a Project 7 7 Overview Introduction Assembling sequences using the AutoAssembler software is an iterative process. You can assemble, edit, and reassemble multiple times until you achieve a satisfactory result. You might reassemble a project for several reasons: ♦ You have added new sequences to the project. ♦ The project has been automatically updated from the BioLIMS database. ♦ You have edited the sequences in the project and want to obtain a clean calculation of the gaps and overlaps. ♦ The previous assembly created more than one contig, and you have either edited the sequences, or changed assembly parameters in such a way as to join the contigs into one. ♦ The previous assembly created what appears to be incorrect overlaps because of repetitive sequence regions, and you have set constraints to correct the overlaps. Note When you reassemble a project, the number at the end of each contig name increments to reflect the number of times you have assembled. In This Chapter This chapter includes the following topics: Topic See Page Reassembling with New or Changed Sequences 7-2 Reassembling to Achieve Different Results 7-5 Reassembling a Project 7-1 Reassembling with New or Changed Sequences Introduction After assembling a project, you might want to add more sequences to create a larger contig, or re-add sequences that you have modified using a different program. To reassemble a project with new or modified sequences, you can simply choose Assemble from the Project menu. The parameters you set for the original assembly are maintained. Reassembling with When you add new sequences to an assembled project and New Sequences reassemble, the new sequences are incorporated into the contig, creating a larger consensus for use with other programs. If you are using the BioLIMS option to autoupdate a project, this procedure is unnecessary. During autoupdating, AutoAssembler automatically adds new sequence files in each designated collection on the BioLIMS database and reassembles the project. To reassemble with new sequences: Step 1 Action Choose Add Sequence(s) from the Project menu. The following dialog box appears: Note Choose Add Multiple from the Project menu to select multiple files from different folders. 2 Select the File type checkboxes (“3XX,” “TEXT,” or “Inherit”) to filter for the type of file. Note 7-2 Reassembling a Project The file list shows only files of the type selected. To reassemble with new sequences: Step 3 4 (continued) Action Add a file or files in one of the following ways: ♦ To add only one file, double-click the filename, or select the file and click Add. ♦ To add all files of the chosen types that are in the open folder, click Add All. A progress indicator appears while the sequences are being added: If necessary, repeat Step 2 and Step 3 to add additional files. 5 Click Unassembled in the contig list of the open project window to see the newly added sequences. New sequences are denoted by diamond symbols. 6 Choose Assemble from the Project menu. The diamond symbols disappear and the new sequences are included in the sequence list. continued on next page Reassembling a Project 7-3 Reassembling with If you assemble sequences and then modify the information in the Changed parent disk files (for example, if you assembled sequences and Sequences subsequently processed them in Factura), update the sequences associated with the project by re-adding them. If you use the BioLIMS option to autoupdate a project, this procedure is unnecessary. During autoupdating, AutoAssembler automatically replaces older versions of sequences in the designated collection on the BioLIMS database and reassembles the project. To reassemble a project with modified sequences: Step 1 Action Choose Re-Add Modified Sequences from the Project menu. AutoAssembler checks the modification dates of the sequence files associated with the project, and re-adds sequences that have been modified since they were last changed in AutoAssembler. The re-added sequences disappear from the contig sequence list and appear in the Unassembled sequence list until you reassemble the project. 2 Choose Assemble from the Project menu. Note It is recommended that you edit your sequences in AutoAssembler. The fast and efficient editing tools provided by AutoAssembler should make editing with outside editing programs unnecessary. 7-4 Reassembling a Project Reassembling to Achieve Different Results Introduction If assembly results in two separate contigs or if the sequences appear to be improperly aligned because of repeat regions, you can make the following changes to encourage proper overlaps: ♦ Edit the data ♦ Change constraints (Server option only) ♦ Change the assembly parameters Once you have made these changes, you may need to reassemble the project. The following sections discuss considerations you should make regarding reassembly. Reassembling When you edit sequences in the AutoAssembler program, the After Editing sequence alignment pane of the project window reflects your changes in the consensus sequence, so you do not need to reassemble the project to see editing changes. (See Chapter 6, “Viewing and Editing Sequences,” for editing procedures.) You might, however, want to reassemble in the following circumstances: ♦ If your edits have created unnecessary gaps or made substantial changes to the sequence lengths, you might want to reassemble to obtain clean and consistent overlaps. ♦ When you assemble a project, more than one contig might result if some of the sequences do not meet the overlap criteria specified by the Assembly Setup parameters you set. If this happens, and you edit the resulting contigs in such a way as to overlap them, you can reassemble the project to join them into a single contig. To reassemble a project after editing the sequences: Step 1 Action Choose Assemble from the Project menu. The parameters you set for the original assembly are maintained. If you wish to change the assembly parameters, see Chapter 3, “Creating and Assembling a Project.” continued on next page Reassembling a Project 7-5 Reassembling After you have used the Server option to assemble sequences, the After Changing AutoAssembler software allows you to adjust the relationships between Constraints a selected sequence and each of the sequences with which it overlaps. The assembly engine tries to put only the sequences with the largest overlaps within a contig, but sometimes you must override these relationships. For example, you might need to change constraints to resolve incorrectly assembled repeat regions. To change assembly constraints: Step Action 1 Select the sequence of interest. 2 Choose Constrain Overlaps from the Project menu. The following dialog box appears: The sequences that overlap with the sequence you selected are listed in the dialog box. 3 Click the name of an overlapping sequence for which you want to change the constraint. 4 Change the constraint by clicking the appropriate radio button. ♦ To use the default setting used by the Server algorithm click Automatic (a bullet appears under the “a”). ♦ To strengthen or create an overlap with the selected sequence, click Enhance (a bullet appears under the “e”). ♦ To remove the overlap with the selected sequence, click Inhibit (a bullet appears under the “i”). Using this procedure, you can modify the relationship between your selected sequence and each of the overlapping sequences. 5 7-6 Reassembling a Project When you are finished, click OK. To change assembly constraints: Step (continued) Action 6 Choose Assembly Setup from the Project menu. The following dialog box appears: 7 Select the Server icon. 8 Select the checkbox labeled “Use Constraints.” 9 Click Submit to reassemble the project. Note You must reassemble to see the effect of the changes you have made. When you reassemble the project with only a change in constraints, the assembly takes a small fraction of the time required for the initial assembly. Reassembling a Project 7-7 Resetting Overlap Relationships If you have changed assembly constraints as described above and want to reset all or some of the overlap relationships, you can reset all relationships, or the relationships of individual sequences. To reset overlap relationships: Step 1 Action Choose Remove Constraints from the Project menu. This resets all constraints in the project to the automatic option. or Use the Constrain Overlaps command. Follow the same procedure you used to change the constraints originally. Assembling Projects Without Constraints You may also choose to assemble a project without constraints. This procedure maintains the constraint settings. To assemble the project without constraints: Step Reassembling After Changing Minimum Overlap and Percent Error Action 1 Choose Assembly Setup from the Project menu. 2 In the Assembly Setup dialog box, deselect the checkbox labeled “Use Constraints.” 3 Click Submit. You might want to change the Minimum Overlap or Percent Error parameters after you see the overlaps resulting from a particular set of sequences and parameters. You can do so by choosing Assembly Setup from the Project menu. The procedure is the same as for original assembly (see “Assembling Sequences” on page 3-29). Note To lessen the number of sequences included in an overlap, try increasing the Minimum Overlap parameter or decreasing the percentage of errors allowed. continued on next page 7-8 Reassembling a Project Reassembling If you are using an engine that supports user-entered parameters, you After Changing might want to change the parameters after viewing the consensus. You Engine Parameters can do so by choosing Assembly Setup from the Project menu. The procedure is the same as for original assembly (see “Assembling Projects Using the Engine Options” on page 3-31). Note The parameters mentioned in this section apply only to the CAP and CAP Remote engines. Other engines may or may not support these parameters. In particular, if you find that sequences are being incorrectly excluded from the contig, you should consider modifying the following CAP engine parameters and reassembling: ♦ -OVERLEN–Decreasing this value reduces the number of bases required to establish overlap. ♦ -FLEVEL–Decreasing this value reduces the number of matches required in an overlapping sequence. Note Decreasing these values increases the possibility of incorrectly matched sequences. If you find that you contig contains excessive areas of ambiguity or incorrectly overlapped sequences, you should consider modifying the following engine parameters and reassembling: ♦ -OVERLEN–Increasing this value means that more bases must match before an overlap will be considered valid. ♦ -FLEVEL–By increasing this value, you raise the percentage of bases that must match before an overlap is considered valid. ♦ -POS3–If your sequences contain valid data after the default number of bases (450), you should increase this value. For more information on all user-entered parameters, see “Assembling Projects Using the Engine Options” on page 3-31. Reassembling a Project 7-9 7-10 Reassembling a Project Saving and Printing in AutoAssembler 8 Overview 8 Introduction This chapter provides information on ♦ Printing and saving your work for various purposes ♦ Exporting and importing sequences to and from other programs IMPORTANT You should save your work during and after making any significant change in the project or an individual sequence. In This Chapter This chapter contains the following topics: Topic See Page Saving your Work 8-2 Printing and Saving Assembly Reports 8-7 Printing and Copying the Views for Presentations 8-11 Creating Files for Use with Other Applications 8-16 Saving and Printing in AutoAssembler 8-1 Saving your Work Introduction To prevent losing your work, make sure to save your project after making significant changes in a sequence or in the project window. You can save the entire project or individual sequences. Saving the Project IMPORTANT To preserve the information in a previously created file and create a new file containing changes, either save the open file under a new filename, or save a copy of the open file. To save a project: Step 1 Action You can save a project in one of three ways: ♦ Choose Save from the File menu. If you previously saved the project, it is automatically saved under the same filename. If you have not saved the project before, a standard file dialog box appears so that you can select a location and enter the filename for your file. ♦ Choose Save As from the File menu. When the standard file dialog box appears, type a name for your file in the entry field, select a location for it, and click Save. ♦ Choose Save a Copy In from the File menu. A standard file dialog box appears, allowing you to assign the filename and location. A copy of your current worksheet is saved to the file you name, but the original remains on the screen. Note This procedure will not save changes made to sequences to the individual sequence files. These changes will be saved to the project only (see “Project and Sequence Relationships” on page 8-3). continued on next page 8-2 Saving and Printing in AutoAssembler Project and When you are ready to save changes to your project, it is important to Sequence understand exactly what it is that you are saving. Relationships Project Files A project file contains the following: ♦ Consensus sequence ♦ Editable sequence data ♦ Links to the original sequences The original sequences themselves are not part of a project file. This is why moving the sequences can prevent you from being able to display electropherograms or open the sequence window for a particular sequence (see “Organizing a From Files Project” on page 3-2). When you save a project, any changes you have made to a sequence are not saved. For example, if you add the same sequence to another project, none of the edits you made in the original project are visible. Sequence Files Sequences contain the following: ♦ Original Sample file data ♦ Editable data ♦ Electropherogram data (only if the sequence was produced on an ABI PRISM DNA Sequencing Analysis software) Original sequence data is stored in the Sample file. In the Factura program, you can revert the sequences to the original data. However, if you do, the edited data is overwritten with the original data, and any edits you have made are lost. To make a permanent change to a sequence file’s editable data, you must save the sequence in one of two ways: ♦ Use the Save Sequences command from the project window (see “Saving Sequences From the Project Window” on page 8-4). ♦ Use the Save command from the sequence window (see “Saving Sequences From the Sequence Window” on page 8-6). continued on next page Saving and Printing in AutoAssembler 8-3 Saving Individual Each sequence in the project has an associated data file containing the Sequences characters that make up the sequence. ABI PRISM DNA Sequencing Analysis software Sample files include electropherogram information that defines the four-color electropherogram display of the data. If you edit sequences in the project window, the changes are not stored in the associated data file until you save to the individual sequence files. Since the editing tools in the project window are so powerful, you might not need to use the sequence window for editing. You can save changes to a Sample file from the project window, or from the sequence window, if you have opened it. Note If you are using the BioLIMS option, the connection to the database must be open for you to save a sequence. Saving Sequences From the Project Window To save changes to sequences from the project window: Step 1 Action Choose Save Sequences from the File menu. The following dialog box appears: The sequences you have edited are marked with a checkmark. 2 Click to deselect any sequences you do not want to save. If you want to save additional sequences, click next to them to select them. Deselect the “Save sequences with gap characters” checkbox to save sequences without gap characters. 8-4 Saving and Printing in AutoAssembler To save changes to sequences from the project window: Step 3 (continued) Action Click Save to save the sequence(s). Note If the modification dates of your sequences are later than those remembered by the project with which you are working, the program displays the following alert box for each modified sequence after you click Save: Click OK to save sequences. Click Force Save in the Save dialog box to save the sequence without checking modification dates. Force Save is useful if you have another program (such as a backup program) that might change the modification dates of your sequence files while a project is open. Saving and Printing in AutoAssembler 8-5 Saving Sequences From the Sequence Window To save sequences from the sequence window: Step 1 Action From any sequence view, select Save from the File menu. Note If the modification dates of your sequences are later than those remembered by the project with which you are working, and you try to save information to the sequences, the program displays the following alert box for each modified sequence after you click Save: 8-6 Saving and Printing in AutoAssembler Printing and Saving Assembly Reports Introduction After assembling a project, you can view, save, and print the following types of project assembly reports: ♦ Project Summary ♦ Contig Summary ♦ Project Reports Saved reports are in tab-delimited format, so you can open them in many word processing, spreadsheet, and database application programs. In all the print procedures, if you want to print only one copy or if you do not want to change the print range, choose Print One rather than Print. This carries out your request immediately, bypassing the standard print dialog box. Project Summary The Project Summary report summarizes the current status of the entire project, and includes the following information: ♦ The last time it was saved to a project file ♦ The last time it was assembled ♦ A summary of the sequences and bases in the total project ♦ A summary of the sequences and bases in each contig Figure 8-1 shows an example of the Project Summary. Figure 8-1 Project Summary Saving and Printing in AutoAssembler 8-7 The Contig The Contig Summary contains detailed information for a single selected Summary contig or for the Unassembled sequence list. It includes the following: ♦ Sequence lengths, orientations, and project ID numbers ♦ Starting and ending positions along the consensus sequence ♦ Last modification date for each sequence ♦ Chemistry used to produce each sequence (ABI PRISM DNA Sequencing Analysis software) Figure 8-2 shows an example of the Contig Summary. General project information Assembly parameters used Totals for project Contig information includes redundancy, sequence lengths, and information about individual sequences from the sequence list Figure 8-2 Contig Summary report continued on next page 8-8 Saving and Printing in AutoAssembler Project Report The Project Report is the most complete report. It contains the information from the Project Summary and detailed information for each contig in the project. It lists the sequences in the Unassembled list and indicates the source format, but it does not compute orientation and offset values. Figure 8-3 shows an example of the Project Report. Project Summary information Contig Summary information (appears for each contig in the project) Figure 8-3 Project Report Viewing Assembly All three reports are accessed through the Project menu. Reports To view an assembly report: Step 1 Action Choose the name of the report from the Project menu. The report appears in a report window on the screen. continued on next page Saving and Printing in AutoAssembler 8-9 Saving Reports Saved reports are in tab-delimited format, so you can open them in many word processing, spreadsheet, and database applications. To save an assembly report: Step Action 1 Choose the type of report you want to save from the Project menu. 2 With the report window active, choose Save (c-S) from the File menu. A standard file dialog box appears. 3 Type a name for the file in the entry field. 4 Click Save. Printing Assembly All assembly report formats can be printed. In each case, the printed Reports format is the same as the screen format. To print an assembly report: Step 1 Action Choose the type of report from the Project menu. The report window opens, with your selected report displayed in it. 2 Choose Print (z-P) from the File menu, and the Print dialog box appears. 3 Click Print. 8-10 Saving and Printing in AutoAssembler Printing and Copying the Views for Presentations Introduction In addition to printing project reports, you can also print from the project window, or from individual sequence windows. You can copy the views to the Clipboard and paste them into other applications, such as files in word processing or presentation programs. In all the print procedures, if you want to print only one copy or do not want to change the print range, choose Print One rather than Print. This carries out your request immediately, bypassing the standard print dialog box. Printing Project When you print the Layout, Statistics, or Alignment views from the Window Views project window for a selected contig, the printed copy shows the name of the project and a ruler to identify the base positions in the consensus. Note If you want to insert representations of the Layout or Statistics views into word processing or presentation application files, see “Copying Project Window Views to Other Programs” on page 8-14. To print a project window view: Step Action 1 Make sure the project window is active. 2 Click the button of the view you want to print. 3 Choose Print (c-P) from the File menu. The Print dialog box appears. 4 Click OK. Note To increase the amount of data printed per page, choose Page Setup from the File menu, and set the parameters for landscape orientation and a reduced size. If you print the Layout view with file names displayed, the names appear on the printed copy. When you print the Alignment view, the sequence names appear on the printout. The contig wraps down the page as many times as the length allows, printing on several pages. continued on next page Saving and Printing in AutoAssembler 8-11 Printing Sequence You can print any one of the four sequence window views, or all of them Window Views at once. When you print the views separately, or print only Annotation, Feature, and Sequence views together, they print in portrait orientation. If you request all of the views at once, AutoAssembler prints Annotation, Sequence, and Feature views on a single page in landscape orientation, and Electropherogram view on several pages, as necessary (also in landscape orientation). A color printer is recommended for printing the Electropherogram view. To print a view in the sequence window: Step Action 1 Click the sequence window to make it active. 2 Choose Page Setup from the File menu. The Page Setup dialog appears, with additional options (electropherogram settings). 8-12 Saving and Printing in AutoAssembler To print a view in the sequence window: Step 3 (continued) Action If you are only printing the Electropherogram view, click the Landscape option to the far right of Orientation. Change the Electropherogram Settings options as follows: ♦ Select the radio button labeled “Single Page” to print the entire electropherogram on one page. ♦ Select the radio button labeled “Variable Size” to print the electropherogram on several pages. Results will vary depending on the settings in the two entry fields. In most cases, the default settings are sufficient, although you can fine-tune the print by changing the entry fields.These fields specify the number of times the electropherogram wraps down the page and the number of data points displayed within each wrap. 4 Choose Print (c-P) from the File menu. The following dialog box appears: 5 Select the checkboxes next to the views you want to print. If you select all four checkboxes, the views are printed together in landscape orientation, as described earlier. If you select only one, or the first three, they print separately in portrait orientation. If you want to punch holes in the page and file it in a three-ring binder, select the checkbox labeled “Allow for 3-hole punch.” Doing so causes the print to have a slightly wider left margin. 6 Click OK to start printing. continued on next page Saving and Printing in AutoAssembler 8-13 Copying Project If you want to use graphics from the project window in a word Window Views to processing or related program to create a report or a presentation, you Other Programs can copy graphics from the project window to the Macintosh Clipboard, and paste them from the Clipboard into your file in the other program. To copy graphics from the project window for use in another program: Step Action 1 Click the project window to make it active. 2 Click a button to display the view you want to copy. 3 From the view you have opened, select the area you want to copy to the Clipboard. 4 Select Copy (c-C) from the Edit menu. Note 5 Only text can be copied from the Alignment view. Select Show Clipboard from the Edit menu to see what you have copied. Copying a You can also copy a sequence from the sequence window. The Sequence from the following are two possible uses for a copied sequence: Sequence Window ♦ Create a new sequence file for use with other sequencing-related applications ♦ Incorporate the sequence into a text file as part of a report, article, or presentation In the sequence window, you can only copy from the Sequence view, and the copied sequence is in text format, rather than graphic format. To copy an individual sequence from the sequence window: Step 1 Action Make sure the sequence window displaying the sequence of interest is active. Note The selected window can display a single sequence fragment or a consensus (see “Building a Consensus Sequence” on page 8-16). 2 8-14 Saving and Printing in AutoAssembler Select the entire sequence or a desired range. To copy an individual sequence from the sequence window: Step 3 (continued) Action Choose Copy (c-C) from the Edit menu. The sequence or range is copied to the Clipboard. You can use the Paste command in another program to paste the contents of the Clipboard into a file associated with that program. Saving and Printing in AutoAssembler 8-15 Creating Files for Use with Other Applications Introduction Two types of files can be created by the AutoAssembler software for use with other programs. ♦ Consensus files–The consensus file is the end result of AutoAssembler. You can either archive the consensus, or translate the file for use with other programs. ♦ Layout files –Files of a single contig, which can be used with Sequence Navigator, SeqEd, or EditView software. Building a When you are satisfied with the consensus produced as part of a Consensus contig, you can build and save a special consensus sequence file for Sequence use in another program, such as Inherit Analysis. To build a consensus sequence for use with other programs: Step Action 1 Select the contig of interest by clicking its name in the upper-left pane of the project window. 2 Choose Build Consensus from the Project menu. The following dialog box appears, with the file’s name in the Name entry field: 3 Use the pop-up menu to choose the case of the characters in the consensus in one of the following ways: ♦ To retain the case the characters have in the project window (lowercase characters for ambiguous base positions and uppercase characters for all others), use the default (Mixed). ♦ To create a consensus sequence with all upper-case characters, choose “UPPER.” ♦ To create a consensus with all lower-case characters, choose “lower.” Note It is easier to identify ambiguous base positions in the consensus if you choose Mixed case. 8-16 Saving and Printing in AutoAssembler To build a consensus sequence for use with other programs: Step (continued) Action 4 Select the “Delete insertion (gap) characters” checkbox to eliminate gap characters from the consensus. 5 Click OK. A sequence window with the consensus sequence appears: Note In a mixed consensus (such as the consensus shown here), lowercase characters denote ambiguities. This window allows you to switch to either Feature view or Annotation view, but both are empty. If you want to add features, you can do so using Factura (see the Factura User’s Manual). Note The consensus sequence does not have Electropherogram view, since it was not produced by ABI PRISM DNA Sequencing Analysis software. Exporting a Once you have built the consensus sequence, you have several options Consensus for transporting it to other applications: Sequence ♦ Copy the consensus sequence to the Clipboard and paste it into a new sequence file in another application. ♦ Save the consensus sequence for future use with another application. It is saved to a Sample file format without an electropherogram. Saving and Printing in AutoAssembler 8-17 ♦ Export the consensus sequence to text format, as described in the next section, “Exporting Sequences to Text Format.” To copy the consensus sequence via the Clipboard: Step Action 1 With the consensus sequence window active, choose Select All (c-A) from the Edit menu. 2 Choose Copy (c-C) from the Edit menu. 3 Open a new sequence file in the other application. 4 With the window of the new sequence file active, choose Paste (c-V) from the Edit menu. If you try to close the sequence window without saving, a dialog box asks you to verify whether or not you want to save it. To save the consensus sequence: Step 1 Action Choose Save from the File menu. The standard Save dialog box appears, with the name of the contig in the entry field. 2 Enter another name for the file, if you want to change it. 3 Click Save. Note This procedure saves the consensus to a Sample file format without an electropherogram. If you want to save it to a different file format, use the Export command (see the next section). continued on next page 8-18 Saving and Printing in AutoAssembler Exporting AutoAssembler allows you to export a consensus into a text file. A text Sequences to Text files simply contains a string of characters, and can be easily exported Format into word processing applications. Note You can also export the contig to a layout format for use with the SeqEd and Sequence Navigator programs (see the next section). To export a consensus sequence to another format: Step Action 1 Make sure the sequence window containing the consensus is the active window. 2 Choose Export from the File menu, and Text from the submenu that appears. 3 Choose the appropriate file type. 4 The standard Macintosh Save dialog box appears, with the contig name as the default file name. Select the destination folder and click Save. AutoAssembler Although you can open a layout generated by AutoAssembler in the Layout Files SeqEd, EditView, and Sequence Navigator programs, it is highly recommended that you edit assembled sequences in AutoAssembler. IMPORTANT SeqEd and EditView do not recognize feature table information, and saving edits to a Sample file from either of these programs can invalidate the feature table in the file. Note If you find an invalid feature table after such editing and saving, run the sequences in Factura again, using the same settings, but do not revert the sequences to original data. This should re-establish the feature table without overwriting your edits. If you open an AutoAssembler-generated layout in the Sequence Navigator program, a dialog box appears, indicating that the file will be converted. In order to maintain compatibility with SeqEd and EditView, the layout created by AutoAssembler uses System 6-compatible file references. Sequence Navigator converts the references to System 7compatible aliases. Although the Sequence Navigator program does recognize feature tables, some information is lost when you edit sequences in Sequence Navigator, save back to the original sequence files, and then re-add the sequences to AutoAssembler. Saving and Printing in AutoAssembler 8-19 Use AutoAssembler to edit assembled sequences. The powerful editing features in AutoAssembler make editing with outside programs unnecessary in most cases. To export a contig to a layout: Step Action 1 Select a contig from the project window and choose Export from the File menu. 2 Choose Layout from the Export submenu. A standard file dialog box appears. 3 Enter a filename for the layout file. 4 Click Save. 8-20 Saving and Printing in AutoAssembler AppleScript Dictionary Appendix Overview A A Introduction This appendix provides a complete list of the AppleScript commands supported by the AutoAssembler program. For instructions regarding the use of AppleScript, see Apple’s AppleScript User’s Guide. In This Appendix This appendix contains the following topics: Topic AppleScript Commands See Page A-2 AppleScript Dictionary A-1 AppleScript Commands AutoAssembler Table A-1 contains events that are specific to the AutoAssembler Suite software. Table A-1 AutoAssembler Suite Command Description Zoom Increases the magnification of the target window. Zoom (reference)–The window to zoom ♦ Tile Windows to (real)–The new magnification Arrange open windows so that all are visible. Tile windows (reference) Stack Windows Stack open windows. Stack windows (reference) Show Show sequence window for selected sequence. Show (reference) Assemble Assemble project. Assemble (reference)–The project to assemble A-2 AppleScript Dictionary Show Project Report Show the project report window. Show Project Summary Show the project summary window. Show Contig Summary Show the contig summary window. Show project report (reference)–The project Show project summary (reference)–The project Show contig summary (reference)–The contig Table A-1 AutoAssembler Suite (continued) Command Description Show Consensus Show the contig consensus window. Add To Show consensus (reference)–The contig ♦ using title (string)–The names of the consensus window ♦ gaps (Boolean)–Include insertion (gap) characters ♦ in (mixed/upper- case/lowercase)–Alphabetic case Add fragments to the specified project. Add to (reference)–The project to contain the fragment ♦ fragments (alias)–The fragment files to add ♦ with ID (string)–The identifier of the fragment on the database ♦ from database (string)–The name of the BioLIMS database to use ♦ on server (string)–The name of the database’s server Select Next Bad Region Select the next bad region in the project window. Select All Select everything in the project window. Select new bad region (reference)–The project window Select all (reference)–The project window Class Application AutoAssembler application ♦ autoupdate (Boolean)–Enable automatic updating ♦ update delay (integer)–Idle minutes to wait before starting automatic update ♦ update list (a list of file)–List of projects to be automatically updated AppleScript Dictionary A-3 Table A-1 AutoAssembler Suite (continued) Command Description Class Document Project document Elements Class Contig(s) ♦ contig (by numeric index/by name) ♦ sequence (by numeric index/by name) ♦ unassembled sequence (by numeric index/by name) Contiguous alignment of sequences Elements Class Sequence(s) A-4 AppleScript Dictionary ♦ sequence (by numeric index/by name) ♦ consensus (text r/o)–The consensus ♦ name (string r/o)–The name ♦ length (integer r/o)–The length of the consensus ♦ orientation (original/complementary r/o)–The orientation Sequence of bases Elements ♦ feature (by numeric index/by name) ♦ bases (text r/o)–The bases ♦ name (string r/o)–The name ♦ orientation (original/complementary r/o)–The orientation ♦ length (integer r/o)–The length ♦ sequence type (unknown/DNA/RNA/protein r/o)–The type ♦ alphabet (unknown/IUB/gapp-ed IUB/protein r/o)–The alphabet ♦ annotation (text r/o)–The annotation Table A-1 AutoAssembler Suite (continued) Command Description Class Unassembled Sequence(s) A sequence which does not belong in a contig Class Feature(s) Features of a sequence ♦ <Inheritance> (sequence)–All properties and elements of the given class are inherited by this class. ♦ key (text r/o)–The key ♦ name (string r/o)–Synonym for key continued on next page AppleScript Dictionary A-5 BioLIMS Scripts Table A-2 contains AppleScript commands for the BioLIMS access suite. Table A-2 AutoAssembler BioLIMS Access Suite Command Description Open Connection Opens a connection with the database using the current connection or makes a new one with the parameters provided. Open connection (reference)–The application ♦ using connectionID (small integer)–ID number to identify the data for opening this connection ♦ using alias (string)–Alias to identify the data for opening this connection ♦ with username (string)–Alias to identify the data for opening this connection ♦ to database (string)–Database name to be used for opening this connection ♦ on server (string)–Server to be used for opening this connection ♦ with password (string)–Password to be used for opening this connection ♦ with alias (string)–Password to be used for opening this connection ♦ with database (string)–Database name to be used for opening this connection ♦ with server (string)–server to be used for opening this connection Result (small integer)–ID number of the opened connection Open Default Connection Opens a connection using the default data from the session manager dialog. Open default connection (reference)–The selected application Result (small integer)–the ID number of the opened connection A-6 AppleScript Dictionary Table A-2 AutoAssembler BioLIMS Access Suite (continued) Command Description Make New Connection Creates a new set of data for connecting to a database, and makes it the current set. Make new connection (reference)–The application ♦ with username (string)–user name to be used for opening this connection ♦ with database (string)–database name to be used for opening this connection ♦ with server (string)–server to be used for opening this connection ♦ with password (string)–password to be used for opening this connection ♦ with alias (string)–alias to be used for identifying this connection in the future ♦ to database (string)–database name to be used for opening this connection ♦ on server (string)–server to be used for opening this connection Result (small integer)–the ID number of the selected connection Select Connection Make this connection the selected one. Select connection (reference)–The application ♦ with connectionID (small integer)–ID number to identify this connection ♦ with alias (string)–Alias to identify this connection ♦ using connection (small integer)–ID number to identify this connection ♦ using alias (string)–Alias to identify this connection Result (small integer)–The ID number of the selected connection AppleScript Dictionary A-7 Table A-2 AutoAssembler BioLIMS Access Suite Command Description Close Connection Close the channel used in this connection. Delete Connection A-8 AppleScript Dictionary (continued) Close connection (reference)–The application ♦ with connectionID (small integer)–ID number to identify this connection ♦ with alias (string)–Alias to identify this connection ♦ using connection (small integer)–ID number to identify this connection ♦ using alias (string)–Alias to identify this connection Discard the designated connection permanently. Delete connection (reference)–The application ♦ with connectionID (small integer)–ID number to identify this connection ♦ with alias (string)–Alias to identify this connection ♦ using connection (small integer)–ID number to identify this connection ♦ using alias (string)–Alias to identify this connection Table A-2 AutoAssembler BioLIMS Access Suite (continued) Command Description Delete all connections Discard all connections made through AppleScript permanently. Delete all connections (reference)–The application Class Session Manager The session manager ♦ selected alias (text)–The alias of the currently selected connection ♦ selected username (text)–The user name of the currently selected connection ♦ selected database (text)–The database name of the currently selected connection ♦ selected server (text)–The server name of the currently selected connection ♦ selected password (text)–The password of the currently selected connection ♦ selected connectionID (small integer r/o)–The ID number of the connection ♦ current alias (text)–The alias of the currently selected connection ♦ current username (text)–The user name of the currently selected connection ♦ current database (text)–The database name of the currently selected connection ♦ current server (text)–The server name of the currently selected connection ♦ current password (text)–The password of the currently selected connection ♦ current connectionID (small integer r/o)–The ID number for the connection ♦ user intervention (Boolean)–Whether or not the user is asked to help connect AppleScript Dictionary A-9 A-10 AppleScript Dictionary References Appendix Overview B B Introduction This appendix provides a list of references for information about the algorithms used by the AutoAssembler software and its server option. In This Appendix This appendix contains the following topics: Topic Algorithm References See Page B-2 References B-1 Algorithm References Sequence The following references might be useful to you for a more complete Alignment understanding of the sequence alignment algorithms used by Algorithms AutoAssembler: ♦ Applied Biosystems Division of Perkin Elmer. 1993. Sequence Analysis Toolbook. Foster City: Applied Biosystems Division of Perkin Elmer. ♦ Dear, S. and Staden, R. 1991. A sequence assembly and editing program for efficient management of large projects. Nucleic Acids Research. 14:3907-3911. ♦ Huang, X. 1992. A Contig Assembly Program Based on Sensitive Detection of Fragment Overlaps. Genomics. 14:18-25. ♦ Kececioglu, J.D. and Myers, E. 1994. Exact and Approximate Algorithms for the Sequence Reconstruction Problem. Algorithmica. 12:4. ♦ Kececioglu, J.D. Exact and Approximation Algorithms for DNA Sequence Reconstruction. University of Arizona. TR91-26. Feature Tables The following provides more information about feature tables: ♦ B-2 References DNA Data Bank of Japan, Mishima, Japan; EMBL Data Library, Heidelberg, Federal Republic of Germany; GenBank, Los Alamos, NM, and Mountain View, CA, USA. 1993. The DDBJ/EMBL/GenBank Feature Table: Definition. This can be obtained by anonymous FTP to ncbi.nlm.nih.gov. Use “anonymous” as your login ID and your e-mail address as your password. Key Codes C Appendix Overview C Introduction This appendix provides translations for codes used in the AutoAssembler program. In This Appendix This appendix contains the following topics: Topic Translation Tables See Page C-2 Key Codes C-1 Translation Tables Introduction This section provides the following translation tables: ♦ IUPAC/IUB Codes ♦ Complements ♦ Universal Genetic Code ♦ Amino Acid Abbreviations IUPAC/IUB Codes Table C-1 provides translations for IUPAC/IUB codes used in the AutoAssembler software. Table C-1 IUPAC/IUB Codes Code Translation A Adenosine C Cytidine G Guanosine T Thymidine B C,G, or T D A, G, or T H A, C, or T R A or G (puRine) Y C or T (pYrimidine) K G or T (Keto) M A or C (aMino) S G or C (Strong— 3 H bonds) W A or T (Weak—2 H bonds) N aNy base continued on next page C-2 Key Codes Complements Table C-2 provides complements for reference. Table C-2 Complement Table A T S W W S B V D H C G G C T A R Y H D Y R V B K M N N M K Universal Genetic Table C-3 provides Universal Genetic Codes for use with the Code AutoAssembler software. Table C-3 Universal Genetic Code 5' End T C A G 2nd Position 3' End T C A G Phe Ser Tyr Cys T Phe Ser Tyr Cys C Leu Ser OCH OPA A Leu Ser AMB Trp G Leu Pro His Arg T Leu Pro His Arg C Leu Pro Gln Arg A Leu Pro Gln Arg G Ile Thr Asn Ser T Ile Thr Asn Ser C Ile Thr Lys Arg A Met Thr Lys Arg G Val Ala Asp Gly T Val Ala Asp Gly C Val Ala Glu Gly A Val Ala Glu Gly G Key Codes C-3 Amino Acid Table C-4 provides amino acid abbreviations for use with Abbreviations AutoAssembler’s Show Protein Translation feature. Table C-4 Amino Acid Abbreviations AMINO ACID THREE LETTERS Alanine Ala A Arginine Arg R Asparagine Asn N Aspartic Acid Asp D Cysteine Cys C Glutamic Acid Glu E Glutamine Gln Q Glycine Gly G Histidine His H Isoleucine Ile I Leucine Leu L Lysine Lys K Methionine Met M Phenylalanine Phe F Proline Pro P Serine Ser S Threonine Thr T Tryptophan Trp W Tyrosine Tyr Y Valine Val V Any Amino Acid X Stop Codes: AMBer, OCHer, OPA C-4 Key Codes ONE LETTER Glossary This section defines special terminology used in the AutoAssembler software. The terms are listed in alphabetical order. Many terms are defined in the text of this manual. If you do not find a term here, check the index to see if you can locate it in the manual. ambiguity character A character that appears in the Alignment view of the project window to indicate an ambiguous base position or an insertion in the consensus of the displayed contig. The character appears just below the consensus sequence. You can specify the character by choosing Settings from the Edit menu. The default is a black bullet (•). assemblage A term used interchangeably with “contig.” autoupdating A command that, in conjunction with BioLIMS, adds modified and new sequences from a collection to a designated project. With autoupdating, multiple users on different computers can add new sequences or edit existing sequences in a BioLIMS collection. These new or edited files will then be automatically added to the project. BioLIMS A database that stores sequences used in AutoAssembler projects, allowing multiple users to edit or add sequences to collections. These sequences can then be automatically added to a project using the AutoUpdating feature. collection A group of sequences residing in the BioLIMS database. chromatogram A four-color picture of a sequence, showing peaks that represent the bases or amino acids. The term is used interchangeably with “electropherogram” in AutoAssembler. consensus sequence A linear series of characters that represents the multiple sequence alignment of a contig. Individual base positions in the consensus are represented either by capital letters, lowercase letters in color, or insertion characters. contig A group of overlapping sequences resulting from assembly. Unlike traditional sequencealignment methods used in assembling sequences, AutoAssembler generates consensus sequences on the basis of primary data only, not on the basis of interim consensus sequences. Groups of overlapping sequences are dynamically computed with each iteration of the AutoAssembler. The chief advantage of this approach is that initial sequences do not introduce a bias into the consensus. contig list A listing of contigs that appears in the upper-left pane of the project window. When you select a contig in the contig list, the associated sequences appear in the sequence list. editable data A copy of the original ABI PRISM DNA Sequencing Analysis software-produced data that is stored in the sample file. All changes saved to sequence files are stored in the editable data copy, Glossary-1 so the original data is maintained in its unmodified (original) condition. Editable data is displayed in the AutoAssembler project window and sequence window. electropherogram A four-color picture of a sequence, showing peaks that represent the bases or amino acids. The term is used interchangeably with “chromatogram” in AutoAssembler. exporting Storing the contents of selected sequences in a file other than the associated data file. You can export sequences as text files for use with word processing applications. You can also export all sequences in a contig into a layout for use with the SeqEd or Sequence Navigator application programs. feature A defined region in a sequence. You can define features in Factura using the Feature–Add command. Features are also the regions identified when you process a sequence using Factura. The sequence window Feature view in Factura shows Factura-identified features only after you have saved them using the Save to Sequence command in the Worksheet menu. gap character A character inserted into a sequence to indicate a missing region. In AutoAssembler, the gap character is a hyphen or dash (–). For example, the sequence of nucleotides GCTA– contains 5 characters. The last character is a gap. identification parameters The settings you specify for vector, ambiguity, confidence range, and IUB code (heterozygote) calling that are used to identify those features during Factura processing. ID numbers Numbers that identify sequences in the Layout view of the AutoAssembler project window when you choose Show IDs from the Project menu. The numbers are assigned sequentially as sequences are added to the project, and are not re-used if the corresponding sequences are removed from the project. index The index of the first base or amino acid of a sequence is the same as the sequence offset, and each succeeding character has an index of one greater. The index numbers are shown on a ruler at the top of the lower panel of the project window. insertion character A character that appears in the consensus sequence in the Alignment view of the project window. This character indicates an insertion in the consensus of the displayed contig. You can specify the character by choosing Settings from the Edit menu. The default is ~. IUB code Alphabetic character representing the occurrence of mixed bases at a given position in a sequence. Originally defined by the International Union of Biochemistry. IUPAC International Union of Pure and Applied Chemistry. IUB codes are also referred to with this acronym, since IUPAC adopted the codes as a standard. layout A two-dimensional display in the lower panel of the project window that uses arrows to show the relationships between sequences in a contig. A layout is also the main window, or worksheet, that displays multiple sequences in the SeqEd and Sequence Navigator applications. length The length of a sequence is the number of characters it contains, including gap characters. For example, GAATTC has a length of 6. GAA–TTC has a length of 7. mark style A pre-defined style that can be applied in Factura so you can visually identify a feature in the sequence window. Glossary-2 offset The relative distance between the origin and beginning of a given sequence in the Alignment view of the project window. The leftmost sequence starts at the origin (position 1), and each other sequence in the contig is offset a certain number of bases to the right of that point, each being positioned to provide the best alignment of the data. origin An imaginary vertical line in the Alignment view of the project window between index positions zero and minus one. The origin is where the far-left sequence in a contig starts. original data The sequence data produced by the ABI PRISM DNA Sequencing Analysis software. This data is maintained in its original state in a sample file. An editable copy of the data is stored in the same sample file, and changes when you save edits to the file. See also editable data, sample files. protein translation The protein translation is an editing function that displays a single character amino acid beneath the third character of each three-base codon. This command uses a universal codon table when translating ambiguities in the consensus sequence. residue An amino acid or a nucleotide. ruler A scale displaying index numbers, located above the consensus in the lower panel of the project window. sample files Files produced by the ABI PRISM DNA Sequencing Analysis software. These files contain data produced by the instrument: a sequence of base calls, peak locations, and an electropherogram. The original data in sample files is always maintained in its original state (as it came from the ABI PRISM DNA Sequencing Analysis software). When you save changes you have made to the sequence, they are stored to a copy of the original data called the “editable data.” Editable data is displayed in the AutoAssembler project and sequence windows. selected sequence A sequence that you have specified by clicking its identification on the project window in AutoAssembler. sequence A linear series of characters. The characters are displayed in rows from left to right. More specifically, a sequence is a series of nucleotide base characters that represent a linear DNA sequence, or a series of amino acid characters that represent a protein sequence. sequence list A listing of sequence names that is displayed in the upper-right pane of the project window. The sequence list shows the names of the sequences included in the contig that is selected in the upper-left pane of the project window. settings Choices that you specify in AutoAssembler about the parameters used to identify features in the project window views. statistics settings Choices that you specify in AutoAssembler about the parameters used to display the consensus sequence in the Statistics view. statistics view A view in the project window that plots redundancy versus consensus base. This view is useful if you need to verify that you have minimum redundancy and orientation throughout the consensus. You can choose the parameters for this view using the Statistics Settings command in the Edit menu. summary graphic A horizontal line displayed in the top part of the sequence window. This line represents the length of the sequence that is displayed in the window, and reflects the cursor Glossary-3 position as you move it to different places in the sequence. The line also shows colored regions to represent marked features in the sequence. symbol Usually a character, such as G, A, *, or –. Often represents a base or amino acid in a sequence. text files Files produced by all Macintosh word processing programs, and many other programs. Each contains a string of characters and can be created when you save files. upper-left pane The two panes in the upper portion of the project window. The left pane displays the names of contigs and the unassembled sequence list. The right pane displays a list of the sequences of the contig you select in the left pane. upper-right pane The two panes in the upper portion of the project window. The left pane displays the names of contigs and the unassembled sequence list. The right pane displays a list of the sequences of the contig you select in the left pane. views Various displays provided by the project window in AutoAssembler and the sequence window in AutoAssembler. Glossary-4 Index Symbols *ABI_ValidRange feature 5-18 Numerics 373 or 377 sequence data. see data A adding bases in Electropherogram view 6-10 sequences to a BioLIMS project 3-12 sequences to a From Files project 3-10 adjusting relationships between sequences 7-6 Alignment view customizing characters 4-12 described 4-2 editing in 5-18 example 4-6 allocating more memory 2-15 ambiguity color marking 4-5, 4-15 in Alignment view 4-6 in Layout view 4-3, 4-5 in project window views 5-2 ambiguity characters changing 4-16 defined G-1 in Alignment view 4-5, 4-6, 5-3 Analysis software sample files defined G-3 Annotation view button to display 6-7 described 6-7 example 6-7 in text, Inherit, new sequences 6-4 AppleScripting A-1 new feature of version 2.0 1-5 arranging multiple windows 4-18 Arrow keys Down-Arrow key 5-12 Left-Arrow key. see Left-Arrow key Right-Arrow key. see Right-Arrow key Up-Arrow key 5-11 arrows (Layout view) 4-2, 4-3 assembling adjusting overlaps between sequences 7-6 automatically 3-29 by engine algorithm 3-31 by local algorithm 3-30 by server algorithm 3-36 constraining overlaps 7-6 editing valid range 5-18 increasing or decreasing amount of data used 5-18 lessening the number of sequences included in an overlap 7-8 resolving incorrectly assembled repeat regions 7-6 sequences 3-29 sequences of diverse lengths 3-39 assembly engines adding engines 3-31 CAP, CAPRemote 3-31 new feature of version 2.0 1-5 output files 3-41 parameters 3-31 reassembling after changing parameters 7-9 assembly reports printing 8-10 saving 8-10 viewing 8-9 AutoAssembler Index-1 compatibility with previous versions files installed 2-6 flowchart 1-7 getting started 2-14 icon 3-6 installation options 2-2 installing from 3.5 disks 2-3 installing from the BioLIMS Client Package 2-9 memory requirements 2-15 new features in version 2.0 1-5 optional configurations 1-4 registration 1-2 related software 1-6 software description 1-3 software installation disks 2-2 system requirements 2-2 AutoUpdating configuring 3-43 host machine 3-44 new feature of version 2.0 1-5 setting up 3-43 turning off 3-45 1-6 B bars colored 4-5 half-height 4-5 bases adding in Electropherogram view 6-10 adding in project window 5-12 changing in Electropherogram view 6-8 changing in Sequence view 6-12 colored bars 4-5 deleting in project window 5-12 deleting in Sequence view 6-12 editing in Electropherogram view 6-8 editing in project window 5-10–5-19 editing in sequence window 6-8–6-12 editing with lower case characters 5-10 half-height bars 4-5 keeping track of edits 5-11 keyboard shortcuts for selecting 5-11 multiple positions in electropherograms 6-9 original data in electropherograms 6-8 replacing in contig 5-16 shifting left or right 5-16 Index-2 BioLIMS database accessing BioLIMS database 3-14 adding sequences to a project 3-12 Client Package custom installation 2-9 installing 2-8 removing 2-10 configuring the server connection 2-17–2-19 files installed in System folder 2-13 installation option 1-4 installing the Client Package 2-8–2-13 interfaces file 2-17 opening access 3-13 organizing and naming projects 3-4 project file 3-2 removing sequences from a project 3-24 Sequence Chooser window 3-18 displaying the window 3-15 parts of the window 3-16–3-17 using 3-20–3-22 setting up for AutoUpdating 3-43 SybaseConfig control panel 2-18 borders, marking features 6-15 buttons to display Statistics view 5-21 C CAP assembling projects with 3-31 engine parameters 3-31 CAP Remote assembling projects with 3-31 engine parameters 3-31 installation option 1-5 changing ambiguity character in the project window 4-16 assembly constraints 7-6 bases in Electropherogram view 6-8 bases in Sequence view 6-12 feature appearance 6-14 feature range 6-14 insertion character in the project window 4-16 marking style 6-15 scale of electropherograms in project window 4-10–4-14 characters ambiguity 4-5, 4-6 lower case 4-5, 5-11 upper case 4-5, 5-11 using lower case to edit 5-10 chromatogram defined G-1 clipboard 8-11, 8-14, 8-15 cloning the project window 4-19 example 4-20 color ambiguous bases in consensus 5-3 changing marking style of features 6-14 marking ambiguity 4-5, 4-15, 5-3 marking features 6-14 specifying ambiguity color for project window views 4-15 color picker 4-15 colored bars 4-5 complementing a contig 5-5 compressed view. see Layout view consensus sequence ambiguous bases shown in color 4-2, 5-3 changing color of ambiguous bases 4-15 characters in 4-6 defined G-1 described 4-2, 4-6 editing 5-10 example of replacing bases 5-13–5-14 half-height bars 4-2, 4-5 IUB codes in 4-7 no electropherogram view 8-17 reassembling with new sequences 7-2 window described 8-17 constraining overlaps procedure 7-6 reassemble to see results 7-7 resetting relationships 7-8 contig building and saving a consensus 8-16 complementing 5-5 defined G-1 exporting to layout format 8-19 locating ambiguous regions 5-2 names 3-41 printing from the project window 8-11 viewing more than one simultaneously 4-19 Contig list 3-6 defined G-1 Contig Summary described 8-8 example 8-8 printing 8-10 saving 8-10 viewing 8-9 copying a sequence from the sequence window 8-14 project window views to other programs 8-14 CPU 2-2 creating files for use with other applications 8-16 graphics from project window 8-14 project 3-6 customizing characters in the Alignment view 4-12 D data editable data G-1 editing valid range used for assembly 5-18 from ABI 373 or ABI PRISM 377 1-3 increasing or decreasing amount used for assembly 5-18 original data G-3 saving to sequence files 8-4 seeing multiple views in cloned project window 4-19 showing original on electropherogram 6-8 shown as colored bars 4-5 decreasing amount of data used for assembly 5-18 defaults graphic in project window 4-3 marking styles 6-15 deleting bases in project window 5-12 bases in Sequence view 6-12 deletions and server assembly algorithm 3-38 diagram Factura/AutoAssembler data flow 1-7 Index-3 diamond symbols 3-41 dictionary A-1 Disk Drive 2-2 displaying complement of a contig 5-5 sequence windows 6-3 double-clicking to show sequence 6-3 Down-Arrow key 5-12 viewing native (variable) peak spacing 6-2 zooming in project window 4-10–4-14 engine option assembly using 3-31 enhancing sequence overlaps 7-6 Export command 8-20 exporting contig to layout format 8-19 defined G-2 sequences to text format 8-19 E Edit menu not available in Electropherogram view 6-8 editable data G-1 editing bases in Electropherogram view 6-8 bases in project window 5-12–5-19 bases in sequence window 6-8–6-12 consensus 5-10 editable data in sample files G-1 example of replacing a range of bases 5-16 features 6-14 keeping track of edits 5-11 marking style 6-15 original data in sample files G-3 protecting sequence from edits 6-4 saving changes 8-4 saving changes in the sequence window 6-6 specific examples using one situation 5-17, 5-18 using lower case characters 5-10, 5-11 valid range of data for assembly 5-18 what happens on the screen 5-18 Electropherogram view button to display 6-8 electropherograms adding bases 6-10 changing scaling in project window 4-10–4-14 defined G-2 editing 6-8 Electropherogram view 6-8 in sample files only 6-8 multiple base positions 6-9 printing on a color printer 8-12 showing original data 6-8 Index-4 F Factura flowchart 1-7 interrelation with AutoAssembler 1-6 software description 1-3 false overlaps 3-40 Fast Data Finder and server assembly algorithm 3-36 feature tables marking features 6-14 Feature view button to display 6-13 described 6-13 example 6-13 in text, Inherit, new sequences 6-4 marking features 6-14 features changing appearance 6-14 defined G-2 editing 6-14 marking 6-14 valid range of data for assembly 5-18, 6-13 files creating files for use with other applications 8-16 displaying names 4-4 installed by AutoAssembler 2-6 keeping sequence files with project 3-2 moving sequence files with respect to project file 3-2 removing sequences from a BioLIMS project 3-24 removing sequences from a project 3-11 saving as text 8-19 finding Find Again command 4-23 Find command 4-21 IUB codes 4-22 patterns 4-21 selection expressions 4-22 flowchart, Factura/AutoAssembler 1-7 folders keeping sequence and project files together 3-2 formatting the sequence list 3-25 forward delete key 5-14 From Files project file 3-2 G gap characters defined G-2 gaps changing characters 4-16 deleting from multiple sequences replacing in sequences 4-21 Get Info command 2-15 graphics copying to other programs 8-14 grep 4-22 5-17 H half-height bars 4-5 I icons AutoAssembler program 3-6 button for Alignment view 4-2 button for Layout view 4-2 button for Statistics view 4-2 diamond shapes by sequence names 3-41 ID numbers defined G-2 not reused in project 3-11 identification parameters defined G-2 importing command 3-41 engine output files 3-41 Inherit files 3-10, 7-2 sample files 3-10, 7-2 text files 3-10, 7-2 incorrectly assembled repeat regions 7-6 increasing amount of data used for assembly 5-18 index defined G-2 Inherit Analysis program 8-16 Inherit files importing 3-10, 7-2 inhibiting sequence overlaps 7-6 insertion character defined G-2 insertions and assembly algorithm 3-38 changing character 4-16 installing AutoAssembler from 3.5 disks interfaces file 2-17 IUB codes defined G-2 finding 4-22 in consensus sequence 4-7 when zooming in 4-5 IUPAC 4-22, G-2 2-3 K Kececioglu algorithm 3-36 keyboard keys Arrow keys. see Arrow keys forward delete key 5-14 Option key. see Option key Shift key 5-11 keyboard shortcuts selecting bases or sequence segments 5-11 L layout exporting contig to 8-19 Layout view defined G-2 described 4-2 example 4-3 Left-Arrow key moving in electropherograms 6-9 selecting bases 5-11 length defined G-2 link between project and sequence files list of sequences in a contig 3-25 local assembly algorithm advantages 3-30 3-5 Index-5 how to use 3-30 setting minimum overlap and percent error 3-40 when to use 3-30 lock image 6-4 lower case characters editing with 5-10, 5-11 half-height bars 4-5 lower pane of project window 3-7 M Macintosh system software needed 2-2 manual, user's about 1-9 conventions used in 1-9 mark style choosing 6-15 defaults 6-15 defined G-2 memory allocating more 2-15 suggested memory allocation 2-2 minimum overlap parameter setting too high or too low 3-40 mismatches and server assembly algorithm 3-38 missing files 3-4 Monitor 2-2 More checkbox only with FDF 3-38 table of parameters 3-38 moving sequence file location 3-2 multiple base positions 6-9 multiple views of data in cloned project window 4-19 multiple windows, arranging 4-18 Myers-Kececioglu model 3-36 N native spacing of electropherograms network parameters 4-12 networked project, organizing 3-3 O offset defined Index-6 G-3 6-2 Open command 3-7 opening project 3-6 sequence window 6-3 Operating System 2-2 Option key for selecting bases 5-11 moving cursor in electropherograms 6-9 opening AutoAssembler program 3-7 organizing networked project 3-3 project files 3-2 the sequence list 3-27 origin defined G-3 original data defined G-3 preserved in sample files 6-13 showing in Electropherogram view 6-8 P panes lower in project window 3-7 upper right and left of project window 3-7 parameters identification G-2 Statistics view Statistics Settings 5-20 patterns finding in sequences 4-21 peak shape cursor 4-10 peaks viewing electropherograms with variable spacing 6-2 percent error parameter setting too high or too low 3-40 portrait orientation (printed sequence window views) 8-12 presentations copying graphics from project window 8-14 copying sequence from sequence window 8-14 printing sequence window views 8-12 printing the project window 8-11 Print command 8-11 printing assembly reports 8-10 color printer for electropherograms 8-12 contig 8-11 only one copy 8-7, 8-11 project views 8-11 project window 8-11 sequence window views 8-12 project adding sequences 3-10 arranging multiple windows 4-18 closing 3-8 complementing contig 5-5 creating 3-6 described 3-2 example of window after assembly 3-40 file types 3-2 files from BioLIMS database 3-12 keeping file with sequence files 3-2 opening 3-6 organizing 3-2 organizing with several project files 3-3 printing a contig 8-11 removing sequences 3-11 removing sequences from BioLIMS project 3-24 saving 8-2 project folder 3-2 Project Report described 8-9 example 8-9 printing 8-10 saving 8-10 viewing 8-9 Project Summary described 8-7 example 8-7 printing 8-10 saving 8-10 viewing 8-9 project window cloning 4-19 closing 3-8 copying views for presentations 8-11 copying views to other programs 8-14 described 3-6 locating ambiguities 5-2 lower pane 3-7 opening 3-6 sequences in upper right pane 3-7 upper panes 3-7 R random access memory (RAM) 2-15 range changing range of a feature 6-14 valid range for assembly. see valid range re-adding modified sequences 7-4 reassembling after changing assembly parameters 7-8 after changing constraints 7-6 after changing engine parameters 7-9 after editing 7-5 contig name increments 7-1 lessening the number of sequences included in an overlap 7-8 resolving incorrectly assembled repeat regions 7-6 to obtain clean and consistent overlaps 7-5 with changed sequences 7-4 with new sequences 7-2 registering your software 1-2 registration code 3-6 removing BioLIMS Client Package 2-10 sequences from BioLIMS projects 3-24 sequences from project 3-11 replacing bases 5-16 bases in consensus 5-13–5-14 gaps 4-21 residues defined G-3 resolving incorrectly assembled repeat regions 7-6 Right-Arrow key moving in electropherograms 6-9 selecting bases 5-11 ruler defined G-3 ruler origin defined G-3 S sample files consensus sequence saved as defined G-3 editable data G-1 importing 3-10, 7-2 8-17 Index-7 original data G-3 see also data; sequences viewing information from run 6-7 saving assembly reports 8-10 consensus sequence 8-17, 8-18 modifications made in sequence window 6-6 project 8-2 to sequence files 8-4 scaling electropherograms in the project window 4-10–4-14 selected sequence defined G-3 selecting bases or sequence segments 5-11 characters 5-11 views in the project window 4-2 selection expressions, tables 4-22 SeqEd program 8-19 Sequence Chooser window (BioLIMS) closing 3-22 displaying the window 3-15 parts of the window 3-16–3-17 searching database 3-18 using 3-20–3-24 sequence list changing information in 3-25 defined G-3 fields available 3-25 fields displayed 3-25 formatting 3-25 sorting 3-27 table of fields 3-26 table of sorting options 3-27 viewing options 3-25 Sequence Navigator program 1-8, 8-19 Sequence view button to display 6-11 described 6-11 example 6-11 in text, Inherit, new sequences 6-4 sequence windows changing bases 6-12 consensus sequence 8-17 copying sequences to other programs deleting bases 6-12 Index-8 8-14 printing 8-12 saving modifications 6-6 views 6-4 sequences adding to a BioLIMS project 3-12 adding to a From Files project 3-10 adjusting overlaps 7-6 assembling 3-29 assembling sequences with diverse lengths 3-39 changing bases in Sequence view 6-12 closing a project 3-8 constraining overlaps 7-6 copying from the sequence window 8-14 defined G-3 deleting bases in Sequence view 6-12 determining name in project window 4-4 distance when displayed on same line 4-15 editing component sequences versus editing consensus 5-10 editing in Electropherogram view 6-8 editing in Sequence view 6-11 editing valid range 5-18 exporting to text format 8-19 finding patterns 4-21 half-height bars in 4-5 identifying in Layout view 4-4 identifying in project window 3-7 identifying in Statistics view 4-8 keeping with project 3-2 lessening the number included in an overlap 7-8 locking 6-4 moving with respect to project file 3-2 offset defined G-3 orientation and position in Layout view 4-3 protecting from edits 6-4 re-adding modified sequences 7-4 removing ends from valid range 5-18 removing from BioLIMS projects 3-24 removing from project 3-11 resolving incorrectly assembled repeat regions 7-6 sample files defined G-3 saving to sample file 8-4 selected sequence defined G-3 sequence list 3-7 shifting left or right 5-16 showing original 6-8 viewing simultaneously in sequence windows 6-3 server assembly algorithm based on Myers-Kececioglu model 3-36 hardware-based comparison 3-36 reducing for deletions 3-38 reducing for insertions 3-38 reducing for mismatches 3-38 setting minimum overlap and percent error 3-38 when to use 3-36 Server Option files installed 2-7 optional configuration 1-5 setting up assembly constraints 7-6 settings defined G-3 specifying ambiguity characters 4-16 specifying ambiguity color 4-15 specifying height of electropherograms in project window 4-13 specifying row height for displaying electropherograms in project window 4-13 Shift key 5-11 shifting bases or sequence segments 5-16 shortcuts for selecting bases 5-11 Show Original command 6-8 software supplied with AutoAssembler 2-2 to run AutoAssembler 2-2 virus protection 2-3 sorting the sequence list 3-27 Stack command 4-18 stacked windows example 4-19 Statistics view button to display 5-21 changing parameters 5-20 described 4-2 display legend 4-7 displaying the consensus 4-7 example 4-7 identifying sequences 4-8 locating problem areas to edit 5-21 verifying orientation and redundancy 5-20 styles defaults for marking features 6-15 summary graphic 4-23, 6-4 defined G-3 SybaseConfig control panel 2-18 where located 2-13 symbols defined G-4 synchronized electropherograms changing row height 4-11 character size global to Alignment view 4-10 peak height relative to row height 4-14 scaling horizontally 4-10 scaling vertically 4-11 T tables default marking styles 6-15 fields in sequence list 3-26 keyboard shortcuts for selecting bases or sequences 5-11 More checkbox parameters 3-38 selection expressions for Find command 4-22 sequence list sorting options 3-27 technical support duration 1-2 text files described G-4 exporting to 8-19 importing 3-10, 7-2 Tile command 4-18 example 4-18 U Unassembled sequence list 3-6, 7-4, 8-8 Up-Arrow key 5-11 updating a sequence file 6-6 upper case characters 4-5, 5-11 upper panes of project window 3-7 defined G-4 user's manual about 1-9 conventions used in 1-9 Index-9 V valid range determined from Factura features 6-13 editing 5-18 marked green in sequence window 6-4 viewing assembly reports 8-9 electropherograms with variable peak spacing 6-2 multiple contig 4-19 multiple views of data 4-19 the sequence list 3-25 views copying project window views to other programs 8-14 defined G-4 printing project window 8-11 printing sequence window views 8-12 sequence window views 6-4 using for presentations 8-11 virus protection 2-3 volume (computer) 3-3 W windows arranging 4-18 cloning the project window 4-19 stacking 4-18 tiling 4-18 word processing copying graphics from project window Wrap checkbox 4-22 Z zooming between project window views 4-5 electropherograms in the project window 4-10–4-14 to change project window views 4-2 Index-10 8-14 Worldwide Sales Offices Applied Biosystems vast distribution and service network, composed of highly trained support and applications personnel, reaches into 150 countries on six continents. For international office locations, please call our local office or refer to our web site at www.appliedbiosystems.com. Headquarters 850 Lincoln Centre Drive Foster City, CA 94404 USA Phone: +1 650.638.5800 Toll Free: +1 800.345.5224 Fax: +1 650.638.5884 Technical Support For technical support: Toll Free: +1 800.831.6844 ext 23 Fax: +1 650.638.5891 www.appliedbiosystems.com PE Corporation is committed to providing the world’s leading technology and information for life scientists. PE Corporation consists of the Applied Biosystems and Celera Genomics businesses. Printed in the USA, 09/2000 Part Number 904947B
© Copyright 2026 Paperzz