User`s Manual - Thermo Fisher Scientific

AutoAssembler
Version 2.0
User’s Manual
© Copyright 2000, Applied Biosystems
For Research Use Only. Not for use in diagnostic procedures..
ABI PRISM and its Design, Applied Biosystems, SeqEd and Sequence Navigator are registered trademarks of PE Corporation. ABI,
AutoAssembler, BioLIMS, Factura, Inherit, and Applied Biosystems are trademarks of PE Corporation or its subsidiaries in the U.S.
and certain other countries.
Collections Manager is a trademark of Molecular Informatics, Inc.
AppleScript and Macintosh are registered trademarks of Apple, Inc.
All other trademarks are the sole property of their respective owners.
P/N 904947B
Software License and Warranty
Applied
Biosystems
Software License
and Limited
Product Warranty
PURCHASER, CAREFULLY READ THE FOLLOWING TERMS AND
CONDITIONS (THE “AGREEMENT”), WHICH APPLY TO THE
SOFTWARE ENCLOSED (THE “SOFTWARE”). YOUR OPENING OF
THIS PACKAGE INDICATES YOUR ACCEPTANCE OF THESE TERMS
AND CONDITIONS. IF YOU DO NOT ACCEPT THEM, PROMPTLY
RETURN THE COMPLETE PACKAGE AND YOUR MONEY WILL BE
RETURNED. THE LAW PROVIDES FOR CIVIL AND CRIMINAL
PENALTIES FOR ANYONE WHO VIOLATES THE LAWS OF
COPYRIGHT.
Copyright The SOFTWARE, including its structure, organization, code, user
interface, and associated documentation, is a proprietary product of
Applied Biosystems and is protected by international laws of copyright.
Title to the SOFTWARE, and to any and all portion(s) of the
SOFTWARE shall at all times remain with Applied Biosystems.
License 1. You may use the SOFTWARE on a single computer (or on a single
network, if your software is designated as a network version). You may
transfer the SOFTWARE to another single computer (or network, if a
network version), so long as you first delete the SOFTWARE from the
previous computer or network. You may never have operational
SOFTWARE on more than one computer (or more than one network, if
a network version) per original copy of the SOFTWARE at any time.
2. You may make one copy of the SOFTWARE for backup purposes.
3. You may transfer the SOFTWARE to another party, but only if the
other party agrees in writing with Applied Biosystems to accept the
terms and conditions of this Agreement. If you transfer the SOFTWARE
to another party, you must immediately transfer all copies to that party,
or destroy those not transferred. Any such transfer terminates your
license.
continued on next page
iii
Restrictions 1. You may not copy, transfer, rent, modify, use, or merge the
SOFTWARE, or the associated documentation, in whole or in part,
except as expressly permitted in this Agreement.
2. You may not reverse assemble, decompile, or otherwise reverse
engineer the SOFTWARE.
Limited Warranty For a period of 90 days after purchase of the SOFTWARE, Applied
Biosystems warrants that the SOFTWARE will function substantially as
described in the documentation supplied by Applied Biosystems with
the SOFTWARE. If you discover an error which causes substantial
deviation from that documentation, send a written notification to Applied
Biosystemsr. Upon receiving such notification, if Applied Biosystems is
able to reliably reproduce that error at its facility, then Applied
Biosystems will do one of the following at its sole option: (i) correct the
error in a subsequent release of the SOFTWARE, which shall be
supplied to you free of charge, or (ii) accept a return of the SOFTWARE
from you, and refund the purchase price received for the SOFTWARE.
Applied Biosystems does not warrant that the SOFTWARE will meet
your requirements, will be error-free, or will conform exactly to the
documentation. Any sample or model used in connection with this
Agreement is for illustrative purposes only, is not part of the basis of the
bargain, and is not to be construed as a warranty that the SOFTWARE
will conform to the sample or model.
Limitation Of EXCEPT AS SPECIFICALLY STATED IN THIS AGREEMENT, THE
Liability SOFTWARE IS PROVIDED AND LICENSED “AS IS”. THE ABOVE
WARRANTY IS GIVEN IN LIEU OF ALL OTHER WARRANTIES,
EXPRESSED OR IMPLIED, INCLUDING THOSE OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
NOTWITHSTANDING ANY FAILURE OF THE CENTRAL PURPOSE
OF ANY LIMITED REMEDY, APPLIED BIOSYSTEMS LIABILITY FOR
BREACH OF WARRANTY SHALL BE LIMITED TO A REFUND OF
THE PURCHASE PRICE FOR SUCH PRODUCT. IN NO EVENT WILL
APPLIED BIOSYSTEMS BE LIABLE FOR ANY OTHER DAMAGES,
INCLUDING INCIDENTAL OR CONSEQUENTIAL DAMAGES, EVEN
IF APPLIED BIOSYSTEMS HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
continued on next page
iv
Term You may terminate this Agreement by destroying all copies of the
SOFTWARE and documentation. Applied Biosystems may terminate
this Agreement if you fail to comply with any or all of its terms, in which
case you agree to return to Applied Biosystemsr all copies of the
SOFTWARE and associated documentation.
Miscellaneous 1. Failure to enforce any of the terms and conditions of this Agreement
by either party shall not be deemed a waiver of any rights and privileges
under this Agreement.
2. In case any one or more of the provisions of this Agreement for any
reason shall be held to be invalid, illegal, or unenforceable in any
respect, such invalidity, illegality, or unenforceability shall not affect any
other provisions of this Agreement, and this Agreement shall be
construed as if such invalid, illegal, or unenforceable provisions had
never been contained herein.
3. This Agreement shall be construed and governed by the laws of the
State of California.
4. This Agreement and the Applied Biosystems Sales Quotation
constitute the entire agreement between Applied Biosystems and you
concerning the SOFTWARE.
v
vi
Contents
Software License and Warranty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Applied Biosystems Software License and Limited Product Warranty iii
Copyright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Restrictions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Limited Warranty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Limitation Of Liability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v
Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1
Registering Your Copy of AutoAssembler. . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
How to Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2
About AutoAssembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3
Using AutoAssembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
Optional AutoAssembler Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
BioLIMS Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4
CAP Remote Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Server Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
New Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5
Compatibility with Previous Releases . . . . . . . . . . . . . . . . . . . . . . . . 1-6
vii
Related Software Packages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Factura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
Sequence Navigator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-8
Using This Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Conventions Used in This Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9
Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
To Reach Us on the Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10
Hours for Telephone Technical Support. . . . . . . . . . . . . . . . . . . . . . 1-10
To Reach Us by Telephone or FAX… . . . . . . . . . . . . . . . . . . . . . . . 1-10
Documents on Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-13
To Reach Us by E-Mail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14
Regional Offices, Sales and Services . . . . . . . . . . . . . . . . . . . . . . . . 1-14
2 System Requirements and Installation . . . . . . . . . . .2-1
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
In This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1
Hardware and Software Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Required Computer System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Supplied with AutoAssembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
AutoAssembler Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
Installing AutoAssembler Only. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Using the AutoAssembler Installer . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3
Files Installed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
Installing the BioLIMS Client Package,
Including the AutoAssembler Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
Before You Install. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8
To Install the BioLIMS Client Package . . . . . . . . . . . . . . . . . . . . . . . 2-8
To Do a Custom Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
To Remove the Installed Package . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
viii
Files Installed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
Application Files Installed . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12
System Files Installed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
Starting AutoAssembler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
To Start AutoAssembler for the First Time. . . . . . . . . . . . . . . . . . . . 2-14
Allocating More Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
Configuring BioLIMS Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
Configuring for Server Connection. . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
3 Creating and Assembling a Project . . . . . . . . . . . . . 3-1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
Organizing Your Project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Organizing a From Files Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
Organizing a Large Project With Several Project Files . . . . . . 3-3
Organizing a Networked Project . . . . . . . . . . . . . . . . . . . . . . . 3-3
Organizing a BioLIMS Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
Naming a BioLIMS Project . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
Missing Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
Opening and Closing a Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Starting AutoAssembler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Viewing the Project Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Contig List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-6
Sequence List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
Project Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
Opening a New Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
Opening an Existing Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
From the Finder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
While Starting the AutoAssembler Program . . . . . . . . . . . . . . 3-7
From Within the AutoAssembler Program. . . . . . . . . . . . . . . . 3-7
ix
Closing a Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
Adding Sequences From Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
From File and BioLIMS Projects. . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-9
Adding Sequences to a From Files Project . . . . . . . . . . . . . . . . . . . 3-10
Removing Sequences from a Project . . . . . . . . . . . . . . . . . . . . . . . . 3-11
Adding Sequences From the BioLIMS Database . . . . . . . . . . . . . . . . . . . . 3-12
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
In This Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-12
Opening BioLIMS Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13
Displaying the Sequence Chooser Window . . . . . . . . . . . . . . . . . . . 3-15
Parts of the Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-16
Collection Search Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-17
Sequence Search Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-18
Searching the BioLIMS Database . . . . . . . . . . . . . . . . . . . . . . . . . . 3-20
Adding Sequences From BioLIMS . . . . . . . . . . . . . . . . . . . . . . . . . 3-22
Removing Sequences from a Project . . . . . . . . . . . . . . . . . . . . . . . . 3-24
Viewing the Sequence List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-25
Changing the Information Displayed in the Sequence List . . . . . . . 3-25
Changing the Sort Order in the Sequence List. . . . . . . . . . . . . . . . . 3-27
Assembling Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-29
Assembling by Local Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-30
Assembling Projects Using the Engine Options . . . . . . . . . . . . . . . 3-31
Assembly Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-31
Engine Assembly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-34
Using a Server to Assemble Project . . . . . . . . . . . . . . . . . . . . . . . . . 3-36
FDF Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-38
Setting Minimum Overlap and Percent Error . . . . . . . . . . . . . . . . . 3-38
Server Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-38
Local Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-40
x
The Assembled Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-40
Contig Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-41
Sequence Names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-41
Importing Assembled Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-41
Setting Up for AutoUpdating. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43
Opening BioLIMS Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43
Configuring AutoUpdating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-43
Changing and Adding Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-44
Adding Sequences using AutoUpdating . . . . . . . . . . . . . . . . 3-44
While the Project is Being Updated . . . . . . . . . . . . . . . . . . . . . . . . . 3-44
Turning Off AutoUpdating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-45
4 Viewing the Consensus. . . . . . . . . . . . . . . . . . . . . . . 4-1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-1
Understanding the Project Window Views . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
Layout View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
Identifying Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Displaying File Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-4
Zooming In. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
Alignment View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Consensus Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-6
Viewing Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
The Statistics View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
Displaying the Consensus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
Statistic View Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
The Zoom Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-8
Displaying Electropherograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
Opening Electropherogram Displays . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
Hiding Electropherogram Displays. . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
xi
Changing Electropherogram Appearance . . . . . . . . . . . . . . . . . . . .
Changing Horizontal Scale . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing Vertical Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing Row Height . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing the Display Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Opening the Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing Row Height and Vertical Scale . . . . . . . . . . . . . . . . . . . .
Changing Minimum Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Selecting Base Color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing Consensus Characters . . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing Threshold Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing Orientation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . .
Changing Network Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Manipulating Window Displays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Arranging Multiple Windows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Tiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Stacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Cloning the Project Window to See Multiple Views of the Data . . .
Locating Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Finding Sequences and Patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . .
Searching for Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4-10
4-10
4-11
4-11
4-12
4-12
4-13
4-14
4-15
4-15
4-16
4-16
4-16
4-17
4-18
4-18
4-18
4-18
4-18
4-19
4-21
4-21
4-21
4-24
5 Editing the Project. . . . . . . . . . . . . . . . . . . . . . . . . . .5-1
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
In This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Locating and Controlling Ambiguity in the Consensus . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Using the Views to Locate Problem Areas . . . . . . . . . . . . . . . . . . . . .
Finding Ambiguities Quickly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Controlling Ambiguity in the Consensus . . . . . . . . . . . . . . . . . . . . . .
xii
5-1
5-1
5-1
5-2
5-2
5-2
5-3
5-4
Complementing a Contig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
Translating the Consensus to Protein Sequences . . . . . . . . . . . . . . . . 5-5
Using an Electropherogram to Resolve Ambiguities . . . . . . . . . . . . . 5-6
Finding Ambiguous Areas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
Resolving Ambiguity in the Project Window . . . . . . . . . . . . . . . . . . . . . . . 5-10
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
Editing in the Consensus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
What Gets Saved When You Edit . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10
Keeping Track of Your Edits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
Selecting Bases or Sequence Segments . . . . . . . . . . . . . . . . . . . . . . 5-11
Adding Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-12
Deleting Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
Replacing Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-14
Shifting Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16
Editing Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
Editing the Valid Range of Data Used for Assembly . . . . . . . . . . . . 5-18
Verifying Orientation and Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
Changing Statistic View Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
Checking the Consensus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-21
6 Viewing and Editing Sequences. . . . . . . . . . . . . . . . 6-1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1
Viewing and Editing Individual Sequences in Sequence Windows. . . . . . . . 6-2
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
Opening the Sequence Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3
Viewing the Sequence Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4
Editing in the Sequence Window versus the Project Window . . . . . . 6-5
Closing the Sequence Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5
Using the Annotation View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
The Annotation View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7
xiii
Using the Electropherogram View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8
Editing in the Electropherogram View . . . . . . . . . . . . . . . . . . . . . . . . 6-8
Moving the Selection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9
Changing Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9
Adding Bases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10
Using the Sequence View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11
Editing Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11
Adding Bases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
Deleting Bases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
Changing Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-12
Using the Feature View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-13
Editing Feature Ranges and Markings . . . . . . . . . . . . . . . . . . . . . . . 6-14
Changing Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-14
7 Reassembling a Project . . . . . . . . . . . . . . . . . . . . . . .7-1
Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
In This Chapter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reassembling with New or Changed Sequences . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reassembling with New Sequences. . . . . . . . . . . . . . . . . . . . . . . . . .
Reassembling with Changed Sequences . . . . . . . . . . . . . . . . . . . . . .
Reassembling to Achieve Different Results . . . . . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reassembling After Editing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reassembling After Changing Constraints . . . . . . . . . . . . . . . . . . . .
Resetting Overlap Relationships . . . . . . . . . . . . . . . . . . . . . . .
Assembling Projects Without Constraints. . . . . . . . . . . . . . . .
Reassembling After Changing Minimum Overlap
and Percent Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Reassembling After Changing Engine Parameters . . . . . . . . . . . . . .
xiv
7-1
7-1
7-1
7-2
7-2
7-2
7-4
7-5
7-5
7-5
7-6
7-8
7-8
7-8
7-9
8 Saving and Printing in AutoAssembler . . . . . . . . . . 8-1
Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
In This Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1
Saving your Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
Saving the Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
Project and Sequence Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
Project Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
Sequence Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3
Saving Individual Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
Saving Sequences From the Project Window. . . . . . . . . . . . . . 8-4
Saving Sequences From the Sequence Window. . . . . . . . . . . . 8-6
Printing and Saving Assembly Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
Project Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-7
The Contig Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-8
Project Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9
Viewing Assembly Reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-9
Saving Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10
Printing Assembly Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-10
Printing and Copying the Views for Presentations . . . . . . . . . . . . . . . . . . . 8-11
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-11
Printing Project Window Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-11
Printing Sequence Window Views . . . . . . . . . . . . . . . . . . . . . . . . . . 8-12
Copying Project Window Views to Other Programs. . . . . . . . . . . . . 8-14
Copying a Sequence from the Sequence Window . . . . . . . . . . . . . . 8-14
Creating Files for Use with Other Applications . . . . . . . . . . . . . . . . . . . . . 8-16
Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-16
Building a Consensus Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-16
Exporting a Consensus Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17
Exporting Sequences to Text Format . . . . . . . . . . . . . . . . . . . . . . . . 8-19
AutoAssembler Layout Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19
xv
A AppleScript Dictionary . . . . . . . . . . . . . . . . . . . . . . A-1
Appendix Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
In This Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
AppleScript Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
AutoAssembler Suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
BioLIMS Scripts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6
B References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
Appendix Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
In This Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
Algorithm References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
Sequence Alignment Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
Feature Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
C Key Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
Appendix Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
In This Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1
Translation Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2
IUPAC/IUB Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-2
Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
Universal Genetic Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3
Amino Acid Abbreviations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-4
Glossary
Index
xvi
Introduction
Overview
1
1
Introduction This chapter provides information on:
♦
The AutoAssembler™ software
♦
Using this manual
♦
How to get help if you need it
Before you begin, you should be familiar with the Product License and
Warranty in the front of the manual.
In This Chapter This chapter contains the following topics:
Topic
See Page
Registering Your Copy of AutoAssembler
1-2
About AutoAssembler
1-3
Using This Manual
1-9
Customer Support
1-10
Introduction 1-1
Registering Your Copy of AutoAssembler
Introduction When you register your copy of the AutoAssembler software, you
become eligible for telephone and field service support from Applied
Biosystems that lasts for 90 days from the date of the first telephone
support call. Registering also allows you to purchase upgrades to the
software at a lower price than it would cost you to purchase new
software. These privileges are only available if you return your
registration card.
How to Register To register your copy of the AutoAssembler software, fill out the
registration card included in this package and return it to Applied
Biosystems.
1-2 Introduction
About AutoAssembler
Introduction The AutoAssembler software allows you to quickly and efficiently
assemble small pieces of data from ABI PRISM™ DNA Sequencing
Analysis software (as well as data from other sources) into larger
segments of data. The AutoAssembler software provides powerful tools
for editing the sequences, including the ability to display constantly
spaced electropherograms with the assembled sequences. You can
build a consensus from the assembled sequences and export that
consensus to use with other programs such as Sequence Navigator®
software.
Use the AutoAssembler software in conjunction with the Factura
program to clean up sequence data for analysis and alignment by
identifying features you specify (such as vector and ambiguity ranges
that are not to be used in assembly). Factura processes sequences in
batches, speeding the cleanup process. Factura also provides tools for
editing sequences and marking the identified features with color and
underscoring.
continued on next page
Introduction 1-3
Using Using AutoAssembler is an iterative process. You can build a project,
AutoAssembler assemble the sequences, view and edit the contig, and then add more
sequences and reassemble. Figure 1-1 shows a typical path for using
the AutoAssembler software.
Create a project
Add sequences
Assemble the project
Review and edit the
contig and sequences
Reassemble
Print or export
the consensus
Figure 1-1 AutoAssembler user path
Optional AutoAssembler software is available in three optional configurations
AutoAssembler that expand the capabilities of the program through the use of a server.
Versions These options are purchased separately.
Note
The client sides of the three options are included with the
AutoAssembler installation disk as custom options. However, they will not work
without the purchased server options.
BioLIMS Option
The BioLIMS system provides a relational database for sequences
created by ABI PRISM DNA Sequencing Analysis software. This
database accommodates multiple users and editions while preserving
the original data.
1-4 Introduction
With the BioLIMS option, you can store sequences for use in your
AutoAssembler projects. Using AutoAssembler’s AutoUpdating feature,
the BioLIMS database allows you to build a project from sequences
stored on the server. AutoAssembler automatically updates and
reassembles the project as database sequences are added or edited.
CAP Remote Option
You may also purchase the Remote Contig Assembly Program (CAP)
version of AutoAssembler. Like the Server option, CAP Remote allows
you to assemble projects on a UNIX server, making assembly faster for
larger projects, and freeing your Macintosh® computer for use during
assembly.
Server Option
The AutoAssembler software supports the Server with Fast Data
Finder® (FDF) as a separate purchase option. In this optional
configuration, AutoAssembler sends projects to a server for assembly.
New Features For those who are familiar with prior versions of the AutoAssembler
software, AutoAssembler 2.0 contains the major new features shown in
Table 1-1.
Table 1-1 New AutoAssembler Features
Item
Description
AutoUpdating
With BioLIMS, AutoAssembler allows you to
automatically update designated projects to include
new or edited sequences as they are entered into the
database.
AppleScripting
Support
AutoAssembler now supports a wide variety of
AppleScript® commands (see Appendix A for a list of
supported commands).
User-Configurable
Assembly Engines
With the engine assembly option, you can update
AutoAssembler with new assembly algorithms by
plugging in new algorithm engines.
continued on next page
Introduction 1-5
Compatibility with AutoAssembler 2.0 is fully compatible with projects created with
Previous Releases AutoAssembler versions 1.0 and 1.4. This allows you to apply the
improved features of AutoAssembler 2.0 to projects you have already
created. However, projects created with AutoAssembler 2.0 are not
compatible with earlier versions of AutoAssembler.
Related Software The following software packages improve the capabilities of
Packages AutoAssembler.
Factura
Before you assemble sequences, you should remove vector sequences
and ambiguities from the sequence data. The Factura software uses the
parameters you specify to identify the vector and ambiguous sequence
areas, assign International Union of Biochemistry (IUB) codes to
ambiguous bases, and mark the confidence range. The program
performs these operations on batches of sequences, speeding
sequence cleanup time considerably.
After you have identified clean data using the Factura program, you can
import the sequences into an AutoAssembler project, which determines
the valid range of data for assembly based on the features identified in
Factura. In the project window, you can edit and assemble multiple
sequences and view their electropherograms simultaneously.
When you are satisfied with the assembled project, you can build a
consensus and export it to a file for use with other applications. You can
also print (or copy and paste) various parts of the project windows for
reports or presentations.
Figure 1-2 shows how the AutoAssembler and Factura programs work
together to create completed project files.
1-6 Introduction
Factura/AutoAssembler Flowchart
Sequence files
Process Sequences
Identify vector sequence
Identify ambiguous regions
Identify confidence range
Identify heterozygotes
Print
Save a copy
Print sequences, including
electropherograms
Print features, annotation
Sequences
BioLIMS
Database
(Newly created or imported)
Import to
Project
Assemble
Edit
Update/Reassemble
Build/Save consensus
Save/Export to
Print
Save
Sequence files
Figure 1-2 AutoAssembler and Factura interaction
Introduction 1-7
Sequence Navigator
The Sequence Navigator program runs on Macintosh computers and
addresses the unique needs of researchers who compare sequences to
identify interesting sequence variations. It is used for mutation
identification/heterozygote screening of sequences (such as p53 and
HIV) and mitochondrial DNA. The program incorporates five powerful
algorithms for pairwise or multiple alignment of DNA and protein
sequences. Sequence Navigator software can be used in conjunction
with Factura to identify heterozygote base positions and quickly clean
up sequences before aligning them.
1-8 Introduction
Using This Manual
Introduction This manual includes an index, glossary, list of topics for each section,
and numerous cross-references to help you find the information you
need.
Conventions Used The following words and styles draw your attention to specific details of
in This Manual the information presented in this manual:
Note
This is used to call attention to useful information.
IMPORTANT
This information is indicated because it is necessary for
proper operation of the software.
CAUTION This word informs you that damage to the application or loss
of data could occur if you do not comply with this information.
Introduction 1-9
Technical Support
To Reach Us on the Applied Biosystems web site address is:
Web
http://www.appliedbiosystems.com/techsupport
We strongly encourage you to visit our web site for answers to
frequently asked questions, and to learn more about our products. You
can also order technical documents and/or an index of available
documents and have them faxed or e-mailed to you through our site
(see the “Documents on Demand” section below).
Hours for In the United States and Canada, technical support is available at the
Telephone following times.
Hours
Technical Support Product
Chemiluminescence
9:00 a.m. to 5:00 p.m. Eastern Time
LC/MS
9:00 a.m. to 5:00 p.m. Pacific Time
All Other Products
5:30 a.m. to 5:00 p.m. Pacific Time
See the “Regional Offices Sales and Service” section below for how
to contact local service representatives outside of the United States and
Canada.
To Reach Us by Call Technical Support at 1-800-831-6844, and select the appropriate option
Telephone or Fax (below) for support on the product of your choice at any time during the call. (To
in North America open a service call for other support needs, or in case of an emergency, press 1
after dialing 1-800-831-6844.)
For Support On This
Product
ABI PRISM ® 3700 DNA
Analyzer
ABI PRISM ® 3100 Genetic
Analyzer
DNA Synthesis
1-10 Introduction
Dial 1-800-831-6844, and...
Press
FAX
8
650-638-5981
Press
FAX
26
650-638-5891
Press
FAX
21
650-638-5981
For Support On This
Product
Fluorescent DNA
Sequencing
Fluorescent Fragment
Analysis (includes
GeneScan® applications)
Integrated Thermal Cyclers
BioInformatics (includes
BioLIMS™, BioMerge™, and
SQL GT™ applications)
PCR and Sequence
Detection
Dial 1-800-831-6844, and...
Press
FAX
22
650-638-5891
Press
FAX
23
650-638-5891
Press
FAX
24
650-638-5891
Press
FAX
25
505-982-7690
Press
FAX
5, or call
240-453-4613
1-800-762-4001,
and press 1 for
PCR, or 2 for
Sequence
Detection
FMAT
Peptide and Organic
Synthesis
Protein Sequencing
Chemiluminescence
Telephone
FAX
1-800-899-5858,
and press 1, then
press 6
508-383-7855
Press
FAX
31
650-638-5981
Press
FAX
32
650-638-5981
Telephone
FAX
1-800-542-2369
(U.S. only), or
781-275-8581
(Tropix)
1-781-271-0045
(Tropix)
9:00 a.m. to
5:00 p.m. ET
Introduction 1-11
For Support On This
Product
LC/MS
Dial 1-800-831-6844, and...
Telephone
FAX
1-800-952-4716
650-638-6223
9:00 a.m. to
5:00 p.m. PT
1-12 Introduction
Documents on Free 24-hour access to Applied Biosystems technical documents,
Demand including MSDSs, is available by fax or e-mail.
You can access Documents on Demand through the internet or by
telephone:
If you want to
order...
through the
internet
Then...
Use http://www.appliedbiosystems.com/techsupport
You can search for documents to order using keywords.
Up to five documents can be faxed or e-mailed to you
by title.
by phone from the
United States or
Canada
a. Call 1-800-487-6809 from a touch-tone phone. Have
your fax number ready.
b. Press 1 to order an index of available documents
and have it faxed to you. Each document in the
index has an ID number. (Use this as your order
number in step “d” below.)
c. Call 1-800-487-6809 from a touch-tone phone a
second time.
d. Press 2 to order up to five documents and have
them faxed to you.
by phone from
outside the
United States or
Canada
a. Dial your international access code, then
1-858-712-0317, from a touch-tone phone.
Have your complete fax number and country code
ready (011 precedes the country code).
b. Press 1 to order an index of available documents
and have it faxed to you. Each document in the
index has an ID number. (Use this as your order
number in step “d” below.)
c. Call 1-858-712-0317 from a touch-tone phone a
second time.
d. Press 2 to order up to five documents and have
them faxed to you.
Introduction 1-13
To Reach Us by Contact technical support by e-mail for help in the following product
E-Mail areas.
For this product area
Use this e-mail address
Chemiluminescence
[email protected]
Genetic Analysis
[email protected]
LC/MS
[email protected]
PCR and Sequence Detection
[email protected]
Protein Sequencing, Peptide and
DNA Synthesis
[email protected]
Regional Offices If you are outside the United States and Canada, you should contact
Sales and Service your local Applied Biosystems service representative.
The Americas
United States
Applied Biosystems
850 Lincoln Centre Drive
Foster City, California 94404
Tel:
Fax:
Latin America (Del.A. Obregon,
Mexico)
Tel:(305) 670-4350
Fax:
(305) 670-4349
(650) 570-6667
(800) 345-5224
(650) 572-2743
Europe
Austria (Wien)
Hungary (Budapest)
Tel:
43 (0)1 867 35 75 0
Fax:
43 (0)1 867 35 75 11
Tel:
Fax:
Belgium
Tel:
Fax:
1-14 Introduction
36 (0)1 270 8398
36 (0)1 270 8288
Italy (Milano)
32 (0)2 712 5555
32 (0)2 712 5516
Tel:
Fax:
39 (0)39 83891
39 (0)39 838 9492
Czech Republic and Slovakia
(Praha)
The Netherlands (Nieuwerkerk a/d
IJssel)
Tel:
Fax:
Tel:
Fax:
420 2 61 222 164
420 2 61 222 168
31 (0)180 331400
31 (0)180 331409
Denmark (Naerum)
Norway (Oslo)
Tel:
Fax:
Tel:
Fax:
45 45 58 60 00
45 45 58 60 01
47 23 12 06 05
47 23 12 05 75
Europe
Finland (Espoo)
Tel:
Fax:
358 (0)9 251 24 250
358 (0)9 251 24 243
Poland, Lithuania, Latvia, and
Estonia (Warszawa)
Tel:
Fax:
48 (22) 866 40 10
48 (22) 866 40 20
France (Paris)
Portugal (Lisboa)
Tel:
Fax:
Tel:
Fax:
33 (0)1 69 59 85 85
33 (0)1 69 59 85 00
351 (0)22 605 33 14
351 (0)22 605 33 15
Germany (Weiterstadt)
Russia (Moskva)
Tel:
Fax:
Tel:
Fax:
49 (0) 6150 101 0
49 (0) 6150 101 101
7 095 935 8888
7 095 564 8787
Spain (Tres Cantos)
South Africa (Johannesburg)
Tel:
Fax:
Tel:
Fax:
34 (0)91 806 1210
34 (0)91 806 1206
Sweden (Stockholm)
Tel:
Fax:
46 (0)8 619 4400
46 (0)8 619 4401
27 11 478 0411
27 11 478 0349
United Kingdom (Warrington,
Cheshire)
Tel:
Fax:
44 (0)1925 825650
44 (0)1925 282502
Switzerland (Rotkreuz)
South East Europe (Zagreb, Croatia)
Tel:
Fax:
Tel:
Fax:
41 (0)41 799 7777
41 (0)41 790 0676
385 1 34 91 927
385 1 34 91 840
Middle Eastern Countries and North
Africa (Monza, Italia)
Africa (English Speaking) and West
Asia (Fairlands, South Africa)
Tel:
Fax:
Tel:
Fax:
39 (0)39 8389 481
39 (0)39 8389 493
27 11 478 0411
27 11 478 0349
All Other Countries Not Listed
(Warrington, UK)
Tel:
Fax:
44 (0)1925 282481
44 (0)1925 282509
Japan
Japan (Hatchobori, Chuo-Ku, Tokyo)
Tel:
81 3 5566 6100
Fax:
81 3 5566 6501
Introduction 1-15
Eastern Asia, China, Oceania
1-16 Introduction
Australia (Scoresby, Victoria)
Malaysia (Petaling Jaya)
Tel:
Fax:
Tel:
Fax:
61 3 9730 8600
61 3 9730 8799
60 3 758 8268
60 3 754 9043
China (Beijing)
Singapore
Tel:
Fax:
Tel:
Fax:
86 10 6238 1156
86 10 6238 1162
65 896 2168
65 896 2147
Hong Kong
Taiwan (Taipei Hsien)
Tel:
Fax:
Tel:
Fax:
852 2756 6928
852 2756 6968
886 2 2698 3505
886 2 2698 3405
Korea (Seoul)
Thailand (Bangkok)
Tel:
Fax:
Tel:
Fax:
82 2 593 6470/6471
82 2 593 6472
66 2 719 6405
66 2 319 9788
Introduction 1-17
1-18 Introduction
System Requirements
and Installation
2
Overview
2
Introduction This chapter provides:
♦
Hardware and software requirements for use of the AutoAssembler
software
♦
Instructions for installing the AutoAssembler software
♦
Instructions for installing the BioLIMS Client Package and the
AutoAssembler software from the BioLIMS CD-ROM (optional)
♦
Information about using your registration code and increasing the
memory available for AutoAssembler
♦
Instructions for connecting to the BioLIMS database (optional)
In This Chapter This chapter contains the following topics:
Topic
See Page
Hardware and Software Requirements
2-2
Installing AutoAssembler Only
2-3
Installing the BioLIMS Client Package, Including the
AutoAssembler Software
2-8
Starting AutoAssembler
2-14
Configuring BioLIMS Access
2-17
System Requirements and Installation 2-1
Hardware and Software Requirements
Introduction This section describes the minimum hardware and software
requirements for running AutoAssembler.
Required Table 2-1 describes the computer system required to run
Computer System AutoAssembler. These are the minimum requirements. In general, the
more memory, the larger the screen size, and the more processing
power you have, the better.
Table 2-1 Required Computer System
System Component
Requirements
CPU
A PowerPC Mac OS computer. You will benefit
from using the fastest computer available.
Operating System
Mac OS version 7.5.3 or later with Open
Transport 1.1 or later.
Monitor
A 17-inch monitor or larger is recommended,
although a monitor size of 640 x 480 pixels can
be used. You will benefit from having a larger
monitor.
Disk Space
A minimum of 5.6 MB free disk space.
Memory
The suggested memory allocation is 9.9 MB of
random-access memory (RAM).
Supplied with The AutoAssembler installation disk contains the AutoAssembler
AutoAssembler program and the client sides of the AutoAssembler options. You may
install the client sides of any option, but they will not run without the
purchased server side of the option packages.
AutoAssembler If you purchase an AutoAssembler option, the package may also
Options include one of the following server applications:
♦
AutoAssembler Server Install
♦
AutoAssembler CAP Remote Install
Note
The BioLIMS Client Package (including the AutoAssembler software)
is installed from a CD-ROM disc (see “Installing the BioLIMS Client Package,
Including the AutoAssembler Software” on page 2-8).
2-2 System Requirements and Installation
Installing AutoAssembler Only
Introduction It is important that you disable any virus protection software during the
installation process. After installation is complete, you should restart
your computer to re-enable any virus protection software.
Using the If you are installing AutoAssembler for the Macintosh computer only,
AutoAssembler follow the steps described below. If you are installing AutoAssembler
Installer software with an AutoAssembler option, follow the directions provided in
the Installation Procedure you received with the software.
To install AutoAssembler:
Step
Action
1
If you have not yet done so, disable any virus protection software on
your hard disk.
2
Insert the AutoAssembler Install disk into the 3.5-inch disk drive of
your computer.
The files on the disk are displayed on your screen.
3
Click the Installer icon.
4
When the Installer splash screen appears, click Continue. The
following dialog box appears:
System Requirements and Installation 2-3
To install AutoAssembler:
Step
(continued)
Action
5
Use the pop-up menu in the lower section of the dialog box to select
the hard drive and folder on which to install AutoAssembler.
6
Click Install to install AutoAssembler.
If you need to perform a custom installation, select Custom from the
pop-up menu in the upper-left corner of the dialog box. The
following options appear:
Select the checkboxes of the options you wish to install, and click
Install.
Note
Clicking the “I” buttons to the right of the installation
options provides information on the particular option. Clicking the
Read Me button accesses the Read Me file included with the
software.
7
When prompted, insert the remaining disks.
When the installation is complete, the following dialog box appears:
2-4 System Requirements and Installation
To install AutoAssembler:
Step
8
(continued)
Action
Click the Restart button, unless you want to perform additional
installations.
Note
You do not need to restart in order to use the
AutoAssembler program. Restarting reinstates your virus protection
and cleans up any temporary files created by the installation
procedure.
IMPORTANT
Before you use the programs, open the
AutoAssembler folder and read the Read Me files. To open a Read
Me file, double-click the icon.
continued on next page
System Requirements and Installation 2-5
Files Installed The folders and files should now be installed on your hard disk as
shown in Figure 2-1. The files are briefly described in Table 2-2 and
Table 2-3.
Note
Some of these files are only installed with custom installation options
(see Table 2-3).
Figure 2-1 Location of installed files for the AutoAssembler program
2-6 System Requirements and Installation
Table 2-2 lists the files contained in the AutoAssembler folder after
installation.
Table 2-2 Files Installed in the AutoAssembler Folder
Item
Description
AutoAssembler 2.0
The AutoAssembler program. Double-click the icon
shown in Figure 2-1 to start the AutoAssembler
program.
Engines Folder
Contains CAP assembly engine.
About
AutoAssembler 2.0
Contains information about the AutoAssembler
program. Read the file before starting AutoAssembler.
Assembly Data
A folder used by the AutoAssembler program to store
temporary files that result from assembling
sequences by the Engine option. This folder is empty
when the program is installed, and will contain no
more than five temporary files of a single type at any
time.
Table 2-3 shows the additional files and folders that will be added if you
ordered one of the AutoAssembler options.
Table 2-3 Optional Installation
Option
Additional Folders or Files
SAServer
ABI Folder located in the system folder contains the
SAServer.config file.
CAP Remote
CAP Remote engine added to the Engines folder
located in the AutoAssembler folder.
System Requirements and Installation 2-7
Installing the BioLIMS Client Package,
Including the AutoAssembler Software
Introduction The AutoAssembler application is shipped on a CD-ROM disc as part of
the BioLIMS Client Package. You must have purchased one BioLIMS
Client Package for each Macintosh on which AutoAssembler is
installed.
This section describes
♦
Complete installation of the BioLIMS Client Package (“To Install the
BioLIMS Client Package” below)
♦
Custom installation (“To Do a Custom Installation” on page 2-9), for
example, to install the AutoAssembler application alone
♦
Removal of the installation (“To Remove the Installed Package” on
page 2-10)
Before You Install ♦
Check that you have at least 56MB of free disk space to
accommodate the BioLIMS applications.
♦
Quit from all applications that you may have open.
♦
Turn off any virus protection software that you may have running.
To Install the Follow these steps to install all of the BioLIMS Client Package onto your
BioLIMS Client Macintosh:
Package
To install the BioLIMS Client Package:
Step
1
Action
Insert the BioLIMS Client Package CD-ROM disc.
The BioLIMS Client Package window opens automatically.
2
Find the BioLIMS Client Installer icon in the BioLIMS Client
Package window and double-click to open the BioLIMS Client
Installer.
3
Click Continue.
4
This dialog box contains important information that you should
read.
After you have read it, click Continue to open the BioLIMS Client
Installer window. You may print or save the contents if you want.
2-8 System Requirements and Installation
To install the BioLIMS Client Package:
Step
5
(continued)
Action
To install the whole BioLIMS Client Package, use the default Easy
Install described here.
For information about custom installation, see “To Do a Custom
Installation” on page 2-9.
For information about removing an installed package, see “To
Remove the Installed Package” on page 2-10.
6
Use the Switch Disk button or the Install Location pop-up menu to
choose the disk on which to install the BioLIMS Client Package.
If the software cannot be installed on the chosen disk, a warning
appears in the Installer window.
7
Choose the Select Folder item on the Install Location pop-up menu.
A Macintosh browser box appears.
8
Use the browser box to select a folder in which to install the
BioLIMS Client Package applications.
9
Click Install to begin the installation.
10
At the conclusion of the installation, you should Restart your
computer.
To Do a Custom You may not want to install all of the BioLIMS Client Package. For
Installation example, you might want to install only the AutoAssembler application
on your Macintosh.
To complete a custom installation:
Step
Action
1
Follow steps 1 to 5 in the procedure “To Install the BioLIMS Client
Package” on page 2-8.
2
Select Custom Install from the pop-up menu at the top left of the
window.
3
Check the names of all the applications that you want to install.
For information about the individual applications, click the
information button to the right of the application name to display an
information dialog box.
System Requirements and Installation 2-9
To complete a custom installation:
Step
4
(continued)
Action
Use the Switch Disk button or the Install Location pop-up menu to
choose the disk on which to install the selected applications.
Be sure that there is enough space on the disk to accommodate
your chosen applications. The Installer window reports both the
space available on the disk and the approximate disk space
required for the selected applications.
5
Choose the Select Folder item on the Install Location pop-up menu.
A Macintosh browser box appears.
6
Use the browser box to select a folder in which to install the
selected applications.
7
Click Install to begin the installation of the selected applications.
8
At the conclusion of the installation, you should Restart your
computer.
To Remove the If you decide to remove the BioLIMS Client Package from your
Installed Package Macintosh, follow these steps. The Remove process deletes all the
applications installed in the BioLIMS folder and also the files and folders
placed in the System folder by the installer.
Note
If you have moved BioLIMS files or folders from their original installed
locations, they may not be found and deleted by the remove operation. Also,
any files that have been added to the application folders, such as those created
when the applications are run, are not deleted by the remove operation.
CAUTION If you have installed both the BioLIMS Instrument Package
and the BioLIMS Client Package on the same Macintosh, you should not
use Remove unless you intend to delete both the Client and the
Instrument Packages. This is because the Remove process deletes files
common to both packages, including files that are in the System Folder.
To remove the BioLIMS Client Package files:
Step
Action
1
Follow steps 1 to 5 in the procedure “To Install the BioLIMS Client
Package” on page 2-8.
2
Select Remove from the pop-up menu at the top left of the window.
3
Choose the Select Folder item on the Install Location pop-up menu.
A Macintosh browser box appears.
4
Use the browser box to locate the folder that contains the BioLIMS
folder.
2-10 System Requirements and Installation
To remove the BioLIMS Client Package files:
Step
(continued)
Action
5
Click Remove to begin the removal of the BioLIMS Client Package
applications on your disk.
6
At the conclusion of the remove operation, an alert box appears
telling you whether or not the remove was successful.
Note
If files have been moved or added to the BioLIMS folder,
the remove operation will be reported as unsuccessful; you should
then examine and delete the remaining files in the BioLIMS folder
yourself.
continued on next page
System Requirements and Installation 2-11
\
Files Installed The BioLIMS Client Package installs files in a folder called BioLIMS and
also installs some files in your System Folder.
Application Files Installed
The BioLIMS Client applications are placed in four folders in the main
BioLIMS folder:
This folder…
Contains…
Sequencing
Analysis
the Sequencing Analysis and Basecaller applications,
the About Sequencing Analysis text file, and other
folders associated with the Sequencing Analysis
application
Factura
the Factura application, the About Factura text file,
and other files and folders associated with the Factura
application
AutoAssembler
the AutoAssembler application, the About
AutoAssembler text file, the Assembly Data folder,
and the Engines folder
BioLIMS Extras
the Sample2DB, Collections Manager, and
SimpleText applications, the About Sample2DB and
About Collections Manager text files, the Scripts
folder, and the Sybase folder containing the interfaces
and other database-related files
IMPORTANT
Before running an application for the first time, read the About
text file for the application. Important information not contained in the manual
may be found in the About text file.
2-12 System Requirements and Installation
System Files Installed
The installer places these files in the Macintosh System Folder:
Item
Folder Location Description
Sybase Config
Control Panels
SybaseConfig control panel
(see page 2-17)
libblk
Extensions
Sybase library extension file
libcomn
Extensions
Sybase library extension file
libcs
Extensions
Sybase library extension file
libct
Extensions
Sybase library extension file
libctb
Extensions
Sybase library extension file
libintl
Extensions
Sybase library extension file
libsybdb
Extensions
Sybase library extension file
libtcl
Extensions
Sybase library extension file
libtcp
Extensions
Sybase library extension file
SequenceChooserLib
Extensions
BioLIMS library extension file
ABI Folder
System Folder
Mobility, comb, & matrix files
System Requirements and Installation 2-13
Starting AutoAssembler
Introduction Each AutoAssembler package contains a card with a unique
registration code. The first time you use the AutoAssembler program,
you are asked to enter this code. AutoAssembler then verifies the code.
If you use the program on a different computer, you must re-enter the
code.
IMPORTANT
You cannot use the same registration code on more than one
computer at a time.
To Start This procedure is only necessary the first time you open AutoAssembler
AutoAssembler for on a particular Macintosh computer.
the First Time
To open AutoAssembler for the first time:
Step
1
Action
In the Finder, double-click the AutoAssembler icon.
The first time you do so, the following registration dialog box
appears:
2
Enter your name, organization, and registration code (located on
the product registration card).
3
Click OK.
continued on next page
2-14 System Requirements and Installation
Allocating More When you start AutoAssembler, the program sets aside a certain
Memory amount of RAM for its own use. AutoAssembler’s default RAM size
allows you to assemble a project containing as many as 1000
sequences with an average length of 500 bases. If your projects are
considerably bigger than this, you may want to give AutoAssembler and
the CAP engine bigger memory partitions.
When you assign the CAP engine extra memory, you may speed up
assembly.
To allocate more memory:
Step
1
Action
In the Finder, click the AutoAssembler icon and choose Get Info
from the File menu.
Note
Do not double-click the icon. The program must remain
closed.
The following dialog box appears:
2
Type a larger number in the “Preferred size” entry field in the lowerright corner.
Note
Add memory in 1 MB increments until your memory
problem is solved.
System Requirements and Installation 2-15
To allocate more memory:
Step
3
(continued)
Action
Close the Info dialog box.
When you start the program, the Finder will allocate the amount of
memory you have indicated, if it is available.
2-16 System Requirements and Installation
Configuring BioLIMS Access
Introduction The BioLIMS system provides a database for sequences created by
ABI PRISM DNA Sequencing Analysis software. This database is
located on a server, and accommodates multiple users and editions
while preserving the original data.
Configuring for Before you can access the BioLIMS database, you must configure the
Server Connection SybaseConfig control panel.
IMPORTANT
Anytime you change the BioLIMS database server name, its
IP address or host and domain name, or the port number, you must repeat this
procedure.
To configure the SybaseConfig control panel:
Step
Action
1
Find the interfaces file in the Sybase folder in the BioLIMS Extras
folder.
2
Open the file with SimpleText, or a similar text editing application.
3
Find the lines:
SYBASE
query MacTCP mac_ether neuron.apldbio.com 2500
and edit them:
♦
Replace SYBASE with the name of the database server.
♦
Replace neuron.apldbio.com with the IP address or host and
domain name of the server machine.
♦
Replace 2500 with the port number.
You can find this information in the interfaces file on the Sybase
server, or your BioLIMS database administrator can provide you
with the information.
System Requirements and Installation 2-17
To configure the SybaseConfig control panel:
Step
4
(continued)
Action
If you have access to more than one server, duplicate the two lines
and edit them for the other servers. For example, for two servers,
one called SYBASE and one called SERVER2, the interfaces file
might look like this:
SYBASE
query MacTCP mac_ether neuron.apldbio.com 2500
SERVER2
query MacTCP mac _ether 192.,135.191.128 2025
5
Save and close the interfaces file.
6
Open the SybaseConfig control panel.
This control panel is found in the Control Panels folder in the
System folder.
2-18 System Requirements and Installation
To configure the SybaseConfig control panel:
Step
7
(continued)
Action
The first time you open the SybaseConfig control panel, a file
browser opens automatically.
If a file browser does not open immediately, click the Interfaces
Files button to open a file browser.
8
Use the file browser to locate and open the interfaces file that you
edited in the steps above.
9
Set the Default Language pop-up menu to be us_english.
10
Close the SybaseConfig control panel.
System Requirements and Installation 2-19
2-20 System Requirements and Installation
Creating and
Assembling a Project 3
Overview
3
Introduction To assemble sequences using the AutoAssembler software, you must
create a project, which maintains information about sequences and the
contigs that result when sequences are assembled. The project is
displayed in the project window, which allows you to easily edit and
assemble the sequences. Saving the project (described on page 8-2)
stores the information in a project file for future use.
In This Chapter This chapter contains the following topics:
Topic
Organizing Your Project
See Page
3-2
Opening and Closing a Project
3-6
Adding Sequences From Files
3-9
Adding Sequences From the BioLIMS Database
3-12
Viewing the Sequence List
3-25
Assembling Sequences
3-29
Setting Up for AutoUpdating
3-43
Creating and Assembling a Project 3-1
Organizing Your Project
Introduction AutoAssembler uses the following two types of projects:
♦
From Files–Contain only sequences from the computer
AutoAssembler is running on, or from a non-BioLIMS server.
♦
BioLIMS–Contain only sequences from BioLIMS database
collections.
Organizing a From When you start a From Files project, you should consider how to store
Files Project the sequences so that they remain accessible to the project at all times.
Make sure you keep your sequences in the same relative position to the
project file with which they are associated. Otherwise, AutoAssembler
may not be able to locate the sequences when you try to open them
from within the project.
If your assembly project requires only one project file and a few related
sequences, maintain the project sequences in a folder inside the project
folder, as shown in Figure 3-1.
If you move or archive the project folder, the project
file and sequence files remain in the same
relationship to each other
Figure 3-1 Example of simple project organization
In this configuration, the project file and related sequences move
together if you move or archive the project.
3-2 Creating and Assembling a Project
Organizing a Large Project With Several Project Files
If you have a large number of sequences and want to create several
projects to assemble them, store all the related projects, along with their
sequences, in a single folder (see Figure 3-2).
Figure 3-2 Large project organization
In this example, any of the four project files can contain sequences from
any of the sequence folders. If you move or archive the Cosmid folder,
all the sequences and project files remain in the same relative position.
Organizing a Networked Project
If you are working on a network server other than the BioLIMS
database and share sequences with other people, it is important that
the sequences remain on the same volume. If the sequences are
moved to another disk drive, another server, or another partition of the
same disk, AutoAssembler will not be able to locate them when you
open a related project file.
If you are using AutoAssembler with the BioLIMS database, the
sequences always remain accessible to the respective project. In
addition, new sequences can be automatically added to the project (see
“Setting Up for AutoUpdating” on page 3-43).
continued on next page
Creating and Assembling a Project 3-3
Organizing a The BioLIMS database keeps track of all sequences and changes
BioLIMS Project made to the sequences by all users connected to the server. For this
reason, no special precautions are necessary to maintain links to
BioLIMS sequences. However, the BioLIMS access must be open in
order to view electropherograms or edit sequence data. If connection is
not established, the BioLIMS access dialog box will automatically open
when you attempt to access a sequence (see “Opening BioLIMS
Access” on page 3-13).
Note
You cannot mix local files and sequences from BioLIMS.
Naming a BioLIMS Project
The AutoAssembler AutoUpdating feature relies on the name of the
project to identify the collection that contains the correct sequences. For
example, a project named “Project 1” assigned to be autoupdated will
have all sequences in a collection named “Project 1” automatically
added and updated.
Note
A BioLIMS project must have the same name as a collection on the
database if you want to use AutoAssembler’s AutoUpdating feature.
Missing Files If you move your sequences out of position relative to the project file
with which they are associated, you can still open the project and
assemble it. However, if you try to open the sequence in the sequence
window, or display a sequence’s electropherogram, the following dialog
box appears:
This dialog box indicates that the sequence is no longer in the same
place in relation to the project file. Use this dialog box to find the
sequence. If you cannot find the sequence, click Cancel, and the
following dialog appears:
3-4 Creating and Assembling a Project
Click Yes to open the project without electropherogram data from the
missing file.
To re-establish the link between the project and the sequences, re-add
the sequences to the project. This provides the AutoAssembler software
with the new relative path between the project and the sequence files.
Creating and Assembling a Project 3-5
Opening and Closing a Project
Starting To start AutoAssembler, double-click the AutoAssembler icon.
AutoAssembler Note The first time you start AutoAssembler, you must enter a registration
code. Refer to “Starting AutoAssembler” on page 2-14 for specific instructions
about starting AutoAssembler for the first time.
Viewing the With AutoAssembler open, you can create a new, blank project window
Project Window by selecting New from the File menu (Figure 3-3). To change the shape
and size of the window, drag the size box in the bottom right corner.
Indicates whether or not the sequences in the project are from the
BioLIMS database (once the first sequence is added, the project type
is assigned, and cannot be changed)
Sequence
names and
information
appear in the
sequence list
After
assembly, the
contig names
appear in the
contigs list
After assembly, a graphic display of assembly results appears in the
lower pane of the project window; use these buttons to change the
graphic view
Figure 3-3 The empty project window
Contig List
After assembly, the upper-left pane of the project window lists each
contig in the project, as well as an Unassembled list. The Unassembled
list contains the names of sequences that have just been added to the
project, that do not have any overlaps, or that have only weak
relationships with other sequences in the project.
3-6 Creating and Assembling a Project
Sequence List
The upper-right pane of the project window identifies sequences
associated with the project. In an assembled project, you can select a
contig or the Unassembled list in the upper-left pane to see the relevant
sequences in the right pane.
Project Views
After assembly, the lower pane of the project window shows a graphic
display of the results. See “Understanding the Project Window Views”
on page 4-2.
Opening a New To open a new project, select New from the File menu. A new, blank
Project project opens. You can have several projects open at a time.
Opening an You can open a previously created project in one of the following three
Existing Project ways:
From the Finder
Project files are distinguished with the icon shown here. When you
double-click a project file icon, the AutoAssembler program
automatically starts, if it is not already running. The program displays a
project window showing the project just as it was last saved to the file.
While Starting the AutoAssembler Program
If you press the Option key as you double-click the AutoAssembler icon,
the program starts and a standard file dialog box automatically appears,
allowing you to select the file you want to open.
From Within the AutoAssembler Program
If you are currently working in the AutoAssembler program, choose
Open from the File menu. A standard dialog box allows you to select the
file you want to open. Alternatively, go to the Finder desktop and
double-click the project file icon.
continued on next page
Creating and Assembling a Project 3-7
Closing a Project Save the project to a project file before you close the project window.
See Chapter 8, “Saving and Printing in AutoAssembler,” for instructions
on saving.
To close the project window:
Step
1
2
3-8 Creating and Assembling a Project
Action
Close the project window in one of the following three ways:
♦
Click the Close box in the upper-left corner
♦
Press z-W
♦
Choose Close from the File menu
If you have modified the project and have not saved the changes, a
dialog box prompts you to save the changes into the project file:
♦
Click Don’t Save to continue the close operation without saving
changes. In this case, the project file reverts to the last time
you saved it.
♦
Click Cancel to discontinue the close operation.
♦
Click Save to save the changes.
Adding Sequences From Files
Introduction You can add sequences to a project from several types of files:
♦
Text files that you have created or exported from other applications
♦
Files created by ABI PRISM DNA Sequencing Analysis software
♦
Files from existing Inherit-accessed databases
The AutoAssembler software copies a minimum amount of information
from the sequence source file into the project and maintains a reference
to the source file. AutoAssembler preserves the integrity of the data in
the project by checking the system modification date of each source file
in the project.
Note
A single sequence can be included in more than one project, but the
file name of each sequence included in a single project must be unique within
the project.
From File and If you purchased the BioLIMS option with AutoAssembler, projects may
BioLIMS Projects be designated as “From Files” or “BioLIMS.”
A project acquires this designation based on the first sequence that is
added to it. After that, only files of that type may be added. For example,
if you added a sequence from BioLIMS, then the project will be
designated a “BioLIMS” project, and only sequences from BioLIMS can
be added. All command selections change to reflect this. For example,
the command Add Sequences in the Project menu becomes Add
Sequences from BioLIMS, and so forth.
If you do not have the BioLIMS option, all your projects will be From
Files. To add sequences from the BioLIMS database, see “Adding
Sequences From the BioLIMS Database” on page 3-12.
continued on next page
Creating and Assembling a Project 3-9
Adding Sequences When you add a sequence to a From Files project, the sequence’s
to a From Files name and information are displayed below those of any other
Project sequences in the upper-right pane of the project window. You can add a
single sequence, individual sequences from various folders, or a group
of sequences from one folder.
To add a single sequence or a group of sequences:
Step
1
Action
Choose Add Sequence(s) from the Project menu. The following
dialog box appears:
Note
Choose Add Multiple from the Project menu to select
multiple files from different folders.
2
Select the “File type” checkboxes (“3XX” sample files, “TEXT,” or
“Inherit”).
Note
3
The file list shows only files of the type selected.
Add a file or files in one of the following ways:
♦
To add only one file, double-click the filename, or select the file
and click Add.
♦
To add all files of the chosen types that are in the open folder,
click Add All.
A progress indicator appears while the sequences are being added:
If necessary, repeat Step 2 and Step 3 to add additional files.
3-10 Creating and Assembling a Project
Note
Save the project file at this point in order to preserve a copy of the
unassembled project.
The upper-right pane of the new project shows information about the
individual sequences. To specify how much information is displayed and
the sort order of the sequences, see “Introduction” on page 3-25.
Removing If necessary, you can remove an extraneous sequence from the list of
Sequences from a sequences in a project. The sequence is not deleted from your hard
Project disk; the file is simply removed from the current project.
To remove a sequence from the project:
Step
Description
1
Select a contig (or the Unassembled list).
2
In the sequence list, select the sequence you want to remove.
3
Choose Remove Sequence from the Project menu.
IMPORTANT
This command cannot be undone.
Once you have removed a sequence from the project window, you
cannot use Undo to replace it. You must add it again. The ID number
assigned to a removed sequence is not used again in the same project.
Note
Using Cut, Delete, or Clear removes characters from the selected
sequence, but does not remove the sequence itself.
Creating and Assembling a Project 3-11
Adding Sequences From the BioLIMS Database
Introduction Projects that are to be populated with sequences from the BioLIMS
database must be designated BioLIMS project (see “Organizing a
BioLIMS Project” on page 3-4).
If you have not already configured the SybaseConfig control panel, you
must do so before establishing connection with the database (see
“Configuring BioLIMS Access” on page 2-17).
The interface you use to access the BioLIMS database is called the
Sequence Chooser window. The Sequence Chooser window is
common to the following BioLIMS applications:
♦
Sample 2DB
♦
Factura
♦
AutoAssembler
♦
Sequencing Analysis
Using the Sequence Chooser, you can search the BioLIMS database
for specific collections and sequences. Table 3-2 on page 3-17 lists the
five collection criteria and Table 3-3 on page 3-18 lists the nine
sequence criteria by which you can search.
In This Section This section includes the following topics:
For this topic
Opening BioLIMS Access
See page
3-13
Displaying the Sequence Chooser Window
3-15
Parts of the Window
3-16
Collection Search Criteria
3-17
Sequence Search Criteria
3-18
Searching the BioLIMS Database
3-20
continued on next page
3-12 Creating and Assembling a Project
Opening BioLIMS The Edit Session Information dialog box contains session information
Access for establishing connection to the BioLIMS database.
To configure the BioLIMS access:
Step
Action
1
Choose BioLIMS Access from the Edit menu. The Edit Session
dialog box appears.
2
In the text boxes, enter
♦
Your user name on the server
♦
The password for your server account
♦
The name of the database on the server (You may have access
to more than one database on the server.)
♦
The server name
IMPORTANT
3
All these text boxes are case sensitive.
Click the checkbox labeled Save Password if you want your
password saved so that you do not have to enter it every time you
open the connection.
Note
If you plan on opening the connection via AppleScript, you
should select this checkbox. Saving the password here eliminates
the need to have the password included as part of the AppleScript.
4
If you want the database to open automatically when you start the
AutoAssembler application, click the checkbox labeled Open on
Launch.
Creating and Assembling a Project 3-13
To configure the BioLIMS access:
Step
5
(continued)
Action
If you intend to use more than one database or user account, enter
an alias name for this session information.
Use the pop-up menu to change, add, or remove aliases.
If you have more than one alias, select the checkbox labeled Make
Default to choose which one appears when you first open the Edit
Session dialog box.
6
Click Open to open the connection to the database. Once
connection is established, you may add sequences to you project
using the sequence chooser window (see the following section).
If the connection fails, an alert dialog appears. Check the following:
♦
All the logon information was entered correctly and in the
correct case.
♦
Your interfaces files is correctly configured.
For more information, see “Configuring BioLIMS Access” on
page 2-17.
♦
Consult your BioLIMS database administrator or the BioLIMS
System Administration manual.
continued on next page
3-14 Creating and Assembling a Project
Displaying the To display the Sequence Chooser window, choose Add Sequences
Sequence Chooser from BioLIMS from the Project menu.
Window
The Sequence Chooser window appears (Figure 3-4).
Criteria pop-up menu
Search button
Collection search
criteria pop-up menus
and text boxes
Sequence search
criteria pop-up menus
and text boxes
Split bar
Search results
Status line
Figure 3-4 Sequence Chooser window
continued on next page
Creating and Assembling a Project 3-15
Parts of the Table 3-1 describes the parts of the Sequence Chooser window that
Window were labeled in Figure 3-4.
Table 3-1 Sequence Chooser Window Parts
Item
Description
Criteria pop-up
menu
Use this pop-up menu to specify the search criteria
visible on the Sequence Chooser.
Note
If you only intend to use a subset of criteria,
setting only those visible helps reduce clutter in the
window.
However, the search results are the same whether a
criterion is invisible or blank and visible.
Search button
Click this button to query the BioLIMS database.
Note
You can also press the Return key to begin a
search.
Collection search
criteria pop-up
menus and text
boxes
Use these pop-up menus and text boxes to define the
collection criteria of the search.
IMPORTANT
Only those sequences that match
each and every criterion you specify are returned.
That is, search criteria are combined using the logical
AND operation.
For more information, see “Collection Search Criteria”
on page 3-17.
Sequence search
criteria pop-up
menu and text
boxes
Use these pop-up menus and text boxes to define the
sequence criteria of the search.
IMPORTANT
A collection is returned if one or
more of the sequences contained in it fulfill all of the
specified sequence criteria.
For more information, see “Sequence Search Criteria”
on page 3-18.
Split bar
Drag this bar to alter the relative amount of space
allocated to the top and bottom portions of the
Sequence Chooser window.
Search results
After a successful query, found collections are listed in
this area as Name, Modification date, and Creator.
3-16 Creating and Assembling a Project
Table 3-1 Sequence Chooser Window Parts
(continued)
Item
Description
Status line
Error messages and other important information is
reported here.
For example, the Status Line lists how many
collections were returned in a search.
Collection Search Table 3-2 shows the collection search criteria. The collections returned
Criteria by the Sequence Chooser must match all of the collection criteria and
contain at least one sequence that matches all of the sequence criteria.
Table 3-2 Allowed Collection Search Criteria
Pop-up Menu
Choices
Allowed
Text
Collection
Creator
is
starts with
ends with
contains
up to 255
characters
Name of the
creator/owner of the
collection
Collection Name
is
starts with
ends with
contains
up to 255
characters
Name of the collection
Collection Type
any
run
project
other
NA
Collection type, default
is any
Creation Date
is any
is
is before
is after
is between
date only
— set with
arrow
buttons
Date the collection was
created
Modification
Date
is any
is
is before
is after
is between
date only
— set with
arrow
buttons
Date the collection was
last modified
Criterion
Description
Creating and Assembling a Project 3-17
Sequence Search Table 3-3 shows the sequence search criteria. The collections returned
Criteria by the Sequence Chooser must contain at least one sequence that
matches all of the specified sequence criteria.
Table 3-3 Sequence Search Criteria
Criterion
Pop-up Menu
Choices
Allowed Text
Description
Sequence
Creator
is
starts with
ends with
contains
up to 255
characters
including
letters,
numbers, and
punctuation
Name of the
person responsible
for the run
Sequence
Name
is
starts with
ends with
contains
up to 255
characters
including
letters,
numbers, and
punctuation
Name of the
sequence
Sample Name
is
starts with
ends with
contains
up to 255
characters
including
letters,
numbers, and
punctuation
Sample name from
the Sample Sheet
Gel Path
is
starts with
ends with
contains
up to 255
characters
including
letters,
numbers, and
punctuation
The full path name
to the original gel
file, for example,
Hard Disk:Data:
GelRuns:L28t
Length
is any
is
is less than
is greater than
is between
number
The length of the
most recent version
of the sequence in
the database
Status
any
nascent
prepare
collect
analysis
cleanup
assembly
NA
Status of the
sequence; there
are six stages of
collection and
analysis
3-18 Creating and Assembling a Project
Table 3-3 Sequence Search Criteria
Criterion
Pop-up Menu
Choices
(continued)
Allowed Text
Description
Instrumentation
any
gel
capillary
NA
Whether the
sample was run on
a gel or capillary
instrument
Start Collect
Time
is any
is
is before
is after
is between
date only — set
with arrow
buttons
Date data
collection began
End Collect
Time
is any
is
is before
is after
is between
date only — set
with arrow
buttons
Date data
collection ended
continued on next page
Creating and Assembling a Project 3-19
Searching the Follow these steps to use the Sequence Chooser to search the
BioLIMS Database BioLIMS database for specific collections and sequences.
To find sequences using the Sequence Chooser:
Step
1
Action
Choose Add Sequences from BioLIMS from the Project menu.
The Sequence Chooser window appears.
2
Use the items from the Find Collection with Criteria pop-up menu
(below) to define your search.
Note
To list all of the items in the BioLIMS database, perform
the search with no criteria specified. For large databases, this
process may be slow.
3
To use the pop-up menu:
Choose menu items...
To define the search for...
above the horizontal line
Collection criteria
below the horizontal line
Sequence criteria
Note
As you choose items from the pop-up menu, a black dot
appears next to the item on the menu and the item is added to
either the search criteria or the sequence criteria section of the
window.
3-20 Creating and Assembling a Project
To find sequences using the Sequence Chooser:
Step
(continued)
Action
The following is an example of the Sequence Chooser window showing four
collection search criteria and five sequence search criteria:
4
Use the pop-up menus and text fields to define your search query.
When you are satisfied with the search, click Search.
The results of the search appear in the lower portion of the window.
Note
Collections returned by the Sequence Chooser must
match all of the collection criteria and contain at least one
sequence that matches all of the sequence criteria.
5
To view the sequences contained in the collections, click the small
triangle to the left of the collection name.
Creating and Assembling a Project 3-21
To find sequences using the Sequence Chooser:
Step
6
(continued)
Action
You can take the following action.
If you want to...
Then...
add a sequence
a.
Select a sequence.
Note
You can select
multiple sequences by
selecting the first sequence,
and while pressing either the
Shift key, Control key, Option
key, or Command key (c)
selecting the additional
sequences.
b.
close the Sequence Chooser
window
Click the Select button.
Click the
♦
Close button
Adding Sequences In BioLIMS, sequences are organized in collections. Sequences in
From BioLIMS collections contain both the changed data and copies of the original
sequences, which remain on the database.
BioLIMS-based assembly projects can contain sequences from one or
more collections, and can contain some or all of the sequences from a
particular collection. However, when you use the AutoUpdating feature,
the project will contain all of the sequences in only one collection.
In order for the AutoUpdating feature to work, the project must have the
same name as the collection. If you give the project a different name,
autoupdating will not work.
3-22 Creating and Assembling a Project
Note
An alternate way to add files to a BioLIMS project is to name the
project after a collection, and then assign that project to be autoupdated. All
sequences in the collection will be added to the project. This is only useful if you
want every sequence from the designated collection.
To add files to a BioLIMS project:
Step
1
Action
Open the project to which you want to add sequences.
Note
The project must either be a BioLIMS project (see
page 3-9) or an empty project.
IMPORTANT
In order for autoupdating to function, the project
must have the same name as the collection file from which you
want to add sequences.
2
If BioLIMS access is not already open, select BioLIMS Access from
the Edit menu.
If necessary, modify any of the session information (see “Opening
BioLIMS Access” on page 3-13).
3
Click Open, then OK.
4
Select Add Sequences from BioLIMS from the Project menu. The
Sequence Chooser window appears:
5
Highlight a collection folder, or open a collection folder and highlight
individual sequences within that folder.
Note
If necessary, you can search for selected files on the
server by using the commands in the upper panes of the Sequence
Chooser window (see “Searching the BioLIMS Database” on
page 3-20).
Creating and Assembling a Project 3-23
To add files to a BioLIMS project:
Step
6
(continued)
Action
Click Select to add the sequences to your project.
Removing If necessary, you can remove an extraneous sequence from the list of
Sequences from a sequences in a project. The sequence is not deleted from the BioLIMS
Project database; the file is simply removed from the current project.
To remove a sequence from the project:
Step
Description
1
Select a contig (or the Unassembled list).
2
In the sequence list, select the sequence you want to remove.
3
Choose Remove Sequence from the Project menu.
IMPORTANT
This command cannot be undone.
Once you have removed a sequence from the project window, you
cannot use Undo to replace it. You must add it again. The ID number
assigned to a removed sequence is not used again in the same project.
Note
Using Cut, Delete, or Clear removes characters from the selected
sequence, but does not remove the sequence itself.
3-24 Creating and Assembling a Project
Viewing the Sequence List
Introduction You can change the sequence list by specifying what information is
displayed for each sequence, and by sorting the list using varied
criteria.
If the upper-right pane contains more information columns than you can
see, use the size box in the lower-right corner of the project window to
stretch the window to the right.
Changing the You can choose to display any of 13 fields containing information about
Information each sequence in the sequence list.
Displayed in the
To change the information displayed in the sequence list:
Sequence List
Step
1
Action
Choose Format from the Project menu. The following dialog box
appears:
Table 3-4 on page 3-26 describes the various fields.
2
3
Make changes as follows:
♦
To add more columns of information, drag the fields you want
to display from the Fields Available list to the Fields Displayed
list.
♦
To remove columns of information, drag the appropriate fields
from the Fields Displayed list to the Fields Available list.
♦
To change the order in which the fields are displayed, drag
them up or down in the list.
♦
To see the effect of any changes you make without closing the
dialog box, click Apply.
When you are satisfied with your changes, click Done.
Creating and Assembling a Project 3-25
Table 3-4 contains a list of the available sequence list fields and their
definitions.
Note
You may add as many fields to the “Fields Displayed” list as you want,
but you may not be able to see all the displayed fields without increasing the
size of the Project window.
Table 3-4 Project Sequence List Fields
Item
Description
Ambiguity
The percentage of ambiguities in the data.
Begin
The starting position of the sequence along the
consensus of the contig.
Chemistry
The type of chemistry that was used for the run that
produced the data (Sample files only).
DocID
An ID number assigned by the Server algorithm during
Server assembly. Used for technical support purposes.
End
The ending position of the sequence along the
consensus.
File
The name of the sequence file. Use the Show Names
command to display it.
Gapped Len
The length of the sequence, including gaps added in
assembly.
ID
The sequence ID number assigned when the sequence
is added to the project.
Length
The number of nucleic acids in the sequence.
Orientation
The orientation of the sequence, displayed as an arrow.
This column is filled in after the sequence is in a contig.
Run Date
The date of the ABI sequencer run that produced the
data, or the creation date for the file.
Sample
The name embedded in a ABI Sample file that was
assigned when the data was sequenced.
Source
The file type (Sample, Inherit, Text, BioLIMS). ABII
denotes an Inherit file.
continued on next page
3-26 Creating and Assembling a Project
Changing the Sort You can use the Sort command to change the order in which the
Order in the sequences are displayed in the upper-right pane of the project window.
Sequence List
To change the sort order in the sequence list:
Step
1
Action
Choose Sort from the Project menu. The following dialog box
appears:
Table 3-5 describes each of the sorting options.
2
Click the radio button beside the option you want to apply.
If you want to view the effect of your selection before closing the
dialog box, click the Apply button.
3
Click Done.
Table 3-5 describes the project sequence list sorting options.
Table 3-5 Project Sequence List Sorting Options
Option
Sort performed
Name
Sorts by the sequence filenames in numerical, then
alphabetical order.
Date
Sorts by run date, from earliest to latest.
Begin
Sorts by the starting positions of the sequences along
the consensus, from far-left to far-right.
End
Sorts by the ending positions of the sequences along
the consensus, from far-left to far-right.
Length
Sorts by the number of nucleic acids in the sequences,
from least to most.
Gapped Length
Sorts by the gapped length, from lowest to highest.
Creating and Assembling a Project 3-27
Table 3-5 Project Sequence List Sorting Options
(continued)
Option
Sort performed
Orientation
Sorts by the orientation of the sequence, normal
orientation first.
Chemistry
Sorts by the chemistry type, in alphabetical order.
Sample Name
Sorts by the sample names in numerical, then
alphabetical order.
3-28 Creating and Assembling a Project
Assembling Sequences
Introduction After adding sequences to a project, you can assemble them
automatically (after selecting the assembly parameters) by choosing
Assemble from the Project menu.
You can choose between one of the following assembly options:
♦
Local–Conducts assembly by using a local algorithm and
parameters that you select
♦
Engine–Conducts assembly by using either a CAP or CAP Remote
algorithm (if you have purchased the CAP Remote option)
♦
Server–Conducts assembly by using an algorithm based on a
server, leaving you free to work on your Macintosh while the project
is being assembled (only available if you purchased the Server
Option)
Note
To read more about the algorithms, see Appendix B.
Note
Each assembly method may produce slightly different results.
These three options can be selected from the Assembly Setup dialog
box prior to assembling the project.
continued on next page
Creating and Assembling a Project 3-29
Assembling by The Local assembly option allows you to assemble data without the use
Local Algorithm of server software or an assembly engine. Local assembly is faster than
the server for small projects, since the extra time required for moving
the project over a network is eliminated.
To assemble a project locally:
Step
Action
1
Choose Assembly Setup from the Project menu. The following
dialog box appears:
2
Click the Local icon in the Assemble box.
3
Set the Minimum Overlap and Percent Error (see page 3-38).
4
Click OK to set assembly parameters and close the dialog box.
or
Click Submit to assemble the project using the parameters you
selected.
5
Select Assemble from the Project menu. The following dialog box
appears while the project is being assembled:
continued on next page
3-30 Creating and Assembling a Project
Assembling The second method of assembling a project is by using an installed
Projects Using the assembly engine. The AutoAssembler software comes with the
Engine Options following engine:
♦
CAP–Macintosh based Contig Assembly Program
If you purchased the CAP Remote option, your engine options include
the following:
♦
CAP Remote–UNIX-based Contig Assembly Program
Both algorithms deliver the same results, but the CAP Remote option is
much faster for large projects, while also allowing you to work on your
Macintosh during assembly.
You may add additional engines by placing them in the Engines folder
located in the AutoAssembly folder. The new engines then appear in the
pop-up menu in Assembly Setup. You can also enter Assembly Engine
Parameters in the Assembly Setup dialog box.
Assembly Parameters
Table 3-6 shows the parameters that can be modified in the CAP engine
included with the AutoAssembler software. (These parameters also
apply to the optional CAP Remote engine.) If you have installed
additional engines, the parameters you can modify may be different.
Note
These parameters must be entered in all caps and preceded by a
hyphen (as shown in Table 3-6).
Table 3-6 Engine Assembly Parameters
Parameter
Default
Description
-OVERLEN
20
The minimum length of valid overlap required to join two
sequences.
Increasing this value will speed assembly and decrease the
possibility of false overlaps.
Decrease this value if you are assembling short sequences.
-FLEVEL
0.70
Minimum percentage of matching bases in a valid overlap.
Increasing this value can speed assembly and reduce false
overlaps.
Creating and Assembling a Project 3-31
Table 3-6 Engine Assembly Parameters
(continued)
Parameter
Default
Description
-PERCENT
0.86
Minimum percentage of matching bases in the “best” part of any
overlap.
The “best” part of an overlap refers to the highest quality section
of the overlapping bases. This value rarely needs to be modified.
-POS5
20
The number of bases in the beginning of a sequence which may
be of lower quality than the following bases.
Typically, sequences have more ambiguity towards the
beginning and end of their length. By designating an area of
lower certainty at the beginning of a sequence, the algorithm will
assign lower penalties to mismatches or gaps that occur in
these bases.
-POS3
450
Bases from this value to the end of a sequence which may be of
lower quality than the preceding bases.
Typically, sequences have more ambiguity towards the
beginning and end of their length. By designating an area of
lower certainty at the end of a sequence, the algorithm will
assign lower penalties to mismatches or gaps that occur in
these bases.
-WORDSIZE
9
The size of a group of bases (word) that the engine uses to find
potential matches.
Increasing this value can greatly speed up the assembly engine.
For example, a word size of 11 may increase assembly speed by
5 to 10 times. However, increasing word size also greatly
increases memory requirements. For example, typical memory
requirements for the assembly engine are 5 raised to the power
of the word size, so a word size of 9 means 1.9 MB of free
memory is required by the assembly engine.
If you want to increase -WORDSIZE, first increase the memory
allocated to the assembly engine (not the AutoAssembler
program itself). For an example of how to increase the memory
allocated to an application, see “Allocating More Memory” on
page 2-15).
3-32 Creating and Assembling a Project
Table 3-6 Engine Assembly Parameters
Parameter
Default
(continued)
Description
The following parameters should only be modified by expert users. These parameters modify the
score the assembly engine assigns to matches, mismatches, and gaps in overlapping sequences.
The total score must be greater than the OVERLEN value multiplied by the MATCH value for the
engine to consider an overlap valid.
-MATCH
20
Score assigned to a correctly matched base in a potential
overlap.
Increasing this score will make the assembly engine more likely
to consider overlaps valid.
-MISMAT
-40
Score assigned to an incorrectly matched base in a potential
overlap.
Increasing this score will make in harder for the assembly
engine to find valid overlaps.
-LTMISM
-30
Score assigned to an incorrectly match base residing in the area
specified by the POS5 or POS3 parameters (defined above).
-OPEN
60
Penalty assigned to the first gap character in a run of gap
characters in a potentially overlapping sequence.
-EXTEND
43
Penalty assigned to each subsequence gap (after the OPEN
penalty has been assigned) in a potentially overlapping
sequence.
-LTEXTEN
20
Penalty assigned to a gap in an area specified by the POS5 or
POS3 parameters (defined above).
Creating and Assembling a Project 3-33
Engine Assembly
Note
Assembling a project with the Engine option also creates a temporary
file that can be imported and read by AutoAssembler (see page 3-41).
To assemble a project using the Engine option:
Step
Action
1
Select Assembly Setup from the Project menu. The following dialog
box appears:
2
Click the Engine Icon in the Assemble box. The following options
appear:
3
Select either Cap or CapRemote from the pop-up menu.
If you are using an engine that supports user-entered parameters,
you may enter them now. See Table 3-6 on page 3-31 for a list of
parameters for the included assembly engine.
3-34 Creating and Assembling a Project
To assemble a project using the Engine option:
Step
4
(continued)
Action
Click OK to set Assembly parameters and close the dialog box.
or
Click Submit to assemble the project using the parameters you
selected.
5
Select Assemble from the Project menu. The following dialog box
appears while the project is being assembled:
continued on next page
Creating and Assembling a Project 3-35
Using a Server to The Server option is based on the Myers-Kececioglu model. This model
Assemble Project handles repeat sequences more efficiently and can be faster for large
projects than the Local option (but not the CAP Engines). Like the CAP
Remote option, the Server option allows you full use of your Macintosh
computer during assembly, since the computations are performed on a
remote server. Assembling on the Server allows you to use the optional
Fast Data Finder (FDF).
Note
The following procedure assumes that you have already logged on to a
server.
To assemble a project using the Server option:
Step
1
3-36 Creating and Assembling a Project
Action
Select Assembly Setup from the Project menu. The following dialog
box appears:
To assemble a project using the Server option:
Step
2
(continued)
Action
Click the Server icon from the Assemble box. The following
additional options appear:
The FDF filter parameters that appear in the expanded Assembly
Setup dialog box (when the “More” checkbox is selected) are the
parameters computed for a given set of Fragment Assembly (FA)
parameters, and are shown for your information only. This manual
does not describe how to directly set the FDF parameters (see
“FDF Parameters” on page 3-38).
3
Check the “Submit As New” checkbox.
4
Click OK to set Assembly parameters and close the dialog box.
or
Click Submit to assemble the project using the parameters you
selected.
5
Select Assemble from the Project menu.
continued on next page
Creating and Assembling a Project 3-37
FDF Parameters The More checkbox in the Assembly Setup dialog box does not appear
if you purchased the AutoAssembler software without the FDF. When
the FDF option is installed on your server, AutoAssembler uses the
Minimum Threshold and Error Rate parameters (also known as the FA
parameters) to automatically compute the FDF filter parameters. These
parameters are then used to perform an assembly in the FDF version of
the AutoAssembler program.
Although this manual does not provide instructions for setting the FDF
parameters, Table 3-7 lists parameters and their definitions.
Table 3-7
More Checkbox Parameters
Parameter
Definition
Window
Size of the FDF query
Offset
Number of bases skipped between two queries
Tolerance
Error tolerance applied to the FDF query
Overlap
Length of a sequence used by the FDF to extract queries
Hit Count
Number of contiguous hits used by the FDF filter to decide
potential edges
Setting Minimum The results of an assembly using the Local or Server options depend
Overlap and heavily on the settings you enter for Minimum Overlap and Percent
Percent Error Error. The two assembly algorithms use these parameters in slightly
different ways.
The number of contigs that result from assembly is dependent upon the
overlaps that occur in the source sequence files and the assembly
parameters you set. If the parameters are too stringent (Minimum
Overlap high and Percent Error low), sequences that belong together
may not be put into the same contig. If the parameters are too loose
(Minimum Overlap low and Percent Error high), sequences that do not
belong in the same contig may be put together anyway.
Server Algorithm
The Server algorithm calculates a statistical score that measures
similarity between overlaps and reduces a given overlap score for
errors (insertions, deletions, or mismatches) in the overlapping
segments. If the resulting value is less than the value you entered as
Minimum Overlap, or if the number of errors in the overlap exceeds the
3-38 Creating and Assembling a Project
number allowed by the Percent Error parameter, the algorithm ignores
the overlap.
To calculate the number of allowed errors, the program sums the
lengths of the two sequences being compared, and applies the Percent
Error value.
Example: Two sequences of comparable length
Assuming two sequences that are 200 base pairs and 300 base pairs
long, respectively, a Percent Error value of 10% yields 50 allowed
errors, calculated as follows:
200 + 300 = 500 ∗ 10% = 50 errors allowed in the overlap
IMPORTANT
Be careful if you are assembling sequences of diverse
lengths. Anything other than a very small Percent Error will allow a short
sequence to overlap completely with a long sequence, since the long sequence
determines that a large number of errors are allowed.
Example: Two sequences of diverse length
Assume that you want to assemble Sequence A (500 bp) and
Sequence B (10,000 bp). If you use a Percent Error value of 10% and a
Minimum Overlap of 10, the following calculations apply:
500 + 10,000 = 10,500 ∗ 10% = 1,050 errors allowed
The number of errors allowed is greater than the length of Sequence B,
which means that the program could align the two sequences if they
had 10 overlapping bases, regardless of the number of insertions,
deletions, or mismatches. In this case, a percent error value of 2% or
less might be more appropriate.
Note
If you set the Percent Error value to zero, the Minimum Overlap value
describes the number of bases required for an overlap.
As you enter larger Minimum Overlap values, the time required for
assembly decreases. The default value, 10, is a conservative starting
point for this parameter.
Creating and Assembling a Project 3-39
Local Algorithm
The Local algorithm uses the Minimum Overlap parameter simply as
the minimum number of bases allowed in the overlap. The Percent Error
parameter specifies the percentage errors allowed in the overlap.
Example: Local algorithm
If the overlap consists of 10 bases and the Percent Error value is 10%,
an overlap would be allowed with 9 matching bases and 1 error.
IMPORTANT
Using either algorithm, if you set the Minimum Overlap value
too low or the Percent Error value too high for a particular set of sequences,
random similarities can produce false overlaps. If you set the Minimum Overlap
value too high or the Percent Error value too low, ambiguities and sequencing
artifacts nested in the overlapping regions at the ends of the sequences might
cause the algorithm to miss real overlaps.
The Assembled The time required to generate overlaps and multiple alignments can
Project vary depending on the number of sequences and the amount of overlap
between the sequences.
When assembly is complete, the status message dialog box disappears
and the assembled results appear in the project window (see
Figure 3-5).
After assembly, the sequence list displays the sequences associated
with the contig selected in the contig list
The graphic
display
defaults to
the Layout
view
Figure 3-5 Assembled project (in the Layout view)
3-40 Creating and Assembling a Project
Contig Names
The contig name appears in the upper-left pane of the window. When
you select a name, the component sequences of the contig appear in
the sequence list. Contigs are named after the first sequence in the
project and numbered incrementally each time you assemble the
project. For example, a contig titled “ox208.Contig.4” is the result of the
fourth assembly of a project containing sequence ox208 as its first
sequence.
Sequence Names
The diamond shapes no longer appear beside the sequence names,
since the sequences have been assembled. The bottom pane of the
window is a graphic display of the aligned sequences. The views are
described in “Understanding the Project Window Views” on page 4-2.
Note
This is a good time to save the project. If you saved it before assembly,
and want to preserve the unassembled file, use the Save As command and give
the assembled file a different filename.
Importing When you assemble a project using the CAP engine, two temporary
Assembled text files are created and saved in the Assembly Data folder (located in
Projects the AutoAssembler folder). These two files are
♦
Engine input file
♦
Engine output file (identified by the suffix “.asmg”)
AutoAssembler can import and read the engine export files (that is, files
whose names end with “.asmg”) as if they were assembled project files.
Export files are smaller than an assembled project file, and can be more
easily sent over relatively slow network connections (for example, the
Internet).
Creating and Assembling a Project 3-41
When imported to AutoAssembler, these files can be read, saved, or
printed. However, you should not attempt to edit and reassemble these
files, because they do not have the full range of sequence data.
To import assembly output files:
Step
Action
1
Select Import from the File menu. The following dialog box appears:
2
Open the Assembly Data folder and select an export file (a file
ending with “.asmg”).
3
Click Open.
The file opens in a new project window.
3-42 Creating and Assembling a Project
Setting Up for AutoUpdating
Introduction With the BioLIMS option, you can set a project to be automatically
updated via a network connection every time sequences are added or
changed. Sequences are assigned to AutoAssembler project with the
Add Sequences from BioLIMS command (see page 3-12).
Opening BioLIMS The Edit Session Information window controls how your Macintosh
Access communicates with the BioLIMS database (see “Opening BioLIMS
Access” on page 3-13 to configure the connection).
Configuring The names of the projects to be updated are determined by
AutoUpdating preferences you set in the AutoUpdate Settings dialog box. While the
project is being updated, you will not be able to use your Macintosh
computer. AutoAssembler will not cause other programs on the
Macintosh to fail, but will override them during autoupdating.
To configure AutoUpdating:
Step
Action
1
Select AutoUpdate Settings from the Edit menu. The following
dialog box appears:
2
Click the Add Project button and select the project(s) you want to be
automatically updated in the Open File dialog box.
3
To remove a file from the list, select it and click Remove.
Creating and Assembling a Project 3-43
To configure AutoUpdating:
Step
4
(continued)
Action
Select the “Activate Automated Update” checkbox.
Note
Choose at least a 10 minute wait before starting the
update. Once updating has started, the AutoAssembler software
will periodically “take control” of the Macintosh computer in order to
update the sequences. While this will not cause any programs you
are running to fail, it may become disruptive to your work.
5
Click OK to accept the settings.
Changing and AutoAssembler maintains links between sequences on the BioLIMS
Adding Sequences database and the projects to which they have been added. Each time
those sequences are modified, the modified version of the sequence is
sent to the project, updating it. Sequences that have been processed by
Factura and are added to a project’s collection will be automatically
added to the project and the project will be automatically assembled.
Adding Sequences using AutoUpdating
An alternate way to add files to a BioLIMS project is to name the project
after a collection, and then assign the project to be autoupdated. All
sequences in the collection that have been processed in Factura will be
added to the project and the project will be automatically assembled.
While the Project Typically, if you are using BioLIMS to automatically update an
is Being Updated AutoAssembler project, the Macintosh the project resides on is left
unattended during the updating process. Turn off programs that
periodically display user prompts (for example, a calendar program
reminding you of an upcoming meeting) to prevent interference with
AutoAssembler.
continued on next page
3-44 Creating and Assembling a Project
Turning Off When a project no longer must be updated, or you must use the
AutoUpdating Macintosh on which the project is stored, turn off AutoUpdating.
To turn off AutoUpdating:
Step
Action
1
Select AutoUpdate Settings from the Edit menu. The following
dialog box appears:
2
Deselect the “Activate Automated Update” checkbox.
3
Click OK.
Creating and Assembling a Project 3-45
3-46 Creating and Assembling a Project
Viewing the
Consensus
Overview
4
4
Introduction After assembling sequences, you can display the consensus and the
underlying sequences in one of three project window views. This
chapter discusses the views, and how to change the parameters that
effect how the views appear in the project window.
In This Chapter This chapter contains the following topics:
Topic
Understanding the Project Window Views
See Page
4-2
Displaying Electropherograms
4-9
Changing the Display Parameters
4-12
Manipulating Window Displays
4-18
Locating Sequences
4-21
Viewing the Consensus 4-1
Understanding the Project Window Views
Introduction After you submit a project for assembly, the contig that appears in the
lower pane of the project window is represented by three different
views. It is important that you understand the different views so that you
can efficiently edit the sequences. Table 4-1 provides a brief description
of each view.
Table 4-1 Project Window Views
View
Description
Layout
The default that appears after assembly. Arrows
display the orientations and relative positions of
assembled sequences. Click the button shown at left
to return to the Layout view from any other view.
Zooming in from this view shows individual
nucleotides as colored bars, and the consensus
displays half-height bars at positions of lower
certainty, to facilitate editing. Double-click a sequence
to see its electropherogram.
Alignment
Shows the specific nucleotide order of each
sequence in a region of the contig. In the Alignment
view, you can show a constantly spaced
electropherogram for each sequence, providing a
useful editing tool. Click the button shown at left to
change to the Alignment view from any other view.
Double-click a sequence to see its electropherogram.
Statistics
Shows redundancy plotted against consensus base.
You can set criteria for the level of redundancy or
orientation you consider acceptable. User-definable
colors identify certain areas of the sequence. Click
the button shown at left to change to the Statistics
view from any other view.
All views have an axis that represents the consensus sequence and
indicates the positions of bases in the currently selected contig. In the
Layout and Alignment views, the consensus sequence axis shows a
user-specified color at positions representing ambiguous bases. The
Statistics view shows level of redundancy along the consensus.
continued on next page
4-2 Viewing the Consensus
Layout View After you assemble sequences, the Layout view is the default that
appears in the lower pane of the project window. This view graphically
represents the sequence orientations and relative positions in the
contig (see Figure 4-1). You can zoom in from this view to observe the
nucleotides of the individual sequences.
The selected sequence is highlighted in
A box shows the
position of the
selected sequence
in the consensus
The consensus
sequence is
represented by this
axis
Arrows show
orientation of
sequences
Figure 4-1 The Layout view in the project window
In the Layout view, each arrow represents a single sequence, and the
direction of the arrow indicates the sequence’s relative orientation. The
axis across the top of the lower pane in Figure 4-1 represents the full
consensus of the contig. Its length is marked in bases, and gray (the
default color) shows the positions of ambiguities.
Viewing the Consensus 4-3
Identifying Sequences
Sequence names are synchronized with the display in the Layout view,
so clicking either an arrow (in the Layout view) or a name (in the
sequence list) will highlight the corresponding sequence.
To determine the name of a sequence in the Layout view:
Step
1
Action
Click the sequence in the Layout view. The corresponding
sequence name is highlighted in the sequence list.
or
Click a sequence name in the sequence list and the corresponding
sequence is highlighted in the Layout view.
Displaying File Names
To make it easier to identify files in the Layout view, you can display the
sequences with their sample file names.
To display the file names of sequences:
Step
1
Action
To display file names in the Layout view, choose Show Names from
the Project menu.
The project window should now look like this:
Note
Displaying file names in the sequence list is described under “Viewing
the Sequence List” on page 3-25.
4-4 Viewing the Consensus
Zooming In
You can obtain closer resolution of the Layout view by zooming in,
which displays individual nucleotides as colored bars.
To zoom in from the Layout view:
Step
Action
1
Select a region you want to examine more closely by clicking it or
dragging the cursor over a range of the consensus axis.
2
Choose Zoom In (c– =) from the Window menu.
Note
If you continue to Zoom In, the view switches to the
Alignment view.
Figure 4-2 shows the Layout view after two Zoom In commands. You
can also get to this display by using the Zoom Out command from the
Alignment view.
Bases are shown as colored bars, and lowercase bases
appear as half-height bars in the consensus sequence
Ambiguity
characters
appear below
the consensus
sequence
Characters other
than upper- or
lowercase A,C,G,
and T appear as
short black bars
Figure 4-2 Layout view after zooming in
The Zoom command can be used to facilitate editing. For example, in
Figure 4-2, positions at which the consensus base calls are less certain
appear as half-height bars in the ambiguity color (the default is gray).
Locations identified with codes other than upper- or lowercase A,C,G,
and T (for instance, other IUB codes or gaps) appear as black markers
that are one-fourth the height of the normal bars. The ambiguity
character appears below each ambiguous position in the consensus
sequence.
Viewing the Consensus 4-5
Alignment View The Alignment view allows fast and easy editing of ambiguities in the
sequences. You can quickly show, zoom, and scale synchronized
electropherograms that are constantly spaced so that their peaks match
the base positions in the sequences.
Note
You can switch directly to this view from the Layout view by choosing
Actual Size (c- ]) in the Sequence menu, or by clicking the button at left.
Lower case characters in the consensus indicate
positions of lower certainty
Arrows in the
sequence list
indicate
sequence
orientations
Ambiguity
characters
When you click a
sequence in the
Alignment view,
the corresponding
electropherogram
is displayed below
Figure 4-3 The Alignment view
In the Alignment view, you can easily observe specific nucleotide
sequences and the nature of marked ambiguities. The consensus
sequence appears at the top of the pane, and ambiguity characters
below it mark ambiguous base positions. The individual overlapping
sequences appear below the ambiguity characters.
Consensus Characters
The characters of the consensus sequence vary with the composition of
the underlying sequences. A lowercase character in the consensus
sequence indicates a position of lower certainty. Such characters are
marked with the ambiguity color you specify using the Settings
command (see page 4-15).
♦
4-6 Viewing the Consensus
A lowercase a,c,g, or t in the consensus sequence indicates that at
least the threshold value (but less than 100 percent) of the aligned
bases at that position are called as A,C,G, or T, respectively (see
page 4-16). Therefore, a possibility exists that the base at that
position might not be correctly called.
♦
A lowercase IUPAC (or IUB) code letter in the consensus sequence
indicates that no single base is represented by the threshold value
(see page 4-16) or more of the calls at that position (see page 4-16)
in the underlying sequences. See Appendix C, “Key Codes,” for a
table of the IUPAC/IUB codes.
Viewing Sequences
The orientation of each sequence is recorded by arrows in the
sequence list. As in the Layout view, you can click a sequence name in
the sequence list to identify the graphic representation of the sequence
in the lower pane. You can also click a sequence in the lower pane to
highlight the corresponding sequence name in the upper-right pane.
The Statistics View The Statistics view displays redundancy and orientation information
about the contig, and is useful for finding regions of the consensus that
do not have enough underlying data. You select the minimum number
and orientation of underlying sequences across the consensus, and the
Statistics view highlights areas that do not meet your standards.
Click the button shown at left to display the Statistics view
(see Figure 4-4).
Horizontal line
at 3 shows
minimum
redundancy
Figure 4-4 Project window with the Statistics view displayed
Displaying the Consensus
The Statistics view displays the consensus sequence as redundancy
plotted against consensus base. You can set criteria for the level of
redundancy or orientation you consider acceptable.
User-definable colors identify certain areas of the sequence. The
default colors are as follows:
Viewing the Consensus 4-7
♦
Red indicates where the data falls below the minimum orientation
settings.
♦
Blue indicates where the data falls below the minimum redundancy.
♦
Gray indicates where the data is acceptable according to the
specified settings.
To locate the region representing a particular sequence, select the
sequence name in the sequence list. A highlight in the lower pane
indicates the range of the sequence that is highlighted in the sequence
list.
Statistic View Parameters
The information that appears in the Statistics view is based on
parameters you set (see “Changing Statistic View Parameters” on
page 5-20).
Your settings determine the minimum number of overlapping
sequences to be considered acceptable, and the proportion of the total
that must be either one orientation or the other.
The default setting is the 2+1 rule. This means that for a consensus
base to meet a certain quality standard, a minimum of three underlying
sequences must exist at that position. At least two of the underlying
sequences must be one orientation, and at least one must be the
opposite orientation.
Note
You do not need to specify the actual sequence orientations. For
example, when the values are 2+1, this stipulates either two forward and one
reverse, or two reverse and one forward.
The Zoom The Zoom command allows you to closely examine an area of the
Command consensus, or the underlying sequences. The Zoom command is
available from both the Layout and Alignment views, and can be used to
transition between the two views (for example, zooming in from the
Layout view three times will switch the project window to the alignment
view). See “Zooming In” on page 4-5 and “Using an Electropherogram
to Resolve Ambiguities” on page 5-6 for specific procedures and uses
of the Zoom command.
4-8 Viewing the Consensus
Displaying Electropherograms
Introduction Sequence electropherograms can be displayed in either the Layout or
Alignment views of the project window. In the Alignment view, the
electropherograms are constantly spaced, so they can be used to
resolve ambiguous base calls.
Opening Electropherograms can be displayed in the Layout and Alignment views
Electropherogram of the project window, for both single or multiple sequences.
Displays
To display electropherograms:
Step
1
Action
Display electropherograms in one of the following ways:
♦
Choose Show Electropherogram(s) from the Sequence menu
(displays an electropherogram for the selected sequence(s)).
♦
Double-click a sequence in the Layout or Alignment view.
♦
Double-click an area of the consensus sequence. This will
display electropherograms for the individual sequences in the
selected area of the consensus.
Hiding You can also choose to hide electropherograms once they are
Electropherogram displayed in either view.
Displays
To hide electropherograms:
Step
1
Action
Hide electropherograms in one of the following ways:
♦
Choose Hide All Electropherograms from the Sequence menu.
♦
Double-click a sequence in the Layout or Alignment view that
has an electropherogram displayed.
♦
Select a sequence from the sequence window that has its
electropherogram displayed, and choose Hide
Electropherogram from the Sequence window.
continued on next page
Viewing the Consensus 4-9
Changing You can change the horizontal and vertical spacing of the
Electropherogram electropherogram peaks by zooming and scaling. You can change the
Appearance spacing with the mouse, or by changing the settings in the Settings
dialog box (see “Changing Row Height and Vertical Scale” on
page 4-14).
Changing Horizontal Scale
When you change the horizontal scale of an electropherogram, the
character spacing of the sequence changes as well. Since the
character size is global to the Alignment view, changing the horizontal
scale changes the spacing of all sequences in the Alignment view.
To scale electropherograms horizontally:
Step
Action
1
Place the cursor over an electropherogram.
2
Press and hold down the Shift-Option keys.
The cursor changes to a peak shape with a horizontal arrow, as
shown here:
3
As you hold down the Shift-Option keys, click a peak and drag to
the left or right.
When you release the mouse button, all electropherograms are
rescaled to the new width.
Note
If you reduce the electropherogram to its minimum
horizontal spacing, the project window shifts to the Layout view.
4-10 Viewing the Consensus
Changing Vertical Scale
When you change the vertical scale of an electropherogram, the vertical
scale of all other displayed electropherograms changes as well. The
rows containing the peaks, however, do not change size, and the peak
tops are clipped to the row height.
To scale electropherograms vertically:
Step
Action
1
Place the cursor over an electropherogram.
2
Press and hold down the Option key.
The cursor changes to a peak shape with a vertical arrow, as
shown here:
3
As you hold down the Option key, click a peak and drag up or down.
When you release the mouse button, all electropherograms are
rescaled to the new height.
Changing Row Height
Change the row height to compensate for changes to the vertical scale.
To change the row height:
Step
1
Action
Move the cursor to the bottom of an electropherogram.
When it is near the horizontal line that marks the bottom of the row,
the cursor changes to a bidirectional arrow, as shown here:
2
Click the horizontal line and drag up or down.
When you release the mouse button, all electropherograms are
drawn in proportion to the new row height.
Viewing the Consensus 4-11
Changing the Display Parameters
Introduction You can use the Settings dialog box to specify the following:
♦
Row height and vertical scale of displayed electropherograms
♦
The minimum separation between sequences that are displayed in
the same line
♦
The color of base sequences
♦
The color of ambiguous base sequences
♦
The characters that represent insertions in the consensus
sequence, or ambiguity between the component sequences of the
consensus
♦
The minimum threshold for consensus bases
♦
The manner in which forward and reverse strands are displayed
♦
Network connection preferences
IMPORTANT
If you are using a network version of the AutoAssembler
software, the network parameters appear at the bottom of the dialog box. These
were set up when your system was installed or by your system administrator. If
your Macintosh computer is unable to communicate with the server, see your
system administrator. Do not change the Host or Service parameter values
unless you are instructed to do so by your system administrator.
continued on next page
4-12 Viewing the Consensus
Opening the To display the Settings dialog box (see Figure 4-5), choose Settings
Settings from the Edit menu.
Sets the height of the rows that
display electropherograms
Sets the relative height of the
electropherogram in the row
Click any of the color
boxes to display a color
picker and change the
default color
Choose the insertion and
ambiguity characters, and
the base threshold
Choose how forward and
reverse strands will be
displayed
ID numbers for a
connected Server
Figure 4-5 The Settings dialog box
continued on next page
Viewing the Consensus 4-13
Changing Row You can change the appearance of electropherograms by using the
Height and Settings command. Changes that you make to displayed
Vertical Scale electropherograms using the mouse (see page 4-10) are reflected in
the Settings dialog box.
To scale electropherograms vertically, or to change row height using the
settings command:
Step
Action
1
Choose Settings from the Edit menu. The following dialog box
appears:
2
Type a new number in the Vertical Scale entry field and the Row
Height entry field.
The number in the Vertical Scale entry field expresses the peak
height relative to the number (in inches) in the Row Height entry
field. For example, if you set the Row Height to 1 and the Vertical
Scale to 0.5, the peaks are scaled to half the height of the row that
displays them.
3
Click OK.
Note
Clicking Default Settings resets the fields to the program
defaults.
continued on next page
4-14 Viewing the Consensus
Changing Sequences will often be displayed in the same line in the project views
Minimum in order to conserve vertical space and increase the readability of the
Separation views. How close together the sequences will be depends on the
settings you enter.
To change the distance between sequences that are displayed on the
same line:
Step
1
Action
Change the value in the entry field labeled “Min Separation.”
This parameter defines the minimum length (number of characters)
that must exist between two sequences if they are to be displayed
on the same line.
Selecting Base You can change the appearance of any of the bases in order to make
Color them easier to see on the monitor you are using. Changing the
ambiguous base color may make these bases easier to see in the
Layout view.
To change the color used to mark bases:
Step
Action
1
Select Settings from the Edit menu.
2
Click any of the base color buttons. The color picker appears.
3
Select a new color by clicking on the color wheel, using the scroll
bar, or scrolling fields.
Viewing the Consensus 4-15
To change the color used to mark bases:
Step
4
(continued)
Action
Click OK when you are finished.
Note
bases.
Do not duplicate any colors that are used to distinguish
Changing To change the insertion or ambiguity characters, enter a new character
Consensus in the Insertion Char or Ambiguity Char entry field.
Characters ♦ The Insertion Char field defines the character that indicates
insertions in the consensus sequence.
♦
The Ambiguity Char field specifies the character that depicts
ambiguity at a particular base position of the consensus. This
character is placed on a separate line below the consensus
sequence in the Alignment view. For an example, see Figure 4-3 on
page 4-6.
Changing Changing the value in the Threshold field dictates when ambiguity
Threshold Value characters are displayed in the consensus sequence. The value refers
to the percentage of bases in the underlying sequences of the
consensus that are the same.
For example, if there are three As and one G in the four sequences
underlying the consensus at a particular base position, an A would be
displayed in the consensus if the threshold value were 75 percent or
lower. If not, a lowercase (lower certainty) or ambiguity character would
be displayed.
Changing By default, forward and reverse sequences are displayed the same way
Orientation in the Alignment view. If, however, you must make the different
Parameters directions easy to distinguish, you can change the text styles in which
the bases are displayed.
To change the way sequences are displayed:
Step
4-16 Viewing the Consensus
Action
1
Select Settings from the Edit menu.
2
Select the Forward or Reverse radio button.
To change the way sequences are displayed:
Step
(continued)
Action
3
Select one of the three checkboxes to determine how forward or
reverse strands will be displayed.
4
Click OK when you are finished.
Changing Network These parameters should have been set when the program was
Parameters installed or by your system administrator. Do not attempt to change
these settings without the permission of your system administrator.
Viewing the Consensus 4-17
Manipulating Window Displays
Introduction The AutoAssembler software allows you to manipulate the project and
sequence windows in order to get a better look at your data. You can
also clone the project window in order to see multiple views of the same
data.
Arranging When you have opened more than one sequence window, you can
Multiple Windows quickly organize the open windows by either tiling or stacking them by
using the respective commands in the Window menu.
Tiling
To arrange the windows so they do not overlap and a good-sized
portion of each is visible, choose Tile from the Window menu. This
method is useful when you have only a few windows open. Figure 4-6
shows an example of tiled windows.
Project window
Sequence windows
Figure 4-6 Tiled windows
Stacking
Choose Stack from the Window menu to arrange a large number of
open windows so they are reduced in size, and stacked from back to
front so that an edge of each is visible. When the windows are stacked,
you can bring any window to the front by clicking the exposed edge of
4-18 Viewing the Consensus
that window or selecting the project file name from the Window menu.
Figure 4-7 shows an example of stacked windows.
Project window
Sequence
windows
Figure 4-7 Stacked windows
Cloning the
Project Window to
See Multiple Views
of the Data
If you want to look at your data in more than one format, or if assembly
has resulted in more than one contig and you want to compare them,
you can create more than one window of the project by cloning the
original window. Each clone displays the data independently, so you
can look at several levels of data. For example, you can display the
Layout and the Alignment views at one time, or you can display a
different contig in each window.
To clone the project window:
Step
Action
1
Click the project window to make it the active window.
2
Choose Clone from the Window menu.
Figure 4-8 shows an example of a cloned project window displaying two
different views.
Viewing the Consensus 4-19
When you select a range of bases, the same range is
selected in the cloned window
Figure 4-8 Cloned project window showing the Layout and Alignment views
Since both windows display the same project, they have the same
name in the Window menu. The name with a checkmark beside it in the
Window menu is the frontmost window on the screen.
4-20 Viewing the Consensus
Locating Sequences
Introduction The AutoAssembler software provides you tools for locating particular
sequences in each contig in the project file, and for locating specific
patterns in a sequence. These tools are
♦
Find (Again)–Finds sequences in the project window, and patterns
in the Sequence view of the sequence window
♦
Search (Again)–Searches for patterns within specific sequences in
all contigs of the project
Finding Sequences You can use the Find and Find Again commands in the Edit menu to
and Patterns search either for text (or a specific sequence) in the project window, or
for a pattern of bases in the sequence window.
In the Sequence view of the sequence window, you can use the Find
command to search for gaps or any string of characters in a sequence.
You can also quickly repeat a search operation using the Find Again
command, which locates the same information as in the previous Find
command.
To find a pattern of bases in the Sequence view:
Step
Action
1
Place the insertion point at the location where you want the search
to begin.
2
Choose Find (c-F) from the Edit menu. The following dialog box
appears:
If the insertion point is at the end of a sequence, you must specify
“Wrap around” in the dialog box, or move the insertion point.
3
Type or paste the pattern for which you want to search (up to 255
characters) in the “Find What?” entry field.
Viewing the Consensus 4-21
To find a pattern of bases in the Sequence view:
Step
4
5
(continued)
Action
Click the appropriate radio buttons and checkboxes to specify the
parameters of the search.
♦
Select “Literal” to specify that all characters be matched
exactly as you entered them.
♦
Select “IUPAC/IUB” if you have entered an IUB character as
part of the pattern. If you select IUPAC/IUB, using the Find
command locates all possible matches. For example, if the
pattern you enter is ATM, the command locates either ATA or
ATC.
♦
Select “Grep” to set your own codes to represent a wildcard or
part of a sequence. Table 4-2 shows a list of the available
options.
♦
Select “Offset” to move the cursor to a specified position or
range. If you simply enter a number in the “Find What?” entry
field, the insertion point is moved to that base position. If you
enter a range of numbers, the whole range is highlighted (for
example, 123…230).
♦
Select “Case sensitive” if you want uppercase and lowercase
variants of a letter to be recognized as different symbols.
♦
Select “Wrap around” if you want the search to start again at
the beginning of the sequence after it has reached the end. If
the “Wrap around” checkbox is not selected, the search stops
at the end of the sequence.
Click Find.
Table 4-2 provides special expressions for use with the “Grep” option of
the Find command.
Table 4-2 Selection Expressions for “Grep” Option
Expression
Match Performed
Example
[a] (brackets)
Any character inside the
brackets
AA[AC][GT] matches
AAAG, AAAT, AACG or
AACT.
[AGC] matches A,G or
C.
[l¬¬l] (brackets
with ¬ (Option-L)
as first character
inside)
4-22 Viewing the Consensus
Any character except the
character(s) inside the
brackets
A[¬lAG]C matches ACC
or ATC.
Table 4-2 Selection Expressions for “Grep” Option
(continued)
Expression
Match Performed
Example
* after character
Zero or more such
characters
AT[CG]*T matches ATT
or ATCT or ATGGT, and
so on.
. (period)
Any character
AA.A matches AAAA,
AACA, AAGA, AATA,
AANA, and so on.
– (dash)
enclosed by
brackets
A range of characters
AA[A-z] matches AAA,
AAC, AAG, AAz, and so
on.
The AutoAssembler software finds the first instance of the pattern you
specified and marks its position in the summary graphic at the top of the
Sequence view.
Note
If you only want to find a pattern in the valid range, place the insertion
point just before this range in the sequence.
To find other occurrences of the same pattern:
Step
1
Action
Choose Find Again (c-G) from the Edit menu to bypass the dialog
box and use the pattern defined in the previous Find command.
Each time you use this command, the next occurrence of the
specified pattern is located.
continued on next page
Viewing the Consensus 4-23
Searching for Use the Search command to search for sequences or patterns within
Sequences specific sequences. This command searches across all contigs in the
project.
When you use the Search command, the AutoAssembler software
looks for literal matches and does not search the consensus
sequences. When the program finds a match, it selects that sequence
in the sequence list, and, if you specified a pattern, it highlights the
pattern in the Alignment view.
To search for sequences or patterns:
Step
Action
1
Choose Search from the Edit menu. The following dialog box
appears:
2
Enter the filename or the sample name or both.
Note
You only need to enter enough of the name for it to be
distinguishable from the other samples or files.
4-24 Viewing the Consensus
3
Enter a simple pattern if you want to search by pattern.
4
Click Search.
5
To continue the search, choose Search Again from the Edit menu.
Editing the Project
5
5
Overview
Introduction After assembling sequences, you may need to examine and edit
ambiguous areas in the consensus resulting from the assembly. To do
this efficiently, you should understand the various views available in the
project window and in the sequence window. (See Chapter 4, “Viewing
the Consensus.”)
You can edit the sequences in either window, but edits made to
sequences by changing the consensus in the project window will only
be saved to the project, not the individual sequences (see Chapter 8,
“Saving and Printing in AutoAssembler.”). To edit an individual
sequence, see Chapter 6, “Viewing and Editing Sequences.”
In This Chapter This chapter contains the following topics:
Topic
Locating and Controlling Ambiguity in the Consensus
See Page
5-2
Resolving Ambiguity in the Project Window
5-10
Verifying Orientation and Redundancy
5-20
Editing the Project 5-1
Locating and Controlling Ambiguity in the Consensus
Introduction The AutoAssembler software displays ambiguity in the consensus so
that you can easily find and correct problems with the underlying
sequences or their assembly. Ambiguities are displayed as special
characters, and positions of lower confidence are represented by
lowercase characters.
Once you find ambiguous areas, you can edit the contig and the
underlying sequences in the project window (see “Resolving Ambiguity
in the Project Window” on page 5-10).
You can also alter the consensus in the following ways:
♦
Use the Threshold value in the Settings dialog to control ambiguity
in the consensus
♦
Complement the Contig to view all project window and sequence
data as a complementary strand of DNA
♦
Convert the consensus to a three-frame protein translation
Using the Views to The project window provides a quick and useful way to edit assembled
Locate Problem sequences. You can easily locate problem areas using the Layout view
Areas or the Alignment view. If you want to view graphical sequence
information for files created by ABI PRISM DNA Sequencing Analysis
software in order to clarify the base calls, you can cause synchronized
electropherograms to drop down from the sequences. You can then edit
either the consensus sequence or any of the component sequences in
the Alignment view. Your edits are immediately reflected in the related
components or the consensus sequence.
Note
To read about displaying and scaling electropherograms, see
“Displaying Electropherograms” on page 4-9.
Each of the project window views display problem or ambiguous areas,
so you have several ways of locating bases or regions to edit.
♦
The Layout view shows a comprehensive view of a contig.
Ambiguous areas are highlighted with the default color (gray). By
using the zoom command, you can view increasingly detailed
ambiguous regions of the contig.
–
5-2 Editing the Project
Zooming in from the Layout view shows ambiguities, as well as
a compressed view of the data, and is particularly useful for
locating ambiguous sequence ends that can be deleted from
the valid range of data used for assembly.
♦
The Alignment view provides progressively focused views of
individual ambiguities and positions of low confidence. It shows
ambiguities marked with color in the consensus, and displays
ambiguity characters that indicate positions in the consensus where
ambiguities or insertions exist.
–
♦
Zooming in from the Alignment view is useful for comparing
ambiguous sequence bases with their corresponding
electropherograms (see “Using an Electropherogram to
Resolve Ambiguities” on page 5-6).
The Statistics view allows you to locate areas of the contig that do
not have enough sequences (or enough in each orientation) to
provide adequate data redundancy (see “Verifying Orientation and
Redundancy” on page 5-20).
Finding Use the Tab key to find ambiguities quickly in the consensus or any of
Ambiguities the underlying sequences. Table 5-1 shows the various options
Quickly available.
Table 5-1 Key Commands for Locating Ambiguities
Key Command
Action Performed
Tab
Find next ambiguity (character other than ACGT)
Shift–Tab
Find previous ambiguity (character other than ACGT)
Option–Tab
Find next ambiguity excluding gaps (character other
than ACGT or gap character)
Shift–Option–Tab
Find previous ambiguity excluding gaps (character
other than ACGT or gap character)
When you reach the last ambiguity in a sequence, the AutoAssembler
software sounds an alert. Use Shift–Tab to move backwards through
the sequence.
Note
You can also find bad regions of the consensus quickly using the
Select Next Bad Region (c–H) command in the Edit menu.
continued on next page
Editing the Project 5-3
Controlling Because different projects may require different levels of ambiguity, you
Ambiguity in the can specify a threshold that determines the percentage of bases below
Consensus which an ambiguity character appears in the consensus. This setting
applies globally to all projects.
To control ambiguity in the consensus:
Step
Action
1
Choose Settings from the Edit menu. The Settings dialog box
appears:
2
Enter a threshold value in the Base Threshold entry field.
For a base to appear in the consensus, it must appear at the same
position in the underlying sequences in at least this percentage.
Note
3
The default value is 80 percent.
Click OK.
continued on next page
5-4 Editing the Project
Complementing a The AutoAssembler software allows you to display a contig as though it
Contig is from the complementary strand of DNA. When you do so, the data is
complemented in all views of the project window and in the sequence
window.
To complement a contig:
Step
Action
1
Select a contig in the upper-left pane of the project window.
2
Choose Complement from the Edit menu.
The entire contig is complemented.
Note
When the selected contig is complemented, a checkmark
appears beside the Complement command in the Edit menu. To
revert the contig, select complement again (the checkmark
disappears).
Translating the To facilitate editing decisions, you can display a three-frame protein
Consensus to translation of the consensus. This can be useful if you are looking for
Protein Sequences sequencing errors that cause a potential frame-shift. The Translation
view appears as three lines of text below the consensus (see
Figure 5-1).
Figure 5-1 Protein translation of consensus sequence
Editing the Project 5-5
The single-character amino acid aligns with the third position of each
codon. For example, for the sequence ATGCCA, the code M (for
methionine) aligns with the G, and the code P (for pronine) aligns with
the final A. Note that there will be three rows of amino acids to reflect
each possible three-base codon.
You cannot print, copy, or save the protein translation. You also cannot
directly edit it, although the protein codes update when the underlying
consensus changes.
The protein translation uses a universal codon table when translating
ambiguities in the consensus sequence (see Appendix C).
Using an The Alignment view is particularly useful for comparing sequence calls
Electropherogram with their associated electropherogram (see Figure 5-2).
to Resolve
Ambiguities
Figure 5-2 Alignment view
In this example, you could use the electropherogram to resolve the
lowercase t (circled in Figure 5-2). The electropherogram uses the
same colors as the corresponding bases, allowing you to pick out the
strongest signal at any given point.
continued on next page
5-6 Editing the Project
Finding Use the Layout and Alignment views to find problem areas in the
Ambiguous Areas consensus or in the underlying sequences.
To locate ambiguous areas in the project window:
Step
1
Action
In the Layout view, click each of the file names in the sequence list,
and observe the positions or regions marked by the ambiguity color
in the boxed area of the consensus sequence axis.
or
Use the Select Next Bad Region (c–H) command from the Edit
menu.
The following is an example of a highlighted bad region:
2
Choose Zoom In (c–=) from the Window menu until you can
clearly see the ambiguous area. The view will appear as follows:
Editing the Project 5-7
To locate ambiguous areas in the project window:
Step
3
(continued)
Action
Locate ambiguities by looking for ambiguity characters under the
consensus sequence or short black bars in the component
sequences.
A substantial number of ambiguous characters at either end of a
sequence could indicate sufficient ambiguity to warrant removing
that region from the valid range of data used for assembly (you can
do so using Delete From Valid Range in the Edit menu—refer to
“Editing the Valid Range of Data Used for Assembly” on page 5-18).
4
5-8 Editing the Project
Change to the Alignment view by clicking the button shown here:
To locate ambiguous areas in the project window:
Step
5
(continued)
Action
Locate a region that shows several ambiguity characters (the
default shows bullets) under the consensus sequence.
or
Use the Select Next Bad Region command in the Edit menu.
The following is an example of an ambiguous region:
If necessary, zoom in (c - =) to the area in question (see “Using an
Electropherogram to Resolve Ambiguities” above).
If necessary, double-click the sequences to display the underlying
electropherograms.
6
Examine the region, and, if you determine that it should be edited,
proceed with the steps described in “Resolving Ambiguity in the
Project Window” on page 5-10.
Editing the Project 5-9
Resolving Ambiguity in the Project Window
Introduction The AutoAssembler software allows you to add, delete, replace, and
shift bases, either individually or in groups. The way AutoAssembler
handles sequence and consensus editing is slightly different, and the
differences are described with each procedure. The following
procedures briefly describe options for editing sequences and the
consensus, and provide several examples.
Note
You should always edit with lowercase characters (in the consensus,
newly entered characters will still appear uppercase, but underlying sequences
will be lowercase). Doing so makes it easy to locate areas you have edited. In
both the Alignment view and when you zoom in from the Layout view, lowercase
bases appear as half-height bars.
Editing in the If you edit the consensus, all component sequences that overlap at the
Consensus edited position are changed to match your edit, as described in the
editing procedures in this chapter. If you edit one of the underlying
sequences, the consensus immediately reflects the change.
Sequences that overlap with the edited sequence do not change.
When you add bases (A, C, G, or T) to the consensus, the added bases
always appear in uppercase because they reflect your input to the
consensus. However, if you enter them as lowercase, the underlying
sequences will remain lowercase to show less than 100 percent
certainty in the edit. If you enter a base in the consensus in uppercase,
the underlying sequences will be uppercase as well.
Note
This only applies to A,C, G, or T. Other characters imply less than
100% certainty, and will appear as lowercase in the consensus.
What Gets Saved If you are editing either individual sequences or the consensus from the
When You Edit project window, your edits are saved into the project file only when you
save the project.
Note
Changes to the sequences made in the project window are not saved
to the individual sequence when you save the project. You must either use the
Save Sequences command, or save from the sequence window (see “Project
and Sequence Relationships” on page 8-3).
continued on next page
5-10 Editing the Project
Keeping Track of When you have edited a sequence, a triangle appears beside the
Your Edits sequence name in the sequence list (in the upper-right pane of the
project window).
To keep track of edited bases, a simple rule is to edit with lowercase
characters (they will still appear uppercase in the consensus). When
you do, the edited bases appear in Alignment view sequences as
lowercase characters. All other sequence characters are uppercase, so
the edits are easy to locate. They are also easy to find using the zoom
command, where the lowercase bases appear as half-height bars.
Selecting Bases The keystrokes listed in Table 5-2 allow you to quickly select a single
or Sequence character, a range of bases representing a segment of a sequence, or
Segments an entire sequence.
Table 5-2 Keystrokes for Selecting Sequences in the Project Window
Keystroke
Selection performed
Left-Arrow (←)
Moves cursor to the left one base.
Right-Arrow (→)
Moves the cursor to the right one base.
Shift-Left-Arrow
(⇑ ←)
Selects the next base to the left. Holding down the Shift
key and pressing the arrow key additional times
extends the selection.
Shift-Right-Arrow
(⇑ →)
Selects the next base to the right. Holding down the
Shift key and pressing the arrow key additional times
extends the selection.
Option-Left-Arrow
Moves the cursor to the left end of the current
sequence.
(
←)
Option-RightArrow (
→)
Moves the cursor to the right end of the current
sequence.
Shift-Option-LeftArrow (⇑
←)
Selects a range from the cursor position to the left end
of the sequence.
Shift-Option-RightArrow (⇑
→)
Selects a range from the cursor position to the right end
of the sequence.
Up-Arrow (↑)
Moves the cursor up one sequence in both the
sequence list and in the Alignment view. This operates
in a circular manner. If the cursor is in the top
sequence, it moves to the bottom sequence.
Editing the Project 5-11
Table 5-2 Keystrokes for Selecting Sequences in the Project Window
Keystroke
Selection performed
Down-Arrow (↓)
Moves the cursor down one sequence in both the
sequence list and the Alignment view. This operates in
a circular manner. If the cursor is in the bottom
sequence, it moves to the top sequence.
Option-Up-Arrow
(
↑)
Moves the cursor to the consensus sequence.
Option-DownArrow (
↓)
Moves the cursor to the bottom sequence in the contig.
Adding Bases When you insert a base to the left of a gap, the base replaces the gap in
the sequence.
For example, typing c to the left of the gap in the sequence AA–CT
results in the sequence AAcCT.
When you insert a base to the right of another base, place-marker gaps
are inserted in the overlapping sequences to maintain the downstream
alignment. If there was a gap to the right of the character you typed, it
remains.
Example: Adding a base to the right of a gap
Step
1
Action
Insert a t to the right of the gap in the middle of the following
sequence:
AATCT
A A– CT
AATCT
The following sequence results:
A A T – CT
A A – t CT
A A T – CT
You can select the t and choose Shift–Left from the Edit menu, or
type c–Shift–Left Arrow to shift the t to the left and align the gaps.
5-12 Editing the Project
Note
When you enter lowercase characters in the consensus, the new
bases will still appear as uppercase characters because they reflect your input
to the consensus. The underlying sequences will appear in lowercase.
To add bases:
Step
Action
1
Click to place the cursor at the position you want to add a base or
multiple bases.
2
Type the new character or characters you want to insert.
Deleting Bases When you delete bases, gap characters maintain downstream
alignment of the sequence with the contig.
For example, deleting the N from the sequence AANCT results in the
sequence AA–CT.
When you delete a range of bases, place-marker gaps replace each of
the deleted bases.
For example, deleting NC from the sequence AANCT results in AA– –T.
When you replace a base or range of bases in the consensus, the
corresponding bases in the underlying sequences change to match the
consensus.
Example: Deleting a gap in the consensus
Step
1
Action
Delete the gap in the following alignment (the italicized sequence is
the consensus):
A A Gc – T
A A G– – T
A A GC T T
A A GC – T
CT
CT
CT
CT
The following sequence results:
A A Gc T C T
A A G– T C T
A A GC T C T
A A GC T C T
Two gaps and one T in the underlying sequences are deleted.
Editing the Project 5-13
To delete bases:
Step
1
Action
Delete a base or bases in one of the following ways:
♦
Click to the right of the desired base, or select a range of
bases, and press Delete.
♦
Click to the left of the desired base and press the Forward
Delete key ( X
).
♦
Use the standard Macintosh Cut (c - X) command.
♦
Replace the base or bases as described below.
Note
To delete a range of bases from either end of a sequence in order to
change the valid range of data used for assembly, see the procedure on
page 5-18. It allows you to alter the valid range without losing the sequence
data.
Replacing Bases When you replace a base or range of bases in the consensus, the
corresponding bases in the underlying sequences change to match the
consensus. If you type lowercase characters, the bases that are
changed in the underlying sequences appear as lowercase characters
so you can locate them easily. Underlying sequences that do not
change remain as uppercase characters.
5-14 Editing the Project
Note
When you enter lowercase characters in the consensus, the new
bases will still appear as uppercase characters because they reflect your input
to the consensus. The underlying sequences will appear in lowercase.
Example: Replacing bases in the consensus
Step
1
Action
Replace ct in the consensus in the following alignment (the
italicized sequence is the consensus) by selecting the characters
and typing ct:
AAc t TT
A A GT T T
A A CC T T
A A CC T T
A A CT T T
A A GT T T
The following alignment results:
A A CT
AAc T
A A Ct
A A Ct
A A CT
AAc T
TT
TT
TT
TT
TT
TT
In the third position, a c replaces G in the top and bottom
sequences. In the fourth position, t replaces C in the second
and third sequences.
Note
Whenever you enter characters in the consensus, they are
displayed as uppercase, regardless of whether or not you entered
them in uppercase. However, as in this example, entering in
lowercase does affect the underlying sequences.
Editing the Project 5-15
To replace bases:
Step
Action
1
Drag to select the desired base or bases.
2
Type a new character or characters to replace the selected ones.
The change is reflected immediately.
Note
When you highlight a range of bases, then type one
character, the first base is replaced by the character, and the other
selected bases are replaced by gaps. If you continue to type other
characters, the gaps are replaced.
Note
To replace a single base, you can also click to the right of the base you
want to edit, backspace to remove the character, and then type the new
character. When you backspace, a gap character maintains the alignment of the
sequence in the contig. Typing a new character replaces the gap character with
that character.
Shifting Bases Instead of using Cut and Paste commands, you can shift bases to the
left or right in the consensus sequence.
To shift bases or sequence segments:
Step
1
Action
Select the base or segment.
See Table 5-2 on page 5-11 for a list of keyboard shortcuts you can
use to select bases or sequence regions.
2
Choose Shift Left or Shift Right from the Edit menu.
or
Press c–Shift–Left–Arrow or c–Shift–Right–Arrow.
continued on next page
5-16 Editing the Project
Editing Examples This section uses a specific overlap to demonstrate some different
editing options you might choose.
Assume that, after assembly, you have the following three fragment
overlaps:
A A GC – A C T
A A GN C A C T
A A GC – A C T
In this case, you have chosen to edit the overlap by removing the N and
realigning the sequences, although in other circumstances the
character you edit might depend on what appears in the
electropherogram data.
Example 1:
Step
Action
1
Drag to select the NC in the middle sequence.
2
Type C to replace the two selected bases.
This creates the following alignment:
AAGC–ACT
AAGC–ACT
AAGC–ACT
Note
Reassembling removes the unnecessary gaps. If you want to remove
them without reassembling, select the corresponding gap in the consensus and
press Delete.
Example 2:
Step
Action
1
Click to the left of the N.
2
Press the Forward Delete key ( X
).
The alignment looks like:
AAGC-ACT
AAG-CACT
AAGC-ACT
Reassembling removes the unnecessary gaps. If you want to align
the Cs and gaps, follow Step 3 and Step 4.
Editing the Project 5-17
Example 2:
Step
3
(continued)
Action
Press Shift-Right-Arrow.
This selects the C.
4
Press c-Shift-Left-Arrow.
This shifts the C to the left to align it:
AAGC-ACT
AAGC-ACT
AAGC-ACT
Example 3:
Step
Action
1
Click to the right of the N.
2
Press Delete.
This creates the following alignment, which is the same as that
created in Step 2 of the previous example:
AAGC-ACT
AAG-CACT
AAGC-ACT
These three examples demonstrate that many methods exist for editing
any one base or region of data. Try different options to discover what is
most comfortable or efficient for your own editing purposes.
Editing the Valid AutoAssembler uses a feature called *ABI_ValidRange to determine the
Range of Data range of sequence data used for assembly. By changing this feature,
Used for Assembly you can increase or decrease the amount of data used for assembly
without altering the contents of the sequence data files. This can be a
handy tool if the vector or ambiguity range defined by Factura is either
longer or shorter than necessary for a given sequence.
You can change the feature in two ways:
♦
5-18 Editing the Project
Edit the range in the Feature view of the sequence window to make
it longer or shorter. Editing a feature in the Feature view of the
sequence window is described in “Editing Feature Ranges and
Markings” on page 6-14.
♦
Use the Delete From Valid Range command in the Edit menu. You
can easily do this in the Alignment view of the project window, as
the following procedure shows.
This procedure preserves the data in your sequence file, but removes it
from the valid range of data used for assembly.
To quickly remove either end of a sequence from the valid range:
Step
Action
1
From the Alignment view of the project window, select the range of
bases (at either end of the sequence) you want to delete.
2
Choose Delete From Valid Range from the Edit menu.
The selected region is automatically deleted from the
*ABI_ValidRange feature, which defines the valid range of the data
used for alignment. Deleting a region from the valid range does not
delete it from the sequence. It simply hides the region without
moving the sequence, and ensures that all sequence data is
preserved.
Editing the Project 5-19
Verifying Orientation and Redundancy
Introduction The Statistics view allows you to rapidly locate areas of the consensus
that do not have a specified number of sequences. This feature is
particularly useful for finding areas that require more sequence data.
The parameters the Sequence view uses to check the consensus can
be modified in the Configure Statistics dialog box.
Changing Statistic The 2+1 rule is generally accepted as the minimum requirement for
View Parameters good quality data. This provides one sequence in each orientation and
one extra sequence in one of the orientations for verification. For a
more stringent redundancy of five, you could specify orientation ratios
of 4+1 or 2+3.
To set parameters for the Statistics view:
Step
Action
1
Choose Statistics Settings from the Edit menu. The following dialog
box appears:
2
Enter numbers in the two entry fields to specify the number of
sequences required in each orientation.
You do not need to specify the actual sequence orientations.
5-20 Editing the Project
3
To change one of the colors that identify failure or compliance, click
the color filed to the right of the description to display the color
picker.
4
Ensure that the “Show average redundancy line” checkbox is
selected if you want to show a line that represents average
redundancy.
5
Click OK.
Checking the The Statistics view provides an overall view of the problem areas in a
Consensus given contig. Once you locate a potential problem, you can use the
Layout view to verify the underlying sequences. You can then add new
sequences to attain desired redundancy, or extend the range of the
existing sequences (see “Editing Feature Ranges and Markings” on
page 6-14).
To locate problem areas in the Statistics view:
Step
Action
1
Change to the Statistics view by clicking the Statistics view button in
the lower-left corner of the project window:
2
Locate areas in the consensus that appear to fail your redundancy
criteria.
or
Choose Select Next Bad Region (c–H) from the Edit menu.
The program highlights the next entire region of the consensus that
does not meet the criteria set in the Statistics Settings, as in the
following example:
Editing the Project 5-21
To locate problem areas in the Statistics view:
Step
3
(continued)
Action
Click the Layout view button.
The area selected in the Statistics view is displayed.
In the figure above, there are four sequences in the ambiguous
area, but all are in the same orientation, failing the orientation test.
4
In the Layout view, verify that the underlying sequences for this
position include the minimum number of sequences in one
orientation, and in the opposite orientation.
For example, if you specified 1+2 in the Statistics settings, there
should be at least two sequences in one orientation, and at least
one in the opposite orientation. If necessary, you could then change
the range of a sequence’s valid data to extend a sequence and
attain your specified redundancy (see “Editing Feature Ranges and
Markings” on page 6-14).
5-22 Editing the Project
Viewing and Editing
Sequences
6
Overview
6
Introduction While you are reviewing or editing the consensus, you may find it
necessary to view or edit the underlying sequences (for example, in
order to extend the range of valid data). Using the sequence window
allows you to isolate a particular sequence and view its
electropherogram, annotation, and feature data. When you make
changes and save in the sequence window, your changes are saved
directly to the sequence’s sample file.
In This Chapter This chapter contains the following topics:
Topic
See Page
Viewing and Editing Individual Sequences in Sequence
Windows
6-2
Using the Annotation View
6-7
Using the Electropherogram View
6-8
Using the Sequence View
6-11
Using the Feature View
6-13
Viewing and Editing Sequences 6-1
Viewing and Editing Individual Sequences in Sequence Windows
Introduction The AutoAssembler sequence window displays information about an
individual sequence in up to four views:
♦
Annotation view
♦
Sequence view
♦
Feature view
♦
Electropherogram view
You can use this window to view the native (variable) peak spacing in
sample file electropherograms generated by ABI PRISM DNA
Sequencing Analysis software, or to edit the individual sequences.
Although you will probably do most of your editing in the project window
because of the ease with which you can compare sequences and
electropherograms, you might occasionally want to use individual
sequence windows to view the electropherograms with their native
spacing. The electropherograms in the Alignment view are displayed
with constant spacing, so they line up properly with the corresponding
nucleotide sequences.
You might also want to change the features that are defined in individual
sequences by changing the color marking or range of a certain feature.
You must perform such changes in the sequence window.
Note
If you are using the BioLIMS option, the connection to the database
must be open for you to view or edit a sequence.
continued on next page
6-2 Viewing and Editing Sequences
Opening the You can open the sequence window in several ways to view individual
Sequence Window sequences. You can also open several sequence windows and view
them simultaneously.
To open a sequence window:
Step
1
Action
Open the sequence window in one of the following ways:
♦
Double-click the name of the sequence in the sequence list.
♦
Select the sequence in the sequence list or in the current view
by clicking the sequence once, then choose Show Sequence
(z-D) from the Sequence menu.
♦
Select a region of interest in a sequence in the lower pane,
then press z-D. The bases you select in the project window
are displayed in the sequence window when it opens.
Note
If the sequence was produced on ABI PRISM DNA
Sequencing Analysis software, the sequence window opens in the
Electropherogram view.
Note
If you have changed the physical location of the sequence
file, you may not be able to view electropherogram data (see
“Organizing a From Files Project” on page 3-2).
continued on next page
Viewing and Editing Sequences 6-3
Viewing the If your sequence file was produced on ABI PRISM DNA Sequencing
Sequence Window Analysis software, the sequence window opens in the
Electropherogram view (see Figure 6-1).
Lock
image
Buttons used to
change view
The valid range is marked
green in the summary graphic
Use the size box to change the
size or shape of the window
Figure 6-1 The sequence window in Electropherogram view
Immediately below the standard Macintosh computer title line and close
box is a display window to the right of a lock image. The horizontal line
in this summary graphic represents the length of the sequence, and
reflects the cursor position as you move it to different places in the
sequence. The valid range of data used for assembly is marked green.
If you click the lock image, the sequence is protected from edits. You
cannot Cut from or Paste to the sequence (using the Edit menu) as long
as the lock is closed. Click the image a second time to unlock it.
You can display up to four different views by using the buttons located in
the bottom-left corner of the window. Each button’s function is
described in the following sections.
Note
The Electropherogram view is only available for files containing
ABI PRISM DNA sequencing and analysis software electropherogram data. If
you open a database sequence saved in the Inherit Analysis program, a
sequence created using the New Sequence command in Factura, or a Text
sequence entered on a word processor, the sequence window opens in
Sequence view.
\
continued on next page
6-4 Viewing and Editing Sequences
Editing in the
Sequence Window
versus the Project
Window
Edits made in the sequence window can be saved to the original
sequence file (which retains the original data). Edits made to
sequences in the project window are only stored in the assembled
project, and are not saved to the individual sequences unless you use
the Save Sequence command (see “Saving Individual Sequences” on
page 8-4).
Closing the Save any editing before you close the sequence window. See “Saving
Sequence Window Individual Sequences” on page 8-4 for instructions on saving. To print
any of the four views of the sequence window, see “Printing Sequence
Window Views” on page 8-12.
Note
See “Project and Sequence Relationships” on page 8-3 for a
description of the relationships between project and sequences.
To close the sequence window:
Step
1
Action
Close the sequence window in one of three ways:
♦
Click the close box.
♦
Choose Close from the File menu while the window is active.
♦
Press c-W while the window is active.
Viewing and Editing Sequences 6-5
To close the sequence window:
Step
2
6-6 Viewing and Editing Sequences
(continued)
Action
If you modified the sequence and have not saved it, the following
alert box appears, allowing you to save the changes if you want.
♦
To save changes only to the project file, click Save. Changes
are stored in the project file, but not in the original sequence
file.
♦
To store changes in the project file and the original sequence
file, click Update. The changes are saved to the project file and
the sequence file (which also retains a copy of the original
data).
♦
To cancel closing the window, click Cancel.
♦
To continue to close the window without saving features for the
named sequence, click Don’t Save. This reverts the sequence
to the last saved data.
Using the Annotation View
The Annotation The Annotation view shows information stored in the file about the
View ABI PRISM DNA sequencing and analysis instrument run that produced
the sequence data, as well as annotations from a database entry (text
files do not have annotations).
Click the button shown at left to display the Annotation view. Figure 6-2
shows an example of the Annotation view.
Figure 6-2 The sequence window in Annotation view
Information in the Annotation view cannot be edited in AutoAssembler.
Viewing and Editing Sequences 6-7
Using the Electropherogram View
Introduction The Electropherogram view (Figure 6-3) is available only with
ABI PRISM DNA sequencing and analysis software data files. It is useful
for viewing electropherograms with their native spacing, or for
displaying original base calls while you are editing.
Click the button shown at left to return to Electropherogram view from
any of the other views.
If you click the sequence in the Sequence view and then switch to the
Electropherogram view, the electropherogram shows a range of bases
in the region of the sequence where you placed the insertion point. You
can zoom in or out, and display the original sequence for comparison if
you are editing.
Editing in the In the Electropherogram view, you can keep track of your edits by
Electropherogram choosing Show Original from the Sequence menu. When you do so, a
View second line of data that represents the original appears at the top of the
window (see Figure 6-3). The line below it represents the data you can
edit.
Original data
Data you can edit
Figure 6-3 Electropherogram view with original data
In the Electropherogram view, the Edit menu commands are not
available, and you can only edit one base at a time. You can add, delete,
or change bases in much the same way as described for the Sequence
view on page 6-11. However, the spacing of the characters is much
more precise.
If you use the sequence window Electropherogram view while you are
editing, you can choose Show Original from the Sequence menu to
6-8 Viewing and Editing Sequences
display the original data directly above the edited data for reference.
See Figure 6-3 for an example.
Moving the Selection
Multiple base positions (approximately ten) are available between the
displayed bases in the Electropherogram view. If you place the insertion
point between two characters and click, a position is selected. Following
are some hints about moving the selection from one position to another:
♦
To move from base to base, use the Left-Arrow and Right-Arrow
keys.
♦
To move from position to position (often pixel-by-pixel), press the
Option key while you use the Left-Arrow key. Pressing the Option
and Right-Arrow key moves the cursor to the end of the sequence.
IMPORTANT
Because the available base positions are so close together, it
is possible to select a position very close to one of the bases when you are
actually trying to select the base itself. If you do so, you might insert a character
when you intend to change an existing character. Use the Zoom command
(c–=) to make it easier to see and edit individual bases.
Changing Bases
To change a base:
Step
1
Action
Place the insertion point to the right of the character you want to
select, and click the mouse button.
If necessary, use the Zoom command (c–=) to see the bases
more clearly.
2
Press the Right–Arrow or Left–Arrow key to move to the base you
want to select.
Note
Using the Right–Arrow or Left–Arrow keys moves the
cursor base by base only. To select the gaps between bases, press
the Option-Left-Arrow key combination.
3
Enter the new base.
Viewing and Editing Sequences 6-9
Adding Bases
If you add bases in Sequence view and then switch to the
Electropherogram view, the new bases are spaced as evenly as
possible between the two previously existing bases.
To add a base in the Electropherogram view:
Step
6-10 Viewing and Editing Sequences
Action
1
Place the insertion point to the right of the point at which you want
to insert the character and click the mouse button.
2
To move the insertion point, hold down the Option key and use the
Left–Arrow key to move to the position where you want to insert the
base.
3
Type the new character.
Using the Sequence View
Introduction To change to Sequence view from any of the other views, click the
button shown at left.
The Sequence view shows the nucleotide sequence in the center of the
window, with the base positions at the beginning and end of each row
(see Figure 6-4). The valid range of data is marked green with a bold
green underline.
Figure 6-4 The sequence window in Sequence view
In the Sequence view, you can search for specified patterns or use any
of the standard Macintosh operating system editing commands (Cut,
Copy, Paste, Undo/Redo) to change the bases.
Note
To find a specified pattern, see “Finding Sequences and Patterns” on
page 4-21.
Editing Sequences In the Sequence view, you can use the standard editing commands
found in the Edit menu to cut, copy, paste, and clear bases or ranges of
the sequence in the active window. The Edit menu commands operate
as described in the Apple System Software User’s Guide.
Note
To select the entire sequence (including marked features), choose
Select All in the Edit menu.
Viewing and Editing Sequences 6-11
Adding Bases
To add a base or range of bases in the Sequence view:
Step
Action
1
Place the insertion point at the position in the sequence where you
want to add one or more bases.
2
Type the characters you want to insert.
Deleting Bases
You can delete a base or range of bases by using standard Macintosh
editing commands.
To delete a base or range of bases from the sequence:
Step
Action
1
Select the base or range of bases.
2
Press the Delete key or choose Clear or Cut (c–X) from the Edit
menu.
Changing Bases
You can also change bases in the sequence by highlighting and
replacing them in the same way you would replace text in a word
processing program.
To change a base in the sequence:
Step
Action
1
Select the base you want to change.
2
Type the new character you want in that position.
Note
You can also place the insertion point to the right of the character you
want to replace, press the Delete key, then type the character you want in that
position.
6-12 Viewing and Editing Sequences
Using the Feature View
Introduction The Feature view displays Factura-identified features, as well as
features for a database entry (see Figure 6-5).
To display Feature view, click the button shown at left.
After you have updated the sequence files with the results of batch
worksheet processing in the Factura program, feature ranges are added
to the view, identifying portions of the data that represent vector,
ambiguity, and confidence range. When you import these sequences
into the AutoAssembler program, the vector and ambiguity ranges are
used to determine the valid range of the data, effectively eliminating
poor-quality data.
Note
All the information is maintained in the original data. The data used by
the AutoAssembler program is identified by the *ABI_ValidRange feature.
Figure 6-5 The sequence window in Feature view
In the Feature view, you can modify features by changing their ranges,
or changing the colors and borders that mark the features.
continued on next page
Viewing and Editing Sequences 6-13
Editing Feature You can change the range, description, or color marking of any feature
Ranges and in a sequence feature table using the sequence window Feature view.
Markings
Changing Features
A sequence file will only have feature information if feature information
was entered in Factura.
To change a feature:
Step
Action
1
From the sequence window, click the Feature view button.
2
Double-click the feature you want to change.
or
Select a feature and choose Modify Feature in the Sequence menu.
The following dialog box appears:
3
4
6-14 Viewing and Editing Sequences
Make any desired changes to the range as follows:
a.
Select either the beginning or ending value in the “Feature
range(s)” entry fields.
b.
Type a new value in the entry field.
c.
Click Replace.
Change the feature description as follows:
a.
Select the text in the Description entry field.
b.
Type the new description.
To change a feature:
Step
(continued)
Action
5
Make desired changes to the color marking by choosing one of the
eight marking styles in the Style pull-down menu (see Table 6-1).
6
When you have finished making changes in the dialog box, click
OK.
Table 6-1 provides a complete list of marking styles available in the
Feature view’s Add/Edit Feature dialog.
Table 6-1 Default Marking Styles
Style Name
Color
Border
Blue
Blue
No underline
Red Single
Red
Light underline
Green Bold
Bright Green
Heavy underline
Gray Double
Gray
Double underline
Brown
Brown
No underline
D Green Single
Dark Green
Light underline
D Blue Bold
Dark Blue
Heavy underline
Purple Double
Purple
Double underline
Viewing and Editing Sequences 6-15
6-16 Viewing and Editing Sequences
Reassembling a
Project
7
7
Overview
Introduction Assembling sequences using the AutoAssembler software is an
iterative process. You can assemble, edit, and reassemble multiple
times until you achieve a satisfactory result.
You might reassemble a project for several reasons:
♦
You have added new sequences to the project.
♦
The project has been automatically updated from the BioLIMS
database.
♦
You have edited the sequences in the project and want to obtain a
clean calculation of the gaps and overlaps.
♦
The previous assembly created more than one contig, and you
have either edited the sequences, or changed assembly
parameters in such a way as to join the contigs into one.
♦
The previous assembly created what appears to be incorrect
overlaps because of repetitive sequence regions, and you have set
constraints to correct the overlaps.
Note
When you reassemble a project, the number at the end of each contig
name increments to reflect the number of times you have assembled.
In This Chapter This chapter includes the following topics:
Topic
See Page
Reassembling with New or Changed Sequences
7-2
Reassembling to Achieve Different Results
7-5
Reassembling a Project 7-1
Reassembling with New or Changed Sequences
Introduction After assembling a project, you might want to add more sequences to
create a larger contig, or re-add sequences that you have modified
using a different program. To reassemble a project with new or modified
sequences, you can simply choose Assemble from the Project menu.
The parameters you set for the original assembly are maintained.
Reassembling with When you add new sequences to an assembled project and
New Sequences reassemble, the new sequences are incorporated into the contig,
creating a larger consensus for use with other programs.
If you are using the BioLIMS option to autoupdate a project, this
procedure is unnecessary. During autoupdating, AutoAssembler
automatically adds new sequence files in each designated collection on
the BioLIMS database and reassembles the project.
To reassemble with new sequences:
Step
1
Action
Choose Add Sequence(s) from the Project menu. The following
dialog box appears:
Note
Choose Add Multiple from the Project menu to select
multiple files from different folders.
2
Select the File type checkboxes (“3XX,” “TEXT,” or “Inherit”) to filter
for the type of file.
Note
7-2 Reassembling a Project
The file list shows only files of the type selected.
To reassemble with new sequences:
Step
3
4
(continued)
Action
Add a file or files in one of the following ways:
♦
To add only one file, double-click the filename, or select the file
and click Add.
♦
To add all files of the chosen types that are in the open folder,
click Add All.
A progress indicator appears while the sequences are being added:
If necessary, repeat Step 2 and Step 3 to add additional files.
5
Click Unassembled in the contig list of the open project window to
see the newly added sequences.
New sequences are denoted by diamond symbols.
6
Choose Assemble from the Project menu.
The diamond symbols disappear and the new sequences are
included in the sequence list.
continued on next page
Reassembling a Project 7-3
Reassembling with If you assemble sequences and then modify the information in the
Changed parent disk files (for example, if you assembled sequences and
Sequences subsequently processed them in Factura), update the sequences
associated with the project by re-adding them.
If you use the BioLIMS option to autoupdate a project, this procedure is
unnecessary. During autoupdating, AutoAssembler automatically
replaces older versions of sequences in the designated collection on
the BioLIMS database and reassembles the project.
To reassemble a project with modified sequences:
Step
1
Action
Choose Re-Add Modified Sequences from the Project menu.
AutoAssembler checks the modification dates of the sequence files
associated with the project, and re-adds sequences that have been
modified since they were last changed in AutoAssembler.
The re-added sequences disappear from the contig sequence list
and appear in the Unassembled sequence list until you reassemble
the project.
2
Choose Assemble from the Project menu.
Note
It is recommended that you edit your sequences in AutoAssembler.
The fast and efficient editing tools provided by AutoAssembler should make
editing with outside editing programs unnecessary.
7-4 Reassembling a Project
Reassembling to Achieve Different Results
Introduction If assembly results in two separate contigs or if the sequences appear
to be improperly aligned because of repeat regions, you can make the
following changes to encourage proper overlaps:
♦
Edit the data
♦
Change constraints (Server option only)
♦
Change the assembly parameters
Once you have made these changes, you may need to reassemble the
project. The following sections discuss considerations you should make
regarding reassembly.
Reassembling When you edit sequences in the AutoAssembler program, the
After Editing sequence alignment pane of the project window reflects your changes
in the consensus sequence, so you do not need to reassemble the
project to see editing changes. (See Chapter 6, “Viewing and Editing
Sequences,” for editing procedures.)
You might, however, want to reassemble in the following circumstances:
♦
If your edits have created unnecessary gaps or made substantial
changes to the sequence lengths, you might want to reassemble to
obtain clean and consistent overlaps.
♦
When you assemble a project, more than one contig might result if
some of the sequences do not meet the overlap criteria specified by
the Assembly Setup parameters you set. If this happens, and you
edit the resulting contigs in such a way as to overlap them, you can
reassemble the project to join them into a single contig.
To reassemble a project after editing the sequences:
Step
1
Action
Choose Assemble from the Project menu.
The parameters you set for the original assembly are maintained. If
you wish to change the assembly parameters, see Chapter 3,
“Creating and Assembling a Project.”
continued on next page
Reassembling a Project 7-5
Reassembling After you have used the Server option to assemble sequences, the
After Changing AutoAssembler software allows you to adjust the relationships between
Constraints a selected sequence and each of the sequences with which it overlaps.
The assembly engine tries to put only the sequences with the largest
overlaps within a contig, but sometimes you must override these
relationships. For example, you might need to change constraints to
resolve incorrectly assembled repeat regions.
To change assembly constraints:
Step
Action
1
Select the sequence of interest.
2
Choose Constrain Overlaps from the Project menu. The following
dialog box appears:
The sequences that overlap with the sequence you selected are
listed in the dialog box.
3
Click the name of an overlapping sequence for which you want to
change the constraint.
4
Change the constraint by clicking the appropriate radio button.
♦
To use the default setting used by the Server algorithm click
Automatic (a bullet appears under the “a”).
♦
To strengthen or create an overlap with the selected sequence,
click Enhance (a bullet appears under the “e”).
♦
To remove the overlap with the selected sequence, click Inhibit
(a bullet appears under the “i”).
Using this procedure, you can modify the relationship between your
selected sequence and each of the overlapping sequences.
5
7-6 Reassembling a Project
When you are finished, click OK.
To change assembly constraints:
Step
(continued)
Action
6
Choose Assembly Setup from the Project menu. The following
dialog box appears:
7
Select the Server icon.
8
Select the checkbox labeled “Use Constraints.”
9
Click Submit to reassemble the project.
Note
You must reassemble to see the effect of the changes you have made.
When you reassemble the project with only a change in constraints, the
assembly takes a small fraction of the time required for the initial
assembly.
Reassembling a Project 7-7
Resetting Overlap Relationships
If you have changed assembly constraints as described above and want
to reset all or some of the overlap relationships, you can reset all
relationships, or the relationships of individual sequences.
To reset overlap relationships:
Step
1
Action
Choose Remove Constraints from the Project menu. This resets all
constraints in the project to the automatic option.
or
Use the Constrain Overlaps command. Follow the same procedure
you used to change the constraints originally.
Assembling Projects Without Constraints
You may also choose to assemble a project without constraints. This
procedure maintains the constraint settings.
To assemble the project without constraints:
Step
Reassembling
After Changing
Minimum Overlap
and Percent Error
Action
1
Choose Assembly Setup from the Project menu.
2
In the Assembly Setup dialog box, deselect the checkbox labeled
“Use Constraints.”
3
Click Submit.
You might want to change the Minimum Overlap or Percent Error
parameters after you see the overlaps resulting from a particular set of
sequences and parameters. You can do so by choosing Assembly
Setup from the Project menu. The procedure is the same as for original
assembly (see “Assembling Sequences” on page 3-29).
Note
To lessen the number of sequences included in an overlap, try
increasing the Minimum Overlap parameter or decreasing the percentage of
errors allowed.
continued on next page
7-8 Reassembling a Project
Reassembling If you are using an engine that supports user-entered parameters, you
After Changing might want to change the parameters after viewing the consensus. You
Engine Parameters can do so by choosing Assembly Setup from the Project menu. The
procedure is the same as for original assembly (see “Assembling
Projects Using the Engine Options” on page 3-31).
Note
The parameters mentioned in this section apply only to the CAP and
CAP Remote engines. Other engines may or may not support these
parameters.
In particular, if you find that sequences are being incorrectly excluded
from the contig, you should consider modifying the following CAP
engine parameters and reassembling:
♦
-OVERLEN–Decreasing this value reduces the number of bases
required to establish overlap.
♦
-FLEVEL–Decreasing this value reduces the number of matches
required in an overlapping sequence.
Note
Decreasing these values increases the possibility of incorrectly
matched sequences.
If you find that you contig contains excessive areas of ambiguity or
incorrectly overlapped sequences, you should consider modifying the
following engine parameters and reassembling:
♦
-OVERLEN–Increasing this value means that more bases must
match before an overlap will be considered valid.
♦
-FLEVEL–By increasing this value, you raise the percentage of
bases that must match before an overlap is considered valid.
♦
-POS3–If your sequences contain valid data after the default
number of bases (450), you should increase this value.
For more information on all user-entered parameters, see “Assembling
Projects Using the Engine Options” on page 3-31.
Reassembling a Project 7-9
7-10 Reassembling a Project
Saving and Printing in
AutoAssembler
8
Overview
8
Introduction This chapter provides information on
♦
Printing and saving your work for various purposes
♦
Exporting and importing sequences to and from other programs
IMPORTANT
You should save your work during and after making any
significant change in the project or an individual sequence.
In This Chapter This chapter contains the following topics:
Topic
See Page
Saving your Work
8-2
Printing and Saving Assembly Reports
8-7
Printing and Copying the Views for Presentations
8-11
Creating Files for Use with Other Applications
8-16
Saving and Printing in AutoAssembler 8-1
Saving your Work
Introduction To prevent losing your work, make sure to save your project after
making significant changes in a sequence or in the project window. You
can save the entire project or individual sequences.
Saving the Project IMPORTANT
To preserve the information in a previously created file and
create a new file containing changes, either save the open file under a new
filename, or save a copy of the open file.
To save a project:
Step
1
Action
You can save a project in one of three ways:
♦
Choose Save from the File menu.
If you previously saved the project, it is automatically saved
under the same filename.
If you have not saved the project before, a standard file dialog
box appears so that you can select a location and enter the
filename for your file.
♦
Choose Save As from the File menu.
When the standard file dialog box appears, type a name for
your file in the entry field, select a location for it, and click Save.
♦
Choose Save a Copy In from the File menu.
A standard file dialog box appears, allowing you to assign the
filename and location. A copy of your current worksheet is
saved to the file you name, but the original remains on the
screen.
Note
This procedure will not save changes made to sequences
to the individual sequence files. These changes will be saved to the
project only (see “Project and Sequence Relationships” on
page 8-3).
continued on next page
8-2 Saving and Printing in AutoAssembler
Project and When you are ready to save changes to your project, it is important to
Sequence understand exactly what it is that you are saving.
Relationships
Project Files
A project file contains the following:
♦
Consensus sequence
♦
Editable sequence data
♦
Links to the original sequences
The original sequences themselves are not part of a project file. This is
why moving the sequences can prevent you from being able to display
electropherograms or open the sequence window for a particular
sequence (see “Organizing a From Files Project” on page 3-2).
When you save a project, any changes you have made to a sequence
are not saved. For example, if you add the same sequence to another
project, none of the edits you made in the original project are visible.
Sequence Files
Sequences contain the following:
♦
Original Sample file data
♦
Editable data
♦
Electropherogram data (only if the sequence was produced on an
ABI PRISM DNA Sequencing Analysis software)
Original sequence data is stored in the Sample file. In the Factura
program, you can revert the sequences to the original data. However, if
you do, the edited data is overwritten with the original data, and any
edits you have made are lost.
To make a permanent change to a sequence file’s editable data, you
must save the sequence in one of two ways:
♦
Use the Save Sequences command from the project window (see
“Saving Sequences From the Project Window” on page 8-4).
♦
Use the Save command from the sequence window (see “Saving
Sequences From the Sequence Window” on page 8-6).
continued on next page
Saving and Printing in AutoAssembler 8-3
Saving Individual Each sequence in the project has an associated data file containing the
Sequences characters that make up the sequence. ABI PRISM DNA Sequencing
Analysis software Sample files include electropherogram information
that defines the four-color electropherogram display of the data. If you
edit sequences in the project window, the changes are not stored in the
associated data file until you save to the individual sequence files.
Since the editing tools in the project window are so powerful, you might
not need to use the sequence window for editing. You can save changes
to a Sample file from the project window, or from the sequence window,
if you have opened it.
Note
If you are using the BioLIMS option, the connection to the database
must be open for you to save a sequence.
Saving Sequences From the Project Window
To save changes to sequences from the project window:
Step
1
Action
Choose Save Sequences from the File menu. The following dialog
box appears:
The sequences you have edited are marked with a checkmark.
2
Click to deselect any sequences you do not want to save.
If you want to save additional sequences, click next to them to
select them.
Deselect the “Save sequences with gap characters” checkbox to
save sequences without gap characters.
8-4 Saving and Printing in AutoAssembler
To save changes to sequences from the project window:
Step
3
(continued)
Action
Click Save to save the sequence(s).
Note
If the modification dates of your sequences are later than
those remembered by the project with which you are working, the
program displays the following alert box for each modified
sequence after you click Save:
Click OK to save sequences.
Click Force Save in the Save dialog box to save the sequence
without checking modification dates.
Force Save is useful if you have another program (such as a
backup program) that might change the modification dates of your
sequence files while a project is open.
Saving and Printing in AutoAssembler 8-5
Saving Sequences From the Sequence Window
To save sequences from the sequence window:
Step
1
Action
From any sequence view, select Save from the File menu.
Note
If the modification dates of your sequences are later than
those remembered by the project with which you are working, and
you try to save information to the sequences, the program displays
the following alert box for each modified sequence after you click
Save:
8-6 Saving and Printing in AutoAssembler
Printing and Saving Assembly Reports
Introduction After assembling a project, you can view, save, and print the following
types of project assembly reports:
♦
Project Summary
♦
Contig Summary
♦
Project Reports
Saved reports are in tab-delimited format, so you can open them in
many word processing, spreadsheet, and database application
programs.
In all the print procedures, if you want to print only one copy or if you do
not want to change the print range, choose Print One rather than Print.
This carries out your request immediately, bypassing the standard print
dialog box.
Project Summary The Project Summary report summarizes the current status of the
entire project, and includes the following information:
♦
The last time it was saved to a project file
♦
The last time it was assembled
♦
A summary of the sequences and bases in the total project
♦
A summary of the sequences and bases in each contig
Figure 8-1 shows an example of the Project Summary.
Figure 8-1 Project Summary
Saving and Printing in AutoAssembler 8-7
The Contig The Contig Summary contains detailed information for a single selected
Summary contig or for the Unassembled sequence list. It includes the following:
♦
Sequence lengths, orientations, and project ID numbers
♦
Starting and ending positions along the consensus sequence
♦
Last modification date for each sequence
♦
Chemistry used to produce each sequence (ABI PRISM DNA
Sequencing Analysis software)
Figure 8-2 shows an example of the Contig Summary.
General project
information
Assembly
parameters used
Totals for project
Contig
information
includes
redundancy,
sequence
lengths, and
information about
individual
sequences from
the sequence list
Figure 8-2 Contig Summary report
continued on next page
8-8 Saving and Printing in AutoAssembler
Project Report The Project Report is the most complete report. It contains the
information from the Project Summary and detailed information for each
contig in the project. It lists the sequences in the Unassembled list and
indicates the source format, but it does not compute orientation and
offset values.
Figure 8-3 shows an example of the Project Report.
Project
Summary
information
Contig
Summary
information
(appears for
each contig
in the
project)
Figure 8-3 Project Report
Viewing Assembly All three reports are accessed through the Project menu.
Reports
To view an assembly report:
Step
1
Action
Choose the name of the report from the Project menu. The report
appears in a report window on the screen.
continued on next page
Saving and Printing in AutoAssembler 8-9
Saving Reports Saved reports are in tab-delimited format, so you can open them in
many word processing, spreadsheet, and database applications.
To save an assembly report:
Step
Action
1
Choose the type of report you want to save from the Project menu.
2
With the report window active, choose Save (c-S) from the File
menu.
A standard file dialog box appears.
3
Type a name for the file in the entry field.
4
Click Save.
Printing Assembly All assembly report formats can be printed. In each case, the printed
Reports format is the same as the screen format.
To print an assembly report:
Step
1
Action
Choose the type of report from the Project menu.
The report window opens, with your selected report displayed in it.
2
Choose Print (z-P) from the File menu, and the Print dialog box
appears.
3
Click Print.
8-10 Saving and Printing in AutoAssembler
Printing and Copying the Views for Presentations
Introduction In addition to printing project reports, you can also print from the project
window, or from individual sequence windows. You can copy the views
to the Clipboard and paste them into other applications, such as files in
word processing or presentation programs.
In all the print procedures, if you want to print only one copy or do not
want to change the print range, choose Print One rather than Print. This
carries out your request immediately, bypassing the standard print
dialog box.
Printing Project When you print the Layout, Statistics, or Alignment views from the
Window Views project window for a selected contig, the printed copy shows the name
of the project and a ruler to identify the base positions in the consensus.
Note
If you want to insert representations of the Layout or Statistics views
into word processing or presentation application files, see “Copying Project
Window Views to Other Programs” on page 8-14.
To print a project window view:
Step
Action
1
Make sure the project window is active.
2
Click the button of the view you want to print.
3
Choose Print (c-P) from the File menu.
The Print dialog box appears.
4
Click OK.
Note
To increase the amount of data printed per page, choose
Page Setup from the File menu, and set the parameters for
landscape orientation and a reduced size.
If you print the Layout view with file names displayed, the names appear
on the printed copy.
When you print the Alignment view, the sequence names appear on the
printout. The contig wraps down the page as many times as the length
allows, printing on several pages.
continued on next page
Saving and Printing in AutoAssembler 8-11
Printing Sequence You can print any one of the four sequence window views, or all of them
Window Views at once.
When you print the views separately, or print only Annotation, Feature,
and Sequence views together, they print in portrait orientation.
If you request all of the views at once, AutoAssembler prints Annotation,
Sequence, and Feature views on a single page in landscape
orientation, and Electropherogram view on several pages, as necessary
(also in landscape orientation).
A color printer is recommended for printing the Electropherogram view.
To print a view in the sequence window:
Step
Action
1
Click the sequence window to make it active.
2
Choose Page Setup from the File menu. The Page Setup dialog
appears, with additional options (electropherogram settings).
8-12 Saving and Printing in AutoAssembler
To print a view in the sequence window:
Step
3
(continued)
Action
If you are only printing the Electropherogram view, click the
Landscape option to the far right of Orientation.
Change the Electropherogram Settings options as follows:
♦
Select the radio button labeled “Single Page” to print the entire
electropherogram on one page.
♦
Select the radio button labeled “Variable Size” to print the
electropherogram on several pages. Results will vary depending
on the settings in the two entry fields.
In most cases, the default settings are sufficient, although you
can fine-tune the print by changing the entry fields.These fields
specify the number of times the electropherogram wraps down
the page and the number of data points displayed within each
wrap.
4
Choose Print (c-P) from the File menu. The following dialog box
appears:
5
Select the checkboxes next to the views you want to print.
If you select all four checkboxes, the views are printed together in
landscape orientation, as described earlier. If you select only one, or
the first three, they print separately in portrait orientation.
If you want to punch holes in the page and file it in a three-ring
binder, select the checkbox labeled “Allow for 3-hole punch.” Doing
so causes the print to have a slightly wider left margin.
6
Click OK to start printing.
continued on next page
Saving and Printing in AutoAssembler 8-13
Copying Project If you want to use graphics from the project window in a word
Window Views to processing or related program to create a report or a presentation, you
Other Programs can copy graphics from the project window to the Macintosh Clipboard,
and paste them from the Clipboard into your file in the other program.
To copy graphics from the project window for use in another program:
Step
Action
1
Click the project window to make it active.
2
Click a button to display the view you want to copy.
3
From the view you have opened, select the area you want to copy
to the Clipboard.
4
Select Copy (c-C) from the Edit menu.
Note
5
Only text can be copied from the Alignment view.
Select Show Clipboard from the Edit menu to see what you have
copied.
Copying a You can also copy a sequence from the sequence window. The
Sequence from the following are two possible uses for a copied sequence:
Sequence Window ♦ Create a new sequence file for use with other sequencing-related
applications
♦
Incorporate the sequence into a text file as part of a report, article,
or presentation
In the sequence window, you can only copy from the Sequence view,
and the copied sequence is in text format, rather than graphic format.
To copy an individual sequence from the sequence window:
Step
1
Action
Make sure the sequence window displaying the sequence of
interest is active.
Note
The selected window can display a single sequence
fragment or a consensus (see “Building a Consensus Sequence”
on page 8-16).
2
8-14 Saving and Printing in AutoAssembler
Select the entire sequence or a desired range.
To copy an individual sequence from the sequence window:
Step
3
(continued)
Action
Choose Copy (c-C) from the Edit menu.
The sequence or range is copied to the Clipboard. You can use the
Paste command in another program to paste the contents of the
Clipboard into a file associated with that program.
Saving and Printing in AutoAssembler 8-15
Creating Files for Use with Other Applications
Introduction Two types of files can be created by the AutoAssembler software for use
with other programs.
♦
Consensus files–The consensus file is the end result of
AutoAssembler. You can either archive the consensus, or translate
the file for use with other programs.
♦
Layout files –Files of a single contig, which can be used with
Sequence Navigator, SeqEd, or EditView software.
Building a When you are satisfied with the consensus produced as part of a
Consensus contig, you can build and save a special consensus sequence file for
Sequence use in another program, such as Inherit Analysis.
To build a consensus sequence for use with other programs:
Step
Action
1
Select the contig of interest by clicking its name in the upper-left
pane of the project window.
2
Choose Build Consensus from the Project menu. The following
dialog box appears, with the file’s name in the Name entry field:
3
Use the pop-up menu to choose the case of the characters in the
consensus in one of the following ways:
♦
To retain the case the characters have in the project window
(lowercase characters for ambiguous base positions and
uppercase characters for all others), use the default (Mixed).
♦
To create a consensus sequence with all upper-case
characters, choose “UPPER.”
♦
To create a consensus with all lower-case characters, choose
“lower.”
Note
It is easier to identify ambiguous base positions in the
consensus if you choose Mixed case.
8-16 Saving and Printing in AutoAssembler
To build a consensus sequence for use with other programs:
Step
(continued)
Action
4
Select the “Delete insertion (gap) characters” checkbox to eliminate
gap characters from the consensus.
5
Click OK. A sequence window with the consensus sequence
appears:
Note
In a mixed consensus (such as the consensus shown
here), lowercase characters denote ambiguities.
This window allows you to switch to either Feature view or Annotation
view, but both are empty. If you want to add features, you can do so
using Factura (see the Factura User’s Manual).
Note
The consensus sequence does not have Electropherogram view, since
it was not produced by ABI PRISM DNA Sequencing Analysis software.
Exporting a Once you have built the consensus sequence, you have several options
Consensus for transporting it to other applications:
Sequence ♦ Copy the consensus sequence to the Clipboard and paste it into a
new sequence file in another application.
♦
Save the consensus sequence for future use with another
application. It is saved to a Sample file format without an
electropherogram.
Saving and Printing in AutoAssembler 8-17
♦
Export the consensus sequence to text format, as described in the
next section, “Exporting Sequences to Text Format.”
To copy the consensus sequence via the Clipboard:
Step
Action
1
With the consensus sequence window active, choose Select All
(c-A) from the Edit menu.
2
Choose Copy (c-C) from the Edit menu.
3
Open a new sequence file in the other application.
4
With the window of the new sequence file active, choose Paste
(c-V) from the Edit menu.
If you try to close the sequence window without saving, a dialog box
asks you to verify whether or not you want to save it.
To save the consensus sequence:
Step
1
Action
Choose Save from the File menu.
The standard Save dialog box appears, with the name of the contig
in the entry field.
2
Enter another name for the file, if you want to change it.
3
Click Save.
Note
This procedure saves the consensus to a Sample file format without
an electropherogram. If you want to save it to a different file format, use the
Export command (see the next section).
continued on next page
8-18 Saving and Printing in AutoAssembler
Exporting AutoAssembler allows you to export a consensus into a text file. A text
Sequences to Text files simply contains a string of characters, and can be easily exported
Format into word processing applications.
Note
You can also export the contig to a layout format for use with the
SeqEd and Sequence Navigator programs (see the next section).
To export a consensus sequence to another format:
Step
Action
1
Make sure the sequence window containing the consensus is the
active window.
2
Choose Export from the File menu, and Text from the submenu that
appears.
3
Choose the appropriate file type.
4
The standard Macintosh Save dialog box appears, with the contig
name as the default file name.
Select the destination folder and click Save.
AutoAssembler Although you can open a layout generated by AutoAssembler in the
Layout Files SeqEd, EditView, and Sequence Navigator programs, it is highly
recommended that you edit assembled sequences in AutoAssembler.
IMPORTANT
SeqEd and EditView do not recognize feature table
information, and saving edits to a Sample file from either of these programs can
invalidate the feature table in the file.
Note
If you find an invalid feature table after such editing and saving, run the
sequences in Factura again, using the same settings, but do not revert the
sequences to original data. This should re-establish the feature table without
overwriting your edits.
If you open an AutoAssembler-generated layout in the Sequence
Navigator program, a dialog box appears, indicating that the file will be
converted. In order to maintain compatibility with SeqEd and EditView,
the layout created by AutoAssembler uses System 6-compatible file
references. Sequence Navigator converts the references to System 7compatible aliases.
Although the Sequence Navigator program does recognize feature
tables, some information is lost when you edit sequences in Sequence
Navigator, save back to the original sequence files, and then re-add the
sequences to AutoAssembler.
Saving and Printing in AutoAssembler 8-19
Use AutoAssembler to edit assembled sequences. The powerful editing
features in AutoAssembler make editing with outside programs
unnecessary in most cases.
To export a contig to a layout:
Step
Action
1
Select a contig from the project window and choose Export from
the File menu.
2
Choose Layout from the Export submenu.
A standard file dialog box appears.
3
Enter a filename for the layout file.
4
Click Save.
8-20 Saving and Printing in AutoAssembler
AppleScript
Dictionary
Appendix Overview
A
A
Introduction This appendix provides a complete list of the AppleScript commands
supported by the AutoAssembler program. For instructions regarding
the use of AppleScript, see Apple’s AppleScript User’s Guide.
In This Appendix This appendix contains the following topics:
Topic
AppleScript Commands
See Page
A-2
AppleScript Dictionary A-1
AppleScript Commands
AutoAssembler Table A-1 contains events that are specific to the AutoAssembler
Suite software.
Table A-1 AutoAssembler Suite
Command
Description
Zoom
Increases the magnification of the target window.
Zoom (reference)–The window to zoom
♦
Tile Windows
to (real)–The new magnification
Arrange open windows so that all are visible.
Tile windows (reference)
Stack Windows
Stack open windows.
Stack windows (reference)
Show
Show sequence window for selected sequence.
Show (reference)
Assemble
Assemble project.
Assemble (reference)–The project to assemble
A-2 AppleScript Dictionary
Show Project
Report
Show the project report window.
Show Project
Summary
Show the project summary window.
Show Contig
Summary
Show the contig summary window.
Show project report (reference)–The project
Show project summary (reference)–The project
Show contig summary (reference)–The contig
Table A-1 AutoAssembler Suite
(continued)
Command
Description
Show
Consensus
Show the contig consensus window.
Add To
Show consensus (reference)–The contig
♦
using title (string)–The names of the consensus
window
♦
gaps (Boolean)–Include insertion (gap) characters
♦
in (mixed/upper- case/lowercase)–Alphabetic case
Add fragments to the specified project.
Add to (reference)–The project to contain the fragment
♦
fragments (alias)–The fragment files to add
♦
with ID (string)–The identifier of the fragment on the
database
♦
from database (string)–The name of the BioLIMS
database to use
♦
on server (string)–The name of the database’s
server
Select Next Bad
Region
Select the next bad region in the project window.
Select All
Select everything in the project window.
Select new bad region (reference)–The project window
Select all (reference)–The project window
Class Application
AutoAssembler application
♦
autoupdate (Boolean)–Enable automatic updating
♦
update delay (integer)–Idle minutes to wait before
starting automatic update
♦
update list (a list of file)–List of projects to be
automatically updated
AppleScript Dictionary A-3
Table A-1 AutoAssembler Suite
(continued)
Command
Description
Class Document
Project document
Elements
Class Contig(s)
♦
contig (by numeric index/by name)
♦
sequence (by numeric index/by name)
♦
unassembled sequence (by numeric index/by
name)
Contiguous alignment of sequences
Elements
Class
Sequence(s)
A-4 AppleScript Dictionary
♦
sequence (by numeric index/by name)
♦
consensus (text r/o)–The consensus
♦
name (string r/o)–The name
♦
length (integer r/o)–The length of the consensus
♦
orientation (original/complementary r/o)–The
orientation
Sequence of bases
Elements
♦
feature (by numeric index/by name)
♦
bases (text r/o)–The bases
♦
name (string r/o)–The name
♦
orientation (original/complementary r/o)–The
orientation
♦
length (integer r/o)–The length
♦
sequence type (unknown/DNA/RNA/protein
r/o)–The type
♦
alphabet (unknown/IUB/gapp-ed IUB/protein
r/o)–The alphabet
♦
annotation (text r/o)–The annotation
Table A-1 AutoAssembler Suite
(continued)
Command
Description
Class
Unassembled
Sequence(s)
A sequence which does not belong in a contig
Class Feature(s)
Features of a sequence
♦
<Inheritance> (sequence)–All properties and
elements of the given class are inherited by this
class.
♦
key (text r/o)–The key
♦
name (string r/o)–Synonym for key
continued on next page
AppleScript Dictionary A-5
BioLIMS Scripts Table A-2 contains AppleScript commands for the BioLIMS access
suite.
Table A-2 AutoAssembler BioLIMS Access Suite
Command
Description
Open
Connection
Opens a connection with the database using the current
connection or makes a new one with the parameters
provided.
Open connection (reference)–The application
♦
using connectionID (small integer)–ID number to
identify the data for opening this connection
♦
using alias (string)–Alias to identify the data for
opening this connection
♦
with username (string)–Alias to identify the data for
opening this connection
♦
to database (string)–Database name to be used for
opening this connection
♦
on server (string)–Server to be used for opening
this connection
♦
with password (string)–Password to be used for
opening this connection
♦
with alias (string)–Password to be used for opening
this connection
♦
with database (string)–Database name to be used
for opening this connection
♦
with server (string)–server to be used for opening
this connection
Result (small integer)–ID number of the opened
connection
Open Default
Connection
Opens a connection using the default data from the
session manager dialog.
Open default connection (reference)–The selected
application
Result (small integer)–the ID number of the opened
connection
A-6 AppleScript Dictionary
Table A-2 AutoAssembler BioLIMS Access Suite
(continued)
Command
Description
Make New
Connection
Creates a new set of data for connecting to a database,
and makes it the current set.
Make new connection (reference)–The application
♦
with username (string)–user name to be used for
opening this connection
♦
with database (string)–database name to be used
for opening this connection
♦
with server (string)–server to be used for opening
this connection
♦
with password (string)–password to be used for
opening this connection
♦
with alias (string)–alias to be used for identifying
this connection in the future
♦
to database (string)–database name to be used for
opening this connection
♦
on server (string)–server to be used for opening
this connection
Result (small integer)–the ID number of the selected
connection
Select
Connection
Make this connection the selected one.
Select connection (reference)–The application
♦
with connectionID (small integer)–ID number to
identify this connection
♦
with alias (string)–Alias to identify this connection
♦
using connection (small integer)–ID number to
identify this connection
♦
using alias (string)–Alias to identify this connection
Result (small integer)–The ID number of the selected
connection
AppleScript Dictionary A-7
Table A-2 AutoAssembler BioLIMS Access Suite
Command
Description
Close
Connection
Close the channel used in this connection.
Delete
Connection
A-8 AppleScript Dictionary
(continued)
Close connection (reference)–The application
♦
with connectionID (small integer)–ID number to
identify this connection
♦
with alias (string)–Alias to identify this connection
♦
using connection (small integer)–ID number to
identify this connection
♦
using alias (string)–Alias to identify this connection
Discard the designated connection permanently.
Delete connection (reference)–The application
♦
with connectionID (small integer)–ID number to
identify this connection
♦
with alias (string)–Alias to identify this connection
♦
using connection (small integer)–ID number to
identify this connection
♦
using alias (string)–Alias to identify this connection
Table A-2 AutoAssembler BioLIMS Access Suite
(continued)
Command
Description
Delete all
connections
Discard all connections made through AppleScript
permanently.
Delete all connections (reference)–The application
Class Session
Manager
The session manager
♦
selected alias (text)–The alias of the currently
selected connection
♦
selected username (text)–The user name of the
currently selected connection
♦
selected database (text)–The database name of the
currently selected connection
♦
selected server (text)–The server name of the
currently selected connection
♦
selected password (text)–The password of the
currently selected connection
♦
selected connectionID (small integer r/o)–The ID
number of the connection
♦
current alias (text)–The alias of the currently
selected connection
♦
current username (text)–The user name of the
currently selected connection
♦
current database (text)–The database name of the
currently selected connection
♦
current server (text)–The server name of the
currently selected connection
♦
current password (text)–The password of the
currently selected connection
♦
current connectionID (small integer r/o)–The ID
number for the connection
♦
user intervention (Boolean)–Whether or not the
user is asked to help connect
AppleScript Dictionary A-9
A-10 AppleScript Dictionary
References
Appendix Overview
B
B
Introduction This appendix provides a list of references for information about the
algorithms used by the AutoAssembler software and its server option.
In This Appendix This appendix contains the following topics:
Topic
Algorithm References
See Page
B-2
References B-1
Algorithm References
Sequence The following references might be useful to you for a more complete
Alignment understanding of the sequence alignment algorithms used by
Algorithms AutoAssembler:
♦
Applied Biosystems Division of Perkin Elmer. 1993. Sequence
Analysis Toolbook. Foster City: Applied Biosystems Division of
Perkin Elmer.
♦
Dear, S. and Staden, R. 1991. A sequence assembly and editing
program for efficient management of large projects. Nucleic Acids
Research. 14:3907-3911.
♦
Huang, X. 1992. A Contig Assembly Program Based on Sensitive
Detection of Fragment Overlaps. Genomics. 14:18-25.
♦
Kececioglu, J.D. and Myers, E. 1994. Exact and Approximate
Algorithms for the Sequence Reconstruction Problem.
Algorithmica. 12:4.
♦
Kececioglu, J.D. Exact and Approximation Algorithms for DNA
Sequence Reconstruction. University of Arizona. TR91-26.
Feature Tables The following provides more information about feature tables:
♦
B-2 References
DNA Data Bank of Japan, Mishima, Japan; EMBL Data Library,
Heidelberg, Federal Republic of Germany; GenBank, Los Alamos,
NM, and Mountain View, CA, USA. 1993. The
DDBJ/EMBL/GenBank Feature Table: Definition. This can be
obtained by anonymous FTP to ncbi.nlm.nih.gov.
Use “anonymous” as your login ID and your e-mail address as your
password.
Key Codes
C
Appendix Overview
C
Introduction This appendix provides translations for codes used in the
AutoAssembler program.
In This Appendix This appendix contains the following topics:
Topic
Translation Tables
See Page
C-2
Key Codes C-1
Translation Tables
Introduction This section provides the following translation tables:
♦
IUPAC/IUB Codes
♦
Complements
♦
Universal Genetic Code
♦
Amino Acid Abbreviations
IUPAC/IUB Codes Table C-1 provides translations for IUPAC/IUB codes used in the
AutoAssembler software.
Table C-1 IUPAC/IUB Codes
Code
Translation
A
Adenosine
C
Cytidine
G
Guanosine
T
Thymidine
B
C,G, or T
D
A, G, or T
H
A, C, or T
R
A or G (puRine)
Y
C or T (pYrimidine)
K
G or T (Keto)
M
A or C (aMino)
S
G or C (Strong— 3 H bonds)
W
A or T (Weak—2 H bonds)
N
aNy base
continued on next page
C-2 Key Codes
Complements Table C-2 provides complements for reference.
Table C-2 Complement Table
A
T
S
W
W
S
B
V
D
H
C
G
G
C
T
A
R
Y
H
D
Y
R
V
B
K
M
N
N
M
K
Universal Genetic Table C-3 provides Universal Genetic Codes for use with the
Code AutoAssembler software.
Table C-3 Universal Genetic Code
5' End
T
C
A
G
2nd Position
3' End
T
C
A
G
Phe
Ser
Tyr
Cys
T
Phe
Ser
Tyr
Cys
C
Leu
Ser
OCH
OPA
A
Leu
Ser
AMB
Trp
G
Leu
Pro
His
Arg
T
Leu
Pro
His
Arg
C
Leu
Pro
Gln
Arg
A
Leu
Pro
Gln
Arg
G
Ile
Thr
Asn
Ser
T
Ile
Thr
Asn
Ser
C
Ile
Thr
Lys
Arg
A
Met
Thr
Lys
Arg
G
Val
Ala
Asp
Gly
T
Val
Ala
Asp
Gly
C
Val
Ala
Glu
Gly
A
Val
Ala
Glu
Gly
G
Key Codes C-3
Amino Acid Table C-4 provides amino acid abbreviations for use with
Abbreviations AutoAssembler’s Show Protein Translation feature.
Table C-4 Amino Acid Abbreviations
AMINO ACID
THREE LETTERS
Alanine
Ala
A
Arginine
Arg
R
Asparagine
Asn
N
Aspartic Acid
Asp
D
Cysteine
Cys
C
Glutamic Acid
Glu
E
Glutamine
Gln
Q
Glycine
Gly
G
Histidine
His
H
Isoleucine
Ile
I
Leucine
Leu
L
Lysine
Lys
K
Methionine
Met
M
Phenylalanine
Phe
F
Proline
Pro
P
Serine
Ser
S
Threonine
Thr
T
Tryptophan
Trp
W
Tyrosine
Tyr
Y
Valine
Val
V
Any Amino Acid
X
Stop Codes: AMBer, OCHer, OPA
C-4 Key Codes
ONE LETTER
Glossary
This section defines special terminology used in the AutoAssembler software. The terms are
listed in alphabetical order. Many terms are defined in the text of this manual. If you do not find
a term here, check the index to see if you can locate it in the manual.
ambiguity character A character that appears in the Alignment view of the project window to indicate an
ambiguous base position or an insertion in the consensus of the displayed contig. The character
appears just below the consensus sequence. You can specify the character by choosing Settings
from the Edit menu. The default is a black bullet (•).
assemblage A term used interchangeably with “contig.”
autoupdating A command that, in conjunction with BioLIMS, adds modified and new sequences from a
collection to a designated project. With autoupdating, multiple users on different computers can add
new sequences or edit existing sequences in a BioLIMS collection. These new or edited files will
then be automatically added to the project.
BioLIMS A database that stores sequences used in AutoAssembler projects, allowing multiple users to
edit or add sequences to collections. These sequences can then be automatically added to a project
using the AutoUpdating feature.
collection A group of sequences residing in the BioLIMS database.
chromatogram A four-color picture of a sequence, showing peaks that represent the bases or amino
acids. The term is used interchangeably with “electropherogram” in AutoAssembler.
consensus sequence A linear series of characters that represents the multiple sequence alignment of a
contig. Individual base positions in the consensus are represented either by capital letters,
lowercase letters in color, or insertion characters.
contig A group of overlapping sequences resulting from assembly. Unlike traditional sequencealignment methods used in assembling sequences, AutoAssembler generates consensus
sequences on the basis of primary data only, not on the basis of interim consensus sequences.
Groups of overlapping sequences are dynamically computed with each iteration of the
AutoAssembler. The chief advantage of this approach is that initial sequences do not introduce a
bias into the consensus.
contig list A listing of contigs that appears in the upper-left pane of the project window. When you select
a contig in the contig list, the associated sequences appear in the sequence list.
editable data A copy of the original ABI PRISM DNA Sequencing Analysis software-produced data that
is stored in the sample file. All changes saved to sequence files are stored in the editable data copy,
Glossary-1
so the original data is maintained in its unmodified (original) condition. Editable data is displayed in
the AutoAssembler project window and sequence window.
electropherogram A four-color picture of a sequence, showing peaks that represent the bases or amino
acids. The term is used interchangeably with “chromatogram” in AutoAssembler.
exporting Storing the contents of selected sequences in a file other than the associated data file. You
can export sequences as text files for use with word processing applications. You can also export all
sequences in a contig into a layout for use with the SeqEd or Sequence Navigator application
programs.
feature A defined region in a sequence. You can define features in Factura using the Feature–Add
command. Features are also the regions identified when you process a sequence using Factura.
The sequence window Feature view in Factura shows Factura-identified features only after you have
saved them using the Save to Sequence command in the Worksheet menu.
gap character A character inserted into a sequence to indicate a missing region. In AutoAssembler, the
gap character is a hyphen or dash (–). For example, the sequence of nucleotides GCTA– contains 5
characters. The last character is a gap.
identification parameters The settings you specify for vector, ambiguity, confidence range, and IUB
code (heterozygote) calling that are used to identify those features during Factura processing.
ID numbers Numbers that identify sequences in the Layout view of the AutoAssembler project window
when you choose Show IDs from the Project menu. The numbers are assigned sequentially as
sequences are added to the project, and are not re-used if the corresponding sequences are
removed from the project.
index The index of the first base or amino acid of a sequence is the same as the sequence offset, and
each succeeding character has an index of one greater. The index numbers are shown on a ruler at
the top of the lower panel of the project window.
insertion character A character that appears in the consensus sequence in the Alignment view of the
project window. This character indicates an insertion in the consensus of the displayed contig. You
can specify the character by choosing Settings from the Edit menu. The default is ~.
IUB code Alphabetic character representing the occurrence of mixed bases at a given position in a
sequence. Originally defined by the International Union of Biochemistry.
IUPAC International Union of Pure and Applied Chemistry. IUB codes are also referred to with this
acronym, since IUPAC adopted the codes as a standard.
layout A two-dimensional display in the lower panel of the project window that uses arrows to show the
relationships between sequences in a contig. A layout is also the main window, or worksheet, that
displays multiple sequences in the SeqEd and Sequence Navigator applications.
length The length of a sequence is the number of characters it contains, including gap characters. For
example, GAATTC has a length of 6. GAA–TTC has a length of 7.
mark style A pre-defined style that can be applied in Factura so you can visually identify a feature in the
sequence window.
Glossary-2
offset The relative distance between the origin and beginning of a given sequence in the Alignment
view of the project window. The leftmost sequence starts at the origin (position 1), and each other
sequence in the contig is offset a certain number of bases to the right of that point, each being
positioned to provide the best alignment of the data.
origin An imaginary vertical line in the Alignment view of the project window between index positions
zero and minus one. The origin is where the far-left sequence in a contig starts.
original data The sequence data produced by the ABI PRISM DNA Sequencing Analysis software. This
data is maintained in its original state in a sample file. An editable copy of the data is stored in the
same sample file, and changes when you save edits to the file. See also editable data, sample files.
protein translation The protein translation is an editing function that displays a single character amino
acid beneath the third character of each three-base codon. This command uses a universal codon
table when translating ambiguities in the consensus sequence.
residue An amino acid or a nucleotide.
ruler A scale displaying index numbers, located above the consensus in the lower panel of the project
window.
sample files Files produced by the ABI PRISM DNA Sequencing Analysis software. These files contain
data produced by the instrument: a sequence of base calls, peak locations, and an
electropherogram. The original data in sample files is always maintained in its original state (as it
came from the ABI PRISM DNA Sequencing Analysis software). When you save changes you have
made to the sequence, they are stored to a copy of the original data called the “editable data.”
Editable data is displayed in the AutoAssembler project and sequence windows.
selected sequence A sequence that you have specified by clicking its identification on the project window
in AutoAssembler.
sequence A linear series of characters. The characters are displayed in rows from left to right. More
specifically, a sequence is a series of nucleotide base characters that represent a linear DNA
sequence, or a series of amino acid characters that represent a protein sequence.
sequence list A listing of sequence names that is displayed in the upper-right pane of the project
window. The sequence list shows the names of the sequences included in the contig that is selected
in the upper-left pane of the project window.
settings Choices that you specify in AutoAssembler about the parameters used to identify features in
the project window views.
statistics settings Choices that you specify in AutoAssembler about the parameters used to display the
consensus sequence in the Statistics view.
statistics view A view in the project window that plots redundancy versus consensus base. This view is
useful if you need to verify that you have minimum redundancy and orientation throughout the
consensus. You can choose the parameters for this view using the Statistics Settings command in
the Edit menu.
summary graphic A horizontal line displayed in the top part of the sequence window. This line
represents the length of the sequence that is displayed in the window, and reflects the cursor
Glossary-3
position as you move it to different places in the sequence. The line also shows colored regions to
represent marked features in the sequence.
symbol Usually a character, such as G, A, *, or –. Often represents a base or amino acid in a sequence.
text files Files produced by all Macintosh word processing programs, and many other programs. Each
contains a string of characters and can be created when you save files.
upper-left pane The two panes in the upper portion of the project window. The left pane displays the
names of contigs and the unassembled sequence list. The right pane displays a list of the
sequences of the contig you select in the left pane.
upper-right pane The two panes in the upper portion of the project window. The left pane displays the
names of contigs and the unassembled sequence list. The right pane displays a list of the
sequences of the contig you select in the left pane.
views Various displays provided by the project window in AutoAssembler and the sequence window in
AutoAssembler.
Glossary-4
Index
Symbols
*ABI_ValidRange feature
5-18
Numerics
373 or 377
sequence data. see data
A
adding
bases in Electropherogram view 6-10
sequences to a BioLIMS project 3-12
sequences to a From Files project 3-10
adjusting relationships between sequences 7-6
Alignment view
customizing characters 4-12
described 4-2
editing in 5-18
example 4-6
allocating more memory 2-15
ambiguity
color marking 4-5, 4-15
in Alignment view 4-6
in Layout view 4-3, 4-5
in project window views 5-2
ambiguity characters
changing 4-16
defined G-1
in Alignment view 4-5, 4-6, 5-3
Analysis software
sample files defined G-3
Annotation view
button to display 6-7
described 6-7
example 6-7
in text, Inherit, new sequences 6-4
AppleScripting A-1
new feature of version 2.0 1-5
arranging multiple windows 4-18
Arrow keys
Down-Arrow key 5-12
Left-Arrow key. see Left-Arrow key
Right-Arrow key. see Right-Arrow key
Up-Arrow key 5-11
arrows (Layout view) 4-2, 4-3
assembling
adjusting overlaps between sequences 7-6
automatically 3-29
by engine algorithm 3-31
by local algorithm 3-30
by server algorithm 3-36
constraining overlaps 7-6
editing valid range 5-18
increasing or decreasing amount of data
used 5-18
lessening the number of sequences included
in an overlap 7-8
resolving incorrectly assembled repeat
regions 7-6
sequences 3-29
sequences of diverse lengths 3-39
assembly engines
adding engines 3-31
CAP, CAPRemote 3-31
new feature of version 2.0 1-5
output files 3-41
parameters 3-31
reassembling after changing
parameters 7-9
assembly reports
printing 8-10
saving 8-10
viewing 8-9
AutoAssembler
Index-1
compatibility with previous versions
files installed 2-6
flowchart 1-7
getting started 2-14
icon 3-6
installation options 2-2
installing from 3.5 disks 2-3
installing from the BioLIMS Client
Package 2-9
memory requirements 2-15
new features in version 2.0 1-5
optional configurations 1-4
registration 1-2
related software 1-6
software description 1-3
software installation disks 2-2
system requirements 2-2
AutoUpdating
configuring 3-43
host machine 3-44
new feature of version 2.0 1-5
setting up 3-43
turning off 3-45
1-6
B
bars
colored 4-5
half-height 4-5
bases
adding in Electropherogram view 6-10
adding in project window 5-12
changing in Electropherogram view 6-8
changing in Sequence view 6-12
colored bars 4-5
deleting in project window 5-12
deleting in Sequence view 6-12
editing in Electropherogram view 6-8
editing in project window 5-10–5-19
editing in sequence window 6-8–6-12
editing with lower case characters 5-10
half-height bars 4-5
keeping track of edits 5-11
keyboard shortcuts for selecting 5-11
multiple positions in electropherograms 6-9
original data in electropherograms 6-8
replacing in contig 5-16
shifting left or right 5-16
Index-2
BioLIMS database
accessing BioLIMS database 3-14
adding sequences to a project 3-12
Client Package
custom installation 2-9
installing 2-8
removing 2-10
configuring the server
connection 2-17–2-19
files installed in System folder 2-13
installation option 1-4
installing the Client Package 2-8–2-13
interfaces file 2-17
opening access 3-13
organizing and naming projects 3-4
project file 3-2
removing sequences from a project 3-24
Sequence Chooser window 3-18
displaying the window 3-15
parts of the window 3-16–3-17
using 3-20–3-22
setting up for AutoUpdating 3-43
SybaseConfig control panel 2-18
borders, marking features 6-15
buttons
to display Statistics view 5-21
C
CAP
assembling projects with 3-31
engine parameters 3-31
CAP Remote
assembling projects with 3-31
engine parameters 3-31
installation option 1-5
changing
ambiguity character in the project
window 4-16
assembly constraints 7-6
bases in Electropherogram view 6-8
bases in Sequence view 6-12
feature appearance 6-14
feature range 6-14
insertion character in the project
window 4-16
marking style 6-15
scale of electropherograms in project
window 4-10–4-14
characters
ambiguity 4-5, 4-6
lower case 4-5, 5-11
upper case 4-5, 5-11
using lower case to edit 5-10
chromatogram defined G-1
clipboard 8-11, 8-14, 8-15
cloning the project window 4-19
example 4-20
color
ambiguous bases in consensus 5-3
changing marking style of features 6-14
marking ambiguity 4-5, 4-15, 5-3
marking features 6-14
specifying ambiguity color for project window
views 4-15
color picker 4-15
colored bars 4-5
complementing a contig 5-5
compressed view. see Layout view
consensus sequence
ambiguous bases shown in color 4-2, 5-3
changing color of ambiguous bases 4-15
characters in 4-6
defined G-1
described 4-2, 4-6
editing 5-10
example of replacing bases 5-13–5-14
half-height bars 4-2, 4-5
IUB codes in 4-7
no electropherogram view 8-17
reassembling with new sequences 7-2
window described 8-17
constraining overlaps
procedure 7-6
reassemble to see results 7-7
resetting relationships 7-8
contig
building and saving a consensus 8-16
complementing 5-5
defined G-1
exporting to layout format 8-19
locating ambiguous regions 5-2
names 3-41
printing from the project window 8-11
viewing more than one simultaneously 4-19
Contig list 3-6
defined G-1
Contig Summary
described 8-8
example 8-8
printing 8-10
saving 8-10
viewing 8-9
copying
a sequence from the sequence
window 8-14
project window views to other
programs 8-14
CPU 2-2
creating
files for use with other applications 8-16
graphics from project window 8-14
project 3-6
customizing characters in the Alignment
view 4-12
D
data
editable data G-1
editing valid range used for assembly 5-18
from ABI 373 or ABI PRISM 377 1-3
increasing or decreasing amount used for
assembly 5-18
original data G-3
saving to sequence files 8-4
seeing multiple views in cloned project
window 4-19
showing original on electropherogram 6-8
shown as colored bars 4-5
decreasing amount of data used for
assembly 5-18
defaults
graphic in project window 4-3
marking styles 6-15
deleting
bases in project window 5-12
bases in Sequence view 6-12
deletions
and server assembly algorithm 3-38
diagram
Factura/AutoAssembler data flow 1-7
Index-3
diamond symbols 3-41
dictionary A-1
Disk Drive 2-2
displaying
complement of a contig 5-5
sequence windows 6-3
double-clicking
to show sequence 6-3
Down-Arrow key 5-12
viewing native (variable) peak spacing 6-2
zooming in project window 4-10–4-14
engine option
assembly using 3-31
enhancing sequence overlaps 7-6
Export command 8-20
exporting
contig to layout format 8-19
defined G-2
sequences to text format 8-19
E
Edit menu
not available in Electropherogram view 6-8
editable data G-1
editing
bases in Electropherogram view 6-8
bases in project window 5-12–5-19
bases in sequence window 6-8–6-12
consensus 5-10
editable data in sample files G-1
example of replacing a range of bases 5-16
features 6-14
keeping track of edits 5-11
marking style 6-15
original data in sample files G-3
protecting sequence from edits 6-4
saving changes 8-4
saving changes in the sequence
window 6-6
specific examples using one situation 5-17,
5-18
using lower case characters 5-10, 5-11
valid range of data for assembly 5-18
what happens on the screen 5-18
Electropherogram view
button to display 6-8
electropherograms
adding bases 6-10
changing scaling in project
window 4-10–4-14
defined G-2
editing 6-8
Electropherogram view 6-8
in sample files only 6-8
multiple base positions 6-9
printing on a color printer 8-12
showing original data 6-8
Index-4
F
Factura
flowchart 1-7
interrelation with AutoAssembler 1-6
software description 1-3
false overlaps 3-40
Fast Data Finder
and server assembly algorithm 3-36
feature tables
marking features 6-14
Feature view
button to display 6-13
described 6-13
example 6-13
in text, Inherit, new sequences 6-4
marking features 6-14
features
changing appearance 6-14
defined G-2
editing 6-14
marking 6-14
valid range of data for assembly 5-18, 6-13
files
creating files for use with other
applications 8-16
displaying names 4-4
installed by AutoAssembler 2-6
keeping sequence files with project 3-2
moving sequence files with respect to project
file 3-2
removing sequences from a BioLIMS
project 3-24
removing sequences from a project 3-11
saving as text 8-19
finding
Find Again command 4-23
Find command 4-21
IUB codes 4-22
patterns 4-21
selection expressions 4-22
flowchart, Factura/AutoAssembler 1-7
folders
keeping sequence and project files
together 3-2
formatting the sequence list 3-25
forward delete key 5-14
From Files project file 3-2
G
gap characters defined G-2
gaps
changing characters 4-16
deleting from multiple sequences
replacing in sequences 4-21
Get Info command 2-15
graphics
copying to other programs 8-14
grep 4-22
5-17
H
half-height bars
4-5
I
icons
AutoAssembler program 3-6
button for Alignment view 4-2
button for Layout view 4-2
button for Statistics view 4-2
diamond shapes by sequence names 3-41
ID numbers
defined G-2
not reused in project 3-11
identification parameters
defined G-2
importing
command 3-41
engine output files 3-41
Inherit files 3-10, 7-2
sample files 3-10, 7-2
text files 3-10, 7-2
incorrectly assembled repeat regions 7-6
increasing amount of data used for
assembly 5-18
index defined G-2
Inherit Analysis program 8-16
Inherit files
importing 3-10, 7-2
inhibiting sequence overlaps 7-6
insertion character defined G-2
insertions
and assembly algorithm 3-38
changing character 4-16
installing AutoAssembler from 3.5 disks
interfaces file 2-17
IUB codes
defined G-2
finding 4-22
in consensus sequence 4-7
when zooming in 4-5
IUPAC 4-22, G-2
2-3
K
Kececioglu algorithm 3-36
keyboard keys
Arrow keys. see Arrow keys
forward delete key 5-14
Option key. see Option key
Shift key 5-11
keyboard shortcuts
selecting bases or sequence
segments 5-11
L
layout
exporting contig to 8-19
Layout view
defined G-2
described 4-2
example 4-3
Left-Arrow key
moving in electropherograms 6-9
selecting bases 5-11
length defined G-2
link between project and sequence files
list of sequences in a contig 3-25
local assembly algorithm
advantages 3-30
3-5
Index-5
how to use 3-30
setting minimum overlap and percent
error 3-40
when to use 3-30
lock image 6-4
lower case characters
editing with 5-10, 5-11
half-height bars 4-5
lower pane of project window 3-7
M
Macintosh
system software needed 2-2
manual, user's
about 1-9
conventions used in 1-9
mark style
choosing 6-15
defaults 6-15
defined G-2
memory
allocating more 2-15
suggested memory allocation 2-2
minimum overlap parameter
setting too high or too low 3-40
mismatches
and server assembly algorithm 3-38
missing files 3-4
Monitor 2-2
More checkbox
only with FDF 3-38
table of parameters 3-38
moving sequence file location 3-2
multiple base positions 6-9
multiple views of data in cloned project
window 4-19
multiple windows, arranging 4-18
Myers-Kececioglu model 3-36
N
native spacing of electropherograms
network parameters 4-12
networked project, organizing 3-3
O
offset defined
Index-6
G-3
6-2
Open command 3-7
opening
project 3-6
sequence window 6-3
Operating System 2-2
Option key
for selecting bases 5-11
moving cursor in electropherograms 6-9
opening AutoAssembler program 3-7
organizing
networked project 3-3
project files 3-2
the sequence list 3-27
origin defined G-3
original data
defined G-3
preserved in sample files 6-13
showing in Electropherogram view 6-8
P
panes
lower in project window 3-7
upper right and left of project window 3-7
parameters
identification G-2
Statistics view
Statistics Settings 5-20
patterns
finding in sequences 4-21
peak shape cursor 4-10
peaks
viewing electropherograms with variable
spacing 6-2
percent error parameter
setting too high or too low 3-40
portrait orientation (printed sequence window
views) 8-12
presentations
copying graphics from project window 8-14
copying sequence from sequence
window 8-14
printing sequence window views 8-12
printing the project window 8-11
Print command 8-11
printing
assembly reports 8-10
color printer for electropherograms 8-12
contig 8-11
only one copy 8-7, 8-11
project views 8-11
project window 8-11
sequence window views 8-12
project
adding sequences 3-10
arranging multiple windows 4-18
closing 3-8
complementing contig 5-5
creating 3-6
described 3-2
example of window after assembly 3-40
file types 3-2
files from BioLIMS database 3-12
keeping file with sequence files 3-2
opening 3-6
organizing 3-2
organizing with several project files 3-3
printing a contig 8-11
removing sequences 3-11
removing sequences from BioLIMS
project 3-24
saving 8-2
project folder 3-2
Project Report
described 8-9
example 8-9
printing 8-10
saving 8-10
viewing 8-9
Project Summary
described 8-7
example 8-7
printing 8-10
saving 8-10
viewing 8-9
project window
cloning 4-19
closing 3-8
copying views for presentations 8-11
copying views to other programs 8-14
described 3-6
locating ambiguities 5-2
lower pane 3-7
opening 3-6
sequences in upper right pane 3-7
upper panes
3-7
R
random access memory (RAM) 2-15
range
changing range of a feature 6-14
valid range for assembly. see valid range
re-adding modified sequences 7-4
reassembling
after changing assembly parameters 7-8
after changing constraints 7-6
after changing engine parameters 7-9
after editing 7-5
contig name increments 7-1
lessening the number of sequences included
in an overlap 7-8
resolving incorrectly assembled repeat
regions 7-6
to obtain clean and consistent overlaps 7-5
with changed sequences 7-4
with new sequences 7-2
registering your software 1-2
registration code 3-6
removing
BioLIMS Client Package 2-10
sequences from BioLIMS projects 3-24
sequences from project 3-11
replacing
bases 5-16
bases in consensus 5-13–5-14
gaps 4-21
residues defined G-3
resolving incorrectly assembled repeat
regions 7-6
Right-Arrow key
moving in electropherograms 6-9
selecting bases 5-11
ruler defined G-3
ruler origin defined G-3
S
sample files
consensus sequence saved as
defined G-3
editable data G-1
importing 3-10, 7-2
8-17
Index-7
original data G-3
see also data; sequences
viewing information from run 6-7
saving
assembly reports 8-10
consensus sequence 8-17, 8-18
modifications made in sequence
window 6-6
project 8-2
to sequence files 8-4
scaling
electropherograms in the project
window 4-10–4-14
selected sequence defined G-3
selecting
bases or sequence segments 5-11
characters 5-11
views in the project window 4-2
selection expressions, tables 4-22
SeqEd program 8-19
Sequence Chooser window (BioLIMS)
closing 3-22
displaying the window 3-15
parts of the window 3-16–3-17
searching database 3-18
using 3-20–3-24
sequence list
changing information in 3-25
defined G-3
fields available 3-25
fields displayed 3-25
formatting 3-25
sorting 3-27
table of fields 3-26
table of sorting options 3-27
viewing options 3-25
Sequence Navigator program 1-8, 8-19
Sequence view
button to display 6-11
described 6-11
example 6-11
in text, Inherit, new sequences 6-4
sequence windows
changing bases 6-12
consensus sequence 8-17
copying sequences to other programs
deleting bases 6-12
Index-8
8-14
printing 8-12
saving modifications 6-6
views 6-4
sequences
adding to a BioLIMS project 3-12
adding to a From Files project 3-10
adjusting overlaps 7-6
assembling 3-29
assembling sequences with diverse
lengths 3-39
changing bases in Sequence view 6-12
closing a project 3-8
constraining overlaps 7-6
copying from the sequence window 8-14
defined G-3
deleting bases in Sequence view 6-12
determining name in project window 4-4
distance when displayed on same line 4-15
editing component sequences versus editing
consensus 5-10
editing in Electropherogram view 6-8
editing in Sequence view 6-11
editing valid range 5-18
exporting to text format 8-19
finding patterns 4-21
half-height bars in 4-5
identifying in Layout view 4-4
identifying in project window 3-7
identifying in Statistics view 4-8
keeping with project 3-2
lessening the number included in an
overlap 7-8
locking 6-4
moving with respect to project file 3-2
offset defined G-3
orientation and position in Layout view 4-3
protecting from edits 6-4
re-adding modified sequences 7-4
removing ends from valid range 5-18
removing from BioLIMS projects 3-24
removing from project 3-11
resolving incorrectly assembled repeat
regions 7-6
sample files defined G-3
saving to sample file 8-4
selected sequence defined G-3
sequence list 3-7
shifting left or right 5-16
showing original 6-8
viewing simultaneously in sequence
windows 6-3
server assembly algorithm
based on Myers-Kececioglu model 3-36
hardware-based comparison 3-36
reducing for deletions 3-38
reducing for insertions 3-38
reducing for mismatches 3-38
setting minimum overlap and percent
error 3-38
when to use 3-36
Server Option
files installed 2-7
optional configuration 1-5
setting up
assembly constraints 7-6
settings
defined G-3
specifying ambiguity characters 4-16
specifying ambiguity color 4-15
specifying height of electropherograms in
project window 4-13
specifying row height for displaying
electropherograms in project
window 4-13
Shift key 5-11
shifting bases or sequence segments 5-16
shortcuts
for selecting bases 5-11
Show Original command 6-8
software
supplied with AutoAssembler 2-2
to run AutoAssembler 2-2
virus protection 2-3
sorting the sequence list 3-27
Stack command 4-18
stacked windows example 4-19
Statistics view
button to display 5-21
changing parameters 5-20
described 4-2
display legend 4-7
displaying the consensus 4-7
example 4-7
identifying sequences 4-8
locating problem areas to edit 5-21
verifying orientation and redundancy 5-20
styles
defaults for marking features 6-15
summary graphic 4-23, 6-4
defined G-3
SybaseConfig control panel 2-18
where located 2-13
symbols
defined G-4
synchronized electropherograms
changing row height 4-11
character size global to Alignment
view 4-10
peak height relative to row height 4-14
scaling horizontally 4-10
scaling vertically 4-11
T
tables
default marking styles 6-15
fields in sequence list 3-26
keyboard shortcuts for selecting bases or
sequences 5-11
More checkbox parameters 3-38
selection expressions for Find
command 4-22
sequence list sorting options 3-27
technical support duration 1-2
text files
described G-4
exporting to 8-19
importing 3-10, 7-2
Tile command 4-18
example 4-18
U
Unassembled sequence list 3-6, 7-4, 8-8
Up-Arrow key 5-11
updating a sequence file 6-6
upper case characters 4-5, 5-11
upper panes of project window 3-7
defined G-4
user's manual
about 1-9
conventions used in 1-9
Index-9
V
valid range
determined from Factura features 6-13
editing 5-18
marked green in sequence window 6-4
viewing
assembly reports 8-9
electropherograms with variable peak
spacing 6-2
multiple contig 4-19
multiple views of data 4-19
the sequence list 3-25
views
copying project window views to other
programs 8-14
defined G-4
printing project window 8-11
printing sequence window views 8-12
sequence window views 6-4
using for presentations 8-11
virus protection 2-3
volume (computer) 3-3
W
windows
arranging 4-18
cloning the project window 4-19
stacking 4-18
tiling 4-18
word processing
copying graphics from project window
Wrap checkbox 4-22
Z
zooming
between project window views 4-5
electropherograms in the project
window 4-10–4-14
to change project window views 4-2
Index-10
8-14
Worldwide Sales Offices
Applied Biosystems vast distribution and
service network, composed of highly trained
support and applications personnel, reaches
into 150 countries on six continents. For
international office locations, please call our
local office or refer to our web site at
www.appliedbiosystems.com.
Headquarters
850 Lincoln Centre Drive
Foster City, CA 94404 USA
Phone: +1 650.638.5800
Toll Free: +1 800.345.5224
Fax: +1 650.638.5884
Technical Support
For technical support:
Toll Free: +1 800.831.6844 ext 23
Fax: +1 650.638.5891
www.appliedbiosystems.com
PE Corporation is committed to providing
the world’s leading technology and
information for life scientists. PE Corporation
consists of the Applied Biosystems and
Celera Genomics businesses.
Printed in the USA, 09/2000
Part Number 904947B