BIOINFORMATICS APPLICATIONS NOTE Vol. 23 no. 21 2007, pages 2945–2946 doi:10.1093/bioinformatics/btm455 Genome analysis Idiographica: a general-purpose web application to build idiograms on-demand for human, mouse and rat Taishin Kin1,* and Yukiteru Ono2 1 Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan and 2Information and Mathematical Science Laboratory, Inc., 1-5-21 Meikei Building, Otsuka, Bunkyo-ku, Tokyo 112-0012, Japan Received on May 22, 2007; revised on August 6, 2007; accepted on August 29, 2007 Advance Access publication September 24, 2007 Associate Editor: Chris Stoeckert ABSTRACT 2 Summary: We have launched a web server, which serves as a general-purpose idiogram rendering service, and allows users to generate high-quality idiograms with custom annotation according to their own genome-wide mapping/annotation data through an easy-to-use interface. The generated idiograms are suitable not only for visualizing summaries of genome-wide analysis but also for many types of presentation material including web pages, conference posters, oral presentations, etc. Availability: Idiographica is freely available at http://www.ncrna.org/ idiographica/ Contact: [email protected] The rendering convention for chromosomes is designed after the scheme used in the UCSC Genome Browser (Kuhn et al., 2007), where telomeres and their proximal regions are rendered as curved shapes and centromeres are rendered as wedged shapes. The surface of a chromosome can be mildly shaded according to the user’s preference. Chromosomes are aligned from left to right. G-band annotation data and centromere/ telomere information for human, mouse and rat are obtained from the Genome Browser database. Idiographica renders chromosomes to fit variable page sizes. For the current web server, the largest admissible page size is B0 (1456 1030 mm) at fixed resolution of 200 dpi where the pixel size of the page is 11 464 8110 and the rendering size of human chromosome 1 (245 522 847 bp) is 235 6114 which means 40 kb is represented as a single pixel. At this resolution, the whole human genome (3 076 781 887 bp for hg17) is represented with 76 920 total pixels. The maximum number of annotation is limited to 1 annotation per 4 pixels or 1 annotation per 160 kb. Therefore, the maximum number of annotations for the entire human genome is 17 764. We use the Cairo graphic library (http:// cairographics.org/) for our rendering engine. The rendering time takes up to 3.5 min for B0, human, and maximum amount of mapping information with known gene density background. 1 INTRODUCTION An idiogram is a diagram of the chromosomes showing varieties of cytogenetic bands such as C (centromere), G (Giemsa), R (reverse), Q (quinacrine) and N (nucleus) bands. However, idiograms are not limited to representing these cytogenetic bands. We find idiograms used in many journal articles on a regular basis and especially in web interfaces to visualize varieties of genome-wide information such as genomic distributions of genes and their associated elements (Hubbard et al., 2007; Imanishi et al., 2004; Kuhn et al., 2007; Wheeler et al., 2007). As these examples indicate, application of idiograms has diversified beyond its origin. In the era of genome-wide analysis, an effective method to visualize genome-wide information is needed for daily research activity. Idiograms are an effective method to fulfill the need. However, there is neither software nor a web server to allow users to build their custom idiograms. Therefore, we developed Idiographica—a web server that allows users to build customized high-quality idiograms from their own data by allowing users to upload a description file. The generated idiogram is a high-quality image that is suitable for poster presentation, projector screen presentation and journal publication. A user can use generated idiograms without any obligation or restriction. *To whom correspondence should be addressed. 3 RENDERING SCHEME IDIOGRAPHICA SERVER The Idiographica server is available at http://www.ncrna.org/ idiographica/. A user should fill out a web form presented on the Idiographica page then, click the submit button to send a request to the server. A user is required to supply an email address. This is the only mandatory item to generate an idiogram. The other optional items include Species, Chromosome, Background, Annotation and Figure Configuration, which each have default values. Therefore, no additional operation is needed to generate a simple idiogram. A user can easily utilize these options to enrich their custom idiogram. Details of the options follow. Species is an option to choose the organism that the custom idiogram is based on. Currently, human (hg17 and hg18), mouse (mm8) and rat (rn4) are available. The Chromosome option allows a user to choose which ß The Author 2007. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected] 2945 T.Kin et al. Fig. 1. A sample idiogram generated with our Idiographica server, which shows the genome-wide distribution of G-protein coupled receptors in the human genome. The data are imported from the SEVENS database (Ono et al., 2005). chromosomes to render. Background is an option to select a band type to render, which includes G-band, GC content, repeat element density, known gene density, mRNA density and EST density. The sources of these data are several database tables of UCSC Genome Browser. G-band data is from the cytoBand table. Repeat element density represents the number of repeat elements (Repetitive Sequence Region track) per 1 Mb. Two types of densities are defined for known genes, mRNAs and ESTs. The first is defined as the number of items per 1 Mb. The second is defined as the ratio of genomic bases belonging to known genes, mRNAs or ESTs per 1 Mb. The data sources for known genes, mRNAs, ESTs are the Known Genes track, Human/Mouse/Rat mRNAs track and Human/ Mouse/Rat ESTs track, respectively. If a user needs to put their own genomic annotation information onto the idiogram, he/she can upload a description file—a tab-delimited text file—to the Annotation field. The description file can contain information for a title: an arbitrary string that appears on top of the idiogram, a legend: a text to relate a color/symbol to a category and mapping information: a set of lines to relate a genomic position to an annotation with cosmetic preferences such as font size and text color. The description file should follow a simple format that is described at the website. Figure configuration specifies the size, format, orientation and annotation of the generated image where the size option ranges from B5 (smallest) to B0 (largest), the format option allows PNG or PDF, the orientation option specifies which page orientation to use (vertical or horizontal), the annotation option turns on or off the visibility of mapping information name labels and the 3D shading option turns on or off the visual cosmetics to render chromosomes. 2946 The Idiographica server sends a request to its job queue as soon as the request is submitted by a remote user. After the request is processed and an idiogram is generated, the server sends the user an email in order to notify completion of the request. The email presents an URL to access the custom idiogram on the server. The generated idiogram is scheduled to be erased 24 h after its creation. Therefore, a user needs to download the idiogram to his/her local computer before its deletion. A sample idiogram generated with the Idiographica server is shown in Figure 1. ACKNOWLEDGEMENTS We thank Dr Martin Frith for his generous help to improve our manuscript. This work is partially supported by the Functional RNA Project funded by New Energy and Industrial Technology Development Organization (NEDO). Conflict of Interest: none declared. REFERENCES Hubbard,T.J.P. et al. (2007) Ensembl 2007. Nucleic Acids Res., 35, D610–D617. Imanishi,T. et al. (2004) Integrative annotation of 21,037 human genes validated by full-length cDNA Clones. PLoS Biol., 2, 856–875. Kuhn,R.M. et al. (2007) The UCSC genome browser database: update 2007. Nucleic Acids Res., 25, D668–D673. Ono,Y. et al. (2005) Automatic gene collection system for genome-scale overview of G-protein coupled receptors in eukaryotes. Gene, 30, 63–73. Wheeler,D.L. et al. (2007) Database resources of the national center for biotechnology information. Nucleic Acids Res., 35, D5–D12.
© Copyright 2026 Paperzz