bwtool: A tool for bigWig files Andy Pohl and Miguel Beato Centre de Regulació Genòmica (CRG) i UPF, Dr. Aiguader 88, 08003 Barcelona Abstract Examples BigWig files [1] are a compressed, indexed, binary format for genome-wide signal data for a variety of experiments or otherwise per-base signal data e.g. ChIP-seq read depth, GC percent, etc. bwtool is a tool designed to read bigWig files rapidly and efficiently, providing functionality for extracting data and summarizing it in several ways, globally or at specific regions. Additionally, the tool enables the conversion of the positions of signal data from one genome assembly to another, also known as "lifting". We believe bwtool can be very useful for the analyst frequently working with bigWig data, which is becoming a standard format to represent functional signals along genomes. Highlighting the aggregate operation, we can produce plot data for ENCODE [4] bigWigs containing various ChIP-seq data from MCF7 cells and make a plot with GENCODE gene TSSs (cromatina.crg.cat/bwtool/ex1.html): Operations bwtool's functionality is subdivided into subprograms that roughly fall into three categories: data extraction, analysis, and data modification: Data Extraction matrix Given m regions, makes an m x n sized matrix around defined center points from the regions. paste Aligns and outputs line by line multiple bigWigs base by base. window Outputs data in a sliding window, one slide per line. extract Can output data in irregularly-sized intervals. random Outputs intervals of a defined size from random loci in the bigWig. sax Discretizes bigWig data and output as a mock FASTA file. Another example makes use of the chromgraph operation, using schnurri ChIP data from the Berkeley Drosophila Transcription Network Project (cromatina.crg.cat/bwtool/ex2.html): Analysis aggregate Similar in usage to matrix, but oriented around creating plots of averaged bigWig profiles around regions of interest. chromgraph Makes a file suitable for visualization by UCSC Genome Graphs. distribution Counts the occurances of values in the data. find Finds regions of the bigWig based on thresholds or local extrema. summary Provides summary statistics for given loci in the bigWig. Data Modification fill Fills missing regions in the bigWig with a desired value. remove Creates missing regions in the bigWig based on a given threshold or a file to mask specific regions. shift Simply moves data on the chromosomes a certain amount/direction. lift Maps bigWig data onto another genomeʼs coordinates using a special alignment file called a liftOver chain. Availability bwtool is an open-source software, licensed under the GPL v3. Version control is hosted by github.com, where contributions to the software may also be made. References [1] Kent, WJ. et al., Bioinformatics, 2010 (17): pp. 2204-2207. [2] Shin, H. et al., Bioinformatics, 2009 (25)19: pp. 2605-2606 [3] Harrow, J. et al., Genome Research, 2012 (22): pp. 1760-1774. [4] ENCODE Project Consortium, PLoS Biology, 2011 (9)4: p. e1001046 Acknowledgements Performance Formal benchmarking and comparisons to other software is complicated by dearth of available programs with bigWig functionality and the variety of operations bwtool provides. To provide an anecdote however, we ran CEAS [2] using WIG files generated from a human Pol2 ChIP-seq against 20,318 protein-coding genes from GENCODE v18 [3]. It took around two days, and provided some plots including a “metagene” plot and several other plots of varying usefulness. bwtool aggregate was run on the same genes, and a bigWig version of the WIG data and was done creating the plot data in under 4 min on the same machine. Granted, bwtool only calculated the data for a single plot, and it took a few more minutes to make the plot in R, but nevertheless we tend to save a lot of time in situations like this. Thanks to Daniel Soronellas, João Curado, Alessandra Breschi, Roderic Guigó, Jakob Skou Pedersen, Brian Raney, and Jim Kent for testing the program and providing feedback and advice prior to release. More Information http://cromatina.crg.cat/bwtool If you have questions, you can find me in attendence or e-mail [email protected]
© Copyright 2026 Paperzz