When investigating protein expression changes between different experimental conditions, the quality of the input data plays a huge part in the robustness of the analysis and the confidence with which true changes in spot patterns/expression levels can be measured. For this reason Progenesis PG240 provides a range of data quality measures. They can be divided into two broad categories, that of INCA related measures and those derived from applying a bootstrapping methodology to the data. What is INCA? The Intelligent Noise Correction Algorithm (INCA) is an algorithm that specifically identifies and quantifies noise within a 2D gel image. INCA is highly discriminatory and is capable of distinguishing between true signal and high levels of noise. Noise can vary considerably across an image so it is not enough to simply apply a ‘one size fits all’ approach to dealing with the level of noise, instead each pixel is statistically assessed to enable identification as either true signal or non-specific noise. Two types of noise are identified by INCA; the first is a low level Gaussian distributed sensor noise, generated by any capturing device which utilises an electrical current. The second is the more visually obvious random noise, for example speckling from the crystallisation of certain stains, dust particles, edges of tears and the like. Noise as defined here is not to be confused with the images general background intensity caused by the staining technique employed; this is handled in a separate ‘background subtraction’ operation. Alternative methods for noise removal such as median and averaging masks and blurring filters assume that noise spikes are high frequency, so a low level pass filter is used. This can result in the spot becoming distorted. A clean gel that is processed using a median filter can still result in distorted spots and hence altered results. In contrast INCA does not affect a spot if it is noise free; the spot will not be altered in any way. Spot detection occurs after the image has been INCA processed, so using the corrected image and thus yielding better quality spot detection. Quantitation of spot material however is performed on both the original raw image file and the corrected image file, providing the user with INCA corrected data alongside the uncorrected data within the software. Importantly, it is left up to the user to decide if they wish to make use of the corrected data or not. As the noise component is quantified in the INCA process an additional level of data, i.e. the noise, is associated with each detected spot. Various fields reporting the noise are available from within the software; providing a means to aid in the identification (and removal if required) of noisy spots within the data set. For example INCA volume / noise expresses the INCA corrected volume as a ratio to the noise associated with that spot. A good quality spot should have a large INCA volume / noise value. This data measurement is particularly useful if performing spot filtering on images in order to remove spots consisting primarily of noise. This ability to separate true signal level from the noise component of a spot offers a powerful advantage for the investigation of expression changes. Studying expression changes between gels containing noisy data could result in the identification of inaccurate expression changes. Unless removed, any noise spikes present within protein spots will contribute to the overall signal intensity and hence volume of those spots. This will be particularly problematic with low-level material. As INCA removes such noise from the spot material, any expression differences based on INCA measurements will be a more accurate reflection of the true spot data. Nonlinear Dynamics Group [email protected] | www.nonlinear.com Nonlinear Dynamics Ltd Cuthbert House | All Saints | Newcastle upon Tyne | NE1 2ET | UK tel: +44 (0)191 230 2121 | fax: +44 (0)191 230 2131 Nonlinear USA Inc. | toll free: 1-866 GELS USA 4819 Emperor Blvd | Suite 400 | Durham | NC27703 | tel: 919 313 4556 | fax: 919 313 4505 From within the software the result of INCA can be visualised in a number of ways. Noise can be directly viewed from within the 3D window, as shown below in Figure 1. A B C Figure 1 3D view showing INCA identified noise in red A. The spot in view (highlighted in green) has only a small proportion of noise present compared to its overall signal. Spot volume= 23488668, INCA volume= 23411269, INCA volume/noise= 62.795 B. This spot has a considerable proportion of its volume attributed to non-specific noise. Spot volume= 122528, INCA volume=72706, INCA volume/noise= 1.384 C. The same spot as that shown in B but after noise has been removed Alternatively the INCA corrected data fields and noise associated fields are available for display in the data tables within the Progenesis PG240 software. Data Quality Measurements While INCA allows the user to correct the spot data for noise levels, the Progenesis PG240 software provides additional tools to directly assess the actual quality of the detected spots. This is performed through the Statistics Fields available in the Measurement and Comparison Tables. Data quality is measured by applying a bootstrap method of re-sampling to the spot data. The bootstrap statistics fields available are bootstrap Volume, bootstrap Error and bootstrap Coefficient of Variation (CV). Figure 2 Bootstrap fields used for data quality measurements. Access these via the field selection tabs of the data tables Nonlinear Technical Note – Data Quality 2 Bootstrapping Bootstrapping (or re-sampling) is used to calculate confidence limits for a given measurement. If we assume that a given set of measurements are subject to some degree of noise, then bootstrapping allows you to quantify the errors and add confidence to the measure. The procedure involves taking a subset of the measured values and deriving some property from these. For example a study may assume there is a linear relationship between two variables. For this scenario you would randomly select a subset of values and fit a line. You then repeat this process and build a distribution of fitted lines. The result of this procedure is that you obtain a most likely fitted line but you also get other lines which you can be confident bracket the actual value (this is usually 3 standard deviation points of the fitted line distribution). The bootstrapping method used in Progenesis PG240 adapts this procedure by choosing a subset of pixels from a spot and fitting a surface through them from which a spot volume is calculated. Multiple surface fits are generated by sequential rounds of selection of random pixel points across the spot surface. Calculations of all the generated surface fits are then made to yield a mean surface fit, with an associated volume (this is the bootstrap volume for that spot), and a bootstrap error, corresponding to one standard deviation from the mean surface fit. The bootstrap CV is then calculated as: 100 x bootstrap error / bootstrap volume. frequency mean bootstrap volume Figure 3 Generalised curve representing the distribution of surface fits calculated for a spot bootstrap volume 3 standard deviations =3x bootstrap error A relatively noise free spot will have a bootstrap volume very similar to that of its INCA corrected volume (and uncorrected volume), and a very small bootstrapping error because the surface fits will all cluster quite tightly around the mean bootstrap volume. Importantly its bootstrap CV should also be a small value as the error will be very small in proportion to the mean bootstrap volume. This is shown in Figure 4 below. frequency 3 standard deviations = 3x bootstrap error bootstrap volume Figure 4 Plot showing a relatively noise free spot, where the surface fits are tightly clustered around the mean bootstrap volume, and the bootstrap error is quite small. The spot shown in the 3D view has volume= 5963678, INCA volume=5921973, bootstrap volume=5950442, bootstrap CV= 0.225 Nonlinear Technical Note – Data Quality 3 A noisy spot however will most likely generate a mean bootstrap volume significantly different to that of its INCA volume. The noisy pixels in this spot will mean that any surface fits including the noisy pixels will result in bootstrap volumes significantly different to the INCA volume. The multiple surface fits that are calculated, many of which may include noisy pixels, will result in a mean bootstrapping volume with a larger associated error, reflecting the wider range of fits generated through the inclusion of noisier data. The bootstrap CV value will also therefore be larger as the bootstrap error will be greater in proportion to the bootstrap volume. This is illustrated in Figure 5 below. A frequency 3 standard deviations = 3x bootstrap error B bootstrap volume C Figure 5 Handing of a spot associated with a reasonable level of noise. A. A wider range of surface fits are generated when data from a noisy spot is analysed by bootstrapping, reflected by a larger bootstrap error. B. View of a noisy spot in 3D. Spot volume = 404807, INCA vol= 275621, bootstrap volume= 142261, bootstrap CV= 147. C. The same spot as that shown in B but after noise has been removed. Nonlinear Technical Note – Data Quality 4 Using Bootstrapping Values to Explore Data Quality Bootstrap CV is a useful measure for the identification of particularly noisy spots. It is particularly powerful because the CV is a measure of the bootstrap error as a proportion of the bootstrap volume. This is a more reliable measure of data quality than bootstrap error alone as the significance of a large error is related to how large the parameter being measured is. There are a number of direct methods available to assess data quality using bootstrapping results in Progenesis PG240. 1. Tables Display the bootstrap CV field in the Measurements or Comparison table; the table can then be sorted on the basis of these values simply by clicking on the bootstrap CV column header. Spots with the largest CVs, and hence have a large bootstrap error in comparison to their bootstrap volume, will be at the top of the table (see Figure 6). Figure 6 The spot selected in the table has a large bootstrap CV, this is confirmed upon inspection of the 3D view which shows this spot to have a large amount of associated noise. Once the data has been sorted in this way spots with large bootstrap CVs can be quickly removed from the analysis if required by filtering the table from within the Spot Filtering mode. 2. Histograms If bootstrap volume is selected to view in the Histogram window and standard deviation is selected as the Error type, by default the error shown will be the bootstrap error. If preferred, 3 standard deviations can be selected to view. If the Error type ‘Coefficient of Variation’ is selected this will be the Bootstrap CV. The error bars within the histogram provide a visual means to determine if spots are different from one another or if the absolute difference in measurement parameter is negated by the level of noise associated with that spot measurement. Figure 7 Histogram view showing bootstrap volume and bootstrap error. The size of the error bars suggests that this spot is not particularly noisy. Nonlinear Technical Note – Data Quality 5 Falls Sie Fragen zu einem Produkt haben oder allgemeine Informationen benötigen, wenden Sie sich bitte an: biostep GmbH Meinersdorfer 47a 09387 Jahnsdorf Germany phone: fax: email: web: +49 3721 3905-0 +49 3721 3905-28 [email protected] www.biostep.de
© Copyright 2026 Paperzz