Pearson's chi-squared goodness of fit test is used to determine the probabilities that the RefSeq gene and custom genomic features analyses results are due to chance. The proximity of 100,000 randomly generated integration sites to the set of RefSeq genes and custom genomic features are used as the expected values for these tests (a link to this file can be found under ‘Test data’ on the VISA home page). To find the expected number of integrations for each column, the expected ratio is determined and then multiplied by the total observed number for that category. Yate’s correction is applied when appropriate. The degrees of freedom and chi-square statistic are used to look up the p-value range for each chi-squared test. More specific steps for these analyses are demonstrated by the hypothetical VISA results below. RefSeq genes results Unique RIS Number of observed Number of expected 1083 100000 Number of expected 45 2731 21 2748 710 49889 Upstream > 50kb 238 23879 Outside of genes Within 5 kb upstream of the start of the closest gene Within 5 kb downstream of the end of the closest gene Within or outside of genes Within genes Outside of genes Number of observed 307 776 44632 55368 More than 5 kb away from the closest gene Promoter distance Within genes In first eighth of gene 47 5673 Upstream 10-50kb 106 13600 In second eighth of gene 37 5485 Upstream 5-10kb 24 3048 In third eighth of gene 30 5592 Upstream 2.5-5kb 24 1804 In fourth eighth of gene 48 5579 Upstream 1-2.5kb 12 1107 In fifth eighth of gene 44 5726 Upstream 1kb 13 788 In sixth eighth of gene 32 5572 Downstream 1kb 7 807 In seventh eighth of gene 35 5399 Downstream 1-2.5kb 8 1273 In eighth eighth of gene 34 5606 Downstream 2.5-5kb 19 2018 Downstream 5-10kb 30 3459 Downstream 10-50kb 127 17036 Downstream > 50kb 475 31181 Chi-squared tests Within or outside of genes *Note: All unique integration sites are counted when calculating the expected values in this test. Expected ratio Observed number Expected number Within gene Outside of gene 0.446 0.554 307 776 483.365 599.635 64.350 51.872 2 (O - E) / E Total 1083* 116.222 Chi-square statistic = 116.222 Degrees of freedom = 1 p-value < 0.0001 Within genes *Note: Only integration sites within genes are counted when calculating the expected values in this test. Expected ratio Observed number Expected number 1st eighth 2nd eighth 3rd eighth 4th eighth 5th eighth 6th eighth 7th eighth 8th eighth 0.127 0.123 0.125 0.125 0.128 0.125 0.121 0.126 47 37 30 48 44 32 35 34 39.022 37.728 38.464 38.375 39.386 38.327 37.137 38.561 1.631 0.014 1.863 2.414 0.54 1.044 0.123 0.539 2 (O - E) / E Chi-square statistic = 8.168 Degrees of freedom = 7 p-value > 0.05 Total 307* 8.168 Outside of genes *Note: Only integration sites outside of genes are counted when calculating the expected values in this test. Expected ratio Observed number Expected number < 5kb upstream < 5 kb downstream > 5 kb from gene 0.049 0.050 0.901 45 21 710 38.276 38.514 699.21 1.181 7.964 0.167 (O - E)2 / E Total 776* 9.312 Chi-square statistic = 9.312 Degrees of freedom = 2 p-value < 0.01 Promoter distance *Note: All unique integration sites are counted when calculating the expected values in this test. Expected ratio Observed number Expected number (O - E)2 / E Upstream > 50kb Upstream 10-50kb Upstream 5-10kb Upstream 2.5-5kb Upstream 1-2.5kb Upstream 1kb Downstream 1kb Downstream 1-2.5kb Downstream 2.5-5kb Downstream 5-10kb Downstream 10-50kb Downstream > 50kb 0.239 0.136 0.030 0.018 0.011 0.008 0.008 0.013 0.020 0.035 0.170 0.312 238 106 24 24 12 13 7 8 19 30 127 475 258.837 147.288 32.490 19.494 11.913 8.664 8.664 14.079 21.660 37.905 184.11 337.896 1.642 11.574 2.459 1.019 0.000 2.337 0.346 2.429 0.373 1.486 17.920 55.832 Chi-square statistic = 97.418 Degrees of freedom = 11 p-value < 0.0001 Total 1083* 97.418 Custom genomic features The chi-squared test are performed using the same strategy as demonstrated above for the custom genomic features analysis results. All unique integration sites are counted when calculating the expected values for the chi-squared tests for the custom genomic features categories.
© Copyright 2026 Paperzz