Duplicated regions, important for NAHR-mediated CNV formation, are represented on common oligi-nucleotode platforms. We observed that such regions require special attention due to possible cross-hybridization problems. To many locations in the human genome can be difficult to interpret, since a detectable variation in probe log2 intensity ratio may reflect copy number variation at one or several indistinguishable loci. Additionally, if a significant proportion of a probe's raw signal comes from cross-hybridization to (off-target sequences). As a quality control step prior to analyzing genomic data generated using the Genome-Wide Human SNP Array 6.0 platform, we flagged CN and SNP probes that align to multiple regions of the genome or that align to four or more locations with a single base pair mismatch (NCBI36/hg18). SNP genotype reproducibility was also tested.
| Evaluation of Marker Sensitivity Affymetrix SNP 6.0 marker sensitivity as measured by the regression coefficient obtained from linear regression of marker intensity ratio against copy number calls for a subset of CNVs reported in McCarroll et al. (1) across 270 HapMap samples. 202 CNVs are included for a total of 5,955 probes/regression.The x-axis bins represent the total number of off-target perfect and single base pair mismatch hits after alignment. Marker sensitivity decreases as the number of off-target hits increases (markers with 50 or more off-target hits exhibit on average 15% of the response sensitivity of probes with no off-target hits). The inset shows the linear dependence of intensity ratio on underlying copy number for 13 markers overlapping Variant_38811 (chr1:12,768,450-12,805,683). The median intensity ratio is plotted across all copy number classes on a marker basis for a set of HapMap individuals. The color gradient represents the number of off-target perfect match or 1 bp mismatch hits that were detected by aligning each marker sequence to the whole genome. |
Oldridge DA et al, Optimizing Copy Number Variation Analysis using Genome-Wide Short Sequence Oligonucleotide Arrays, Nucleic Acids Res. 2010 Jun;38(10):3275-86.