|
Interactive Data
UW
Duplication Browser
This browser provides an integrated view of the duplication data using two independent in
silico strategies. The whole genome analysis comparison (WGAC) based
on the August 2001 UCSC assembly, and the whole genome shotgun sequence
detection (WSSD) of duplications mapped on the assembly. These are
represented as additional tracks displayed by the UCSC
Genome Browser. Details regarding the depth of coverage and
average percent identity of WSSD, as well as the length of alignment and percent
sequence identity of WGAC are shown.
Data Downloads
(See also April 2002 Updated WSSD)
The
Whole Genome Shotgun Sequence Detection (WSSD) Database
This consists of 8,595
regions from 2,972 clones representing 130.4 Mb of segmental duplications. Regions were extracted where a significant
increase in WGS read depth was observed.
This data has been filtered for
duplications and recently transposed common repeats such as L1P and HERV
elements. Due to the complex nature and interrelationships of the
duplications we did not attempt to create consensi.
The
Whole Genome Assembly Comparison (WGAC)
for UCSC August 2001
An all by all comparision of duplications (>90%, > 500 bp in length) present in the assembly using a previously described method (Bailey et al, 2001).
WSSD/WGAC Header Descriptions
Analysis
Initial
Read Depth Across Celera Multiple
Alignments of Public Clones
Access to graphical representations of all 39,298 public clones screened for
segmental duplications. (Includes April 2002 Update)
Second
Pass with Consensus Across Putatively Duplicated Clones
Access to graphical representations of clones with putative duplications for
which consensus sequences were generated. This includes average % sequence identity calculated over the
consensus. (Includes April 2002 Update)
Chromosomal
Views of Duplications with Gaps Emphasized (August 2001)
Views of Chromosomes emphasizing gaps (green). The WSSD duplication regions (top track) are
black. The assembly (WGAC) duplications (red and blue for inter and intra
chromosomal, respectively) are broken into >98% similar (top) and <98% similar
(bottom). WSSD
Duplications not detected by WGAC (August 2001)
These are potentially under-represented regions of the genome requiring reassembly or further sequencing.
|