This page contains updated data regarding recent segmental duplications in GRCh37 (Human reference sequences produced by Genome Reference Consortium on February 2009).   It focuses on genomic duplications >1000bp and >90% identity.

Additional Data Access

UCSC Genome Browser

This allows browsing of the genome.

Statistics of Duplications

Based on whole genome assembly comparison (WGAC) (Joins analysis: after the generation of final globalalignments, larger gaps (up to 20 kb insertion side; minimum size of gap > 20 bp.) between two duplications were merged. See ref. 4

Histograms of non-redundant duplication

Histograms show the non-redundant Seg Dup distribution and the Seg Dup ratio on each chromosome.

Histogram of length cuts

Analysis on distribution of number of pairs at different length categories. 

Histogram of percent identity cuts

Analysis on distribution of number of pairs at different Percent Similarity. 

Chromosomal views (simple) of duplications

Chromosomal views of segmental duplications in chromosomes. Gaps are depicted as line discontinuities on the chromosomal sequence. The WSSD Duplication regions (top track) detected by whole-genome shotgun sequences (Excess depth of coverage) are black. The whole genome assembly (WGAC) duplications are shown in red and blue for inter and intra chromosomal, respectively.

Chromosomal views (scale) of duplications

Chromosomal views of segmental duplications with % identities in chromosomes. Gaps are depicted as line discontinuities on the chromosomal sequence. The WSSD Duplication regions (top track) detected by whole-genome shotgun sequences (Excess depth of coverage with % identity of alignment) are black. The whole genome assembly (WGAC) duplications are shown in red and blue for inter and intra chromosomal, respectively.

Chromosomal views showing Inter & Intrachromosomal duplication (blowups)

Inter and intrachromosome duplications from the perspective of individual chromosomes. The assembly (WGAC) duplications (red and blue for inter and intrachromosomal respectively) are shown at various percentage and length thresholds.

Identity vs Length

Length vs Identity scatter plot. 

Kimura vs Length

Kimura vs Length scatter plot. 

Segmental duplication coordinates

Excel format (9MB)    Tab delimited (9.6Mb)

 

all data

contains WGAC data