Genome Assembly
The Arabidopsis thaliana genome was sequenced in 2000 by the Arabidopsis Genome Initiative (AGI) (Nature 14 Dec. 2000). The genome has five chromosomes and a total size of approximately 135-megabases. The current TIGR golden path length is 119,146,348 bp. The table below shows the approximate total length and the length of the golden path for each chromosome.

Golden path length Approximate chromosome length
Chromosome 1 30,427,671 bp 34,964,571 bp
Chromosome 2 19,698,289 bp 22,037,565 bp
Chromosome 3 23,459,830 bp 25,499,034 bp
Chromosome 4 18,585,056 bp 20,862,711 bp
Chromosome 5 26,975,502 bp 31,270,811 bp
Total 119,146,348 bp 134,634,692 bp

Chromosome sequence data and AGI tiling paths are available from the TAIR FTP site.

TAIR8_Assembly_updates.xls and TAIR9_Assembly_updates.xls contains a list of all assembly updates made for the TAIR8 and TAIR9 genome releases.

For a summary of remaining gaps and and unsequenced clones go to Current Genome Status. For a description of the groups that participated in the sequencing effort go to AGI groups.

Current Genome Status
Known Gaps
Centromeres and other gaps between clones in red.
Clones containing gaps in purple.
* Indicates sequence is not yet deposited in GenBank.
Chromosome 1:
T18N24-F8L2-F2C1-F12G6-T23P23-T28N5-F11K13
T24F19-CEN1-F13P3
F9A12-F25O15-F9D18-T5F23
F27F5-T2P3-F2G19 F12A4-F1504-F14D7 T32E22-F103-T32E20 F16N3-T2E6-T6B12 F10A5-T4012-T23E18
Chromosome 2:
NOR2-F23H14-F10A8 T12J2-CEN2-T6C20-T14C8
T4E5-F10C8-T18E17
Chromosome 3:
TEL3N-T4P13 K3G3-MJL12-MTE24 MUO10-T13B17-MWE13 F8N14-T803-F1M23 T15D2-CEN3-T25F15-F23H6-T28G19-5SrDNA-F1C23-T18B3-T26P13-T14A11-
T4P3-F21A14-5SrDNA-F4M19 F7M19-T6L19- -F7K15
Chromosome 4:
NOR4-T15P10 F21I2-5SrDNA-F14G16
T2N12-CEN4-F13J5
T13J8-F26K10-F20O9-T5F17-F16A16
F19B15-F17A13-T16L4
F6I18-F6E21-F8F16
F4D11-T16I18-F26P21
Chromosome 5:
F21E1-T19N18*-T32M21
F23C8-T26N4-5SrDNA-F23B23
F28N5-CEN5-T8H11
T32B3-5SrDNA-T25B21-T3J11
GFF file of all known gaps in the Arabidopsis genome assembly April 2008
Clones Missing or Incomplete in GenBank September 2003
Clones in GenBank HTG section (sequencing in progress) or missing from GenBank. Includes chromosome, status, accession number, group and comments.
Table of Gaps and Incomplete Clones September 2003
Includes comments from TAIR, TIGR and AGI groups on status and priority for sequencing.

Back to top

AGI Groups
Cold Spring Harbor Sequencing Consortium (CSHSC)
Members: CSHL, ABI, WashU
Contacts: Dick McCombie, Rob Martienssen (CSHL); Rick Wilson (WashU)
Regions
sequenced:
13.1 Mb including the top of chromosome 4 and 3 Mb around the centromere of chromosome 5.
European Scientists Sequencing Arabidopsis (ESSA)
Members: John Innes Centre, MIPS, network of 18 labs
Contacts: Mike Bevan (JIC); Klaus Mayer (MIPS)
Regions
sequenced:
Chromosomes 4 (14.5 Mb) and 5 (6Mb)
Genoscope-EU Consortium
Members: EMBL, Genoscope, Lion, U. van Amsterdam, Valle
Contacts: Marcel Salanoubat, Francis Quetier
Regions
sequenced:
Chromosome 3 bottom arm (9.2 Mb)
Kazusa DNA Research Institute
Members: Kazusa
Contacts: Satoshi Tabata, Kiyotaka Okada
Regions
sequenced:
Chromosomes 3 (9.8 Mb) and 5 (17.8 Mb)
SPP Consortium
Members: PGEC, Stanford, UPenn (ATGC)
Contacts: Sakis Theologis (PGEC); Ron Davis (Stanford); Joe Ecker (ATGC)
Regions
sequenced:
Chromosome 1 (20.2 Mb)
The J. Craig Venter Center (JCVI) formerly TIGR
Members: JCVI
Contacts: Christopher Town
Regions
sequenced:
Chromosome 2 (19.6 Mb), parts of 1 and 3

Last modified on April 5, 2010