Most eukaryotic genes are multi-exonic with their gene structure being interrupted by non-coding introns. Introns account for a major proportion of many eukaryotic genomes. For example, approximately 24% of the human genome was found to be introns compared to 1.1% exons. A human gene has on average about 9 exons and 8 introns and the reported number of intronless genes varied from one thousand to more than three thousands.
Several databases have been built to describe the exon-intron structures of eukaryotic genes. Although some databases, such as the Xpro database contain a division of intronless coding sequences, only one database, called SEGE, is entirely devoted to intronless genes. However, all these databases seem to consider intronless coding sequences (CDS) i.e. genes having their coding sequences contained in a single exon, rather than intronless genes which are genes having exactly one exon. In fact, genes with intronless CDS have one, two or more non-coding exons. This means that a variable number of the entries of Xpro and SEGE databases may be multi-exonic.
We are happy to make available to scientific community a highly curated database of intronless genes in eukaryotes. This database which we called IGD (intronless genes database), is a collection of gene sequences that are annotated as having a single exon structure in GenBank feature table. Data are clustered into several divisions to allow easy retrieval and analyses. The first release of this database contains only Human sequences. Sequences from others mammalian genomes are under processing.