BACKGROUNDThe presence of non-coding introns is a characteristic feature of most eukaryotic genes. While the size of the introns, number of introns per gene and the number of intron-containing genes can vary greatly between sequenced eukaryotic genomes, the structure of a gene with reference to intron presence and positions is typically conserved in closely related species. Unexpectedly, the ABCB1 (ATP-Binding Cassette Subfamily B Member 1) gene which encodes a P-glycoprotein and underlies dwarfing traits in maize (br2), sorghum (dw3) and pearl millet (d2) displayed considerable variation in intron composition.
RESULTSAn analysis of the ABCB1 gene structure in 80 angiosperms revealed that the number of introns ranged from one to nine. All introns in ABCB1 underwent either a one-time loss (single loss in one lineage/species) or multiple independent losses (parallel loss in two or more lineages/species) with the majority of losses occurring within the grass family. In contrast, the structure of the closest homolog to ABCB1, ABCB19, remained constant in the majority of angiosperms analyzed. Using known phylogenetic relationships within the grasses, we determined the ancestral branch-points where the losses occurred. Intron 7, the longest intron, was lost in only a single species, Mimulus guttatus, following duplication of ABCB1. Semiquantitative PCR showed that the M. guttatus ABCB1 gene copy without intron 7 had significantly lower transcript levels than the gene copy with intron 7. We further demonstrated that intron 7 carried two motifs that were highly conserved across the monocot-dicot divide.
CONCLUSIONSThe ABCB1 gene structure is highly dynamic, while the structure of ABCB19 remained largely conserved through evolution. Precise removal of introns, preferential removal of smaller introns and presence of at least 2 bp of microhomology flanking most introns indicated that intron loss may have predominantly occurred through non-homologous end-joining (NHEJ) repair of double strand breaks. Lack of microhomology in the exon upstream of lost phase I introns was likely due to release of the selective constraint on the penultimate base (3rd base in codon) of the terminal codon by the splicing machinery. In addition to size, the presence of regulatory motifs will make introns recalcitrant to loss.