As a single monophyletic family of plants, the Fabaceae, Leguminosae or Papilionaceae, commonly known as the legume, pea, or bean family, is a large and economically important family of flowering plants. It includes trees, shrubs, and perennial or annual herbaceous plants, which are easily recognized by their fruit (legume) and their compound, stipulated leaves. As the third-largest land plant family, legume plants are widely distributed and divided into 650 genera and over 18,860 species, accounting up to about 7% of flowering plant species. Along with cereals, fruits and tropical roots of a number of legumes have been a staple human food and their use is closely related to human evolution. Besides, legumes are an important part of world agriculture as they fix atmospheric nitrogen by intimate symbioses with microorganisms. Mainly due to their economic importance, a number of important food and agricultural legumes have been deciphered of their whole-genome sequences, including Glycine max (soybean) (Schmutz et al. 2010), Cicer arietinum (chickpeas) (Jain et al. 2013), Medicago truncatula (alfalfa) (Young et al. 2011; Tang et al. 2014), Lotus japonicas (lotus) (Sato et al. 2008), Vigna anradiata (mung bean) andVigna angularis (adzuki bean) (Kang et al. 2014), Cajanus cajan (pigeon pea) (Varshney et al. 2012), and Phaseolus vulgaris (Common bean) (Schmutz et al. 2014), and Arachis hypogaea (peanut). These legume genomes have relatively small sizes, ranging from ~400 (Medicago) to 1150 Mb (Soybean), packaging onto 6 to 20 chromosomes.
By integrating available genomic sequences and their annotations, we performed comparative genomics analysis of legumes about their genome structures, detected gene colinearity between genomes and within each genome, produced a list of colinearity-supported homologs, orthologs and paralogs, and related duplicated genes to specific ancient polyploidization, characterized and modeled gene losses and retentions after polyploidizations and speciations, and characterized the distribution, expansion, copy-number variations of economically or agriculturally important gene families and regulatory pathways. These enormous and comprehensive efforts make it possible to construct this legume genomics analysis platform to integrate the above results involving all sequenced legume genomes up to date. More datasets of various kinds (transcriptomes, epigenomes, etc) will be integrated in this platform in future efforts, and more genomes will be added while they are ready to be publicized. We believe that this legume genomics analysis platform, including datasets produced and tools developed would be a great aid to the legume research community.