As one of the most economically significant Oleaceae family members, Jasminum sambac is renowned for its distinct sweet, heady fragrance. Using Illumina reads, Nanopore long reads, and HiC-sequencing, we efficiently assembled and annotated the J. sambac genome. The high-quality genome assembly consisted of a total of 507 Mb sequence (contig N50 = 17.5 Mb) with 13 pseudomolecules. A total of 21,143 protein-coding genes and 303 Mb repeat sequences were predicted. An ancient whole-genome triplication event at the base of Oleaceae (~66 Mya, Late Cretaceous) was identified and this may have contributed to the diversification of the Oleaceae ancestor and its divergence from the Lamiales. Stress-related (e.g., WRKY) and flowering-related (e.g., MADS-box) genes were located in the triplicated regions, suggesting that the polyploidy event might have contributed adaptive potential. Genes related to terpenoid biosynthesis, for example, FTA and TPS, were observed to be duplicated to a great extent in the J. sambac genome, perhaps explaining the strong fragrance of the flowers. Copy number changes in distinct phylogenetic clades of the MADS-box family were observed in J. sambac genome, for example, AGL6- and Mα- were lost and SOC- expanded, features that might underlie the long flowering period of J. sambac. The structural genes implicated in anthocyanin biosynthesis were depleted and this may explain the absence of vivid colors in jasmine. Collectively, assembling the J. sambac genome provides new insights into the genome evolution of the Oleaceae family and provides mechanistic insights into floral properties.

Genome data could be downloaded from https://github.com/maypoleflyn/Jasminum-sambac-genome