You are here : Home > Research Centers and Units > Genoscope > Projects > Plant gneomics


Plant Genomics

Coffea canephora, The coffee genome has been sequenced

An international study coordinated by researchers from the CEA (Genoscope), CIRAD, the CNRS, the IRD and the University of Buffalo (USA), involving many laboratories, has identified for the first time a reference for coffee. The results are published in the 5 September 2014 issue of the journal Science .

Published on 14 September 2018

The discovery is doubly interesting: firstly for fundamental reasons, as it improves our understanding of how the genome is structured, functions and evolves, and for more specific reasons, since it opens up new prospects for breeding or improving coffee varieties.

A grain worth its weight in gold

With more than 2.25 billion cups drunk every day worldwide, coffee is the leading source of revenue for many tropical countries. According to estimates from the International Coffee Organization, more than 8.7 million tonnes of coffee were produced in 2013, while export revenue totalled US$ 15.4 billion for the 2009-2010 season and the sector employed almost 26 million people in 52 countries in 2010.

Of the 126 species listed worldwide (Africa, Asia), only two are cultivated: Coffea arabicaand Coffea canephora (robusta). Despite its economic importance, the coffee genome had never been sequenced to date.

Sequencing coffee DNA

In this study published in the journal Science, the researchers looked at robusta coffee, since its genome is of medium size (710 million DNA base pairs) and it is a diploid (1), unlike Coffea arabica , which is a tetraploid (2). The coffee plant studied, produced by the IRD in Ivory Coast in the 1980s, also has the advantage of being a homozygous material (two identical sets of 11 chromosomes), which is easier to analyse than natural heterozygotes.

Using several types of sequencing technology, the Genoscope (CEA) coordinated the obtainment of the coffee DNA sequence, assembled in the form of large fragments suitable for use in various types of analyses. The teams from the IRD and CIRAD then anchored those sequence fragments to a high-density genetic map, so as to reconstruct pseudo-chromosomes. A catalogue of genes and repeated sequences was then compiled and validated, enabling comparison with other plants.

A crucial stage in deciphering the coffee genome

The researchers have thus established a reference genome sequence for coffee (including for the species Coffea arabica ), and more generally for Rubiaceae, one of the largest families of flowering plants (with almost 12 500 species).

The international consortium's comparative analysis of genomes also revealed that the structure of the coffee genome is the one most commonly conserved in asterids (the family to which potatoes and tomatoes belong), and very similar to that of the ancestral species from which all true dicots (3) (or eudicots) derived over the course of their evolution. Lastly, the genome study boosted our knowledge of the secondary metabolism (4) ) of plants and its diversification. A comparative study with the cocoa genome showed in particular that caffeine biosynthesis is due to enzymes specific to each species, which appeared at various times during its evolution.

In the longer term, the identification of the coffee genome sequence opens up new prospects in terms of varietal improvement, knowledge of the precise functions of genes (notably those specific to coffee), and the possibility of transferring the results to other species and of developing tools for diagnosing plant functioning.

It will facilitate applied projects, such as the breeding or creation of coffee varieties with better processing and/or quality characteristics and more resistant to environmental constraints, pests and diseases, for instance coffee leaf rust. This disease still has a substantial impact on coffee growing and on the economy in small-scale producing countries in Central America, such as Guatemala, Honduras and Costa Rica. Lastly, it should help steer producers towards ecologically intensive farming methods.

All these various results are available to the scientific community in an open-access database developed by the IRD and CIRAD, available on line.

(1) A cell with two sets of each of the chromosomes whose number is specific to the species in question. Canephora coffee plants are diploids, with 2x11 chromosomes.  
(2)  A cell with four sets of chromosomes. Arabica coffee plants are tetraploids, with 4x11 chromosomes.
(3) Plants whose embryo comprises two cotyledons (lobes).
(4) Secondary metabolites are molecules that do not participate directly in plant growth, unlike primary metabolites, but that play a determing role in plant adaptation to their environment.

For further information

French public laboratories involved in the study: 

  • Plant resistance to parasites (RPB - CIRAD/IRD/University of Montpellier 2)
  • Genoscope, Institute of biology François Jacob (CEA)
  • Metabolic genomics (CNRS/CEA/University of Evry)
  • Crop diversity, adaptation and development (DIADE - CIRAD/IRD/University of Montpellier 2)
  • Genetic improvement and adaptation of Mediterranean and tropical plants (AGAP - CIRAD/INRA/Montpellier Sup Agro)
  • Genomics-bioinformatics (INRA).

Study conducted with the support of : 

  • the Agence nationale de la recherche (ANR): projects on analysing BAC clone extremity sequences as a preliminary stage in coffee genome sequencing (ANR-08-GENM-022-001) and Coffea canephora genome sequencing (ANR-09-GENM-014-002)
  • Nestlé R&D (Tours)
  • Bioversity International (Montpellier).

​Sequencing :

Reference :

Denoeud F, Carretero-Paulet L, Dereeper A, Droc G, Guyot R, Pietrella M, Zheng CF, Alberti A, Anthony F, Aprea G, Aury JM, Bento P, Bernard M, Bocs S, Campa C, Cenci A, Combes MC, Crouzillat D, Da Silva C, Daddiego L, De Bellis F, Dussert S, Garsmeur O, Gayraud T, Guignon V, Jahn K, Jamilloux V, Joet T, Labadie K, Lan TY, Leclercq J, Lepelley M, Leroy T, Li LT, Librado P, Lopez L, Munoz A, Noel B, Pallavicini A, Perrotta G, Poncet V, Pot D, Priyono, Rigoreau M, Rouard M, Rozas J, Tranchant-Dubreuil C, VanBuren R, Zhang Q, Andrade AC, Argout X, Bertrand B, de Kochko A, Graziosi G, Henry RJ, Jayarama, Ming R, Nagai C, Rounsley S, Sankoff D, Giuliano G, Albert VA, Wincker P, Lashermes P, The coffee genome provides insight into the convergent evolution of caffeine biosynthesis. Science, 2014, 345(6201):1181-1184.