Microsoft Word - 55castellanos.docx
CHEMICAL ENGINEERING TRANSACTIONS
VOL. 64, 2018
A publication of
The Italian Association
of Chemical Engineering
Online at www.aidic.it/cet
Guest Editors: Enrico Bardone, Antonio Marzocchella, Tajalli Keshavarz
Copyright © 2018, AIDIC Servizi S.r.l.
ISBN 978-88-95608-56-3; ISSN 2283-9216
Functional Metagenomic Analysis of the Coffee (Coffea
arabica) Fermentation
Katerine Vera Pachecoa, Wilfredo Valdivieso Quinteroa, Andrea Juliana Mantilla-
Paredesa, William Jaimesb, Jorge Torradoc, German Zafraa,d*
a
Universidad de Santander UDES, Grupo de Investigación en Ciencias Básicas y Aplicadas para la Sostenibilidad – CIBAS.
Campus Lagos del Cacique, Calle 70 No. 55-210, Bucaramanga, Santander, 680003, Colombia.
b
Penagos Hermanos y Compañía SAS. Bucaramanga, Santander, Colombia.
c
Telmo J. Díaz y CIA S.A., Hacienda el Roble. Los Santos, Santander, Colombia.
d
Current address: Universidad Industrial de Santander, Escuela de Microbiología. Bucaramanga, Santander, 680002,
Colombia.
gzafra@udes.edu.co
This study focused on studying the influence of the temperature and time on the functional diversity of the
microbial populations involved in coffee (Coffea arabica) fermentation, using a shotgun metagenomic
approach. Fermentations of depulped coffee grains were carried out under controlled and non-controlled
temperature conditions for 24 h. Paired-end whole genome sequencing from mucilage samples was
performed using an Illumina Hiseq 2x150 platform. Global and specific gene abundance was analyzed using
the KEGG orthology (KO). Results showed a predominance of genes involved in carbohydrate and aminoacid
metabolism during fermentations. The abundance of genes involved in glycolysis / gluconeogenesis, lactate
fermentation and mixed acids were higher during fermentation conducted under non-controlled temperature
conditions; however, fermentations carried out at 11 °C induced a significant increase in the abundance of
genes involved in the synthesis of aminoacid, lipids and organic acids, as well as protein secretion systems.
We concluded that different temperatures and conditions in fermentations produce appreciable changes on
the functional potential of both aminoacid and carbohydrate metabolism, especially in the abundance of N-
acetyl-lysine deacetylase, pyruvate dehydrogenase and 6-phosphofructokinase genes, which in turn could
greatly affect the taste and quality of coffee. This information, together with the results from coffee cupping,
provided valuable insights into the role microorganisms involved in coffee fermentation play in obtaining better
taste attributes, as well to identify key genes and potential metabolic pathways associated with these special
attributes.
1. Introduction
Coffee fermentation is a process carried out by microorganisms, whose main purpose is to remove the
remaining mucilage from the coffee pulping process. In Colombia, coffee fermentation is carried out mostly
using waterless systems (dry fermentation) in open tanks of various materials. In addition to eliminating the
mucilage, the fermentation has a pronounced effect on the final quality of the coffee. This influence may
confer special characteristics and added value to coffee, enhancing its flavor and aroma. On the other hand,
disastrous losses in quality of the beans may take place if the coffee is over-fermented (Bade-Wegner et al.
1997). Among the different variables influencing this process, the fermentation time, temperature and
variations within the microbial populations involved are key to modify the chemical composition of the beans,
which ultimately affects the taste and quality of the final product. Although there are reports on the microbial
composition of mucilage and the microbial populations involved in the fermentation process (Masoud et al.
2004), most of these studies have used conventional techniques for microbial isolation/count and PCR-based
techniques to identify the isolated microorganisms. Conventional microbiological studies applied to this type of
fermentations are difficult to carry out, mainly because of the absolute numbers of some taxa are either very
high or very low, causing issues with the microbial identification and counts. In addition, the presence of
DOI: 10.3303/CET1864060
Please cite this article as: Vera Pacheco K., Valdivieso Quintero W., Mantilla-Paredes A.J., Jaimes W., Torrado J., Zafra G., 2018, Functional
metagenomic analysis of the coffee (coffea arabica) fermentation, Chemical Engineering Transactions, 64, 355-360
DOI: 10.3303/CET1864060
355
teleomorph or anamorphic stages make difficult to carry out reliable comparative studies. These type of
approaches does not allow to analyze most of non-cultivable populations involved in the fermentation process.
Despite there is knowledge about the biochemical conversion of the substrates present in the mucilage and
beans, a direct association between the type of microorganisms present and the final organoleptic attributes of
the finished coffee has not been stablished (Lee et al. 2015). Thus, since only a small proportion of the
microorganisms involved in the fermentation is cultivable, it becomes necessary to use more informative
techniques to describe in a greater detail these important aspects of coffee fermentation. Metagenomics, and
specially the use of next-generation sequencing (NGS) methods, allow the direct analysis of microorganisms
without isolating them, providing also a way to solve the problems mentioned above. Because of the above,
this study aimed to provide a better understanding of the microbiome associated with the fermentation of
coffee, their functional potential and influence on coffee quality using a metagenomic approach. The study
focused on studying the influence of the temperature and time on the functional diversity of microbial
population involved in coffee fermentation. This information, together with the results from coffee cupping,
would provide valuable insights into the role microorganisms involved in coffee fermentation play in obtaining
better taste attributes, as well to identify key genes and potential metabolic pathways associated with these
special attributes.
2. Materials and methods
2.1 Coffee fermentation, processing and cupping
The coffee population analyzed in this study corresponded to Coffea arabica, var. Caturra. Total sample size
consisted of approximately 10 tons of fresh coffee cherries collected from a coffee farm located in Los Santos,
Santander, Colombia (06.8628260°, -073.0450050°; 18 °C annual average temperature), in March 2016.
Coffee cherries were subjected to a mechanical pulping process in a UDT4 depulper (Penagos Hermanos y
CIA, Colombia) and subsequently, two different processes of fermentation (no water added) were carried out
during 24 hours: 1) a fermentation under temperature-controlled conditions at 11 °C in a prototype fermenter
(2 m height, 1.1 m diameter) provided by Penagos Hermanos y CIA (Colombia), and 2) a fermentation under
non-controlled temperature conditions, carried out in a conventional open tiled fermentation tank under the
prevailing environmental conditions on the farm (22 to 26 °C). Composite mucilage samples (150 g) were
taken in sterile flasks from five different zones and depths of both the open tank and the fermenter, during
hours 0, 14, 18 and 24 of fermentation. These mucilage samples were kept at -20 °C until DNA extraction. In
addition, 300 g of fermented bean samples from each time point were obtained for coffee production and
cupping. These coffee beans underwent the remaining production processing (washing, drying, hulling,
sorting, roasting and milling) to obtain a finished coffee product. Subsequently, each brewed coffee was
subjected to a cupping process to evaluate flavor notes and overall quality.
2.3 Nucleic acid extraction and sequencing
Metagenomic DNA was extracted from mucilage samples using the PowerSoil® DNA Isolation Kit (MO BIO
Laboratories Inc.) according to the manufacturer’s instructions. Quantity and quality of DNA was analyzed
using a NanoDrop 2000 system (Thermo Scientific). Paired-end whole genome sequencing was performed
using the Nextera XT DNA sample preparation kit and the Illumina HiSeq 2500 platform (with the 2x150
paired-end, rapid run mode). This process allows to obtain between 5 and 20 million raw reads from each
sample.
2.4 Functional predictive analysis and mapping of metabolic pathways
Raw sequences were checked for sequencing tags and adapters using the FastX-Toolkit. Sequences were
then subjected to quality control using the FastQC tool to check data quality. Bowtie (Langmead et al. 2009)
was used to remove host (coffee) contaminant sequences based on Arabidopsis thaliana (TAIR9) genome.
Gene calling, annotations and similarity-based searches of unassembled metagenomic data were carried out
using the MG-RAST v4.02 pipeline (Meyer et al. 2008). ORFs and their corresponding protein sequences
were identified and annotated using the KEGG orthology (KO), with a maximum e-value of 1e
−5 and a
minimum identity cutoff and alignment length of 80 % and 15 bp, respectively. Assigned sequences were
retrieved from MG-RAST and analyzed with STAMP v2.1.3 (Parks and Beiko 2010) to calculate the statistical
differences between individual samples or groups. To map annotated genes into potential metabolic
pathways, metagenomic reads were annotated with KO in MG-RAST and visualized using the KEGG mapper
tool (http://www.genome.jp/kegg/tool/map_pathway.html). Statistical analysis of metagenomic profiles were
carried out with STAMP v2.1.3, using ANOVA, Fisher´s, Tukey, and Kramer & Welch tests.
356
3. Results and discussion
After paired-end whole genome sequencing of mucilage samples we obtained the following eight datasets:
TNH0: 9,835,879 sequences of 167 bp (on average) for a total of 3.6 Gbp; TNH14: 7,141,248 sequences of
166 bp (on average) for a total of 2.4 Gbp; TNH18: 11,004,665 sequences of 165 bp (on average) for a total of
3.9 Gbp; TNH24: 8,187,076 sequences of 172 bp (on average) for a total of 3.1 Gbp; T11H0: 9,392,753
sequences of 170 bp (on average) for a total of 3.5 Gbp; T11H14: 12,211,152 sequences of 170 bp (on
average) for a total of 3.5 Gbp; T11H18: 10,886,246 sequences of 167 bp (on average) for a total of 3.9 Gbp
and T11H24: 12,211,152 sequences of 167 bp (on average) for a total of 4.4 Gbp. All these datasets are
publicly available in the MG-RAST server under the MG-RAST IDs mgm4716136.3, mgm4716134.3,
mgm4716133.3, mgm4716137.3, mgm4716131.3, mgm4716135.3, mgm4716132.3 and mgm4716138.3
respectively, in the static link http://metagenomics.anl.gov/linkin.cgi?project=mgp19659.
3.1 Abundance of genes related to amino acid metabolism
Figure 1 shows the global gene abundance based on KEGG orthology, which showed a predominance of
metabolism pathways, especially for carbohydrate, aminoacid and energy metabolism in fermentations
conducted under both non-controlled temperature and the 11 °C fermentation (Figure 1).
Figure 1. Estimation of gene abundance in coffee mucilage metagenomes using KEGG orthology (KO). A) KO
level 1. B) KO level 2 based on the abundance of genes involved in metabolism.
Except for an increase in the abundance of genes related to genetic information processing, no major
differences were observed among fermentations. Regarding aminoacid metabolism, the highest proportion of
genes detected during both fermentations corresponded to those involved in the alanine, aspartate and
glutamate metabolism (ko00250), as well as glycine, serine and threonine metabolism (ko00260), which
increased during fermentation (Figure 2). This result agrees with a previous study showing the same
aminoacids as predominant in the composition of Coffea canephora (robusta) and Coffea arabica var. caturra
mucilage (Puerta 2011). A higher abundance of genes related to aminoacid metabolism were found in
metagenomes from non-controlled temperature fermentation compared to those from the 11 °C fermentation.
Even though the most abundant genes corresponded to the metabolism of the before mentioned aminoacids,
the most notorious differences between the two fermentations were related to lysine and phenylalanine
metabolism genes (Figure 2). Although the abundance of genes involved in the degradation of lysine was
similar at the start of both fermentations (4,96 % in non-controlled and 4,79% in 11°C fermentation), it
significantly decreased at hours 14 (1 %, p=0.01), 18 (0,7 %, p=0.01) and 24 (0.3 %, p=0.008) during non-
controlled temperature conditions; in contrast, during the 11 °C fermentation it remained relatively constant
during hours 14 (4.4 %) and 18 (4.5 %) and decreased only by the end of the fermentation (1.6%). On the
other hand and similar to lysine degradation genes, a significant decrease in the abundance of genes involved
in the metabolism of phenylalanine was observed during the non-controlled temperature fermentation,
changing from 13.7 % by hour 0 to 8.2 % and 5.4 % by hour 14 and 18 respectively. Accordingly, this group of
357
genes was not detected by hour 24, strongly suggesting that microbial metabolism shifted towards other group
of genes, such as tyrosine metabolism, by the end of both fermentations (Figure 2).
Figure 2. Estimation of the abundance of genes related to aminoacid metabolism in coffee mucilage
metagenomes using KEGG orthology (KO). Number denote the pathway entry in the KEGG database.
Figure 3. Estimation of the abundance of genes related to carbohydrate metabolism in coffee mucilage
metagenomes using KEGG orthology (KO) at level 4 (function level).
358
3.2 Abundance of genes related to carbohydrate metabolism
Regarding carbohydrate metabolism, the most abundant group of genes were related to
glycolysis/gluconeogenesis, pentose phosphate and pyruvate (Entner-Doudoroff) metabolism pathways. This
result was expected since these are the three central pathways of the intermediate metabolism of
carbohydrates in microorganisms (Varela 2002) which processes carbohydrates and carboxylic acids while
also provide metabolic precursors for other metabolic pathways. The abundance of genes involved in several
other metabolic pathways significantly increased (i.e. galactose metabolism) or decreased (i.e. inositol
phosphate metabolism) during both fermentations. At the function level, we found the gene having the highest
abundance was the 6-phosphate-beta-glucosidase [EC 3.2.1.86] (figure 3), an enzyme that acts on β-1,4-
glycosidic bonds of oligosaccharides and produces glucose monomers as final product (Casablanca et al.
2011). This enzyme also has transferase and transglucosidase activities, being able to generate products of
higher size than the initial oligosaccharides. This is the reason why, along with glucotransferases,
glucosidases constitutes the major catalytic machinery for the synthesis and breakdown of glucosidic bonds in
microorganisms (Casablanca et al. 2011). The abundance of 6-phosphate-beta-glucosidase gene significantly
increased over time in both fermentations, but more noticeably during the fermentation conducted under non-
controlled temperature, increasing from 11 % by hour 0 to 36 %, 45 % and 43 % by hours 14, 18 and 24
respectively. During the fermentation carried out at 11 °C this abundance significantly increased only until hour
24 (43 %).
Table 1. Flavor notes found in brewed coffee depending on the temperature and time of coffee fermentation
Fermentation temperature Fermentation time Predominant flavor note
Non-controlled temperature
conditions (22-26 °C)
Hour 0 (TNH0) Intense caramel
Hour 14 (TNH14) Caramel
Hour 18 (TNH18) Coriander grain
Hour 24 (TNH24) Sweet chocolate
Controlled temperature
conditions (11 °C)
Hour 0 (T11H0) Bitter chocolate
Hour 14 (T11H14) Sweet chocolate
Hour 18 (T11H18) Bitter chocolate
Hour 24 (T11H24) Caramel
3.3 Abundance of genes during fermentation and relationship to coffee cupping
The complex communities involved in coffee fermentation are responsible of the production of different
pectolytic, saccharolytic and lypolityc enzymes which degrade mucilage and generate metabolites, such as
acids, alcohols, esters and ketones. These enzymes modifies coffee grains composition, as well as odor, color
and pH (Lee et al. 2015). In addition, fermentation temperature, time and water content have a strong effect
on the final product. As observed in Table 1, different flavor notes were generated in the final product
depending on the type and time of coffee fermentation. In general, the brewed coffees produced from the non-
controlled temperature fermentation presented a predominant caramel flavor, whereas chocolate flavors were
generated from the controlled temperature fermentation. These flavors are part of the normal range of primary
flavors that are obtained in the caturra coffee variety (Stumptown 2017). By analyzing gene abundance
variations, their corresponding enzyme activities and the predominant flavor notes in each coffee, we found
similarities suggesting a relationship with the results from coffee cupping. For example, the N-acetyl-lysine
deacetylase [EC:3.5.1.-] gene was detected only at hour 18 of the non-controlled temperature fermentation,
coinciding with the appearance of the coriander grain flavor. The above could indicate this enzyme could be
associated with the appearance of this particular flavor. This enzyme is involved in the removal of acetyl
groups from lysine residues and is mainly produced by yeasts (Sengupta and Seto 2004). On the other hand,
while the proportion of genes involved in the degradation of lysine decreased during 11 °C fermentation,
coffee flavor notes also changed from chocolate to caramel notes by the end of the fermentation.
Even though the genes encoding pyruvate dehydrogenase [EC 1.2.4.1] and 6-phosphate-fructokinase [EC
2.7.1.11] were not the most abundant during fermentations, we observed interesting differences when
comparing their abundances among the two fermentations. These genes were detected during the first hours
of the non-controlled temperature fermentation (hours 0 and 14), but were undetectable by hour 18 or 24. In
contrast, both genes were constantly detected during the entire course of the 11 °C controlled temperature
fermentation. This could be related to the occurrence of certain flavor notes, particularly chocolate flavor
notes.
359
4. Conclusions
In conclusion, different temperatures and conditions in coffee fermentations produced appreciable changes on
the functional profiles, especially those associated with aminoacid and carbohydrate metabolism, which in turn
could greatly affect the taste and quality of coffee. Functional diversity was higher in fermentation carried out
at higher temperatures, under non-controlled conditions. Several genes, including N-acetyl-lysine deacetylase,
pyruvate dehydrogenase and 6-phosphofructokinase appear to be involved in the occurrence of certain
specific flavors in coffee, depending on the temperature and time of fermentation. This information, together
with the results from coffee cupping, provided valuable insights into the role microorganisms involved in coffee
fermentation play in obtaining better taste attributes, as well to identify key genes and potential metabolic
pathways associated with these special attributes.
Acknowledgments
This work was supported by Universidad de Santander project PICF0115207733347EJ, Penagos Hermanos y
Compañía and El Roble coffee farm.
References
Bade-Wegner H., Bendig I., Holscher W., Wollmann R., 1997, Volatile compounds associated with the over-
fermented flavour defect, In: 17th International Scientific Colloquium on Coffee - Chemistry, Nairobi, 1997.
Association for Science and Information on Coffee - ASIC, p 21
Casablanca E., Ríos N., Terrazas Siles E., Álvarez M., 2011, β-glucoside production by thermophilic bacteria
indigenous the bolivian altiplano culture, Rev Colomb Biotecnol, 13, 66-72.
Langmead B., Trapnell C., Pop M., Salzberg S.L., 2009, Ultrafast and memory-efficient alignment of short
DNA sequences to the human genome, Genome biology, 10, R25. DOI:10.1186/gb-2009-10-3-r25
Lee L.W., Cheong M.W., Curran P., Yu B., Liu S.Q., 2015, Coffee fermentation and flavor--An intricate and
delicate relationship, Food Chem, 185, 182-191. DOI:10.1016/j.foodchem.2015.03.124
Masoud W., Cesar L.B., Jespersen L., Jakobsen M., 2004, Yeast involved in fermentation of Coffea arabica in
East Africa determined by genotyping and by direct denaturating gradient gel electrophoresis, Yeast, 21,
549-556. DOI:10.1002/yea.1124
Meyer F. et al., 2008, The metagenomics RAST server - a public resource for the automatic phylogenetic and
functional analysis of metagenomes, BMC bioinformatics, 9, 386. DOI:10.1186/1471-2105-9-386
Parks D.H., Beiko R.G., 2010, Identifying biologically relevant differences between metagenomic communities,
Bioinformatics, 26, 715-721. DOI:10.1093/bioinformatics/btq041
Puerta G.I., 2011, Chemical composition of coffee mucilage according to fermentation ad refrigeration time,
Cenicafé, 62, 23-40.
Sengupta N., Seto E., 2004, Regulation of histone deacetylase activities, Journal of cellular biochemistry, 93,
57-67. DOI:10.1002/jcb.20179
Stumptown, 2017, Coffee varieties, Stumptown Coffee Roasters.
accessed 2017
Varela G., 2002, Bacterial physiology and metabolism.
accessed 2017
360