Metabolite Profiles of Sukabumi Arabica Green Coffee Beans Evaluated by 1 H NMR-Based Metabolomics

. Although Sukabumi arabica coffee is one of the famous coffees in Indonesia, however its chemical information is still limited in the literature. In this work, the metabolite profiles of arabica green coffee beans obtained from various plantations in Sukabumi, including Ciayunan, Pondok Halimun, and Selabintana were evaluated by 1 H NMR-based metabolomics. In total, 19 metabolites were successfully identified, including the major and minor metabolites of the coffee. The score plot of the OPLS-DA (Orthogonal Partial Least Square Discriminant Analysis) model was successful in classifying the metabolite profiles of the coffees based on their origins. The loading plot analysis showed that the signals belonging to fatty acids, sucrose, trigonelline, chlorogenic acid, lactic acid, and quinic acid, contributed to the classification. S-plot analysis revealed that Selabintana coffee was characterized with higher concentrations of trigonelline and sucrose, whereas Ciayunan sample had higher levels of fatty acids. Meanwhile, the metabolite profile of Pondok Halimun coffee demonstrated an intermediate characteristic between the Ciayunan and Selabintana samples. This work provided valuable scientific information for coffee development especially in West Java and generally in Indonesia.


INTRODUCTION
Coffee, known as one of the most popular beverages worldwide, is consumed extensively and has become an integral part of daily life (Arboleda et al., 2018).Coffee has developed not only as a daily routine drink but also as an expression of lifestyle (Yeretzian, 2017).Approximately 30-40% of the global population consumes coffee daily (Coffee Rank, 2022).Among the various coffee species, Coffea arabica (arabica coffee) dominates the market, accounting for around 60% of coffee consumption, while the remaining 40% consists of Coffea canephora (robusta coffee) (International Coffee Organization, 2021).Despite their high demand, both varieties have drawbacks.
Recently, single-origin coffee beans have gained popularity among coffee consumers.Single-origin coffee refers to coffee beans grown in a specific geographic location, producing a distinct flavor profile unique to that region.
Hameed et al. (2018) reporte66d that geographical factors, including altitude, soil slope, soil composition, rainfall, temperature, climate, and sun exposure, affect the flavor of coffee beans.As one of the largest coffee producers and exporters globally, Indonesia has many coffee plantations.Sukabumi has been one of the centers of coffee plantations in West Java since the Dutch colonial era.Arabica coffee is the predominant type cultivated in Sukabumi (Dinas Komunikasi, 2023).However, the metabolite profile of Sukabumi arabica coffee remains mostly unexplored.The information of metabolite profiles could be used as valuable references in determining the steps for developing the Sukabumi arabica coffees, and as the scientific evidence of the coffee excellence as well. 1 H NMR-based metabolomics is widely applied to identify metabolite profiles in a biological system.According to Burgess et al. (2014), there are several advantages of 1 H NMR-based metabolomics, including non-destructive analysis, easy sample preparation, facilitating absolute quantification, providing chemical structure information, and independence from analyte polarity.Several researchers have successfully employed 1 H NMRbased metabolomics in green coffee bean studies.For instance, Kwon et al. (2015) applied 1 H NMR-based metabolomics to investigate the quality of green coffee, while Happyana et al., (2021) used the same method to discriminate the metabolite profiles of the green beans of Indonesian arabica coffee varieties.Febrina et al., (2021) successfully identified the characteristic metabolites of green beans of Luwak coffee using 1 H NMR-based metabolomics.Other works employed 1 H NMR-based metabolomics for discriminating coffee samples based on their geographical origins (Consonni et al., 2012;da Silva, 2014;Varana et al., 2015;Markos et al., 2023;V. Gottstein et al., 2024) In this work, the chemical profiles of the green arabica coffee beans obtained from 3 coffee plantations in Sukabumi, including Ciayunan, Pondok Halimun, and Selabintana, were evaluated with 1 H NMR-based metabolomics.OPLS-DA (orthogonal projections to latent structure discriminant analysis) technique was employed to evaluate the similarities and the differences in metabolite profiles among the coffee samples.The discriminant compounds specific to each coffee were revealed by analyzing S-plots obtained from a two-class model of OPLS-DA.To the best of our knowledge, it is the first report concerning the application of 1 H NMR-based metabolomics in investigating Sukabumi arabica coffees.

EXPERIMENTAL SECTION Materials
Green beans of Sukabumi arabica coffee were collected from 3 coffee plantations in Sukabumi, including Ciayunan, Pondok Halimun, and Selabintana.Further details of the samples can be found in Table 1.D2O was employed as the extraction solvent, NaH2PO4 and Na2HPO4 were used as a buffer solution, and TSP (3-(trimethylsilyl)-2,2,3,3tetradeuteropropionic acid sodium salt) served as the standard compound for chemical shift calibration.All the chemicals mentioned were purchased from Merck (Darmstadt, Germany).

Extraction
The green beans of Sukabumi arabica coffee were ground with a 600N coffee grinder (Yang Chia Machine Work, Taiwan).The finely ground coffee was then placed into a 4 mL plastic tube and dissolved in a 2 mL D2O solution containing 1.0 mM TSP and pH 6.0 phosphate buffer.To ensure homogeneity, the sample was vortexed using the Digital Vortex Genie 2 (Scientific Industries Inc., New York, USA) for 1 minute and subsequently sonicated for 20 minutes using the Ultrasonic Cleaner LUC-250H Mujigae (Sungdong Ultrasonic Co., Seoul, South Korea).The sample was then incubated for 30 minutes at 90 °C in a Memmert WNB 14 water bath (Memmert, Buchenbach, Germany) and allowed to stand at room temperature for 10 minutes.The sample was centrifuged for 6 minutes at 12,000 rpm using the Microcentrifuge MC-12 Benchmark (Benchmark Scientific Inc., New Jersey, United States).Finally, 500 μL of the obtained supernatant was transferred into a 5 mm NMR tube for measurement on NMR Spectroscopy.
Measurement of 1 H NMR Spectra The 1 H NMR spectra of coffee samples were measured using a Variant Unity INOVA-500 Spectrometer (Agilent Technologies, California, United States) operating at a frequency of 500 MHz.The pulse pre-saturation method was employed for the measurements, with specific parameters including an acquisition time of 2.72 seconds, a relaxation time of 2 seconds, a spectral width of 8012 Hz, and a data point count of 64K.ACD/Labs 12.0 software (Advanced Chemistry Development, Inc., Toronto, Canada) was used to process the raw data generated from 1 H NMR measurements.The chemical shifts were calibrated to the signal of TSP.

Data Processing of 1 H NMR Spectra
All 1 H NMR spectra were processed by bucketing techniques using ACD/Labs 12.0 software.Bucketing was performed in the chemical shift range of 0.50-10.00ppm.In the bucketing process, each chemical shift area will be integrated with a width of 0.04 ppm.Some signals were not involved in the calculation to facilitate the analysis process.For instance, the residual water signal at δ 4.71-5.23 ppm and caffeine signals at δ 3.13-3.41ppm forming a complex with chlorogenic acid, were removed from the analysis.These signals had different chemical shift values for each spectrum, complicating the analysis process.The collected data were subsequently imported into Microsoft Excel software for normalization using the sum method to mitigate any potential bias effects.

Multivariate Statistical Analysis
The processed data were further investigated with multivariate statistical analysis using SIMCA software version 12 (Umetrics, Umeå, Sweden) to reveal the metabolite profiles of the green coffee bean samples.Orthogonal Partial Least Square Discriminant Analysis (OPLS-DA) with Pareto scaling was employed as the main model.The total variation of the models ( R 2 X and R 2 Y) and the variation predicted by the model based on cross-validation (Q 2 ) were calculated.

RESULTS AND DISCUSSION Detected Metabolites
Eighteen 1 H NMR spectra of Sukabumi arabica green coffee beans were successfully measured.All 1 H NMR spectra are shown in Figure 1.The spectra were further studied to identify metabolites in the coffee samples.
Metabolite was identified by detecting its fingerprint signal and comparing it to the reference spectrum obtained from the HMDB database (https://hmdb.ca/).The detected signals were further clarified and compared with data obtained from the literature (Happyana et al., 2021;Kwon et al., 2015;Wei et al., 2011).In total, 19 metabolites were detected in the 1 H NMR spectra.Signals belonging to caffeine, sucrose, trigonelline, and chlorogenic acids (3-chlorogenic acid, 4-chlorogenic acid, and 5-chlorogenic acid) were detected clearly in the 1 H NMR spectra, as seen in Figure 2. It showed that these compounds are the primary metabolites of green beans of Sukabumi arabica coffee.Other identified acidic compounds were acetic acid, quinic acid, citric acid, formic acid, lactic acid, malic acid, and fatty acids (lipids).Three amino acids, including alanine, asparagine, and gamma-aminobutyric acid (GABA), were also detected in the 1 H NMR spectra.Other compounds identified in the green coffee bean samples were glucose, choline, and myo-inositol.
The protons belonging to caffeine were detected at chemical shifts (δ) of 3.19, 3.36, 3.85, and 7.75 ppm with a multiplicity of all singlets.The triplet sucrose signal was clearly identified at δ 3.50 (t), 3.78 (t),  2.
The signals of amino acids, including alanine, asparagine, and GABA, were successfully detected in the 1 H NMR spectra of the coffee samples.Alanine signals were identified at δ 1.51 ppm with doublet multiplicity, while two asparagine signals were found at δ 2.87 (dd) and 2.97 ppm (dd).GABA signals were detected at δ 2.34 and 3.04 ppm with triplet multiplicity.The signals of other identified compounds, including choline, glucose, and myoinositol were also identified in the 1 H NMR spectra.Choline signals were detected at δ 3.22 (s) and 3.50 ppm (m).Glucose was recorded to have a doublet signal at δ 4.67 ppm, while myoinositol signals were identified at δ 3.29 (t), 3.50 (m), and 3.66 ppm (m).

Sukabumi Arabica Coffee Discrimination
Metabolite profile analysis of Sukabumi arabica green coffee beans was carried out by Orthogonal Partial Least Square Discriminant Analysis (OPLS-DA) modeling.OPLS-DA is often used instead of PLS-DA to parse group prediction variations and unrelated group variations in measured data.OPLS-DA makes models more rigid and easily interpreted (Worley & Powers, 2016).The result of the OPLS-DA model had 2 x-y predictive components and 7 x-orthogonal components (OPLS), accounting for the total model variation of 94.1% R 2 X and 96.7% R 2 Y.Meanwhile, the variation predicted by the model based on crossvalidation (Q 2 ) was 59.1%.The magnitude of the Q 2 value indicated the excellent predictive ability of the OPLS-DA model (more than 50%).Figure 3a shows the score plot of the OPLS-DA model that successfully classified samples of green arabica coffee beans based on their region of origin.The Score plot was generated by combining the data of component 1 (R 2 X [1] = 13.5%) and component 2 (R 2 X [2] = 5.5%).Score plots were used to show similarities and differences between samples of green beans of Sukabumi arabica coffee.To identify compounds contributing to sample classification, the corresponding loading plot was investigated further (Figure 3b).
The loading plot analysis revealed that 6 compounds contributed the most to the sample classification of Sukabumi arabica green coffee beans based on their origins.Sucrose buckets at δ 3. 68-3.74, 3.49-3.54, 3.56-3.62, 4.05-4.11, and 5.41-5.47ppm, as well as trigonelline buckets at δ 4.41-4.47and 8.80-8.86ppm, contributed to the classification of coffee samples based on regional origin.The position of these buckets was consistent with the position of the Selabintana coffee cluster in the score plot, which indicates that sucrose and trigonelline were the distinguishing compounds for this coffee sample.Meanwhile, buckets belonging to fatty acids (δ 0.91-0.97,1.27-1.29,and 1.29-1.33ppm), lactic acid (δ 1.33-1.39ppm), and quinic acid (δ 4.14-4.19ppm) were detected to be differentiators for Ciayunan coffee samples.Buckets belonging to chlorogenic acid at δ 2.10-21.4,2.23-2.29,and 3.87-3.93ppm were adjacent to the Pondok Halimun coffee cluster.It suggested that chlorogenic acid was a discriminant compound for Pondok Halimun coffee.In addition, fatty acids and sucrose are the compounds that most contribute to classifying metabolite profiles based on their geographical origin because the bucket positions of the two compounds are farthest from the axis crossing point (0,0).
The coffee plantations of Pondok Halimun and Selabintana are located in Mount Gede, thus both had the same geographical conditions.However, the coffee sample from Pondok Halimun is single variety (ateng super), whereas the sample obtained from Selabintana is multi-variety coffee, the mixtrure of ateng super, lini S, and typica.The coffee plantation of Ciayunan is located in Gegerbitung region and has different geographical conditions from the others.Ciayunan sample is multi-variety coffee, the mixture of ateng super, lini S, and sigararutang.In order to investigate the effect of the geographical conditions versus the single variety and multi-variety on the metabolite profiles of Sukabumi arabica green coffee beans, hierarchical clustering analysis (HCA) plot was analyzed.As depicted in Figure 3c, Pondok Halimun coffee had closer relationship to the Selabintana sample compared to the Ciayunan, indicating both coffees had more similarity in the metabolite profiles.Therefore, the result suggested that geographical conditions had more influence on the metabolite profiles of Sukabumi arabica coffees than the factors of single variety and multi-variety.
Three models of two-class OPLS-DA were successfully created to identify the typical metabolites of each arabica coffee bean sample, and the information can be found in Table 3.As depicted in Figure 4a-c, all score plots of the two-class OPLS-DA models successfully separated the samples based on their cultivation origins.Corresponding S-plots of the two-class OPLS-DA models were analyzed in depth to evaluate the compounds contributing to the separation.Figure 5a displays an S-plot highlighting the differentiation between Pondok Halimun and Selabintana green coffee beans.The analysis reveals that fatty acids and chlorogenic acid are the key distinguishing compounds for Pondok Halimun coffee, while sucrose and trigonelline stand out as the differentiating compounds for Selabintana coffee.Figure 5b illustrates an S-plot highlighting the compounds responsible for distinguishing between the green beans of Ciayunan and Selabintana arabica coffees.In this separation, quinic acid and fatty acids were identified as the differentiating compounds for Ciayunan arabica coffee, while sucrose and trigonelline played a significant role in distinguishing Selabintana arabica coffee.Moving on to Figure 5c, the S-plot reveals the compounds that distinguish the green beans of Ciayunan arabica coffee from those of Pondok Halimun arabica coffee.Fatty acids and lactic acids were identified as the distinguishing compounds for Ciayunan arabica coffee, while sucrose and trigonelline were found to be characteristic compounds for Pondok Halimun arabica coffee in this separation.
By analyzing the S-plot of the two-class OPLS-DA models (Figure 5), we can identify the characteristic metabolites that distinguish each sample of Sukabumi arabica green coffee beans.Fatty acids consistently emerge as the distinctive compounds for Ciayunan arabica coffee in the corresponding S-plots.On the other hand, sucrose and trigonelline consistently stand out as unique compounds for Selabintana coffee across all related S-plots, underscoring their importance in differentiating Selabintana coffee.In contrast, the green beans of Pondok Halimun arabica coffee lacked consistently differentiating compounds in the corresponding S-plots.It suggested that metabolite profile of Pondok Halimun coffee exhibits characteristics that fell between the profiles of Ciayunan and Selabintana samples, indicating an intermediate metabolic composition.

Figure 2 .
Figure 2. Metabolite identification in the 1 H NMR spectrum of Sukabumi arabica green coffee beans.
and 4.08 (t).Meanwhile, sucrose signals with multiplet multiplicity were detected at δ 3.85 (m), 3.88 (m), and 3.92 (m) ppm.Two doublet signals belonging to sucrose were identified at δ 4.25 (d) and 5.44 (d) ppm.Other signals belonging to sucrose were recorded at δ 3.58 (dd) and 3.71 (s).Singlet peaks at δ 4.44 and 9.1 1 ppm are trigonelline signals.Other trigonelline signals were detected at 8.08 and 8.83 ppm with triplet multiplicity.The signals of 3 isomers of chlorogenic acid, the other main compound in green beans of Sukabumi arabica coffee, were shown in detail in Table 2. Signals of other organic acid compounds were also successfully identified in the 1 H NMR spectra of Sukabumi coffees.The singlet signals from acetic and formic acids were detected at δ 1.98 and 8.49 ppm, respectively.Doublet peaks belonging to citric acid were found at δ 2.65 and 2.76 ppm.Two lactic acid signals were identified at δ 1.31 (br s) and 4.08 ppm (m).Quinic acid is identified with signals at δ 1.89 (m), 1.98 (m), 2.08 (m), 3.99 (m), and 4.18 ppm (m).Malic acid was detected to have signals at δ 2.40 (m), 2.66 (m), and 4.34 ppm (m).The peaks at δ 0.93 (m), 0.98 (br s), and 1.31 ppm (m) were identified as fatty acid (lipid) signals.The signals belonging to the detected compounds are summarized in Table

Figure 3 .
Figure 3. Score plot (a), loading Plot (b), and HCA plot (c) computed from 1 H NMR spectra of Sukabumi arabica green coffee beans.The green coffee beans of Selabintana (red), Pondok Halimun (blue) and Ciayunan (green) were classified in the different clusters as depicted in the score plot (a).

Figure 5 .
Figure 5. S-Plot of the OPLS-DA two-class model of green beans of Sukabumi arabica coffee (a) Pondok Halimun vs Selabintana; (b) Ciayunan vs. Selabintana; (c) Ciayunan vs Pondok Halimun CONCLUSIONS 1 H NMR-based Metabolomics successfully evaluated the metabolite profile of green beans of Sukabumi arabica coffee from 3 different coffee plantations, including Ciayunan, Pondok Halimun, and Selabintana.In total, 19 metabolites were successfully identified in the 1 H NMR spectra, including acetic acid, alanine, asparagine, caffeine, choline, chlorogenic acids, glucose, citric acid, formic acid, GABA, lactic acid, fatty acids (lipids), malic acid, myoinositol, quinic acid, sucrose, and trigonelline.The score plot of the OPLS-DA model successfully classified the metabolite profile of coffee samples based on their origins.Loading plot analysis showed that signals belonging to fatty acids, sucrose, trigonelline, chlorogenic acid, lactic acid, and quinic acid contributed to the classification.Sucrose and trigonelline were found as the discriminant compounds of Selabintana coffee.Fatty acids were found as characteristic metabolites of Ciayunan samples.Pondok Halimun coffee had an intermediate metabolite profile between the Ciayunan and Selabintana samples.

Table 1 .
Information on green bean samples of Sukabumi arabica coffee

Table 2 .
1 H NMR signals of the identified compounds

Table 3 .
The statistical information of two-class OPLS-DA models.