CBD: Coffee Beans Dataset

Bipin Nair B J¹, Abrav Nanda K M¹, Shalwin A S¹, V Raghavendra¹

Affiliations

PMID: 40201542
PMCID: PMC11978365
DOI: 10.1016/j.dib.2025.111434

CBD: Coffee Beans Dataset

Bipin Nair B J et al. Data Brief. 2025.

. 2025 Mar 3:59:111434.

doi: 10.1016/j.dib.2025.111434. eCollection 2025 Apr.

Authors

Bipin Nair B J¹, Abrav Nanda K M¹, Shalwin A S¹, V Raghavendra¹

Affiliation

¹ Department of Computer Science, Amrita School of Computing, Amrita Vishwa Vidyapeetham, Mysuru, Karnataka, India.

PMID: 40201542
PMCID: PMC11978365
DOI: 10.1016/j.dib.2025.111434

Abstract

The development of advanced coffee bean classification techniques depends on the availability of high quality datasets. Coffee bean quality is influenced by various factors, including bean size, shape, colour, and defects such as fungal damage, full black, full sour, broken beans, and insect damage. Constructing an accurate and reliable ground truth dataset for coffee bean classification is a challenging and labour intensive process. To address this need, we introduce the Coffee Beans Dataset (CBD) which contains 450 high-resolution images sampled across 9 distinct coffee bean grades A, AA, AAA, AB, C, PB-I, PB-II, BITS and BULK with 50 images per class. These samples were sourced from Wayanad, Kerala, reflecting the region's diverse coffee bean quality .This dataset is specifically designed to support machine learning and deep learning models for coffee bean classification and grading. By providing a comprehensive and diverse dataset, we aim to address key challenges in coffee quality assessment and improvement in classification accuracy. When tested using the EfficientNet-B0 model, the model achieved a high accuracy of 100%, demonstrating its potential to enhance automated coffee bean grading systems. The CBD serves as a valuable resource for researchers and industry professionals, promot-ing innovation in coffee quality monitoring and classification algorithms.

Keywords: Brightness; Coffee bean; Contrast; Grayscale.

PubMed Disclaimer

Figures

Fig 1 — **Fig. 1**
(a) Grade A (b) Grade AA .(c) Grade AAA (d) . Grade AB (e) Grade C. (f) Grade PB-I (g) Grade PB-II (h). Grade-BITS (i) Grade-BULK.

Fig 2 — **Fig. 2**
Folder structure of proposed dataset.

Fig 3 — **Fig. 3**
Various Defects in beans.

Fig 4 — **Fig. 4**
Dataset capturing setup.

Fig 5 — **Fig. 5**
Displays ground truth images: (a) Original image (b) Brightness decreased gray scale image (c) Increased contrast image.

See this image and copyright information in PMC

References

1. Nair B.B., Rani N.S. HMPLMD: handwritten malayalam palm leaf manuscript dataset. Data Brief. 2023;47 doi: 10.1016/j.dib.2023.108960. - DOI - PMC - PubMed
1. Prabhu A., Rani N.S. AMDPWE: alphonso mango dataset for precision weight estimation. Data Brief. 2023;51 doi: 10.1016/j.dib.2023.109778. - DOI - PMC - PubMed
1. Pushpa B.R., Rani N.S. DIMPSAR: Dataset for Indian medicinal plant species analysis and recognition. Data Brief. 2023;49 doi: 10.1016/j.dib.2023.109388. - DOI - PMC - PubMed
1. Chumchu P., Patil K. Dataset of cannabis seeds for machine learning applications. Data Brief. 2023;47 doi: 10.1016/j.dib.2023.108954. - DOI - PMC - PubMed
1. Jayakumari B.N.B., Mambilamthoda A.N.K., Stephen S.A., Venkitesan P., Raghavendra V. Coffee bean graded based on deep net models. Int. J. Electr. Comput. Eng. 2024;14(3):3084–3093. doi: 10.11591/ijece.v14i3. 2088-8708. - DOI

LinkOut - more resources

Full Text Sources
- Elsevier Science
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

CBD: Coffee Beans Dataset

Affiliation

CBD: Coffee Beans Dataset

Authors

Affiliation

Abstract

Figures

References

LinkOut - more resources

Full Text Sources