首頁»
最新錄用
Petroleum Science > DOI: https://doi.org/10.1016/j.petsci.2025.04.013
A large-scale, high-quality dataset for lithology identification:Construction and applications Open?Access
文章信息
作者:Jia-Yu Li, Ji-Zhou Tang, Xian-Zheng Zhao, Bo Fan, Wen-Ya Jiang, Shun-Yao Song, Jian-Bing Li, Kai-Da Chen, Zheng-Guang Zhao
作者單位:
投稿時(shí)間:
引用方式:Jia-Yu Li, Ji-Zhou Tang, Xian-Zheng Zhao, Bo Fan, Wen-Ya Jiang, Shun-Yao Song, Jian-Bing Li, Kai-Da Chen, Zheng-Guang Zhao, A large-scale, high-quality dataset for lithology identification:Construction and applications, Petroleum Science, 2025, https://doi.org/10.1016/j.petsci.2025.04.013.
文章摘要
Abstract: Lithology identification is a critical aspect of geoenergy exploration, including geothermal energy development, gas hydrate extraction, and gas storage. In recent years, artificial intelligence techniques based on drill core images have made significant strides in lithology identification, achieving high accuracy. However, the current demand for advanced lithology identification models remains unmet due to the lack of high-quality drill core image datasets. This study successfully constructs and publicly releases the first open-source Drill Core Image Dataset (DCID), addressing the need for large-scale, high-quality datasets in lithology characterization tasks within geological engineering and establishing a standard dataset for model evaluation. DCID consists of 35 lithology categories and a total of 98,000 high-resolution images (512×512 pixels), making it the most comprehensive drill core image dataset in terms of lithology categories, image quantity, and resolution. This study also provides lithology identification accuracy benchmarks for popular convolutional neural networks (CNNs) such as VGG, ResNet, DenseNet, MobileNet, as well as for the Visual Transformer (ViT) and MLP-Mixer, based on DCID. Additionally, the sensitivity of model performance to various parameters and image resolution is evaluated. In response to real-world challenges, we propose a Real-World Data Augmentation (RWDA) method, leveraging slightly defective images from DCID to enhance model robustness. The study also explores the impact of real-world lighting conditions on the performance of lithology identification models. Finally, we demonstrate how to rapidly evaluate model performance across multiple dimensions using low-resolution datasets, advancing the application and development of new lithology identification models for geoenergy exploration.
關(guān)鍵詞
-
Keywords: Geoenergy exploration; Lithology identification; Lithology dataset; Artificial intelligence; Deep learning; Drill core