From 50,000 Unstructured Files to a Market-Ready MegaSurvey Well Data Package

HDS indexed, classified and composited 50,000+ unstructured data files to deliver an interpretation-ready well data package for PGS's West Africa MegaSurvey — including a machine learning-based lithology analysis applied across the full well dataset.

The Challenge

PGS's West Africa MegaSurvey required a high-quality, well-structured well data package to complement their seismic offering. The raw input was a 126GB dataset of over 50,000 unstructured files in multiple formats — MS Office documents, PDFs, scanned raster images, LAS, DLIS, LIS, SEG-Y and more. Critically, well naming conventions within the data varied substantially, making automatic well-to-data linkage impossible without extensive analysis.

The Solution

HDS indexed and catalogued the full dataset using GeoSCOPE's data mining and Business Intelligence platform, classifying all files against multiple virtual taxonomies simultaneously. From the metadata analysis, a standardised well naming convention was established for the entire dataset, with a comprehensive alias set for each well to capture every naming variant present in file names, folder names and digital file headers.

HDS then produced composited, interpretation-ready well log data — and deployed a machine learning-based neural network to generate lithology columns across the dataset. The neural network was trained on digitised mud cuttings lithology columns and applied to wireline and LWD curve data for the entire survey area. These ML-generated lithology results were delivered as composite plots as part of the MegaSurvey data package.

The Outcome

PGS received a fully structured, interpretation-ready well data product — including standardised well naming, comprehensive curve composites and AI-generated lithology characterisation — delivered to their full satisfaction and incorporated into the West Africa MegaSurvey commercial package.

Previous
Previous

Transforming 1.5 Million Unstructured Archive Records into a Usable Corporate Asset

Next
Next

Establishing a National Energy Data Repository for Sudan