A G&G Consultancy was engaged to perform a field reserves audit for 45 fields in 3 offshore blocks in SW Africa. The data delivered to the consultancy was a USB data disk with over 2 TB of data with 300,000 unstructured files and no catalogue or index.  The data contained many MS Office files (XLS PPT DOC), over 20,000 PDF reports with over 1,000,000 pages of text and many other file formats including subsurface application databases.

The unstructured nature of the data and the absence of a catalogue made it impossible for the team performing the audit to work efficiently or be through within the proposed project schedule.


  • HDS GeoSCOPE was used to index and data mine the content of the USB disks provided

  • Identified all the duplicate digital files on the delivered disk

  • Performed OCR on all scanned reports to allow text analytics to be performed on the content

  • All text content was extracted from scanned pages

  • Identified the data according to “type”

  • Text analytics was applied to the written reports

  • Metadata headers were extracted from SEGY, DLIS, LAS, UKOOA, P190 formats

  • Once all the metadata was gathered, HDS GeoSCOPE was used to spatially link the data files to Blocks, Fields, Wells and classify against the client taxonomy

  • over 20,000 PDF reports and studies had no associated metadata with the file and folder names being purely numeric.

    HDS identified the content of these PDF files by “data mining” using intelligent business rules run as queries on the OCR’d content of the files.

  • By this process we were able to identify and classify the key reports and documents pertaining to the field study, without the need open each one and examine their content manually.


  • The Audit Team was able to perform its work within the allotted schedule and utilise 10 times more data than would have been possible using traditional manual methods.

  • The duration of the work was over a period of 6 months with only 3 of the months being active.