DIGITAL DATA CLEAN-UP - US OIL MAJOR

 

OVERVIEW

In 2010, a major US based oil major client obtained a USB disk from an asset in the FSU (Former Soviet Union) containing nearly 70,000 digital files of which over 22,000 had distinct Cyrillic characters in their names. This data was in long complex file folders of which nearly 1000 such folders have Cyrillic content in the path. Some 1,300 key distinct Cyrillic words were detected in the file paths.

Within this data set, there were 18,000 LAS files of which about 15,000 were unique (not duplicates).

The data was delivered to HDS on a USB hard disk for data mining and content analytics.

APPROACH AND DELIVERABLES

  • Extraction of all files in compressed archives (zip, 7z, rar, etc.).

  • Translation of all Cyrillic characters in all folder and filenames into English.

  • Standardisation of all English field, well and reservoir names in folder and file names.

  • Metadata extraction of ALL LAS files and key documents.

  • Translation of all Cyrillic characters, within LAS files, into English.

  • Standardisation of English field and well names.

  • Standardisation of Russian log mnemonics using the provided client - LPS/RUG system.

VALUE ADDED

  • Data discovery was 100% comprehensive

  • Data index and catalogue of all digital & log curve content done autonomously in a few days

  • Curve and file names translated to western Latin formats making them easier to use 

  • 100% data visibility and usability