Proposed SeisCap Encapsulation Standard
Sept 11, 2006
Most of the effort to encapsulate data is spent collecting the various elements of data. The the right is a folder structure showing examples of our archived files. Click on the image to see the full size version. Here is what the folders contain
EnCana Seismic Folder Structure A line entry is created in EDM with all pertinent meta data. The Container structure can easily be built by using the EDM clipboard of meta data and running a Perl script to create the required folders on the I drive. The Lineid.LineName.info.txt meta data file is also created. Additional notes can be added to this file.
- Basic - To contain observer notes, chaining notes, drillers and shooters logs, additional notes
- Business - A copy of the final signed AFE, Seismic license agreements, Quality control plots, side labels, partnership deals...
- Design - Initial survey design, Parameter sheet and Layout map
- Shots - Field data, as near to the original as possible. Aram field segy shots, SEGA-D, Vanguard CD
- Gathers - SEGY shots with geometry and refraction statics in header
- Stacks - All previous archived stacks, this gives the processor something to shoot for. When we sell a seismic line, here's the data. We also put a copy of our SeisWare workstation ready file.
- Survey - Final audited SEGP survey. For 3D separate files for source and receiver lines. Some times all we have is a tiff image of some computer printout. Be sure you have elevations captured somewhere. A lack of elevations makes a seismic line very difficult to process.
Existing Kelman Quartet (KQ) (or Seitel SUOF) archived data should be copied to the appropriate Shots directory. Please note that Perl scripts have been written to automatically copy the data from /data/field and /data/stack.
EDMS SD Loader retrieval Retrieve the basic data from EDMS
Fire up SD loader, specify your Line ID and get your line list. Here we have our line CVB-00017. Right click in the middle of the screen (in blue area) and Copy All to Clipboard.Open up microsoft Excel and paste into your spreadsheet.
File, Save as Book1.csv, specifying a CSV (Comma separated format) using the default name.
Convert the file to unix, build the folder structure
Convert the file to a unix file using the unix utility dos2unix changing the name to edm.csvdos2unix -7 Book1.csv > edm.csvHere is what the edm.csv file looks like on my machine using the unix command more:more edm.csv Option,Program #,Line ID,Line Name,First Stn,Last Stn,Length,Sheet,Area Name,Shot For,Shot By,Completion,Position None,P26404,P9281,CH-AB-58,80,242,32.33,,CH.AB.7609.ABP.01.13,,,01-Sep-76,1We will use the above data to build the info.txt file that is a summary of the Meta data in EDM (the important stuff)Build the LineID.LineName.info.txt file and create folders
Build the info.txt file and create all the necessary directories, edit this file with additional details. Click here to see the source code for this Perl script.~ekeyser/Perl/edm2info2.plThis command returns with:File out = P9281.CH-AB-58.info.txt ######################## new line ###################### Today's date = 06/22/2006 -finiCheck the Folders
Check that the folders (directories) were created and examine the info.txt file. Note you can add additional comments to this file!cd P9281_CH-AB-58 % ls -al total 88 drwxrwxrwx 2 achayipo expl 4096 Jun 19 15:17 Basic drwxrwxr-x 2 ekeyser expl 4096 Jun 22 12:20 Business drwxrwxr-x 2 ekeyser expl 4096 Jun 22 12:20 Design drwxrwxr-x 2 ekeyser expl 4096 Jun 22 12:20 Gathers -rw-rw-r-- 1 ekeyser expl 469 Jun 22 12:20 P9281.CH-AB-58.info.txt drwxrwxr-x 2 ekeyser expl 4096 Jun 22 12:20 Shots drwxrwxr-x 2 ekeyser expl 4096 Jun 22 12:20 Stacks drwxrwxrwx 2 achayipo expl 4096 Jun 19 15:29 Survey more P9281.CH-AB-58.info.txt Working on Line ID = P9281 Line Name = CH-AB-58 Today's date = 06/22/2006 Information exported from EDM Meta Database Option = None Program # = P26404 Line ID = P9281 Line Name = CH-AB-58 First Stn = 80 Last Stn = 242 Length = 32.33 Sheet = Area Name = CH.AB.7609.ABP.01.13 Shot For = Shot By = Completion = 01-Sep-76Note that we have some strange permissions. Don't know why but here's how to clean up all the files and folders: These command change all the files to readwrite owner, readwrite the group and read only everyone else. The folders are wide open. Anyone can put files here! The following two commands will find all the folders and make the permissions wide open, the second one will find all the files and change permissions to read only for everyone and read write for the owner and the group.find . -type d -exec chmod 777 {} \; find . -type f -exec chmod 664 {} \;Folder Organization
The folders have been created and the permissions have been opened up so data can be placed. Copy the Observer Notes, Driller's logs, shooters etc to the folder under Basic. Request that the ORIGINAL data be pdf'd or multi page .tif'd. You will probably find a lot of microfilm out there. Spend the additional effort and locate the paper copies where they exist. The HP 9050 printer does an excellent job of scanning, both in colour or in black and white. Have your IT department configure the printer to have a single button to scan, it's easier!Here are EnCana's standards for naming Basic data, data are only useful if properly labelled! Be sure to use the LineID and LineName on all documents!
Additional Procedures for Data Management for New Shooting
Examine the top of this document to see what this folder structure should look like:
- After the folders have been created, usually on the I: drive, copy the Folder Organization instructions README.html to that location.
- Zip the folders and your notes together and email to the Acquisition contractor. Right click on the folder,
, Run WinZip, Add to Zip file and exit. You can now email this to your user! Tell your user to double click on the file, Extract and double click on the README.html file for further instructions.
Data Management for Historical Data
- File names should NOT have any spaces or special characters. Please use underscores if required.
- Scan the original basic data, scan paper first then Microfilm or Fische. All scanning to be grey scale or in full colour! Files are larger but the quality is vastly superior.
- All original field notes including monitor records. Monitors are especially useful because they frequently contain field file numbers on one of the traces. Old monitors should be kept. Monitors for new data should be used for quality control in the field and then destroyed.
- Scan only two sections per line, the Largest scale and best Quality, Oldest and Newest Versions with a TITLE BLOCK.
EDMS Survey Export Go Get the Survey
Retrieve all the basic data and survey and place in the appropriate folders. Retrieve the survey. Right mouse click on the option and change None to Export. Press Process, put your file in the appropriate directory and export All the survey (Original & Calculated).mv all_survey.seg P9281.CH-AB-58.NAD27_survey.segRetrieve the Field Shot and the Stack data,
Retrieve all the Field Shot data and the Stack data. The stack data is useful for the processor Make a list of lines you wish to retrieve. Here is a slick way to generate a list of all lines below your current level in the directory structure. Make a list of all files and then pull out the reference number. Edit the file 2d.lst to generate just the list that you desire.find . -name "*.*" > junk awk -f ~ekeyser/Ak/firstlast.ak < junk | sort -u > 2d.listHere is another slick unix feature to create a file called 2d.list containing the line_id P9281echo P9281 > 2d.list
Use the unix command grep to retrieve all of the online field data:awk '{printf("grep \"/%s\.\" /data/seismic_index/field.list >> new_field.all \n", $1)}' 2d.list > run chmod +x run ./runRun the run script and examine the file called new_field.allmore new_field.all /data/field17/1/P9281/P9281.CH-AB-58.DY10400.field /CH-AB-58 /data/field17/1/P9281/P9281.CH-AB-58.DY10400.field.index /CH-AB-58 /data/field17/1/P9281/P9281.CH-AB-58.DY10400.field.log /CH-AB-58 /data/field17/1/P9281/P9281.CH-AB-58.DY10400.field.sum /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-1.field /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-1.field.index /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-1.field.log /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-1.field.sum /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-2.field /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-2.field.index /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-2.field.log /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-2.field.sum /CH-AB-58 /data/field25/1/P9281/P9281.CH-AB-58.PTR-3.field /CH-AB-58 ...Ah, we found the data, now lets pull it! I have written a script called copyfield.pl and it will even place it in the right location (folder, errrr directory)~ekeyser/Perl/copyfield.plLet's check it and see if it all makes sense.cd Shots ls P9281.CH-AB-58.DY10400.field P9281.CH-AB-58.PTR-2.field P9281.CH-AB-58.PTR-4.field P9281.CH-AB-58.DY10400.field.index P9281.CH-AB-58.PTR-2.field.index P9281.CH-AB-58.PTR-4.field.index P9281.CH-AB-58.DY10400.field.log P9281.CH-AB-58.PTR-2.field.log P9281.CH-AB-58.PTR-4.field.log P9281.CH-AB-58.DY10400.field.sum P9281.CH-AB-58.PTR-2.field.sum P9281.CH-AB-58.PTR-4.field.sum P9281.CH-AB-58.PTR-1.field P9281.CH-AB-58.PTR-3.field P9281.CH-AB-58.PTR-5.field P9281.CH-AB-58.PTR-1.field.index P9281.CH-AB-58.PTR-3.field.index P9281.CH-AB-58.PTR-5.field.index P9281.CH-AB-58.PTR-1.field.log P9281.CH-AB-58.PTR-3.field.log P9281.CH-AB-58.PTR-5.field.log P9281.CH-AB-58.PTR-1.field.sum P9281.CH-AB-58.PTR-3.field.sum P9281.CH-AB-58.PTR-5.field.sumAhhh, it sure looks like the Kelman Quartet data. The .field and the .index are all you really need. The .log file is the human readable report when it was generated. The .sum file is used by Kelman's data base. We really do not need to keep storing this file.Now lets go get the Stack data for this same line
awk '{printf("grep \"/%s\.\" /data/seismic_index/stack.list >> new_stack.all \n", $1)}' 2d.list > runHere is the file the above awk created:more run grep "/P9281\." /data/seismic_index/stack.list >> new_stack.allexecute it with the following commands. (you have to turn the execute permission on)chmod +x run ./runCheck the listmore new_stack.all /data/stacks07/1/9281/P9281.CH-AB-58.03010104.f08.MIG.0.sgy /CH-AB-58 /data/stacks07/1/9281/P9281.CH-AB-58.f-mig.051995.ver.segy /CH-AB-58 /data/stacks07/1/9281/P9281.CH-AB-58.f-str.051995.ver.segy /CH-AB-58 /data/stacks07/1/9281/P9281.CH-AB-58.f-str.051995.ver.sgy /CH-AB-58 /data/stacks07/1/9281/P9281.CH-AB-58.f-str.091987.ess.segy /CH-AB-58 /data/stacks07/1/9281/P9281.CH-AB-58.fstr.198709.sgy /CH-AB-58 /data/stacks07/1/9281/P9281.CH-AB-58.u-str.091987.ess.segy /CH-AB-58 /data/stacks07/1/9281/P9281.CH-AB-58.ustr.198709.sgy /CH-AB-58 /data/stacks11/1/9281/P9281.CH-AB-58.03010104.f08.MIG.0.sgy /CH-AB-58 /data/stacks11/1/9281/P9281.CH-AB-58.fstr.198709.sgy /CH-AB-58 /data/stacks11/1/9281/P9281.CH-AB-58.ustr.198709.sgy /CH-AB-58 /data/stacks12/1/P9281/P9281.CH-AB-58.03010104.f08.MIG.0.sgy /CH-AB-58 /data/stacks12/1/P9281/P9281.CH-AB-58.f-str.051995.ver.FSTR.0.sgy /CH-AB-58 /data/stacks12/1/P9281/P9281.CH-AB-58.f-str.051995.ver.STR.0.sgy /CH-AB-58 /data/stacks12/1/P9281/P9281.CH-AB-58.f-str.091987.ess.STR.0.sgy /CH-AB-58 /data/stacks14/1/9281/P9281.CH-AB-58.03010104.f08.sgy /CH-AB-58 ...Now run the script copystack.pl that will get the data and place it in the correct directory.~ekeyser/Perl/copystack.plBuild an Inventory to Check your Results
: Here is yet another script that will take a listing of all files and fill out a .csv file flagging the different types of data. This does assume that you have consistantly named items with flags like .obs or .sur or .chain etc. Here's how to run the seiscap_data.pl script. Click on the link to see what the program looks for. First make a list of all files using the unix find command. Here is how to run it:find . -name "*.*" | sort > junk ~ekeyser/Perl/seiscap_data.pl > inventory_report.csvUse your favorite spread sheet program to view the above document.SeisCap the data (create the archive file)
: How to build a seiscap archive:Collect all the files we wish to archive using the unix find command:
- build a sorted list of files, with path names, edit the file archive.txt and order the files to your preferred format.
- Delete all files you do not want in your archive
- Move your info.txt file up to the top of the file
- Run the seiscap_build.pl to create the csv file, edit the comments. Check the file associations, .txt for files to be opened with a text editor, .tif, .jpg or .png for graphics program. Note: use .bin for a file without a file association.
- From the PC side, run the seicap.bat file to create the three archive files, the data file, an index and the log file. The log file also exists in the data file and can be erased. This bat file will run the create_seiscap.exe program to create the encapsulated data.
find . -name "*.*" | sort -u > archive.txtEdit the file an place the info.txt file as the first filemore archive.txt ./P9281.CH-AB-58.info.txt ./Basic/P9281.CH-AB-58. Drillers.tif ./Basic/P9281.CH-AB-58.Observers.tif ./Basic/P9281.CH-AB-58.Tape Index.tif ./Field/P9281.CH-AB-58.DY10400.field ./Field/P9281.CH-AB-58.DY10400.field.index ./Field/P9281.CH-AB-58.DY10400.field.log ./Field/P9281.CH-AB-58.DY10400.field.sum ./Field/P9281.CH-AB-58.PTR-1.field ./Field/P9281.CH-AB-58.PTR-1.field.index ./Field/P9281.CH-AB-58.PTR-1.field.log ./Field/P9281.CH-AB-58.PTR-1.field.sum ./Field/P9281.CH-AB-58.PTR-2.field ./Field/P9281.CH-AB-58.PTR-2.field.index ./Field/P9281.CH-AB-58.PTR-2.field.log ./Field/P9281.CH-AB-58.PTR-2.field.sum ./Field/P9281.CH-AB-58.PTR-3.field ./Field/P9281.CH-AB-58.PTR-3.field.index ./Field/P9281.CH-AB-58.PTR-3.field.log ...You can see that we have directories for the Basic and the Field data. This helps us keep organized. Next step is to build the file associations and the comments field useing the following seiscap_build.pl Perl script:
Create SeisCap ~ekeyser/Perl/seiscap_build.plThis script also builds the seiscap.bat pc file that does the encapsulation. Edit the file and add your own comments,vi *archive.csvGo back to the pc side execute the seiscap.bat file. To the right is what the execution looks like, note that this version is dated May 2, 2006 and contains the CRC 16 bit check sum. You may have to place the program create_seiscap.exe found under Programs on the I: drive and execute the batch file seiscap.bat, I just have to double click the bat file. EnCana is currently working on compiling a unix version. This will make it easy to automate multiple jobs.more seiscap.bat d:create_seiscap.exe P9281.CH-AB-58.archive.csv P9281.CH-AB-58.seiscap.fieldNote: Here is the command to build an encapsulated data set for stack instead of field data. All you have to do is use the -f2 option, the field will be replaced with stack.~ekeyser/Perl/seiscap_build.pl -f2 A44266-info.txt.archive.csv , TXT, Archive table of contents A44266-info.txt , TXT, Text File Description A44266.MKD-4(W).info.txt , TXT, Text File Description Business\A44266.MKD-4-WEST.f-mig-prena.200605.cnc.png, PNG, Business\A44266.MKD-4-WEST.fstr.png , PNG, Business\A44266.MKD-4-WEST.label.200605.cnc.png , PNG, Processing Side Label Business\A44266.MKD-4-WEST.map.200605.cnc.png , PNG, Map Business\A44266.MKD-4-WEST.map.png , PNG, Map Business\A44266.MKD-4-WEST.Processing-QC.200605.cnc.doc, DOC, Processing Quality Control Business\A44266.MKD-4-WEST.Refraction-Front-End-QC.200605.cnc.doc, DOC, Business\A44266.MKD-4-WEST.semblance.200605.cnc.tif, TIF, Velocity Semblance Plots - multi page TIFF Business\Seismic_License_Invoice.pdf , PDF, Stacks\A44266.MKD-4-WEST.f-mig-prena.200605.cnc.sgy, BIN, SEG-Y file Stacks\A44266.MKD-4-WEST.f-mig-prena.200605.cnc.txt, TXT, Stacks\A44266.MKD-4-WEST.u-mig-prena.200605.cnc.sgy, BIN, SEG-Y file Stacks\A44266.MKD-4-WEST.u-mig-prena.200605.cnc.txt, TXT, Stacks\A44266.MKD-4-WEST.u-str-prena.200605.cnc.sgy, BIN, SEG-Y file Stacks\A44266.MKD-4-WEST.u-str-prena.200605.cnc.txt, TXT, Stacks\A44266.MKD-4-WEST.u-str.200605.cnc.sgy , BIN, SEG-Y file Stacks\A44266.MKD-4-WEST.u-str.200605.cnc.txt , TXT, d:create_seiscap.exe A44266-info.txt.archive.csv A44266-info.txt.200607.seiscap.stack -finiIn the above example for stack data sets, we are storing processing reports and .png images in a folder called Business. This is the location that our data sales will access. The geophysicist is normally going after the data in the Stacks directory.Last step is to check your archive. Run the program seiscap_extractor.exe, It's a window based program and see if you can open up your archive. Check the log file, does it make sense. Check the listing of files, the LineID.LineName.archive.csv eg: A3695.84848-84.archive.csv. Are all the files here? If it doesn't look right here I mistakes I commonly make:
- Seiscap.bat will fail if you have the .csv file open in another program, Note Pad or Excel. This is a Microsoft feature!
- Files in the LineID.LineName.archive.csv file aren't there any more. Regererate the list
- The file associations are not correct. If in doubt, make the file association .BIN This is treat the file as one big file with one check sum. Do this for files like .zip or other encapsulations
- Do not place commas in the comments field, data after the comma will be ignored
- Make sure there isn't a log file pointer at the end of LineID.LineName.archive.csv It's automatically put there after each run.
- Look at the log file, this will usually give you some clues as to what has gone wrong.
- Run the create_seiscap.exe program from a dos window to be able to look at the messages. Every record will have a different line of information
Collect all the SeisCap files together
Here is how I use some simple unix commands to collect all the SeisCap's files together and ready to send to the archive or processor. Here I do a find command with today's date, it's automatically put into the archive name. This one line awk will generate the unix move command to collect all the different SeisCap archive files into a single location.find . -name "*200607*" > junk awk '{printf("mv %s .\n",$1)}' junk > run chmod +x run ./runI examine the log files for completeness and when I am done I erase them.more *.logWhen I am done, I erase the log files (another copy of them are in the data file and can be extracted)rm *.logMove the files to an organized Staging Area
Here is a script I like to use that starts off with a find command to locate a series of files directed into the file labelled junk. The script stage_dir.pl takes as a parameter the directory of the data you wish to move into the new structure. This program will generate the following directory levels, the ten thousand, one thousand and finally the Line_idFirst step is to use the unix find command to build a list of seiscap files to be moved. I'd sort them first to make them easier to check making sure a data file and an index file exist.
pwd /data/seiscap1/reprocessing/eric/processing_current/06June16 find . -name "*seiscap.field*" | sort > junk bash-2.03$ more junk ./A3696.84849-84.200607.seiscap.field ./A3696.84849-84.200607.seiscap.field.idx ./A3699.84892-84.200607.seiscap.field ./A3699.84892-84.200607.seiscap.field.idx ...Change directory to our Staging area and move the files with the stage_dir.pl script. You change directory to the storage location and run the script pointing to the directory with the file called junk that contains the list of files to be moved.cd /data/seiscap1/seiscap ls 000000 040000 120000 ~ekeyser/Perl/stage_dir.pl -f /data/seiscap1/reprocessing/eric/processing_current/06June16 mv /data/seiscap1/reprocessing/eric/processing_current/06June16/A3696.84849-84.200607.seiscap.field A3696.84849-84.200607.seiscap.field mv /data/seiscap1/reprocessing/eric/processing_current/06June16/A3696.84849-84.200607.seiscap.field.idx A3696.84849-84.200607.seiscap.field.idx ...Last step is to check and rebuild our indexfind . -name "*.*" > seiscap.list more seiscap.list ./000000/003000/A3696/A3696.84849-84.200607.seiscap.field ./000000/003000/A3696/A3696.84849-84.200607.seiscap.field.idx ./000000/003000/A3699/A3699.84892-84.200607.seiscap.field ./000000/003000/A3699/A3699.84892-84.200607.seiscap.field.idx ./000000/003000/A3786/A3786.84890-83.200607.seiscap.field ./000000/003000/A3786/A3786.84890-83.200607.seiscap.field.idx ./000000/003000/A3787/A3787.84853-83.200607.seiscap.field ./000000/003000/A3787/A3787.84853-83.200607.seiscap.field.idx ./000000/003000/A3817/A3817.84948-84.200607.seiscap.field ./000000/003000/A3817/A3817.84948-84.200607.seiscap.field.idx ...They moved and are in their organized resting spot
Site Owner: Eric Keyser
Last Updated: Sept 11,2006