SeisCap Duo Encapsulation Standard
November 30, 2007
November 30, 2007 - There has been some confusion about the SeisCap Duo being a new format. Kelman's KQ, Seitel's SUOF and the CSEG SeisCap Duo are all the same with trivial differences. The basic extractor will work for all three flavours of the index file.Processors access data using only the byte offset and length of the record to extract shot records from the .field file.
Kelman provides the following documentation for anyone wishing to extract KQ data. All three formats use 36 bytes of the index file to describe each record.
May 24, 2007 - We have removed the create_seiscap.exe from this web site. Please note that it took C&C only a week to write the code to perform the encapsulation.
December 5, 2006 - Here is my PowerPoint Presentation for the Mackenzie Delta SeisCap Project Status. To date we have completed 100 our our 500 lines. The most expensive step is the cost of scanning microfilm. We think we can reduce this from our current costs of $.50 down to the 5 to ten cents per image.
Novermber 20, 2006 - Here is a copy of an article that should be published in the December issue of the CSEG Recorder starting at page 34. It explains basic archive principles and how to convert tape shot records into record oriented encapsulated files suitable for storage on any media. Best of all, it's free!
October 30, 2006 - Found and fixed a bug in the de-encapsulator program. Under some conditions, the program would fail to open up a file. Here is the new version for seiscap_extractor.exe.
October 29, 2006 - Does anyone out there read this stuff? Do you think it has value? I'm looking for a donation of $30k for both the SeisCap creator and the SeisCap extractor. He's what we will deliver:
September 5, 2006 - Building a case to Encapsulate and Verify 355 2D seismic ines. Submitting a proposal at $200 per 2D seismic line and create the following quality control products:
- Provide additional details with any error messages
- Extract Kelmans's KQ or Seitel's SUOF data.
- Verify KQ and SUOF format. Has it changed over time? EnCana had over ~2000 encapsulated files go south and have to be re archived due to a 1024 byte problem when our backup tape drives were changed. You must routinely very that your data has not changed. Create a batch version of this program.
- Port both programs to both Unix and Linux.
- Add additional file types .zip
The data will be verified as complete and ready to process. You no longer have to wait days or weeks to have all your data collected. Here is my PowerPoint Presentation Who is ready to place all your seismic data, segy files, field tapes, ob's all organized on your hard drive
- Field shots demultiplexed with geometry
- Line statistics collected, fold, number of shots/traces. You can now properly estimate reprocessing charges
- Deconvolution, pick three velocities and apply reflection statics
- Produce a stack, 100% section and selected shot displays
June 29, 2006 - Build a web document to show how to Encapsulate data at EnCana. This link will provide details about the naming conventions and how we prepare the data to be encapsulated.
June 8, 2006 - Here's how to run the create_seiscap.exe file:
May 5, 2006 - Met with data management and declared some naming conventions and created a work flow document on the steps to Archive your data. Consistant names are required to facilite inventory reports. Anyone who is archiving data for EnCana, please use these standards!
- Create a comma separated list of file names (and path), File associations, and Comments.
- Run the batch program. I place the program at the root level of my D: drive and then execute
d:create_seiscap.exe P172685.BURNT_MALLIK_M3D.archive.csv P172685.BURNT_MALLIK_M3D.seiscap.field- This will create three files, the data file, the index and a log file (human readable report). Note that the log file is a copy of the last file in the data file. Here's what it looks like:
![]()
- We are also adding a file association for compressed files, .zip Note that the program design is recursive, we can encapsulate encapsulated files. You can take all those encapsulated files from Kelman or Seitel and do another level of encapsulation!
May 2, 2006 - Terry has fixed a bug and provided another option. They are:
I have also been busy building some scripts to automatically create Containers that can be accessed from both unix and from the Windows side. I even create the info.txt file for each and every seismic line as well as deposit the SEGP file into the Survey_audited file. It's slick. Click here to see what the containers look like. We create directories for Archive, Basic, Business, Design, Field_shots, Gathers, Stacks, Survey_audited and Survey_original. This 3D is around 25gig and the field shots have been archived into around 4gig chunks.
- The create_seiscap.exe program will now handle fully qualified path names for output. You can now encapsulate from say the D: drive and place the archive files of the F: drive. This eliminates an additional archive copy.
- Bob found a bug that using the seiscap_extractor.exe program would fail on archives larger than 2gig. It's now been fixed! We can handle multi gigs!
- Pssst: The software is free, it's a standard! You are no longer held hostage by your favorite archive company!
April 20, 2006 - Here's a PowerPoint for my recommendation for EnCana's Geophysical Advisory Group (GAG) to approve the SeisCap encapsulation method and to continue to work with Archive companies (Seitel, Kelman), Field Instrument Manufacturers (ARAM, I/O, Sercel) and Seismic Processors (C&C, Geo-X, Sensor). Note the slide that shows the costs of Encapsulation going from $40,000 down to $0 when data are encapsulated directly in the field.
April 19, 2006 - Made numerous changes, SeisCaped ~ 30gig of field of data. Seems to be working! Here's what else has been done.
March 8, 2006 - Bill Leakey of Seitel Solutions has provided additional reasons why we do not like Vanguard as a format.
- Changed the 16bit check sum to CRC-16. This is much more robust, the CRC (cyclic redundancy check) is better and determining multiple false bits. Check out Lammert Bies web site for more details. Bill Leakey of Seitel Solutions recommended this implementation.
- Changed the name to SeisCap, Google reports more than a million references to ENCAP and none for SeisCap. We want to be found!
- Updated the create_seiscap.exe program. It will now handle ARAM's PC byte order shot data in Aram-Airies Version 2 format. Click here for the PDF documentation. It also contains the source code for the creator! Now we all can do it. Note that you have to make a file that contains a list of the Filenames, types and comments before you run the SeisCap_creator. The different file types control what is recorded in the log file.
- Recommend that Stack data be encapsulated as SGY format with a check sum calculated for each trace.
- Recommend that Field segy Shot data be encapsulated in BIN format with a check sum calculated for each file (shot).
- Please note - field data recorded as SEGY format should have a separate file for each shot. Individual shots can have a different number of samples or a different number of traces. Do not try and combine multiple shots together. EnCapsulate together instead!
March 6, 2006 - The check sum method has been upgraded to 16bit CRC that is more robust than the 8bit XOR used by Kelman and Seitel. The SeisCap extractor now has the option to verify the check sums. This is especially useful when data are transferred across a network. We have also a second text file that contains key information relative to this line. We are going to add a first line that contains an inventory of the archive. Defined archives now exist for prestack and historical.
February 13, 2006 - Cleaned up the description, reserved a byte for future use. We are now working on a check program for data verification.
February 8, 2006 - The SeisCap Viewer has now been updated to handle field data in segy and segd formats. We will be working on examples of sega, segb and segc.
February 2, 2006 - The Proposed SeisCap Encapsulation standard now consists of only two files, a data and an index file. The log of the description of the encapsulation process is now the last file of the data file and contains enough information that can be used to recover the data if the index file is lost.
Seismic Archive Goals
- The data will reside in a format suitable for transfer over networks.
- The data will contain a signature that will ensure that modifications to the data can be detected at any time in the future.
- The data contains the necessary information required to perform data verification, embedded in the format of that data.
- The format has the ability to track errors that occurred in the archiving process.
- The archive format does not need to be extended or modified to support additional data formats as they evolve.
- The archive process logs track the processes and actions that took place during the archive process.
- The Original Data is not altered. The records remain in the original record order and all EOF (End of File) markers are preserved.
- This single format of the archive is independent of the format and media type of the input dataset.
- Information is not added or deleted from the binary data stream. Data order is preserved.
- The data can easily be restored to the original or like media. It is trivial to decode this format. I'll define trivial as less than a day!
What's Wrong with Existing Standards?
In a lot of ways, magnetic tape is our perfect format for encapsulation. It is a sequential, self defining, record orientated storage mechanism that does not depend upon the format definition of the input data. Data are stored based on sequential record order and size of the records. However, tape has two major problems:The industry is moving to DVD's for small data sets and to hard drives for large data sets. IDE drives are available in 500gig sizes (February 2006). For a couple of thousand dollars you can build a 2Terra Byte Raid V server. Everyone appreciates the random access ability of disk.
- The media is not permanent, the magnetic coating doesn't always stay glued. Tape stiction is a major problems, especially in damp climates. Back in the old days, tape manufacturers recommended the rotation of the physical tape every six months. We stopped doing this during a downsize period.
- Tape is sequential, you have to spin the full tape in order to retrieve the last file. Disk has the advantage that the last record can be retrieved as quick as the first.
The SeisCap encapsulation standard is independant of media type. It can change with the future. So why do we need yet another encapsulation format? What is wrong with the existing formats? You can refer to SEG Technical Standards for more information on what has been blessed at the SEG. I was given a copy of an old NDS description of Lacey, Vanguard as well as a slew of other transfer formats. It's worth reading this document. Here's a brief review of existing encapsulation techniques that I have been able to evaluate.
SEG RODE
RODE was initially published in 1996 as a method to encapsulate well logs. The original author has been reputed to say that he would not recommend it for Seismic Data. This method has been adopted by several Oil Companies, Shell, Mobil... numerous vendors (CGG) etc. We do not support the use of RODE for the following reasons:
- It's very complicated. It's expensive to purchase software to decode RODE. It takes around six months to write a program to decode RODE and it will only work your current example of RODE. We are providing free software for SeisCap format. Look at the source code and see how trivial it is to write your own software!
- RODE is not a standard. There is too much "wiggle room". Everyone's version of RODE is different. Every time you see a different RODE, you have to get your programmer to fix his code.
- I have even talked with a Geophysicist within SHELL and he does not recommend using RODE.
- The only companies that I can find that recommend RODE are the software vendors that have spent the investment to write the code that supports RODE.
- RODE is a format defined around magnetic tape.
VANGUARD - Dual files per shot
Vanguard is a method to transfer files from seismic field tapes to CD or DVD. It takes all the records from tape and splits them up into a header file and a record file. Here's why we don't like it:Processing shops generally like to work with Vanguard. Our encapsulation method provides all the advantages of a proper archive while preserving all the benefits of Vanguard. Note that our examples below have archived vanguard files. We have even preserved the directory paths. (Paths lacking data have been deleted). You can even use our viewer to verify the field file numbers.
- Does not represent a single format. Formats have to be defined for each new format. Formats defined so far for sega through segd. To identify the header/data pairs they usually contain a common prefix, 'FILE' or just 'F' and a suffix that separates the header and data portion. For example an extension of '.hb' and '.sb' for segb data. Notice that the selection of letters is critical, the header record has to sort before the data record.
- No check sum information. Double the number of files to have to look after in an archive
If you still are not convinced, Bill Leakey of Seitel Solutions has authored the following document to explain why people may not understand the problem we are trying to solve.
LACEY - Eight byte preamble header
Lacey is an encapsulation format for any type of a data. It's very simple and consists of a byte stream that is created by all blocks of data are concatenated together prefixing each with an eight byte preamble header. The first four bytes are a sequence number of the block within the data set and the second is the byte sixe of the block directly following the preamble header. Support software is required to extract the blocks correctly by reading the preamble header to determine the size of the next block. Here's why we don't like it:We recommend that all Lacey data be reformated into the SeisCap encapsulation format.
- Lack of check sum data verification
- Only defined for data blocks less than 2 gig
- No identification for each type of record
- Cannot handle different record types
Other Encapsulation Techniques
The following techniques have also been examined and rejected:Please let me know if you know of another way to encapsulate seismic data. I'd like to be as complete as I can.
- Geo-x : Four byte preamble header and postamble trailer. Similar to Lacey but provides the ability to retrieve a file in the reverse direction.
The Proposed SeisCap Encapsulation Format
The SeisCap Encapsulation Format consists of only two files, a data file and an index file. The data file contains enough information for self extraction. The separate index file contains record file lengths and checksum information. Data that has been electronically sent can be verified.
File Description xxx.yyy.field The bytes read from the tape with NOTHING added between the data blocks. The first file contains a list of the original file names file numbers. The second, optional file *info.txt contains a key summary of the data in the archive. The last file contains the log file of Human readable archiving, who did it, when and includes the first 32bytes of field data plus a list of the record sizes just in case the index file is lost or corrupt. (This use to be the .log file). Data to include field shot data, original field tape labels, field survey,Observers notes, chaining notes, drillers logs, scanned mylar of oldest processed section that includes a side label. xxx.field.idx Provides information to find each record/file boundary within the data file. Includes integrity checks (16bit CRC checksum). Note that Kelman uses .index and Seitel use .indx so we have to be different as well and use .idx for the index file. This way, a quick glance at the archive and you will know what version you are looking at! xxx.field.log OPTIONAL separate file. Normally it's the last file in the data stream. It contains the human readable information at the time of encapsulation. It contains a 32 byte hex dump of each record that can be used to identify field file numbers. For data with a file type of SGY, the EBCDIC header dumped out here. Each record length is also recorded so data can be recovered in case the idx file is lost or missing. Note: Archive data sets will also be created for the following additional data types:
All 4 byte values are stored big-endian (native for the Sun, backwards for Intel)
- xxx.yyy.stack - Final stacks, processing text files, map, processing side label, velocity semblance plots, front end quality control, processing quality control
- xxx.yyy.prestack - Front end SEGY shot data file, gathers, original field tape labels, field survey, observers notes, chaining notes, drillers logs etc (some of this is duplicated with field data)
- xxx.yyy.historical - All previous processed versions with included text files
Index Header
Index Header is a single occurance at the beginning of the file. It contains the version number and the total number of Files and Records.
Byte offset 0 1 2 3 Format Description 0 000 000 000 004 32 bit int Version (currently 4) 4 32 bit int Number of Files 8 32 bit int Number of Records Note: One segy file consists of a record for the EBCDIC header, a record for the binary (line) header and a record for each trace header and trace data. The index Record is repeated for each record and EOF of the SeisCap.
Index Record
Byte Offset 0 1 2 3 Format Description 0 000 000 000 001 32 bit int File Number 1, 2, 3... 4 32 bit int Field File Number (1-9999) 8 32 bit int Record Number 12 64 bit int Start Location 20 64 bit int Record Length (0=EOF) 28 4 bytes 8 bit Format 16 bit CRC Checksum 8 bit Status 32 o 4 bytes Reserved Optional 3 ascii character file suffix,eg sgd, sgy, tif, zip... Here are a list of the Format codes currently defined, contact the author if you would like any others.
Format Codes
File Suffix Format Code (byte) csv v sega a segb b segc c segd d segy, sgy y jpg j tif t png g xml m xls x ppt p doc o lst, txt, prn l segp, ukooa, survey, sur s f zip z
Status Codes
Here is a preliminary list of status codes: Please let me know if you want any more.
Status Code Description 0 Unknown Error 1 Good data - no errors 2 Short file 3 ??? Important Information for the SeisCap Format
- The first record is a simple ascii flat file that contains a list of original file name, File type (TXT, BIN, SEGY, SEGA, SEGB, SEGC, SEGD) with an optional Notes field. Here is an example for this first file. You can use a Spread Sheet program, Notepad or Wordpad, more or type command to view this file. This file contains a lot of useful information. I prefer to use Workpad because it will properly open up unix text files. This file is to be comma deliminated .csv file.
Or, here's how I do this from unix, I perform six steps:
- Create the info.txt file for the Meta data directly from EDM
- Use the find command to create a list of files
- Re order my sorted list so the important data is at the top
- Run my Perl script to build the list of files with the appropriate list of files
- Manually edit adding additional comments
- Create a DOS .bat file to execute the create_seiscap.exe
- The Data Checksum is now a 16 bit CRC checksum for each record of the data file. Please note that the EBCDIC header, Binary Header, Trace Header plus the data are considered to be records for SEGY format data.
- The Status will normally be a 1 if there are no read errors, otherwise it is a 0
- If an EOF exists on tape, or a series of EOF's, the record length will be zero.
- The log file is the last file in the data file and contains useful, human readable data, the size for each record is located here just in case the index file is corrupt. NOTE: all the data can easily be recovered! Here is an example of the log file for field data and another example example for stack data archive. Note the full EBCDIC header is displayed for each segy file.
Advantages of the SeisCap Data Encapsulation Format
- This format is in the Public Domain and free for anyone to use. An SeisCap creator and a reader has been provided by C&C Systems in Calgary. We even provide the C source code so you can write your own program.
- The maximum data record and file size can now exceed 2gig.
- Simple, 36 byte index record, de-encapsulators are available for free, you can write your own in a couple of hours. Look at our source code to see how simple it is. All you have to do is read the Record Length as a 32 bit data field for data less than 2 gig and as a 64 bit field for larger data sets.
- Format will allow for record sizes or greater than 2gig.
- All record types can be encapsulated, it doesn't have to be just seismic. We encapsulate all different data types, graphic images, word processing reports etc.
- It has been pointed out to me that this format will allow multiple index files to be created. You can create your own index, perhaps one that contains XY's.
- This archive format is a modern version of what Kelman, Seitel, Devon, EnCana... have been using for more than 10 years. Your old decoder will still work. There are millions of encapsulated files already in this format. No one knows about it because it has not been published. This web site now publishes the standard. I plan on submitting this "standard" first to the CSEG and then to the SEG Standards Committee for ratification. It should be noted that Seitel are currently using format 1, Kelman uses format 2 and 3. Hence we are now proposing format 4. Here are a few reasons why format 4 is "better":
- Supports files larger than 2gig for Files and Records
- Robust 16bit checksum is used
- A single byte has been reserved for future use
- The 3 letter suffix for each record is stored in the index file (optional)
- The first file contains a list of all the file names. All files can be restored back to their original names.
- The second file, info.txt contains a summary of important information for the archive. A single line summary of the inventory of the data is included.
- Format as defined can encapsulate ANY type of digital data!
- The log file now contains enough information to reconstruct the data if the index file is lost or corrupt. The log file is to be appended to the .stack and .field data stream as the last file. All data are stored in a single file.
- More format codes have been defined.
- All processing centers in Calgary can read data in this archive format.
The SeisCap Extractor
SeisCap Viewer for field SEGD data
Here is an example of a simple archive extractor written by C&C systems here in Calgary. This Windows XP program is free and you can download it by right clicking on this link and saving the executable file (it's now grown to 2.9meg in size).We are encouraging anyone else who has written a better to place it into the public domain!
SeisCap Extractor for Field SEGD We are working on preparing a data set you can use to see how this all works. We will have a series of SEGA, SEGB, SEGC, SEGD, SEGY files you can use to verify your application.
Click on the image to see the full size version.
This archive contains a total of 399 files. The first file is a table of contents and it looks like this. It contains three columns, a file name, a file suffix and a description field. For sega, segb, segc and segd the first 32 bytes will be dumped into a hex viewer. See how easy it is to check your data, we can easily see the field file number eg 0327 and the format of the data, in this case 8058, 32 bit IEEE demultiplexed.
The Processors out there will like this format. You can easily verify the field file number for both the header and data files.
This archive file contains all data required for processing, survey, chaining, observer and drillers notes. In this case they have been scanned. Even the label of the scanned tape has been captured into our archive file. For this example, we have taken a CD of Vanguard data to the AEC standard and archived the data!
SeisCap Viewer for field SEGY data
This archive contains a total on only 11 files. It has the same basic data files, tape lables, chaining, driller's and observor's notes as well as the Field survey. The field shots are now a single segy file.The last file is the log of this archive (at the time of encapsulation. Examine this log and you will view the first 32bytes of data for each File and Record as well as the length of each record. For text files, the spaces and carriage returns have been removed.
SeisCap Extractor for Field SEGY IMPORTANT - this single file contains all the information to re construct your data! The index file is not critical! It's a nice to have (contains data verification information (check sums for all the records)
SeisCap Viewer for stack SEGY data
This archive contains a total on only 23 files for the final processing report. In addition to the segy stack files it includes:
SeisCap Extractor for Stack SEGY The data can easily be extracted or the view button will automatically launch your associated raster graphics viewer, word processor or segy viewer. Pssst, we are building a segy viewer that only requires number of samples and data sample format. Should work for most data sets without having to read traces and shot points.
- final survey,
- The Seismic Line label as a graphic .png file
- The Processing text file
- The Processing Quality Control report (16meg)
- The Survey as provided to the processor
- All shot records with all geometry and statics in the trace headers and not applied to the data. This file can be used for further reprocessing. I might even process the data using this file!
- The seismic segy files, note that they are in EnCana Workstation segy format and need only to be attached in SeisX or SeisWare!
- Chaining, drillers and observer notes.
- Log of the archive. Note the record and record length can be used in a pinch to de-encapsulate this file if the index file is lost. Click here for the log file
Here's what the files look like, there is a stack file that contains all the data, and index file that contains pointers to the data with check sum information and a program to extract the data.
-rwxr-xr-x 1 ekeyser expl 2101584 Feb 1 09:18 A44263.MKD-2.200601.cnc.idx -rwxr-xr-x 1 ekeyser expl 625381298 Feb 1 09:19 A44263.MKD-2.200601.cnc.stack -rwxr-xr-x 1 ekeyser expl 775680 Feb 1 09:19 seiscap_extractor.exeThat's it...
Site Owner: Eric Keyser