It turns out we have a number of files on our system that were labeled as sgy but in fact are really graphic files. All you have to do is to check the first few byte positions. Here is how to automatically figure out what these files are: Graphic Image Signature Bytes
Here is the reference if used for the graphic images, I have verified it works: http://jasperreports.sourceforge.net/api/net/sf/jasperreports/engine/util/JRTypeSniffer.html
- Gif - 71, 73, 70
- Jpeg -255,216
- Png - 137,80,78,71,13,10,26,10 , we probably need to check only the first few byte positions...
- Tif -73,73 (little endian) or 77,77 (big endian)
- SEGZ - 83, 69, 71, 90
Does anyone know a simple way to detect a pdf file, word doc, excel spread sheet?
Site Owner: Eric Keyser