Following is a copy of the first draft of The "GeoTIFF Box" Specification for JPEG 2000 Metadata.
*** DRAFT ***
The "GeoTIFF Box" Specification for JPEG 2000 Metadata
Version 0.0
30 April 2004
*** DRAFT ***
Michael P. Gerlek, editor
mpg(AT)lizardtech(DOT)com
LizardTech, Inc.
1008 Western Ave Suite 200
Seattle, WA 98104 USA
This is a DRAFT document. Comments welcome.
Sections in [brackets] are editorial asides, calling out specific questions or details to be resolved.
This document is copyright LizardTech, Inc, 2004. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct or commercial advantage and this copyright notice appears.
LizardTech, Inc assumes no liability for any special, incidental, indirect or consequences of any kind, or any damages whatsoever resulting from loss of use, data or profits, whether or not advised of the possibility of damage, and on any theory of liability, arising out of or in connection with the use of this specification.
This specification describes a GeoTIFF-based method for adding geospatial metadata to a JPEG 2000 file. While the actual specification is not at all complex or hard to implement, there has been some confusion about what it actually is, what restrictions apply to its use, etc. Enough people have asked about it that we considered it worthwhile to put something on paper and have it reviewed by some independent developers.
Note the intent of this document is only to codify existing practice as of this writing; no modifications or extensions to this specification are planned or expected.
Mapping Science Inc. (MSI) provided the first implementation of this specification in their GeoJP2(tm) encoder product in 2003. At that time, the definition of the specification was available only under certain licensing restrictions from MSI.
LizardTech, Inc. acquired the assets of Mapping Science in 2004. It is LizardTech's position that this specification should be publicly available for anyone to implement. Neither JPEG 2000 nor GeoTIFF are proprietary standards; the combination should not be either.
Note that "GeoJP2" is a trademark that refers to the original MSI encoder (now owned by LizardTech). Please don't use the term "GeoJP2" to refer to this metadata specification -- we don't want this specification to be encumbered by trademark issues.
Two UUID boxes are defined.
The first, called the GeoTIFF box, contains a degenerate GeoTIFF file as described in section 2.
The second, called the world file box, contains the usual six doubles as in an external world (.wld) file, plus some additional version information. This is described in section 3. Presence of the world file box is optional.
This specification assumes a compliant JP2 file with only one codestream box.
The GeoTIFF box provides a simple mechanism for a JP2 file to have the same level of geospatial metadata as is provided by the widely supported GeoTIFF standard, using the normal GeoTIFF implementations.
The UUID for this box is
static unsigned char geotiff_box[16] = { 0xb1, 0x4b, 0xf8, 0xbd, 0x08, 0x3d, 0x4b, 0x43, 0xa5, 0xae, 0x8c, 0xd7, 0xd5, 0xa6, 0xce, 0x03 };
This box contains a valid GeoTIFF image. The image is "degenerate", in that it represents a very simple image with specific constraints:
The TIFF image is to be encoded in little endian format. [Note that an early and possibly unreleased MSI encoder seems to have used big endian form, but the GeoTIFF data appears corrupt.]
The intent is that any compliant GeoTIFF reader/writer will be able to read/write this image.
Note that the TIFF image properties -- width, bitdepth, etc -- do NOT reflect the image properties of the JP2 image. These image properties are not to be used in the interpretation of the geospatial metadata.
Other TIFF image properties maybe present; if so, they should be similarly ignored.
[ If the TIFF image properties do not meet the constraints above, the geospatial information represented by this box should be considered to be undefined. ]
The GeoTIFF image may contain TIFF metadata tags. These should be ignored; they do not apply to the JP2 image.
The GeoTIFF image may contain any number of GeoTIFF keys, as allowed by the GeoTIFF standard. These keys define the geospatial metadata of this box and of the JP2 image itself.
[ This description is based on my reading of the MSI source code; I will have to flesh this out as I become more confident of it. Alternatively, at some point if I can get the code suitably cleaned up I may just publish the implementation itself... If anyone needs this information now, feel free to contact me. ]
The world file box contains one or more "chunks" of metadata data of various types. The most common chunk type encodes the normal six-doubles style of geopositioning information found in the conventional external world files often used with some image types.
[ Other chunk types were used to indicate the operating system the JP2 image created on, the MSI command line used, and arbitrary user-defined bytes. It is not clear if these other chunk types were ever widely used or not. I will attempt to define these other chunks, but they use should be considered to be OBSOLETE and not used in future implementations. ]
The UUID for this box is
static unsigned char world_box[16] = { 0x96, 0xa9, 0xf1, 0xf1, 0xdc, 0x98, 0x40, 0x2d, 0xa7, 0xae, 0xd6, 0x8e, 0x34, 0x45, 0x18, 0x09 };
The first bytes in the box, which we will call the "header", give some versioning information and the number of chunks in the box. The "chunks" themselves then follow, laid out as contiguous bytes. The box ends with a small of amount of data in what we will call the "footer".
Bytes 0-3: 'M', 'S', 'I', 'G'.
Bytes 4-5: major and minor version numbers (shifted and packed together) - the actual values of these numbers may not be used for anything
Bytes 6-13: feature set flags - current values are {1, 0, 0, 0, 0, 0, 0, 0} - first flag controls interpretation of the world file values, see section 3.2.2.1 - second flag indicates windows or linux build of encoder; not used for anything? - remaining flags undefined (leave as zero)
Byte 14: number of chunks in the box?
Byte 15: next box?; apparently always 0
The next bytes in the file correspond to the serialization of each chunk. There may be zero or more chunks present; each chunk type may appear at most once.
The chunk format appears to be a simple header of six bytes, followed by the chunk-specific data.
Byte 0: chunk index, used to indicate type of chunk
Byte 1: chunk properties [not used?]
Bytes 2-5: chunk length (including these six bytes) - stored as little-endian unsigned int
Byte 0: chunk index (equal to 0)
Byte 1: chunk properties
Bytes 2-5: chunk length (equal to 2 + 4 + 6*8) - stored as little-endian unsigned int
Bytes 6-13: x scale (resolution)
Bytes 14-21: x rotation
Bytes 22-29: y rotation
Bytes 30-37: y scale (resolution)
Bytes 38-45: x upper-left
Bytes 46-53: y upper-left
The six geo values are stored as little-endian doubles.
The first feature flag (defined in section 3.2.1) control the interpretation of these values. According to the comments in the source code, if set to 1 then the following applies:
"This was instituted with version 1.03.11 (May 15, 2003) to signify that we clarified the definition of the georeferencing data and found out that that data represents the upper left corner of the upper left pixel, not the center as we had thought, so the [world chunk values are] not equal to the geotiff data, but is shifted by 0.5*scale to the center of the pixel."
If the world chunk is present, these values should override the corresponding values in the GeoTIFF box.
Byte 0: chunk index (equal to 1)
Byte 1: chunk properties
Bytes 2-5: chunk length - stored as little-endian unsigned int
Bytes 6..n: user-defined data (chunk length minus 6 bytes)
Byte 0: chunk index (equal to 2)
Byte 1: chunk properties
Bytes 2-5: chunk length - stored as little-endian unsigned int
Bytes 6..n: command-line string (chunk length minus 6 bytes)
Byte 0: chunk index (equal to 3)
Byte 1: chunk properties
Bytes 2-5: chunk length - stored as little-endian unsigned int
Bytes 6..n: unknown data (chunk length minus 6 bytes)
The footer, coming after the chunk data, is six bytes long.
Byte 0: set to 0xff
Byte 1: set to 0x00
Bytes 2-5: file offset of next world file box?