Remote sensing image transformation with cosine and wavelet method for SPACeMAP Visualization

Remote sensing imagery in LAPAN has been managed and distributed through many media and device, like CCT, DCRSy, tape, CD, DVD, hard drive. The format is available from Quicklook data (in JPG or BMP), data RAW (in Gtiff or ground station data format). There also exist metadata that bring along in each remote sensing data like Landsat data, SPOT, Radar data, and High-resolution data. From that all collection then it would need some system that can arrange, keep and distribute data in a huge amount of number. Research on metadata structure and standard will be required in managing a large number of remote sensing data. Types and formats from those remote sensing data will be various but the raster format which most widely used is GeoTiff. Metadata for every raster data format is already standardized following the rule of ISO 19139 and 19115. SPACeMAP is a new web catalog system that uses the simplest and latest raster data format for displaying the mosaic image, data processing result, data raw format and etc. The structure of raster data format (like Gtiff and ECW) is very flexible although the format needs certain criteria. Metadata also treated the same as raster format, this treatment hopefully will minimize the error.


Introduction
Remote sensing imagery catalog system development is started by doing managing lots of data manually. Computer with simple technology until complicated has been used to manage and display the imagery. Since the internet has been utilized by people around the world then web map GIS has gradually developed naturally. Nowadays the existence of ease access on data storage, data management and highly develop network-capable image used to create the latest system for cataloging the remote sensing imagery. For example, like Landsat, nowadays it is easy to download without doing any payment. Image availability from Landsat 1 until Landsat 8 it can be processed with Google engine for its geometric and radiometric operation. Also, someone can retrieve several vegetation indexes from combining its bands. Before the latest technology provides some tools to manage the big size data like an old system it is hard and difficult to maintain the data storage management. Today, all imagery can be easily managed after they collected from on board storage inside satellite then it can be direct or indirect moved to ground station and keep the data into advance storage like NAS, SAN, etc. Digital storage has various storage format like bitmaps, GIF, joint photo (JPG), ECW, Tiff, GeoTiff and others. The mosaic is processed by image processing team in remote sensing data and technology center, LAPAN. Erdas Apollo server is used to operate this catalog like creating a footprint, Quicklook, relational database management system, several simple image process on the fly and other. Format file of the Geotiff (BigTiff) is capable of keeping such a large file more than 4GB but when display on SPACeMAP it will take long processing time. Conversion into another format like ECW will compress the large image into a small size by maintaining its data quality (visually).

Aim and Goal
This research is to develop BDPJN (Bank Data Penginderaan Jauh Nasional) system in maintaining and managing standard data processing results and also to support SPACeMAP system. This research also to ensure standard data, metadata and storage data standard for processing imagery results. The specific goal from this research is also to increase the efficiency of the image displaying on SPACeMAP that is why the raster format target is JP2/ECW. The compression method of this image file format will be described and tested to see the image quality after compressed.

Methodology
The methodology is by doing literature study and searching for the basic concept of the research, definition and the meaning of research basis will be explored in this part. Method of image compression that has been used by JPEG and ECW format file will be explained. Metadata definition and functionality will follow an international geospatial standard.

Literature Review
Literature review started by searching for remote sensing image format file and how they are distributed well by some system. Reference paper-like "Interoperability in planetary research for geospatial data analysis" [1]. Planetary science community use interoperable method for accessing and working with geospatial data. This community has been targeted two format files like GeoTiff and GeoJPEG 2000 while Geo FITS (Flexible Image Transport System) is not currently widely used. Although FITS is standard formats in the Virtual Observatory, compatible with Planet Data System archiving specification and it is supported by a large number of software tools. Interoperability for metadata, data portal, web service, cartographic and GDAL based tools have become a major point in this paper. [2] explain how to organize earth observation data inside a spatial data infrastructure. The automatic metadata extraction method is presented in this paper and it is used for collecting various satellite metadata based on standard ISO 19115. This rule-based method has extracted a lot of different remote sensing image like Landsat, Modis, S-NPP VIIRS, Rapid Eye, Sentinel 1 and 2. Automatic metadata extraction also replaces manual metadata search method which is usually causing more time consumption and data duplication.
The next paper is referred to how image compression method has been implemented, like discrete cosine transform [3]. The paper explains the procedure of how discrete cosine method doing the transformation of an image and compress image like JPEG [4]. The paper described digital image performance comparison method with discrete cosine, wavelet and both (hybrid). It exposes some experiment that discrete cosine produces mean square error (MSE) 28.55 and peak signal-noise ratio for 36.03. For the wavelet method, the value of MSE is lower which is 28.02 and for PSNR is higher which is 38.58. Those results mean that wavelet transformation performs better than discrete cosine. Because if MSE value is smaller then there is no image damage caused by the transformation. If PSNR value is high then the difference between a compressed image with the original is small.

Metadata
Metadata actually is information about data or a file which contain data description. The other definition which mostly used in remote sensing scope is geospatial metadata, geospatial metadata is metadata which can be implemented into an object with the geographical extent with explicit and implicit, in another word that is some object that can be related into a map, globe or earth surface. Metadata for each raster has various types, the arrangement of metadata will follow standard ISO 19115 and 19139. Attribute and parameter are configured to the standard metadata structure based on ISO 19115 and 19139 rules. ISO 19115:2003 can be implemented on database catalog creation (in this research limitation only for Remote Sensing Imagery catalog) and also it can describe an image collection completely. ISO also can be used for digital image and developed into different data form like maps, chart, text document and other. This ISO 19115 have 409 elements and 22 core elements for describing data and the other compound element (core). The role is consisting of 11 main elements, i.e.; identification, boundary, data quality, spatial representation, reference system, data information, reference of catalog portal, distribution, additional information, and application scheme information.

Compression Method
Compression is a function, operator of the transformation that can change some of the set numbers to the new set of number. In this research, the term transformation will be used to do change several numbers of an image. There is a lot of compression method but in this research is limited to only cosine and wavelet method.

Cosine Method
The formula for Discrete Cosine Transformation (function) type 2 for two-dimensional transformations for any images A and B is given by where where M and N letter represent the rows and column size of A; P and Q are the row and column size of B. JPEG using Discrete Cosine Method in a compression operation and inverse back its transformed image into output. This reverse procedure is also applied on Fourier and Wavelet Transformation.

Wavelet Method
A wavelet is a wave-like oscillation with an amplitude that begins at zero, increase and then decrease back to zero. A wavelet transform is a mathematical function which is used to divide a given function or continuous-time signal into different scale components. Wavelet transform can be discrete or continue, continuous wavelet transformation will not be discussed here because the compression process uses discrete wavelet transformation.

Discrete Wavelet Transformation
According to the R. Gonzales and R. Woods in their book (Digital Image Processing) the discrete wavelet transform in one dimension defined as: ( ) ( ) for some the wavelet series expansion coefficients for f(x) equation 1 and 2 become the forward DWT coefficients for sequence f(n)

DWT in two-dimension
DWT of an image ( ) of size is defined by the next following equation: where i is the value of H, V and D, and for the inverse discrete wavelet transform it can be show as following:

Daubechies Wavelet
Daubechies wavelet basically is part of orthogonal wavelets which define DWT and characterized by a number of vanishing moment. This number determines how is Daubechies visualized in certain of graphics. For N equal to two and four then they will be represented as db2 and db4, while for N=40 then the number of pulses on wavelet also increasing where it can be seen from db40 wavelet graphic. Higher numbers of N will cause number for the pulses.

Wavelet Decomposition
When wavelet decomposes an image then it can be described through this figure 3 below, the row image convolved with low and high-frequency filter. The filtered image is down sampled and the even indexed rows did not disappear or still kept in decomposition process.

Validation of process
Need another validation to guarantee the image quality like Mean Square Error (MSE) and Peak Signal to Noise Ratio (PSNR) beside visualization verification. This MSE is an error quadratic value between the original image and the compressed image, for an example if MSE value high then the level compression performance also high. PSNR is used to measure image quality, the measurement can be determined by the value of PSNR. It means that higher PSNR value can make the compressed image resemble or similar to the original image. MSE value is derived from the equation below where x, y = Image row and column M, N = Image size refer to maximum value of the pixel, image row and column represented by x and y see the MSE equation.

Map Tiling
Open GIS Consortium (OGC) is a consortium which defines the standard of map mapping. The Web Mapping Service (WMS) is of the type that accepts the queries for map-projected layers and returns requested data in the form of a simple image and graphical format such as JPEG and PNG [2]. It needs more time to visualize the image in WMS type. Map tiling is a technique of image visualization that will decrease the loading time of high-quality image consumed. With this technique, the performance of digital map service has been improved because the tile caches are generated based on geographic vector data [1]. Web Map Tile Service (WMTS) is one of the types of OGC standard that allowing caching mechanism, define a set of scales and a tile matrix set for each scale. The image that already visits by the client can be cached in their browser for speed up the access.

Projection System on SPACeMAP
Available map projection that becomes the default on SPACeMAP are EPSG 4326 and EPSG 3857, those have a different coordinate system. EPSG is abbreviation from European Petroleum Survey Group, this group published a database of coordinate system information. According to http://www.epsg.org/ IOGP EPSG Geodetic Parameter Dataset is a collection of definitions of coordinate reference systems and coordinate transformations which may global, regional, national or local in application. World Geodetic System is a standard earth coordinate frame which usually used in mapping, geodetic and navigation. The last revision from this system is in 1984 where before 1984 the system called " WGS 1960WGS , 1966WGS and 1970.

Data Format Types
Since Landsat and SPOT constellation has been produced by satellite owner, they distribute data digital under GeoTiff data format. Nowadays SPOT 6 provide JP2 format for its image format and DIMAP create its metadata in XML structure.

Tiff (GeoTiff, Big Tiff, Tif)
TIFF (Temporary Instruction File Format) is the best image format with all data and information where the processing or correction on its image still co-exist (not lost any data). The extension TIFF can be created as .tiff of .tif, this extension remains the same (three-letter extension limitation). GeoTiff (Wikipedia) is a public domain metadata standard which allows geo-reference information to be embedded within a TIFF file. According to [1] that "GeoTiff file format allows the flexibility to support tag structures without any error if one application did not support GeoTiff tags". GeoTiff image file format supports 8-bit greyscale, 16, 32 and 64-bit floating point.

Big Tiff (Gigapixel limitations)
Giga pixel can be counted by multiplying the length of the row with the column. For example, if there any imagery with 20.000 number of rows and columns then the size of the image will be approximately 400,000,000 pixels or 0.4 Gigapixel (this is not included with the number of bands). SDK (Software Development Kit) writing license is available in 1, 10, 100 and 1000 Gig pixel (ERDAS ECW JP2 SDK user guide).

JPG (JPEG, GeoJPEG, JP2)
JPEG is Joint Photographic Experts Groups while JPG is another extension that shortened from JPEG because in old windows (MS-DOS 8.3 and FAT -16 file systems) require a three-letter extension. JPEG or JPG is a lossy compression schema of the bitmap file. This mean compression process causes 10 several data to be lost when the file compressed, besides that the image quality becomes decreasing. JPG support until 16 million of color where this format suitable for photography image display. It can be concluded that JPG is an image format which can be very useful to create a kind of high-quality image with very small size (depend on the compression percentage). Still image compressed by JPEG with Discrete Cosine Transform (DCT). JPEG2000 or called as JP2 is a wavelet-based image compression standard (ISO Specification (ISO/IECC 1544)) and coding system.

ECW and ERS
ER Mapper or Erdas Imagine is a software that has been involved in image processing especially on satellite imagery, most of the development has been a focus on satellite area even though this software can process any image with or without coordinate. ERS is the raster format of images produced using ERDAS ER-Mapper, a geo-processing tool. ECW (Enhanced Compression Wavelet) is a proprietary wavelet compression image format optimized aerial and satellite imagery. ECW using Discrete Wavelet Transform (DWT) and inverse DWT (iDWT) operation to be performed on very large data in a tiny amount of RAM and time. After JP2 become an image standard ER-Mapper added tools to read and write JP2 into ECW SDK to form ECW JPEG2000 SDK. The file format of ECW can achieve a compression ratio from 1:2 until 1:100.

Result
In this research, it has been implemented mosaic process of Landsat 8 July 6th, 2018 in Jakarta area ( Figure 5). The filename of the mosaic (which is three band layers red, green and blue) is LPN_LS8_PMS_20180706_MOS_SB_JK_crop_dki_daratan.tif with file size 14 MB.

Cosine Method
The Discrete Cosine Transformation algorithm read the image and process with discrete cosine transformation of two-dimensional space, the quality of image decrease when limitation of the value convert to zero value. See a part of the algorithm below: J(abs(J) < 50) = 0; J= dct2(X) <> dct2(abs(J) <50) =0 This means the threshold (<50) where discrete cosine transforms type-2 value below 50 after non negative value will be zero. The input of the image is one-layer GeoTiff image with unsigned integer 8 bit and cropped scene for a pixel location in x-row 1000 th and y-column 1200 th . Size 201x201 pixels then the transformed image is similar to the original one. Although the histogram is look different between transformed and original image. The results explanation can be visualized in the next picture below.

Wavelet Method
Wavelet algorithm in MATLAB has several methods first for discrete wavelet transformation (dwt2 or single level two-dimensional wavelet) and for the wavelet data compression (wavelet decomposition/wavedec2 and wavelet de-noising and compression/wdencmp). The reason why the algorithm used two methods because the discrete wavelet transformation does not give any change for the compressed result. For wavelet data compression the higher threshold is given then the MSE error value becomes high or it can say "too much noise" in compressed image result.

Discrete Wavelet Transformation
The input of the image is one-layer GeoTiff image with unsigned integer 8 bit and crop the scene for a pixel location in x row1000th and y column 1200th. Size 201x201 pixels then the transformed image is similar to the original one. The value of PSNR is infinite and the mean square error is zero.   Figure 10 display the result of discrete wavelet transformation, it can be concluded that the value of zero from MSE have no impact in changing the picture. It can be said also that the original and transformed image are similar. Next experiment it will be tested on radar image satellite.  Figure 11 display the discrete wavelet transformation on 32 bit of depth data, and the satellite is Sentinel-1.The input of the image is Sentinel data with 32bit depth and under real format, MATLAB read the data as in double format. Visually they cannot be differentiated well but if the results of PSNR is 220.5266 and the MSE value is near to zero then it can be concluded that the dwt2 process changes a slight digital value on the image. The discrete wavelet transformation seems not to have an effect too much to the image, that is why the second wavelet method for image compression for further analyze. The algorithm will be explained in the next chapter (5.2.2).

Wavelet Data Compression
The algorithm read the image and process with Wavelet Decomposition of two-dimensional space, the quality of image decrease when the thresholds in algorithm input are set.   Figure 12 is the result of wavelet decomposition between original and transformed image, from the visualization is not very clear the difference (except zooming in certain scale). PSNR and MSE calculation can add better understanding.The PSNR value is 24.5111 and the MSE value is 230.12792 when the threshold value is set to 50, this decomposing process is using Daubechies wavelet with 4 Vanishing Moments.  Figure 13 is the result of wavelet decomposition between original and transformed image, PSNR and MSE calculation can add better understanding. The PSNR value is 36.7894 and the MSE value is 13.618698 when the threshold value is set to 10, this decomposing process is using Daubechies wavelet with 4 Vanishing Moments. Discrete wavelet transformation does not change any value if the data is in the form of an integer but if the image is in double then there is some slight change in the image. The Daubechies wavelet function is used for discrete wavelet transform in this experiment where the value of PSNR and MSE is not too much different compared with other wavelet function such as symlet, morlet, haar.

ErMapper Compression (ECW extension)
The second experiment is by processing one band only (red from RGB mosaic result) with the size of almost 4 MB (4,737 KB) and compressed into 721 KB single layer. The data type is unsigned 8-bit integer with output size 2136 x 2265 pixels, and the predictive file size is 4.61 MB with pixel width and height 0.000138888 degrees under WGS 84 Geodetics system. The ratio of compression is 10 (10:1) with ERmapper compressor where the size approximately will be on 471 KB.  . For a full scene, there will be no difference between two images but if the images are zoom into certain scale then it will show some distortion or changes for the compressed image.

SPACeMAP display
The file of Landsat 8 mosaic January 1st, 2019 until June 30th, 2019 of all over Indonesia, this file is saved into the GeoTiff Format File and has 37,475,590 KB in size. ECW format file is in 1,907,036 KB in size. GeoTiff (Big Tiff) file format in width 188,030 and height 68,028 pixels, eight-bit unsigned integer under coordinate system WGS 84 (EPSG 84) Geographic. Location extent limited to longitude 94.99975, -11.00725 and latitude142.00725, 5.999749, for each pixel resolution is in 0.00025 in decimal degrees.

Conclusion
Format data like JPEG or GeoJPEG2000 and ECW can produce an image with small size but still ensure the quality of the image visually. The different between uncompressed and compressed image has a large number of size difference. The quality of the image can be assured its damage but if the compressed image is used only for display then it does not affect at all. Crawling or ingesting process in SPACeMAP is better has a small size image as an input (like JP2/ECW) rather than large size image like BigTiff. But if the image is used for enhancement or extraction then the image should in the real size of the image. Format file which using discrete wavelet and cosine transformation method increase the efficiency. Efficiency can be implemented on webmap display or visualization and also on data management if one see on filesize.