Next Article in Journal
Residual-Based Multi-Stage Deep Learning Framework for Computer-Aided Alzheimer’s Disease Detection
Previous Article in Journal
Unsupervised Content Mining in CBIR: Harnessing Latent Diffusion for Complex Text-Based Query Interpretation
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Multi-Shot Approach for Spatial Resolution Improvement of Multispectral Images from an MSFA Sensor

by
Jean Yves Aristide Yao
1,2,
Kacoutchy Jean Ayikpa
1,2,
Pierre Gouton
1,* and
Tiemoman Kone
2
1
Laboratoire Imagerie et Vision Artificielle (ImVia), Université de Bourgogne, 21000 Dijon, France
2
Unité de Recherche et d’Expertise Numérique (UREN), Université Virtuelle de Côte d’Ivoire, 28 BP 536, Abidjan 28, Côte d’Ivoire
*
Author to whom correspondence should be addressed.
J. Imaging 2024, 10(6), 140; https://doi.org/10.3390/jimaging10060140
Submission received: 10 May 2024 / Revised: 4 June 2024 / Accepted: 5 June 2024 / Published: 8 June 2024
(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)

Abstract

:
Multispectral imaging technology has advanced significantly in recent years, allowing single-sensor cameras with multispectral filter arrays to be used in new scene acquisition applications. Our camera, developed as part of the European CAVIAR project, uses an eight-band MSFA to produce mosaic images that can be decomposed into eight sparse images. These sparse images contain only pixels with similar spectral properties and null pixels. A demosaicing process is then applied to obtain fully defined images. However, this process faces several challenges in rendering fine details, abrupt transitions, and textured regions due to the large number of null pixels in the sparse images. Therefore, we propose a sparse image composition method to overcome these challenges by reducing the number of null pixels in the sparse images. To achieve this, we increase the number of snapshots by simultaneously introducing a spatial displacement of the sensor by one to three pixels on the horizontal and/or vertical axes. The set of snapshots acquired provides a multitude of mosaics representing the same scene with a redistribution of pixels. The sparse images from the different mosaics are added together to get new composite sparse images in which the number of null pixels is reduced. A bilinear demosaicing approach is applied to the composite sparse images to obtain fully defined images. Experimental results on images projected onto the response of our MSFA filter show that our composition method significantly improves image spatial resolution and minimizes reconstruction errors while preserving spectral fidelity.

1. Introduction

Multispectral images are useful in a wide range of applications: facial recognition [1], remote sensing [2], medical imaging [3], and precision agriculture [4], among others. Multispectral image acquisition systems offer great diversity, particularly with scanning mode acquisition systems that acquire the multispectral image in multiple frames. They are divided into three categories: tunable filter cameras, tunable illumination cameras, and multi-camera systems. Tunable filters, such as LCTF (Liquid Crystal Tunable Filter) [5] and AOTF (Acousto-Optical Tunable Filter) [6], use electronic techniques to capture each multispectral band. Although these systems produce fully defined multispectral images, their acquisition time is beyond the scope of a real-time acquisition system.
On the other hand, instantaneous acquisition systems, or snapshots, capture the MS image in a single shot. They include single-sensor or multi-sensor multispectral systems, which are divided into several classes: multispectral filter array (MSFA), interferometers, tunable sensors, and filtered lens arrays [7].
The acquisition system based on a single-sensor one-shot camera coupled with an MSFA provides a compact, low-cost, real-time solution for multispectral image acquisition. The camera can capture all the necessary spectral bands in a single snapshot [8]. To achieve this, an MSFA is positioned in front of the sensor to capture mosaic images where each pixel location contains information from a single spectral band. An interpolation method is applied to the mosaic image to obtain the fully defined multispectral image [9].
The MSFA plays a crucial role in multispectral imaging by filtering the light entering the sensor. In a defined MSFA, the number of bands increases by reducing the number of pixels assigned to the band. A greater number of spectral bands in the MSFA allows for a more precise spectral analysis of the observed scene, but this results in a decrease in the spatial resolution of the image. Indeed, with more spectral bands, the distance between spectrally similar pixels increases [10]. The main weakness of single-sensor one-shot cameras is their ability to efficiently reconstruct a complete multispectral image from a mosaic image, especially when the mosaic contains non-homogenous areas, abrupt transitions, and textured regions [11].
Previous works [4,12] have detailed our single-shot multispectral camera’s design process, specifically designed to operate in the visible. This camera has a 4 × 4 MSFA moxel with eight spectral bands selected by a genetic algorithm. Each spectral band receives two pixels per moxel, where the mosaic arrangement is a moxel assembly over a monochrome sensor [13]. After a snapshot, the camera provides a mosaic that is decomposed into eight sparse images, each containing pixels with the same spectral properties and null pixels. Thus, the sparse images have a very high number of null pixels. For our camera, in a 16-pixel moxel window, 14 pixels are null. This deficit can cause problems during image demosaicing, affecting image quality and visual fidelity and a loss of spatial resolution.
To address these issues, we propose a method for reducing the number of null pixels in sparse images. Our approach aims to reduce the number of null pixels by combining sparse images from multiple acquisitions. To achieve this, we combine camera displacements along both the vertical and horizontal axes. At each displacement, the camera captures an image of the observed scene, generating a mosaic of the scene with a spatial redistribution of pixels with similar spectral properties. Next, the set of sparse images from each post-displacement acquisition is summed with those obtained without displacement to obtain new composite sparse images. The new sparse images are finally demosaiced.
In this study, we present the following contributions:
  • Setting up a dataset for our experiments, consisting in transforming images from a database of 31 into 8 bands to simulate our 8-band MSFA moxel. These images will then be mosaicked with our MSFA filter to simulate a snapshot from our camera;
  • Development of a new composition method with a multi-shot approach to reduce the number of null pixels in sparse images while maintaining the same number of spectral bands;
  • Performed visual and analytical comparisons using validation metrics to evaluate our experiments, demonstrating the improvement in spatial resolution of the final image obtained after demosaicing.
The remainder of this article is organized as follows: Section 2 presents the state of the art in improving the spatial resolution of MSFA images. Section 3 details the materials and methods used in our approach. Section 4 presents the experiments carried out and the results obtained. Section 5 discusses the results. Finally, Section 6 presents our conclusion.

2. Related Works on Improving the Spatial Resolution of MSFA Images

Much research has demonstrated the interest in improving the spatial resolution of multispectral images from an MSFA sensor.
Monno et al. [11,14] proposed a multispectral demosaicing method using a guided filter. This method is used in multispectral imaging to improve color reproduction and computer vision applications. The proposed method uses a guided filter to interpolate spectral components in a multispectral color filter array. The technique addresses the challenge of undersampling in multispectral imaging and shows promising results for practical applications. Its effectiveness is based on the establishment of an MSFA pattern with a dominant green band.
Wang et al. [15] proposed a method to improve the quality of images reconstructed from multispectral filter networks while minimizing the computational cost. It addresses the challenge of estimating missing data in images acquired by these networks using adaptive frequency domain filtering (AFDF). This technique combines the design of a frequency domain filter to eliminate artifacts with spatial averaging filtering to preserve spatial structure. By incorporating adaptive weighting, AFDF improves the quality of reconstructed multispectral images while maintaining high computational efficiency.
Rahti and Goyal [16] proposed a weighted directional interpolation method for estimating missing pixel values. They exploit both spectral and spatial correlations present in the image to intelligently select interpolation schemes based on the properties of binary tree-based MSFA models. By computing directional estimates and using edge amplitude information, the method progressively estimates missing pixel values and updates pixel arrangements according to the band’s point of arrival (PoA) in the binary tree structure.
Zhang et al. [17] proposed a method that integrates a deep convolutional neural network with a channel attention mechanism to improve the demosaicing process. In this method, a mean square error (MSE) loss function is used to improve the accuracy of estimated pixel values in image processing. In addition, a contour loss is introduced to improve the sharpness and richness of textured images using high-frequency subband analysis in the wavelet domain. The method uses the TT-59 database [18] for training and evaluation. Multispectral images are processed to synthesize radiance data to demonstrate the effectiveness of the demosaicing technique.
Mihoubi et al. [19] proposed a demosaicing method called PPID based on the generation of a pseudo-panchromatic image (PPI). To ensure robustness to different lighting conditions, an adjustment of the value scale in the raw image is proposed before estimating the PPI, with the aim of mitigating biases caused by differences in spectral illumination distribution between channels. The remaining steps include calculating the spectral differences [20] between the original raw image and the PPI, using local directional weights for interpolation [21], and, finally, combining the PPI with the differences to estimate each channel of the final image.
Jeong et al. [22] proposed a method to improve image quality by estimating a pseudo-panchromatic image using an iterative linear regression model. It then performs directional demosaicing, a technique that combines the pseudo-panchromatic image with spectral differences to produce a final interpolated image. The process includes steps such as directional interpolation using the BTES method [23] and calculation of weights to improve the accuracy of the final multispectral image.
Rathi and Goyal [9] proposed a method that uses the concept of the pseudo-panchromatic image and spectral correlation between spectral bands to efficiently generate a complete multispectral image. It involves estimating a pseudo-panchromatic image from a mosaic image using convolution filters based on the probability of the appearance of each spectral band [24] and binary masks. This pseudo-panchromatic image is then used to interpolate each spectral band to produce a multispectral image. The process iteratively improves the quality of the multispectral image by updating the pseudo-panchromatic image and estimating the spectral bands multiple times.
Liu et al. [25] proposed a new deep learning framework for multispectral demosaicing using pseudo-panchromatic images. The framework consists of two networks, the Deep PPI Generation Network (DPG-Net) and the Deep Demosaic Network (DDM-Net), which are used to generate and refine the PPI to improve image quality and recover high-frequency information in the demosaicing process. DPG-Net specifically focuses on improving the sharpness of the preliminary PPI to improve image resolution by learning the differences between the actual PPI and Mihoubi’s blurred version [19], which ultimately leads to the production of the final refined PPI. DDM-Net uses bilinear interpolation to estimate missing pixel values in fragmented bands, followed by a neural network architecture that extracts color and texture features to improve image quality. By combining convolutional layers and loss functions, DDM-Net aims to minimize reconstruction errors and produce high-quality demosaiced images.
Zhao et al. [26] proposed a neural network model with two branches of adaptive features (DDMF) and edge infusion (PPIG). The proposed architecture combines weighted bilinear interpolation [21] to generate initial demonstration images with adaptive adjustments of pixel values in reconstructed multispectral images. It uses a DDMF module to generate convolution kernel weights that adapt to spatial and spectral changes, thus improving the accuracy of the demosaicing process. In addition, the PPIG edge infusion sub-branch integrates edge information to improve demosaicing accuracy in terms of spatial precision and spectral fidelity.
Most of the methods proposed to improve the spatial resolution of a multispectral image are based on complex steps during the demosaicing process. Our paper proposes a new approach based on a multi-shot method that happens before the demosaicing process.

3. Materials and Methods

3.1. The MSFA Moxel

The MSFA moxel is a grid of optical filters placed in front of the sensor of a multispectral camera to filter the incoming light into different spectral bands. Each pixel in the captured image is associated with a specific filter in the MSFA moxel, allowing light intensity to be measured in different parts of the electromagnetic spectrum. The MSFA allows the simultaneous acquisition of multispectral information during image acquisition by distributing the pixels on the image sensor according to their spectral sensitivity. The choice of MSFA size and the number of bands is essential for the acquisition and reconstruction of multispectral images. The MSFAs commonly used in the literature generally have the following two main characteristics:
  • Redundancy [27]: a band can have a probability of appearance greater than one, 1 n , where n represents the linear size of the MSFA moxel;
  • Non-redundancy [21]: each band has a probability of appearance of 1 n .
In the case of bands with redundancy, the following two types of behavior can be observed:
  • Dominant bands: the probability of the appearance of certain bands in the MSFA moxel is higher than others;
  • Nondominant bands: all bands in the MSFA moxel have the same probability of appearance.
These characteristics of the MSFA moxel directly affect the quality and resolution of the multispectral images obtained after the acquisition and reconstruction process. The selection of the appropriate MSFA moxel depends on the specific application requirements, such as the desired spectral resolution, sensitivity to different wavelengths, and camera hardware constraints.
Our camera uses a 4 × 4 filter with equal probability of band appearance to acquire mosaic images, where each band is sampled by two pixels. This moxel was chosen to balance the spatial distribution of pixels in sparse images [28]. This design is based on the color shade approach [12], which optimizes the spectral response of the filters and improves the quality of images acquired during a shot. Figure 1a illustrates the spectral band arrangement of our MSFA moxel. This moxel is used throughout our study to construct mosaic images and in demosaicing multispectral images.
Figure 1b shows the filters’ spectral response in our MSFA model. Spectral response refers to how well the sensor detects and measures light in different spectral bands. This spectral response is given in the visible spectral interval [400 nm, 790 nm].

3.2. Dataset

In our simulation, we project 31 image bands from the TokyoTech database (TT-31) [11] into 8 bands corresponding to our MSFA. This projection is performed on the response of the MSFA filters of our camera. The use of this projection is important because it allows us to work with accurate data that reflect the conditions we encounter in the real world when making acquisitions with our camera. This allows us to reduce the dimensionality of the images while preserving the most relevant spectral information. Here are the steps in the projection process:
  • Determination of the desired number of bands for the resulting multispectral image, in our study, eight bands.
  • Definition of Gaussian filter full width at half maximum (FWHM) in nanometers; in our study, this width is 30 nm.
  • Calculating the standard deviation of the Gaussian filter corresponding to the defined FWHM is necessary because the shape of the Gaussian is determined by its standard deviation.
  • Calculation of the central wavelength of each Gaussian filter. we use a distance of 3 times the standard deviation of the start wavelength. Then, we move at a calculated interval between filters and end at a distance of 3 times the standard deviation of the end wavelength. Subsequently, we round the values to the nearest integer and sample at the desired spectral interval.
  • Creation of Gaussian filters using a Gaussian function. Each filter is calculated based on the similarity between the spectral wavelength and the central wavelength of the filter. The greater the similarity, the higher the filter weight. Filters are normalized to ensure that their sum equals 1.
  • The recovery of original image data from 31 bands is followed by filtering using the created Gaussian filters.
  • Multiplication of Gaussian filters to the weighted data to perform the 8-band multispectral transformation, selecting the appropriate spectral bands.
The 430, 464, 498, 529, 571, 605, 645, and 680 nm bands used for projection result from optimization work with the genetic algorithm.
In this approach, it is assumed that there is no change in the inclination of the illuminance.

3.3. Mosaicking Process to Obtain Sparse Images

A mosaic image captured by our camera produces 8 sparse images after grou** pixels with similar spectral properties. Since we will be working with fully defined images, we use our MSFA moxel to generate mosaics from them. Figure 2 illustrates the mosaicking process with our MSFA moxel and the grou** of pixels with similar spectral properties into sparse images.
Figure 3 shows the spatial distribution of pixels in the sparse images of spectral band B1. The gray areas represent the available pixels, while the white areas represent the null pixels.
Our approach is to reduce the number of null pixels in these sparse images. We expect that reducing the number of null pixels will reduce reconstruction errors during the demosaicing process.

3.4. Conceptualization of the Method

Let us define B = B 1 ,   B 2 ,   B 3 ,     B 4 ,   B 5 ,   B 6 ,   B 7 ,   B 8 , the original spectral bands that have the fully defined information (pixels) of the eight bands obtained after projection.
Let us define I M S F A as the mosaic obtained after the first snapshot without sensor displacement. Synthetically, it is obtained using our MSFA moxel on B i bands.
Let us define I k j D j M S F A as the mosaic obtained with camera displacement of kj pixels, where kj ∈ {1, …, 3}, along the Dj axes, which can be either horizontal (H) or vertical (V). Synthetically, these mosaics are obtained by shifting the bands B i of kj pixels along the Dj axes. This produces bands B i j , which are mosaicked with our MSFA moxel.
Figure 4 illustrates the different mosaics obtained with a one-pixel camera displacement on the vertical axis (k1 = 1 and D1 = V) and a one-pixel camera displacement on the horizontal axis (k2 = 1 and D2 = H).
These mosaic matrices have the following shapes for kj displacement:
B 1 0,0 B 5 0,1 B 7 1,0 B 3 1,1 B 2 0,2 B 6 0,3 B 8 1,2 B 4 1,3 B 2 2,0 B 6 2,1 B 8 3,0 B 4 3,1 B 1 2,2 B 5 2,3 B 7 3,2 B 3 3,3         B 1 0,4 B 7 1,4 B 2 2,4 B 8 3,4 I M S F A B 1 k 1 , 0 B 5 k 1 , 1 B 7 k 1 + 1,0 B 3 k 1 + 1,1 B 2 k 1 , 2 B 6 k 1 , 3 B 8 k 1 + 1,2 B 4 k 1 + 1,3 B 1 k 1 , 4 B 7 k 1 + 1,4 B 2 k 1 + 2,0 B 6 k 1 + 2,1 B 8 k 1 + 3,0 B 4 k 1 + 3,1 B 1 k 1 + 2,2 B 5 k 1 + 2,3 B 7 k 1 + 3,2 B 3 k 1 + 3,3 B 2 k 1 + 2,4 B 8 k 1 + 3,4 I k 1 V M S F A
B 1 0 , k 2 B 7 1 , k 2 B 5 0 , k 2 + 1 B 2 0 , k 2 + 2 B 3 1 , k 2 + 1 B 8 1 , k 2 + 2 B 2 2 , k 2 B 8 3 , k 2 B 6 2 , k 2 + 1 B 1 2 , k 2 + 2 B 4 3 , k 2 + 1 B 7 3 , k 2 + 2         B 6 0 , k 2 + 3 B 1 0 , k 2 + 4 B 4 1 , k 2 + 3 B 7 1 , k 2 + 4 B 5 2 , k 2 + 3 B 2 2 , k 2 + 4 B 3 3 , k 2 + 3 B 8 3 , k 2 + 4 I k 2 H M S F A
The process of grou** pixels with similar spectral properties involves separating a mosaic image into different spectral bands using a binary mask formulated as follows:
m B i p = 1 , p   B i 0 , o t h e r w i s e
For each mosaic, we obtain a set of sparse images, I ~ i , by applying the following formula on them:
I ~ i = I M S F A m B i
For any camera displacement, we obtain the mosaics I k j D j i   w i t h   i 1 , ,   8 ,   k j 1 , ,   3 ,   a n d   D j   ϵ   H ,   V , where i represents the index of a band of the MSFA moxel and kj represents the displacement scalars along the horizontal (H) and vertical (V) axes.
Figure 5 shows the density of pixels that are spectrally similar in band B1 of the mosaics I M S F A ,   I k 1 H M S F A , and I k 2 V M S F A . The gray areas represent the available pixels in the sparse image I ~ 1 of the band B1; the yellow areas represent those available in the sparse image I ~ k 1 V 1 of the band B 11 due to the camera’s displacement on the vertical axis of k1 pixels; and the blue areas represent the pixels available in the sparse image I ~ k 2 H 1 of the band B 12 due to the camera’s displacement on the vertical axis of k2 pixels.
The positions of the non-null pixels vary in each sparse image, and these pixels have the same spectral properties. Therefore, the sparse images can be combined (composition method), i.e., added together, to increase the number of non-null pixels and reduce the number of null pixels. The pixels are redistributed according to the camera displacement combinations.

3.5. Sparse Image Composition

The sparse image composition method is performed in 3 steps, as shown in Figure 6. The first step is to take an initial snapshot of a scene. This snapshot provides a mosaic image I M S F A , which is decomposed into sparse images I ~ i using Formula (2). Then we set the number N of compositions we want to make by specifying the displacement scalars kj, and the axes Dj. Finally, we obtain composite sparse images I ~ c i , which contain more available pixels. The symbol “?” in the composite sparse images I ~ c i , indicates the areas where new pixels can appear depending on the displacement combination. This composition method reduces the distance between two non-null pixels and is limited to three compositions. Beyond three compositions, implementing such a method can be very time consuming.

3.5.1. Case of the Composition of Two Sparse Images

For two bands, we obtain six possible compositions for the different values of the displacement scalar on the two axes H and V. The following algorithm shows how the composition of two bands is achieved:
  • The camera takes a first snapshot from which we obtain a mosaic I M S F A ;
  • The camera moves k pixel(s) on the D axis and takes a second snapshot, from which a second mosaic I k D M S F A is obtained;
  • The separation into sparse image is performed on the mosaics I M S F A and I k D M S F A with Formula (2), resulting in sparse images I ~ i and I ~ k D i ;
  • The addition of the two sparse images is performed, such that I ~ c i = I ~ i + I ~ k D i .
Figure 7 shows a composition of bands from the I M S F A and I 1 H M S F A mosaics. The eight sparse images have globally the same pixel distributions, which vary according to the parameters kj and Dj. Thus, for a given camera displacement, the pixel distribution in the composite sparse image I ~ c i is the same, which justifies that we comment only on spectral band B1 of each composition.
Figure 8 shows the six possible compositions of band B1 with different values of the displacement scalar kj on the V and H axes. The new composition allows for more pixels and better redistribution to minimize the non-null pixel distance in the composite sparse images I ~ c i .

3.5.2. Cases of Sparse Images Greater Than Two

For more than two bands, we obtain more than 30 possible compositions for the different values of the displacement scalars kj on the axes Dj. The following algorithm shows how the composition of N bands is achieved where 2 < N 4 :
  • The camera takes a first snapshot from which a mosaic I M S F A is obtained.
  • The separation into sparse images is performed on the mosaics I M S F A using Formula (2), resulting in the sparse images I ~ i .
  • The initialization step sets the values of I ~ c i to I ~ i and j to 1.
  • As long as jN:
    • The camera moves along the Dj axis by kj pixels from its position (0, 0) and takes a snapshot, and a new mosaic I k j D j i is obtained.
    • The new mosaic I k j D j i is decomposed using Formula (2), resulting in sparse images I ~ k j D j i .
    • The above sparse image is added to the previous sparse image I ~ c i = I ~ c i + I ~ k j D j i , and the value of j is incremented.
  • In the end, we get composite sparse images I ~ c i .
Figure 9 illustrates the spatial distribution of pixels of certain three- and four-band compositions. The blue area represents the H-axis displacement and the yellow area represents the V-axis displacement.
The composition method redistributes pixels to provide more information and reduce the number of pixels to interpolate. It is important to note that with our MSFA moxel it is not possible to achieve a three-band composition with a displacement of two pixels on both the horizontal and vertical axes (k1 = k2 = 2 on the H and V axes). This would cause a problem with overlap** pixels at certain positions of the composite sparse image.

3.6. Bilinear Interpolation

To generate a fully defined image, we use the bilinear interpolation on the sparse images to deduce the null pixels according to the following Algorithm 1:
Algorithm 1: Bilinear interpolation.
Input: sparse_image, method
Output: InterpIMG
BEGIN
 Width = sparse_image.width
 Height = sparse_image.height
 XI = value grid going from 1 to height + 1
 YI = value grid going from 1 to width + 1
 Ind = coordinates of data to interpolate
 Z = values of non-null indices
 InterpIMG = grid_interpolation(Ind,Z,(XI, YI), fill_value = 2.2 × 10−16)
END
We set the fill value to 2.2 × 10−16 to avoid the zero-value. This would avoid having nan values in our interpolated matrix. The grid_interpolation function is given in the following Algorithm 2.
Algorithm 2: grid_interpolation.
Input: points, values, grid, method, fill_value
  // points: The coordinates of the data to interpolate
  // values: The corresponding values at the data points
  // grid: The grid on which to interpolate the data
  // fill_value: the value to use for points outside the input grid
Output: InterpIMG
BEGIN
  For each point (x, y) in grid:
    If (x, y) is outside of the input points:
      Assign fill_value to InterpIMG(x, y)
   Else:
       Find the k (2 ≤ k ≤ 4) nearest data points within a rectangular grid, with 2 along each axis
       Calculate the weights for interpolation based on distance
       Interpolate the value at (x, y) using the input values in points and interpolation weights
      Assign the new value to InterpIMG(x, y)
   End If
  End For
End If
END

3.7. The General Architecture of Our Method

The architecture in Figure 10 shows the general flow of our method. We start by projecting the images of TT-31 into 8 bands. Then, we create a mosaic from these bands, which represents the first snapshot of the sensor. The mosaic is decomposed into 8 sparse images that go through a composition method that depends on the displacement of the sensor horizontally or vertically. Each displacement provides a mosaic that is decomposed into 8 sparse images, which are added together with the previous 8 sparse images to form the new composite sparse images. The composition process is repeated until the stop condition is reached. The final composite sparse images are demosaiced using a bilinear method to obtain the fully reconstructed images.

4. Experiments and Results

Our experiments aim to demonstrate how reducing the number of null pixels in sparse images can improve the quality of the spatial resolution obtained after interpolation. To achieve this, we will compare the fully reconstructed images of sparse images and composite sparse images. We will use the TokyoTech datasets TT-31 projected on the response of our MSFA filter, on which we will perform qualitative and quantitative analyses to determine the impact of this approach.

4.1. Metric

To evaluate our results, we use four quantitative metrics, namely:
  • PSNR (Peak Signal to Noise Ratio) [29]: PSNR is a widely used metric to assess the quality of a reconstructed or compressed image compared to the original image. This metric measures the ratio between the maximum power of the signal (which is called the peak signal) and the power of the noise that degrades the quality of the image representation (also known as the corrupting noise). Higher PSNR values indicate better image quality because they represent a higher ratio of signal power to noise power.
    P S N R = 10 log 10 2 n 1 2 1 n i = 1 n I i I ^ i 2
    where n is the number of spectral bands in the MSFA moxel.
  • SAM (Spectral Angle Metric) [30] calculates the angle between two spectral vectors in a high-dimensional space. Each spectral vector represents the spectral reflectance or irradiance of a pixel over several spectral bands.
    S A M = cos 1 i = 1 n I i I ^ i i = 1 n I 2 i I ^ 2 i
    The smaller the angle between two spectral vectors, the more similar the spectra are considered to be.
  • SSIM (Structural Similarity Index Measure) [31]: SSIM is a method used to measure the similarity between two images. This technique compares the structural information, luminance, and contrast of the two images, taking into account the characteristics of the human visual system. Compared to simpler metrics such as Mean Square Error (MSE) or PSNR, SSIM provides a more comprehensive assessment of image similarity by considering perceptual factors. The SSIM value ranges from 0 to 1, where
    1 indicates perfect similarity between images.
    0 indicates no similarity between images.
    We use the structural_similarity function of the python skimage.metrics module to compute this metric.
  • RMSE (Root Mean Square Error) [32]: RMSE is a commonly used metric to evaluate the accuracy of predictions by measuring the average size of the errors between the predicted and actual values in a given set of predictions. The metric is expressed in the same unit as the target value. For example, if the target value is to predict a certain value, and we obtain an RMSE of 10, this indicates that the predicted value varies on average by ±10 from the actual value. The formula for calculating the RMSE is as follows:
    R M S E = i = 1 n I i I ^ i 2 n

4.2. Quantitative Evaluation

We have conducted tests on 20 images from the TokyoTech database, and the quantitative results of the PSNR, SAM, SSIM, and RMSE metrics are presented in Table 1, Table 2, Table 3 and Table 4. The header “Snapshots” indicates the number of snapshots taken by the camera. For example, taking p snapshots, where p ∈ {2, …, n}, means taking one snapshot without displacement of the camera and taking p − 1 other snapshots by displacements of the camera along the specified axes. The “Displacements” header of the tables specifies the different configurations of sparse image compositions, identified by the letter compositions, where
  • a’ corresponds to a snapshot without displacement
  • b’ corresponds to a snapshot after a displacement of 1 pixel on the H-axis
  • c’ corresponds to a snapshot after a displacement of 1 pixel on the V-axis
  • d’ corresponds to a snapshot after a displacement of 2 pixels on the H-axis
  • e’ corresponds to a snapshot after a displacement of 2 pixels on the V-axis
  • f’ corresponds to a snapshot after a displacement of 3 pixels on the H-axis
  • g’ corresponds to a snapshot after a displacement of 3 pixels on the V-axis
The ‘abc’ displacements represent three different snapshots taken by a camera. The first snapshot is taken without any displacement, the second snapshot is taken after a horizontal displacement of one pixel, and the third snapshot is taken after a vertical displacement of one pixel. The values in each cell of the table represent the average of the eight spectral bands. Using this quantitative evaluation method, we can compare the values obtained from composite sparse images with the values obtained from sparse images without any composition. Note that the tables do not cover all possible combinations of camera displacements but only a selection. The results indicate that images reconstructed from composed bands are of higher quality than those reconstructed without band composition. However, the quality of the reconstructed image depends not only on the specific composition used but also on the individual image.

4.3. Qualitative Assessment

Figure 11, Figure 12 and Figure 13 show the reconstructions of fine details in images containing abrupt transitions, non-homogeneous areas (Figure 11 and Figure 12), and textured regions (Figure 13). We will compare the quality of images reconstructed from sparse and composite sparse images by selecting and zooming in on a 60 × 60 pixel area from “Butterfly8”, a 155 × 165 pixel area from “Butterfly”, and a 130 × 104 pixel area from “Party”. The selected areas are indicated by red boxes in the original images. The quality of the reconstructions improves from two snapshots to four. We also note a significant correlation between the spatial distribution of pixels in the sparse image and the reconstruction results.

5. Discussion

The study’s results show a direct correlation between the number of compositions and the spatial resolution of the reconstructed image, especially when reconstructing abrupt transitions, non-homogeneous areas, and textured regions. The more compositions performed, the better the reduction of the distance between the non-null pixels of the sparse images, leading to a better spatial resolution after demosaicing. For each level of composition, there are differences in the qualitative and quantitative results depending on the values of the displacement scalars.
Several observations can be made about compositions involving two bands where only one camera displacement is required. For certain images in Figure 11 and Figure 12, there is a preference for horizontal shifts, while for others in Figure 13, there is a preference for vertical shifts. Depending on the type of image, there is a clear improvement in abrupt transitions, non-homogeneous areas, and textured regions. Two-pixel displacements significantly improve local structures such as edges, textures, and patterns compared to the image obtained without band compositing. This improvement is manifested in higher SSIM values (Table 3), lower spectral similarity angle according to SAM (Table 2), and lower reconstruction error according to RMSE (Table 4). However, reconstruction with less noise is observed with 1-pixel or 3-pixel displacements, as shown by PSNR (Table 1). The study highlights a significant correlation between the spatial distribution of the pixels in the sparse images and the quantitative and qualitative results after reconstruction. Indeed, the displacement of 2 pixels better reduces the distance between two non-null pixels of the sparse images, leading to less overlap** in abrupt transitions and improved visual restitution, as shown by the displacement (ad, ae) in Figure 11, Figure 12 and Figure 13. In conclusion, vertical shifts, especially those of 2 pixels, offer a good compromise between improving local structures and reducing noise in the reconstructed images. The study highlights the importance of considering the spatial distribution of pixels when planning camera shifts for optimal reconstruction.
In compositions with three bands and two camera displacements, there are 14 possible combinations of displacements on the horizontal and vertical axes. According to PSNR, 1-pixel or 3-pixel displacements on both axes result in a less noisy reconstruction. SSIM shows that the structural reconstruction is almost equivalent in most cases. Moving along the same axis results in higher spectral similarity and fewer reconstruction errors, as indicated by SAM and RMSE. Visual results show increased sharpness for displacements on the same axis, but decreased sharpness for displacements of 1 pixel on both axes and 3 pixels on both axes. In conclusion, displacements on the same axis provide an optimal compromise between the structural and spectral quality of the reconstruction. At the same time, other configurations offer specific advantages and disadvantages in terms of noise reduction and visual sharpness.
The visual results obtained are very close to the reference image for four-band compositions with three camera displacements, with 10 possible combinations. This suggests a satisfactory ability to reconstruct images with a high level of visual fidelity, although the metrics show less good results than in the case of the three-band composition. However, the implementation of this type of shift is not directly feasible in a real-time acquisition system due to the increased complexity of the camera shift. Therefore, the use of this type of composition is not necessary in real-time acquisition systems. Nevertheless, the displacements of this type of composition on the same axis show excellent visual results. This observation suggests that a limited camera displacement for this type of composition may be sufficient to significantly improve the quality of reconstructed images without requiring the excessive complexity of a bi-axial composition. In conclusion, four-band compositions can produce satisfactory visual results, but their practical implementation in a real-time acquisition system is limited due to their displacement complexity. However, simpler strategies, such as moving along the same axis, can provide significant improvements while reducing the difficulty of operational feasibility.
In practice, the implementation of our method is possible, in particular, by using a tri-CCD system to capture and restore a motion scene of objects [33]. This acquisition system has a beam splitter to split the light into two other axes. The prism redirects light to three sensors that capture a mosaic of the same scene with different observations, providing three mosaics of the same scene with different spatial information distributions. For static objects, a micron-precision camera translation system would be required to capture and restore the fully defined image. Figure 14 illustrates the operation of a tri-CCD system where each sensor is equipped with an MSFA.
The first MSFA filter is mounted on top of sensor 1 to obtain a mosaic with no information shift. The second MSFA filter is mounted on top of sensor 2 to obtain a mosaic with information shifted by 1 pixel on the horizontal axis. Finally, a third MSFA filter is mounted on top of sensor 3 to obtain a mosaic with information shifted by 1 pixel on the vertical axis.

6. Conclusions

This paper presents a first prototype simulation approach to improve the spatial resolution of multispectral images acquired by our MSFA single-shot camera, with particular emphasis on reproducing fine details, abrupt transitions, and textured regions. Our approach proposes a method of camera displacement along horizontal and/or vertical axes to capture multiple snapshots, thus generating different mosaics for the same observed scene. We then proceed to assemble the spectrally similar pixels of these mosaics to increase the number of non-null pixels in the sparse images. The results of our experiments carried out on TT-31 images projected on the response of our MSFA filter show a qualitative and quantitative improvement in the reconstruction based on composite sparse images, with better results validated by the PSNR, SAM, SSIM, and RMSE metrics. The next step will be implementing our multi-shot prototype on our camera by installing a micron-precision camera motion device. This will allow us to perform experiments on real images and propose a new demosaicing method based on composite sparse bands.

Author Contributions

Conceptualization, J.Y.A.Y. and P.G.; methodology, J.Y.A.Y.; experimentation, J.Y.A.Y.; validation, J.Y.A.Y.; related works, J.Y.A.Y.; data projection, K.J.A.; writing—original draft preparation, J.Y.A.Y.; writing—review and editing, J.Y.A.Y., K.J.A. and P.G.; supervision, P.G. and T.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

No new data were created.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Chambino, L.L.; Silva, J.S.; Bernardino, A. Multispectral Facial Recognition: A Review. IEEE Access 2020, 8, 207871–207883. [Google Scholar] [CrossRef]
  2. Zhang, K.; Zhang, F.; Wan, W.; Yu, H.; Sun, J.; Del Ser, J.; Elyan, E.; Hussain, A. Panchromatic and Multispectral Image Fusion for Remote Sensing and Earth Observation: Concepts, Taxonomy, Literature Review, Evaluation Methodologies and Challenges Ahead. Inf. Fusion 2023, 93, 227–242. [Google Scholar] [CrossRef]
  3. Ma, F.; Yuan, M.; Kozak, I. Multispectral Imaging: Review of Current Applications. Surv. Ophthalmol. 2023, 68, 889–904. [Google Scholar] [CrossRef] [PubMed]
  4. Mohammadi, V.; Gouton, P.; Rossé, M.; Katakpe, K.K. Design and Development of Large-Band Dual-MSFA Sensor Camera for Precision Agriculture. Sensors 2023, 24, 64. [Google Scholar] [CrossRef] [PubMed]
  5. Gebhart, S.C.; Thompson, R.C.; Mahadevan-Jansen, A. Liquid-Crystal Tunable Filter Spectral Imaging for Brain Tumor Demarcation. Appl. Opt. 2007, 46, 1896. [Google Scholar] [CrossRef] [PubMed]
  6. Harris, S.E.; Wallace, R.W. Acousto-Optic Tunable Filter. J. Opt. Soc. Am. 1969, 59, 744. [Google Scholar] [CrossRef]
  7. Lapray, P.-J.; Wang, X.; Thomas, J.-B.; Gouton, P. Multispectral Filter Arrays: Recent Advances and Practical Implementation. Sensors 2014, 14, 21626–21659. [Google Scholar] [CrossRef] [PubMed]
  8. Monno, Y.; Kitao, T.; Tanaka, M.; Okutomi, M. Optimal Spectral Sensitivity Functions for a Single-Camera One-Shot Multispectral Imaging System. In Proceedings of the 2012 19th IEEE International Conference on Image Processing, Orlando, FL, USA, 30 September–3 October 2012; IEEE: Orlando, FL, USA, 2012; pp. 2137–2140. [Google Scholar]
  9. Rathi, V.; Goyal, P. Generic Multispectral Demosaicking Using Spectral Correlation between Spectral Bands and Pseudo-Panchromatic Image. Signal Process. Image Commun. 2023, 110, 116893. [Google Scholar] [CrossRef]
  10. Shrestha, R.; Hardeberg, J.Y.; Khan, R. Spatial Arrangement of Color Filter Array for Multispectral Image Acquisition; Widenhorn, R., Nguyen, V., Eds.; San Francisco Airport: San Francisco, CA, USA, 2011; p. 787503. [Google Scholar]
  11. Monno, Y.; Kikuchi, S.; Tanaka, M.; Okutomi, M. A Practical One-Shot Multispectral Imaging System Using a Single Image Sensor. IEEE Trans. Image Process. 2015, 24, 3048–3059. [Google Scholar] [CrossRef]
  12. Mohammadi, V.; Sod**ou, S.G.; Katakpe, K.K.; Rossé, M.; Gouton, P. Development of a Multi-Spectral Camera for Computer Vision Applications. In Proceedings of the International Conference on Computer Graphics, Visualization and Computer Vision (WSCG 2022), Pilsen, Czech Republic, 17–20 May 2022. [Google Scholar]
  13. Thomas, J.-B.; Lapray, P.-J.; Gouton, P.; Clerc, C. Spectral Characterization of a Prototype SFA Camera for Joint Visible and NIR Acquisition. Sensors 2016, 16, 993. [Google Scholar] [CrossRef]
  14. Monno, Y.; Tanaka, M.; Okutomi, M. Multispectral Demosaicking Using Guided Filter; Battiato, S., Rodricks, B.G., Sampat, N., Imai, F.H., ** to Metasurface-Based Imaging. Nanophotonics 2024, 13, 1303–1330. [Google Scholar] [CrossRef]
Figure 1. (a) A 4 × 4 size nondominant band redundant MSFA moxel. (b) Spectral response of the MSFA moxel of the CAVIAR project camera.
Figure 1. (a) A 4 × 4 size nondominant band redundant MSFA moxel. (b) Spectral response of the MSFA moxel of the CAVIAR project camera.
Jimaging 10 00140 g001
Figure 2. Process of mosaicking and grou** pixels with similar spectral properties.
Figure 2. Process of mosaicking and grou** pixels with similar spectral properties.
Jimaging 10 00140 g002
Figure 3. Spatial distribution of pixels in the sparse image of spectral band B1.
Figure 3. Spatial distribution of pixels in the sparse image of spectral band B1.
Jimaging 10 00140 g003
Figure 4. Mosaics obtained before and after camera displacement. (a) I M S F A mosaic obtained with main snapshot. (b) I 1 V M S F A mosaic obtained with dispacement of camera on the vertical axis of 1 pixel. (c) I 1 H M S F A mosaic obtained with displacement of camera on the horizontal axis of 1 pixel.
Figure 4. Mosaics obtained before and after camera displacement. (a) I M S F A mosaic obtained with main snapshot. (b) I 1 V M S F A mosaic obtained with dispacement of camera on the vertical axis of 1 pixel. (c) I 1 H M S F A mosaic obtained with displacement of camera on the horizontal axis of 1 pixel.
Jimaging 10 00140 g004
Figure 5. Spatial distribution of pixels in the sparse image of the spectral band B1. (a) I ~ 1 is the sparse image of spectral band B1. (b) I ~ k 1 V 1 is the sparse image of spectral band B11 with the camera displacement on the vertical axis of k1 pixels. (c) I ~ k 2 H 1 is the sparse image of spectral band B12 with the camera displacement on the horizontal axis of k2 pixels.
Figure 5. Spatial distribution of pixels in the sparse image of the spectral band B1. (a) I ~ 1 is the sparse image of spectral band B1. (b) I ~ k 1 V 1 is the sparse image of spectral band B11 with the camera displacement on the vertical axis of k1 pixels. (c) I ~ k 2 H 1 is the sparse image of spectral band B12 with the camera displacement on the horizontal axis of k2 pixels.
Jimaging 10 00140 g005
Figure 6. Architecture of our composition method.
Figure 6. Architecture of our composition method.
Jimaging 10 00140 g006
Figure 7. Composition scheme of the two sparse images of the I M S F A and I 1 H M S F A mosaics.
Figure 7. Composition scheme of the two sparse images of the I M S F A and I 1 H M S F A mosaics.
Jimaging 10 00140 g007
Figure 8. Pixel distribution in the composite sparse image I ~ c 1 for each camera displacement. (a) I ~ c i without displacement. (bd) I ~ c i with displacements of 1, 2, and 3 pixels on the horizontal axis, respectively. (eg) I ~ c i with displacements of 1, 2, and 3 pixels on the vertical axis, respectively.
Figure 8. Pixel distribution in the composite sparse image I ~ c 1 for each camera displacement. (a) I ~ c i without displacement. (bd) I ~ c i with displacements of 1, 2, and 3 pixels on the horizontal axis, respectively. (eg) I ~ c i with displacements of 1, 2, and 3 pixels on the vertical axis, respectively.
Jimaging 10 00140 g008
Figure 9. Pixel distribution in the composite sparse image I ~ c 1 for each camera displacement. (a) I ~ c i without displacement. (b) I ~ c i with two displacements of 1 pixel on both the H- and V-axis. (c) I ~ c i with two displacements of 3 pixels on the H-axis and 2 pixels on the V-axis. (d) I ~ c i with two displacements of 3 pixels each on the H- and V-axis. (e) I ~ c i with two displacements of 1 and 2 pixels on the same V-axis. (f) I ~ c i with two displacements of 1 and 3 pixels on the same H-axis. (g) I ~ c i with three displacements of 1, 2, and 3 pixels on the same V-axis. (h) I ~ c i with three displacements of 1, 2, and 3 pixels on the same H-axis.
Figure 9. Pixel distribution in the composite sparse image I ~ c 1 for each camera displacement. (a) I ~ c i without displacement. (b) I ~ c i with two displacements of 1 pixel on both the H- and V-axis. (c) I ~ c i with two displacements of 3 pixels on the H-axis and 2 pixels on the V-axis. (d) I ~ c i with two displacements of 3 pixels each on the H- and V-axis. (e) I ~ c i with two displacements of 1 and 2 pixels on the same V-axis. (f) I ~ c i with two displacements of 1 and 3 pixels on the same H-axis. (g) I ~ c i with three displacements of 1, 2, and 3 pixels on the same V-axis. (h) I ~ c i with three displacements of 1, 2, and 3 pixels on the same H-axis.
Jimaging 10 00140 g009
Figure 10. General architecture of our projection, composition, and interpolation method.
Figure 10. General architecture of our projection, composition, and interpolation method.
Jimaging 10 00140 g010
Figure 11. Qualitative assessment of the Butterfly8 image according to different compositions.
Figure 11. Qualitative assessment of the Butterfly8 image according to different compositions.
Jimaging 10 00140 g011
Figure 12. Qualitative assessment of the Butterfly image according to different compositions.
Figure 12. Qualitative assessment of the Butterfly image according to different compositions.
Jimaging 10 00140 g012
Figure 13. Qualitative assessment of the Party image according to different compositions.
Figure 13. Qualitative assessment of the Party image according to different compositions.
Jimaging 10 00140 g013aJimaging 10 00140 g013b
Figure 14. Tri-CCD system diffuses light to three sensors.
Figure 14. Tri-CCD system diffuses light to three sensors.
Jimaging 10 00140 g014
Table 1. PSNR comparison between single and multiple snapshots.
Table 1. PSNR comparison between single and multiple snapshots.
Snapshots1234
Displacementsaabacadaeafagabcacfabgafgabdacgabdfaceg
Butterfly25.4626.0725.9425.5625.5526.0725.9426.5326.4426.4726.5326.4726.4426.0625.89
Butterfly224.1924.4924.6624.3524.3524.4924.6624.9024.6224.7024.9024.7124.7124.4424.59
Butterfly328.3128.5628.9128.3928.3928.5628.9029.1028.9529.0329.1029.0128.9928.5128.83
Butterfly427.8628.1328.4927.9427.9328.1328.4928.7028.5928.6228.7028.6028.5328.0828.34
Butterfly528.3829.0628.5628.4628.4629.0728.5629.1729.0929.0929.1729.0629.0129.0128.44
Butterfly625.8426.3626.2025.9725.9626.3626.2026.6626.5426.5426.6526.5426.5326.3226.14
Butterfly727.5427.8828.2027.6427.6327.8828.2028.5128.3428.4328.5128.4328.4027.8528.15
Butterfly827.9728.0828.6428.0428.0428.0928.6428.6928.4028.5628.6828.5128.4828.0028.49
Colorchart27.3227.6827.9027.4027.4027.6827.9028.2028.2028.2028.2028.1828.1427.6627.83
CD33.8734.1234.2034.0134.0234.1234.2034.3934.3134.2934.3834.3134.3234.1334.19
Cloth227.4828.1227.7327.6927.6828.1127.7328.3728.1328.1528.3528.1928.2428.1127.75
Cloth329.8630.1530.1130.0930.0830.1530.1230.2730.0230.0530.2729.8730.0529.8930.05
Cloth629.1929.8629.4329.3529.3429.8629.4330.0129.8229.8429.9929.7729.8029.7129.35
Flower29.1729.5329.6129.2829.2829.5329.6129.8929.6929.8029.8929.7629.7729.4629.54
Flower228.1528.8328.5528.2328.2328.8428.5529.2329.1329.1429.2329.1329.1228.8028.50
Flower330.9131.3131.2231.0331.0331.3031.2231.5531.4131.4531.5431.4531.4531.2731.18
Party25.9726.4226.1726.1026.0926.4126.1626.4726.3826.3926.4526.2426.3226.2226.07
Tape26.3526.5927.1726.5626.5626.5927.1727.3427.1527.1727.3327.1327.2426.4927.24
Tape225.2525.6825.8425.5525.5525.6825.8426.1525.8825.9926.1326.1126.0725.7525.88
Tshirts223.4223.8023.7223.5923.5923.8123.7323.9923.7123.7423.9823.5323.7123.4723.64
Average27.6228.0428.0627.7627.7628.0428.0628.4128.2428.2828.4028.2528.2727.9628.00
Table 2. SAM comparison between single and multiple snapshots.
Table 2. SAM comparison between single and multiple snapshots.
Snapshots1234
Displacementsaabacadaeafagabcacfabgafgabdacgabdfaceg
Butterfly1.981.551.611.501.501.551.611.421.481.461.421.141.011.231.10
Butterfly24.293.613.633.273.273.603.633.293.433.423.292.572.582.662.68
Butterfly34.874.374.384.274.274.374.384.234.344.324.223.983.984.034.05
Butterfly43.763.363.413.333.333.363.413.263.373.363.253.123.083.153.14
Butterfly53.232.702.712.562.562.702.722.492.612.602.502.102.082.182.17
Butterfly62.702.332.352.272.272.322.352.182.272.252.181.991.962.062.03
Butterfly73.963.363.393.313.303.363.393.303.413.393.293.153.083.193.11
Butterfly83.533.253.132.792.793.263.122.853.033.012.852.102.462.162.57
Colorchart5.164.674.304.164.174.674.304.184.264.254.182.963.653.043.76
CD3.653.403.093.063.063.403.092.993.123.102.992.242.752.322.85
Cloth23.733.103.242.992.993.103.242.993.043.032.992.602.502.662.58
Cloth34.403.573.593.313.303.573.593.393.403.393.392.572.512.642.60
Cloth64.864.003.983.673.684.003.983.703.883.863.712.862.922.933.01
Flower3.212.812.852.772.772.812.842.692.762.752.692.502.532.542.63
Flower23.252.852.882.792.792.852.882.732.822.802.722.532.532.572.61
Flower33.743.273.373.263.263.273.373.173.263.253.163.012.993.053.10
Party4.253.583.343.123.133.583.343.103.283.273.101.952.442.032.55
Tape2.261.971.991.961.961.971.991.821.921.901.831.701.701.781.78
Tape25.074.404.444.304.304.404.444.264.414.394.254.003.974.054.03
Tshirts25.434.504.363.993.994.504.374.054.164.144.052.763.082.843.17
Average3.873.333.303.133.133.333.303.113.213.203.102.592.692.652.78
Table 3. SSIM comparison between single and multiple snapshots.
Table 3. SSIM comparison between single and multiple snapshots.
Snapshots1234
Displacementsaabacadaeafagabcacfabgafgabdacgabdfaceg
Butterfly0.6920.7170.7190.7250.7250.7170.7190.7290.7100.7110.7290.7220.7220.7230.723
Butterfly20.6590.6860.6860.6950.6950.6860.6860.7000.6770.6770.6990.6880.6920.6900.695
Butterfly30.6740.6920.6910.6940.6940.6920.6910.6960.6800.6800.6970.6830.6820.6860.684
Butterfly40.6770.6970.6970.7010.7010.6970.6980.7040.6850.6860.7050.6900.6880.6930.690
Butterfly50.6820.7100.7100.7180.7180.7090.7100.7230.7030.7030.7230.7170.7190.7170.720
Butterfly60.6860.7070.7070.7110.7110.7060.7070.7150.6940.6940.7150.6990.6990.7020.702
Butterfly70.6380.6600.6580.6640.6630.6600.6580.6670.6470.6470.6680.6530.6530.6550.656
Butterfly80.6360.6580.6590.6620.6620.6580.6590.6630.6510.6500.6630.6560.6530.6590.654
Colorchart0.6280.6560.6490.6580.6590.6550.6490.6600.6460.6460.6600.6510.6550.6530.658
CD0.5470.5970.6070.6230.6220.5970.6070.6310.5980.6000.6290.6340.6390.6320.638
Cloth20.5670.6060.6060.6190.6190.6060.6060.6240.5980.5980.6230.6190.6180.6210.618
Cloth30.6200.6500.6480.6570.6570.6490.6480.6610.6390.6390.6600.6510.6500.6540.653
Cloth60.6710.6940.6940.6990.6990.6940.6940.7020.6870.6870.7020.6940.6940.6960.696
Flower0.6710.7010.7040.7120.7120.7010.7040.7170.6970.6980.7160.7160.7140.7170.715
Flower20.7590.7730.7750.7770.7770.7730.7750.7790.7680.7690.7790.7720.7720.7730.773
Flower30.7320.7520.7530.7570.7570.7520.7520.7600.7460.7460.7600.7530.7530.7550.754
Party0.7330.7510.7500.7540.7540.7510.7500.7570.7440.7440.7570.7490.7480.7510.749
Tape0.6620.6790.6800.6810.6810.6790.6800.6830.6670.6670.6830.6670.6660.6700.669
Tape20.6120.6370.6360.6440.6440.6370.6360.6490.6200.6200.6500.6270.6350.6290.639
Tshirts20.6130.6600.6580.6750.6750.6600.6580.6810.6540.6530.6800.6820.6800.6820.680
Average0.6580.6840.6840.6910.6910.6840.6840.6950.6760.6760.6950.6860.6870.6880.688
Table 4. RMSE comparison between single and multiple snapshots.
Table 4. RMSE comparison between single and multiple snapshots.
Snapshots1234
Displacementsaabacadaeafagabcacfabgafgabdacgabdfaceg
Butterfly5.545.135.165.255.255.135.174.915.035.014.924.894.885.035.06
Butterfly28.237.747.687.647.657.737.687.407.717.677.407.347.337.467.39
Butterfly33.613.413.373.443.443.413.373.273.343.323.273.253.253.363.31
Butterfly43.613.393.343.423.423.393.343.243.313.293.233.233.233.343.29
Butterfly53.463.213.303.303.303.213.303.163.213.213.163.153.153.173.26
Butterfly65.845.445.485.505.505.455.485.255.395.385.265.215.225.305.35
Butterfly73.713.483.453.543.543.483.453.313.403.383.313.313.303.443.38
Butterfly84.274.053.954.034.034.053.953.873.983.953.873.853.843.983.85
Colorchart3.263.043.043.113.113.043.042.952.972.972.952.942.933.043.01
CD2.312.212.232.232.232.212.232.172.212.212.172.182.172.202.20
Cloth26.085.795.825.665.665.795.825.515.795.785.515.465.515.485.63
Cloth35.154.964.934.934.934.964.934.855.025.004.865.004.915.004.88
Cloth65.685.355.415.385.395.355.415.255.415.405.255.325.315.335.36
Flower4.864.614.604.624.624.614.604.494.594.574.494.474.474.524.53
Flower25.054.734.784.804.804.734.794.604.714.704.614.604.604.644.71
Flower33.813.603.623.613.613.603.623.523.603.593.523.503.503.523.55
Party5.865.515.615.595.595.515.615.455.515.515.465.455.455.465.56
Tape5.825.525.445.445.445.515.445.255.455.445.265.305.225.435.21
Tape27.937.347.327.267.277.347.327.017.307.277.026.926.937.047.01
Tshirts28.367.777.757.697.77.777.757.447.737.77.457.357.367.477.44
Average4.954.664.664.674.674.664.664.504.634.614.504.494.494.564.55
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Yao, J.Y.A.; Ayikpa, K.J.; Gouton, P.; Kone, T. A Multi-Shot Approach for Spatial Resolution Improvement of Multispectral Images from an MSFA Sensor. J. Imaging 2024, 10, 140. https://doi.org/10.3390/jimaging10060140

AMA Style

Yao JYA, Ayikpa KJ, Gouton P, Kone T. A Multi-Shot Approach for Spatial Resolution Improvement of Multispectral Images from an MSFA Sensor. Journal of Imaging. 2024; 10(6):140. https://doi.org/10.3390/jimaging10060140

Chicago/Turabian Style

Yao, Jean Yves Aristide, Kacoutchy Jean Ayikpa, Pierre Gouton, and Tiemoman Kone. 2024. "A Multi-Shot Approach for Spatial Resolution Improvement of Multispectral Images from an MSFA Sensor" Journal of Imaging 10, no. 6: 140. https://doi.org/10.3390/jimaging10060140

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop