Mamba-convolution hybrid network for underwater image enhancement - Scientific Reports


Mamba-convolution hybrid network for underwater image enhancement - Scientific Reports

where H and W represent the height and width of an underwater image.

To demonstrate the superior performance of our MC-UIE, we present the qualitative evaluation, quantitative assessment, ablation study, complexity comparison, and application test.

We implement the proposed MC-UIE on PyTorch 2.1 framework with an Intel(R) i9-12900K CPU, 64GB of RAM, and an NVIDIA RTX 4090 GPU. We set the learning rate to 10 and use the Adam optimizer for network optimization. The model is trained for 50 epochs with a batch size of 4, and all input images are resized to .

We compared the proposed method with eight leading UIE methods including two physical-based methods (BRUIE and HLRP), two CNN-based methods (USUIR and PUIE-Net), one Transformer-based method (U-Shape), three Mamba-based methods (WaterMamba, UWMamba, and PixMamba) For a fair comparison, we employ the source code provided by the authors, retrain each method on our training set, and save the best enhancement results.

We utilize two publicly available real-world UIE datasets ( LSUI, and UIEB ) for training and testing. The LSUI dataset provides a large collection of real underwater images across diverse scenarios, making it suitable for training data-driven enhancement models. The UIEB dataset, on the other hand, includes real underwater images along with a curated subset of reference images, facilitating both full-reference quality evaluations. For training, we adopt 3794 images from the LSUI dataset and 800 images from the UIEB dataset. For testing, we use the rest 485 images from the LSUI dataset (Test-485) and the rest 90 images from the UIEB dataset (Test-90).

We employ two full-reference and two non-reference metrics to measure the performance of different methods on Test-485, and Test-90. Two widely used full-reference image quality metrics are the Peak Signal-to-Noise Ratio (PSNR) and the Structural Similarity Index Measure (SSIM). PSNR quantifies the pixel-level differences between the enhanced image and the ground truth, while SSIM evaluates perceptual similarity by incorporating luminance, contrast, and structural information. In both cases, higher scores indicate better fidelity to the reference image in terms of both content and structure. Meanwhile, two non-reference metrics, the Underwater Color Image Quality Evaluation (UCIQE) and the Underwater Image Quality Measure (UIQM), are used to assess enhanced underwater image quality without reference images. UCIQE measures image quality based on colorfulness, sharpness, and contrast, while UIQM integrates colorfulness, sharpness, and contrast into a single quality score. For both metrics, higher values suggest improved perceptual quality in underwater environments.

We show visual comparisons of different UIE methods on Test-485 and Test-90, as shown in Figs. 2 and 3. Raw underwater images are shown in Fig. 2a and 3b. As shown in Figs. 2b, 3b, 2c, and 3c, BRUIE and HLRP handle various color casts, but enhanced images of BRUIE tend to lack authentic underwater color, while HLRP introduces noticeable over-brightness. USUIR improves the contrasts of the image, but its images exhibit unnatural color balance in Figs. 2d and 3d. PUIE-Net and U-Shape effectively improve the visibility of underwater images but its results show poor clarity, as evidenced in Figs. 2e, 3e, 2f and 3f. As shown in Figs. 2g, 3g, 2h, and 3h, WaterMamba and UWMamba significantly restore the details of low-light underwater images, but WaterMamba fail to eliminate color cast, while UWMamba introduces local color cast. PixMamba achieves satisfactory visual results in restoring underwater image visibility and eliminating color casts, but it also results in local color casts as shown in Figs. 2i and 3i. In contrast, in Figs. 2j and 3j, the proposed MC-UIE effectively handles color casts, restores image details, and improves image visibility. To sum up, our MC-UIE can effectively handle both conventional degraded underwater images and extremely degraded low-light underwater images.

We evaluate the performance of different UIE methods using PSNR, SSIM, UCIQE , and UIQM on Test-485 and Test-90, as shown in Table 1. For PSNR and SSIM, the proposed MC-UIE yields the best scores, which indicates that our results closely resemble the reference images in terms of both content and structure. For UCIQE, MC-UIE achieves the best score, which demonstrates that the proposed approach mitigates non-uniform color bias, reduces blurriness, and enhances contrast. Besides, MC-UIE obtains the best UIQM scores, which indicate exceptional performance in terms of colorfulness, sharpness, and contrast enhancement.

As shown in Fig. 4, we analyze the importance of M-C HB and CFMB module through comprehensive ablation studies on bluish, greenish, yellowish, and low-visibility degeneration scenes, including 3 settings: 1) replacing the M-C HB with a naive mamba block (Our-settingI), 2) replacing the CFMB with a 1 1 convolution layer (Our-settingII), 2) replacing the L1 loss function with an L2 and structural similarity loss function in (Our-settingIII), 3) full method (Our). Our-settingI underperforms in mitigating color casts and visibility. Our settingII improves color and increases visibility but fails to fully restore color casts. Our settingIII restores more image details but fails to efficiently remove color casts. Our full method achieves true-to-life colors while enhancing sharpness and visibility. Furthermore, we show a quantitative comparison in Table 2. As shown, there is a significant improvement in PSNR and SSIM scores from Our-setting I to Our-settingII. Compared with Our-settingII and Our-settingIII, our full method (Our) achieves the highest PSNR and SSIM scores, underscoring the superior performance of our MC-UIE in restoring underwater content and structure.

We demonstrate the utility of our MC-UIE on several underwater application tests, including depth estimation, edge detection, keypoint detection, saliency detection, and image segmentation. We employ the non-local prior for underwater depth estimation, utilize the Canny operator for underwater edge detection, employ the SIFT keypoint detection for underwater keypoint detection, adopt BASNet for underwater saliency detection, and apply a superpixel-based clustering algorithm for underwater image segmentation. We evaluate the performance of underwater application tests of different methods using the following metrics: absolute relative error (REL) for depth estimation, average precision (AP) for edge detection, the number of detected keypoints for keypoint detection, mean absolute error (MAE) for saliency detection, and global consistency error (GCE) for image segmentation. A lower REL score reflects more accurate depth predictions, with fewer discrepancies from the reference depth map. Conversely, a higher AP score suggests that the detected edges align more closely with the ground truth edges. Similarly, a lower MAE score indicates that the predicted saliency map deviates less from the ground truth mask. For image segmentation, a lower GCE score means the predicted segmentation results are more consistent with the reference annotations. All results are shown in Fig. 5. Compared to other competitors, our enhanced results achieve more accurate depth maps, indicating the superiority of our MC-Net on restoring more reliable depth structures. Moreover, our MC-UIE yields more edge and keypoint numbers, which suggests that our MC-Net recovers richer local details and sharper structures. compared to other methods, the segmentation results of our MC-Net are more consistent and accurate, and the saliency maps by our MC-Net contain more salient objects and better boundaries. These results suggest that MC-UIE can effectively boost underwater image segmentation and saliency detection.

We compare the complexity of all UIE methods, including the running time (s), trainable parameters (M), and FLOPs (G). The results are shown in Table 3. It can be seen that our MC-UIE has relatively few parameters, FLOPs, and running time. WaterMamba has the fewest parameters, but its FLOPs are relatively high and its running time (0.0244) is slower than our MC-UIE (0.0126). Although USUIR has fewer FLOPs and running time, its enhancement performance is worse than our MC-UIE. Comprehensively, our MC-UIE can achieve the best enhancement performance with relatively less complexity and shorter running time.

Previous articleNext article

POPULAR CATEGORY

corporate

13839

entertainment

17156

research

8148

misc

17779

wellness

13966

athletics

18219