基于卷积神经网络的深海摄像资料智能识别研究

Recognizing Seabed Images Taken From the Mid-Atlantic Ridge Based on Convolution Neural Network

  • 摘要: 海底摄像是海洋科考特别是调查海底岩石类型及底层环境最为直观的调查方法,通过判读深海高清摄像及照相设备获取的高质量视频资料和图片资料,可直接识别海底岩石类型、生物活动特征及海洋底层环境。近年来,随着载人潜器、ROV、深海光学拖体等深海摄像设备的广泛使用,获取的海底摄像和高清图像资料迅速增加,这些视频及图像资料的处理分析工作量十分巨大,因此需开发基于计算机视觉的海底图像智能识别技术。本文基于深度学习方法提出了海底图像自动识别模式,并应用于大西洋中脊热液硫化物调查获取的视频资料处理。首先,基于海上调查获取的高质量视频资料,人工识别31 499帧海底图像,并分别标注为远洋沉积物、枕状玄武岩、角砾状玄武岩、热液硫化物等不同底质类型,其中热液硫化物是主要调查目标,将此图像数据集随机切分作为深度学习模型的训练集或验证集。搭建深度残差网络(Residual Networks, ResNet),利用图像数据集训练模型并验证准确率。利用该模型处理分析1条3.5 km长的海底摄像调查测线,结果表明自动识别准确率达到98%。该方法对于海底摄像资料的智能识别具有效率高、精度高的综合优势,既可用于海量视频数据的后处理,也可用于深海摄像调查现场分析。

     

    Abstract: Submarine video system is the most intuitive investigation method for marine scientific research, especially for investigating the types of seafloor rocks and the bottom environment. By interpreting high-quality video and picture data obtained from deep-sea high-quality cameras and photographic equipments, it is possible to directly identify the types of seafloor rocks, biological activities and seabed environments. In recent years, with the widespread use of deep-sea camera equipments such as manned submersibles, Remotely Operated Vehicles (ROVs), and deep-sea optical tow vehicles, scientists has acquired more and more submarine high-definition videos and image data. However, large workload is needed to process these videos and image data. Therefore, it is necessary to develop some intelligent seabed image recognition technologies based on computer vision. In this study, based on the deep learning technology, an automatic seabed image recognition model is proposed, and the model is applied to the processing of video data obtained from the Mid-Atlantic Ridge hydrothermal sulfide survey. Firstly, based on high-quality video data obtained from oceanographic survey, 31 499 frames of seabed images were manually identified and labeled as pelagic sediments, pillow basalts, breccia basalts, hydrothermal sulfides and other categories, and hydrothermal sulfide is the main exploration target. This image dataset is randomly segmented into either the training set or the validation set of the deep learning model. Secondly, deep residual networks (ResNet) were built, and image datasets were used to be training and verifying accuracy. Finally, this model was used to analyze a 3.5 km long submarine camera survey line, and the results show that the ResNet model identification accuracy rate reaches 98%. This method has the comprehensive advantages of high efficiency and high precision for the intelligent recognition of submarine video data, and can be used not only for post-processing of massive video data, but for on-site analysis of deep-sea video surveys.

     

/

返回文章
返回