Open Access System for Information Sharing

Department of Computer Science & Engineering (컴퓨터공학과) 3. Theses_Ph.D.

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Learning Image and Video Restoration Using Auxiliary Data

Title: Learning Image and Video Restoration Using Auxiliary Data

Authors: 이준용

Date Issued: 2023

Publisher: 포항공과대학교

Abstract: Previous deep learning-based restoration methods mainly focus on improving network architectures for better restoration quality. However, a network with the improved capacity may easily overfit to a training dataset and fail to be fully exploited in handling arbitrary real-world degradation. In this dissertation, we propose to utilize auxiliary data to provide degradation-specific priors for the network to be fully involved in a restoration task. We present image and video restoration frameworks for defocus deblurring and video super-resolution tasks, for which we propose novel network architectures and training strategies specifically designed to effectively utilize the auxiliary data, allowing the networks to achieve state-of-the-art restoration quality. First, we propose a novel deep learning-based network for estimating a defocus map containing the per-pixel blur amount of an input defocused image. To train the network, we present a dataset containing synthetic images with defocus maps. During training, we utilize a real-world blur detection dataset as auxiliary data to reduce the domain gap that occurs when a real-world defocused image is fed to the network trained with the synthetic dataset. Our method reports state-of-the-art defocus map estimation performance, and we show that leveraging defocus maps predicted by our method can improve the defocus deblurring quality. Second, we present an end-to-end defocus deblurring framework that predicts per-pixel deblurring filters for flexible handling of spatially varying defocus blur. Due to its high flexibility, the network may easily overfit to a training dataset. During training, we prevent this by utilizing auxiliary disparity map estimation and reblurring tasks for the network to exploit defocus-specific priors about blur sizes and shapes, allowing robust single image defocus deblurring. Our network effectively removes spatially varying defocus blur and shows state-of-the-art deblurring quality. Third, we present a reference-based video super-resolution (RefVSR) approach, in which we propose the explicit matching-based RefVSR network effectively designed to super-resolve ultra-wide low-resolution (LR) video utilizing wide-angle and telephoto videos as auxiliary references. To train our network, we propose the dataset containing video triplets concurrently captured by triple cameras of a smartphone. We also present the training strategy fully utilizing video triplets in the proposed dataset. Our network shows state-of-the-art real-world 4xVSR performance. Lastly, we present a memory network for the implicit reference utilization in the RefVSR task. Using ultra-wide LR features as queries, our memory network returns corresponding wide-angle reference features, which can be utilized by a VSR network for high-fidelity results. We also propose the test-time optimization strategy that fine-tunes the memory network for memorizing video-specific reference information. We show that reference features queried from the proposed memory network can be utilized across the entire region of an LR frame and help improve the final SR quality.
열화는 영상 및 비디오의 화질을 크게 저하시키고 돌이킬 수 없는 정보 손실을 일으킨다. 열화는 카메라의 물리적 원인, 예를 들어 낮은 조리갯값, 제한된 크기의 카메라 센서 등 다양한 원인으로 발생할 수 있으며 영상 및 비디오 내에서 픽셀 위치에 따라 다양한 모양과 크기를 지닐 수 있다. 영상 및 비디오 복원은 가장 어려운 계산사진학 문제 중 하나로써 영상이나 비디오 내의 열화 제거를 통해 원래의 선명한 콘텐츠 복원을 목표로 한다. 최근 딥러닝 기반의 연구가 활발히 이루어지면서 이를 이용한 영상 및 비디오 복원 기술들의 성능 또한 상당히 개선되었다. 하지만 기존 연구들은 계산사진학 문제의 전반적 성능향상을 가져오는 네트워크 구조 개선에 집중하고 있는데, 단순한 구조적 개선은 네트워크가 학습데이터 분포에 쉽게 오버피팅되는 현상을 야기하여 다양한 형태의 실제 열화를 효과적으로 다룰 수 없게 한다. 상기 문제를 극복하기 위해서는 다양한 열화 제거에 특화된 네트워크 설계가 필요하며, 학습에 사용되지 않은 영상 및 비디오에도 네트워크가 복원에 온전히 활용되게 유도하는 학습 방법이 요구된다. 본 논문에서는 보조 데이터의 활용을 통한 열화 제거에 특화된 영상 및 비디오 복원 프레임워크와 학습 방법을 제시한다. 보조 데이터의 활용을 통해 네트워크에게 목표 열화의 특성을 반영하는 프라이어를 제공하여 네트워크가 열화 제거에 특화된 효과적 연산을 배우도록 유도한다. 본 논문에서는 보조 데이터를 효과적으로 운용할 수 있도록 하는 개선된 구조의 딥러닝 프레임워크 및 학습 방법을 제안하며, 디포커스 블러 및 저해상도 열화 제거를 위한 다음 네 개의 딥러닝 기반의 영상 및 비디오 복원 프레임워크들을 소개한다. - 실제적 디포커스 영상 데이터를 활용한 디포커스 맵 측정 인위적 디포커스 맵 데이터셋으로 학습된 네트워크는 영상의 특성 격차 때문에 실제적 디포커스 맵 측정의 어려움을 겪는다. 본 연구에서는 실제적 디포커스 맵의 강인한 측정을 위해 실제적 디포커스 영상의 이진 블러 맵 데이터셋을 보조 데이터로 활용하여 인위/실제적 디포커스 영상의 특성 격차를 줄이는 네트워크 및 학습 방법을 제시한다. - 듀얼픽셀 데이터 및 리블러링을 활용한 디포커스 블러 제거 본 연구에서는 디포커스 블러의 공간 가변적 성질을 반영한 동적 디블러링 필터 네트워크를 제안한다. 하지만, 네트워크의 높은 유연성에 의해 단일 영상 디블러링이 쉽지 않다. 이를 해결하기 위해 본 연구에서는 듀얼픽셀 시차 맵 데이터 및 리블러링 테스크를 네트워크 학습에 활용하여 디블러링 필터가 정확한 디포커스 블러 특성을 포함하도록 유도하고 단일 영상에서 효과적인 디포커스 디블러링을 가능하게 한다. - 다중카메라 비디오를 활용한 레퍼런스 기반 비디오 초해상도 비대칭 다중 카메라 환경에서 광각 및 망원 비디오는 초광각 비디오보다 각각 2배, 4배 높은 해상도를 갖는다. 본 연구에서는 다중 카메라 환경의 이러한 특성에 기반하여, 고해상도의 광각 및 망원 레퍼런스 비디오를 직/간접적으로 활용해 초광각 비디오의 해상도를 높이는 레퍼런스 기반 비디오 초해상도 네트워크 및 학습 방법을 제시한다. - 메모리 네트워크를 활용한 다중카메라 비디오 초해상도 초광각 비디오 초해상도를 위해 광각 레퍼런스 비디오를 직접적으로 활용할 때, 두 비디오가 정합 되지 않는 영역에 대해서 레퍼런스 비디오가 초해상도 결과에 활용되지 않는 문제가 있다. 본 연구에서는 이를 해결하기 위해 광각 비디오를 복원할 수 있도록 학습된 제한된 크기의 레퍼런스 메모리 네트워크를 이용하여 이를 초광각 비디오를 위한 초해상도 네트워크에 간접적으로 활용하는 딥러닝 프레임워크를 제안한다.

URI: http://postech.dcollection.net/common/orgView/200000659628
https://oasis.postech.ac.kr/handle/2014.oak/118336

Article Type: Thesis

Files in This Item:: There are no files associated with this item.

Show full item record

qr_code

트윗하기

Communities & Collection

Department of Computer Science & Engineering (컴퓨터공학과)

Open Access System for Information Sharing

Communities & Collection

Views & Downloads

Browse