このチュートリアルでは、'dnn_superres' インターフェースを使って、事前学習済みニューラルネットワークによって画像を拡大する方法を学ぶ。C++とPythonで動作する。

ビルド

OpenCVをビルドする際、contribモジュールをすべてビルドするには次のコマンドを実行する。

cmake -D OPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules/

または dnn_superres モジュールのみをビルドする。

cmake -D OPENCV_EXTRA_MODULES_PATH=<opencv_contrib>/modules/dnn_superres

または、CMakeのGUI版である cmake-gui で dnn_superres モジュールにチェックが入っていることを確認する。

サンプルのソースコード

次のようにしてサンプルコードを実行できる。

<path_of_your_opencv_build_directory>/bin/example_dnn_superres_dnn_superres <path_to_image.png> <algo_string> <upscale_int> <model_path.pb>

例:

/home/opencv/build/bin/example_dnn_superres_dnn_superres /home/image.png edsr 2 /home/EDSR_x2.pb

// This file is part of OpenCV project.
// It is subject to the license terms in the LICENSE file found in the top-level directory
// of this distribution and at http://opencv.org/license.html.
 
#include <iostream>
 
#include <opencv2/dnn_superres.hpp>
 
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
 
using namespace std;
using namespace cv;
using namespace dnn;
using namespace dnn_superres;
 
int main(int argc, char *argv[])
{
 // Check for valid command line arguments, print usage
 // if insufficient arguments were given.
 if ( argc < 4 ) {
        cout << "usage:   Arg 1: image     | Path to image" << endl;
        cout << "\t Arg 2: algorithm | bilinear, bicubic, edsr, espcn, fsrcnn or lapsrn" << endl;
        cout << "\t Arg 3: scale     | 2, 3 or 4 \n";
        cout << "\t Arg 4: path to model file \n";
 return -1;
    }
 
 string img_path = string(argv[1]);
 string algorithm = string(argv[2]);
 int scale = atoi(argv[3]);
 string path = "";
 
 if( argc > 4)
        path = string(argv[4]);
 
 // Load the image
 Mat img = cv::imread(img_path);
 Mat original_img(img);
 if ( img.empty() )
    {
        std::cerr << "Couldn't load image: " << img << "\n";
 return -2;
    }
 
 //Make dnn super resolution instance
    DnnSuperResImpl sr;
 
 Mat img_new;
 
 if( algorithm == "bilinear" ){
        resize(img, img_new, Size(), scale, scale, 2);
    }
 else if( algorithm == "bicubic" )
    {
        resize(img, img_new, Size(), scale, scale, 3);
    }
 else if( algorithm == "edsr" || algorithm == "espcn" || algorithm == "fsrcnn" || algorithm == "lapsrn" )
    {
        sr.readModel(path);
        sr.setModel(algorithm, scale);
        sr.upsample(img, img_new);
    }
 else{
        std::cerr << "Algorithm not recognized. \n";
    }
 
 if ( img_new.empty() )
    {
        std::cerr << "Upsampling failed. \n";
 return -3;
    }
    cout << "Upsampling succeeded. \n";
 
 // Display image
 cv::namedWindow("Initial Image", WINDOW_AUTOSIZE);
 cv::imshow("Initial Image", img_new);
 //cv::imwrite("./saved.jpg", img_new);
 cv::waitKey(0);
 
 return 0;
}

解説

ヘッダと名前空間を設定する
#include <opencv2/dnn_superres.hpp>

using namespace std;

using namespace cv;

using namespace dnn;

using namespace dnn_superres;

必要であれば、上記のコードのように名前空間を設定できる。
Dnn Superresオブジェクトを作成する
DnnSuperResImpl sr;

これは単にオブジェクトを作成し、カスタムのdnnレイヤーを登録し、クラスの関数にアクセスできるようにするためのものである。
モデルを読み込む
path = "models/FSRCNN_x2.pb"

sr.readModel(path);

これは .pb ファイルからTensorFlowモデルを読み込む。ここで 'path' は事前学習済みTensorFlowモデルのいずれかのパスファイルである。モデルはOpenCVのGitHubの 'dnn_superres' モジュールからダウンロードできる。
モデルを設定する
sr.setModel("fsrcnn", 2);

実行したいモデルに応じて、アルゴリズムと拡大率を設定する必要がある。これは、.pb ファイルの名前を変更しても、目的のアルゴリズムとスケールを知るためである。たとえば、FSRCNN_x2.pb を選んだ場合、アルゴリズムとスケールはそれぞれ 'fsrcnn' と 2 になる。（その他のアルゴリズムの選択肢には "edsr"、"espcn"、"lapsrn" がある。）
画像を拡大する
Mat img = cv::imread(img_path);

Mat img_new;

sr.upsample(img, img_new);

これで任意の画像を拡大できる。標準の 'imread' 関数で画像を読み込み、出力画像用に新しいMatを作成する。あとは単純に拡大するだけである。拡大された画像は 'img_new' に格納される。

Pythonでの例

import cv2
from cv2 import dnn_superres
 
# Create an SR object - only function that differs from c++ code
sr = dnn_superres.DnnSuperResImpl_create()
 
# Read image
image = cv2.imread('./image.png')
 
# Read the desired model
path = "EDSR_x4.pb"
sr.readModel(path)
 
# Set the desired model and scale to get correct pre- and post-processing
sr.setModel("edsr", 4)
 
# Upscale the image
result = sr.upsample(image)
 
# Save the image
cv2.imwrite("./upscaled.png", result)

元画像:

FSRCNNによる拡大画像:

バイキュービック補間による拡大画像: