前のチュートリアル: PyTorch 分類モデルの変換と OpenCV Python での起動


原著者	Anastasia Murzova
互換性	OpenCV >= 4.5

目的

このチュートリアルでは、以下の方法を学ぶ:

PyTorch分類モデルをONNX形式に変換する
変換したPyTorchモデルをOpenCVのC/C++ APIで実行する
モデル推論を提供する

上記の項目をResNet-50アーキテクチャの例を通して見ていく。

はじめに

OpenCV APIによるPyTorchモデルの移行パイプラインに関わる主要な概念を簡単に確認しよう。PyTorchモデルを cv::dnn::Net へ変換する最初のステップは、モデルを ONNX 形式に変換することである。ONNXはさまざまなフレームワーク間でニューラルネットワークを相互運用できるようにすることを目的としている。PyTorchにはONNX変換のための組み込み関数 torch.onnx.export がある。さらに、得られた .onnx モデルは cv::dnn::readNetFromONNX または cv::dnn::readNet に渡される。

要件

以下のコードを試せるようにするには、一連のライブラリをインストールする必要がある。ここではpython3.7+の仮想環境を使う。

virtualenv -p /usr/bin/python3.7 <env_dir_path>

source <env_dir_path>/bin/activate

OpenCV-Pythonをソースからビルドする場合は、OpenCV入門の該当する手順に従うこと。

ライブラリのインストールを始める前に、いくつかの依存関係を除外したり含めたり（例えば opencv-python）して requirements.txt をカスタマイズできる。以下の行で、事前に有効化した仮想環境への要件のインストールを開始する:

pip install -r requirements.txt

実践

この章では、以下の点を扱う。

分類モデルの変換パイプラインを作成する
推論を行い、予測結果を処理する

モデル変換パイプライン

本節のコードはsamples/dnn/dnn_model_runnerモジュールにあり、次の行で実行できる:

python -m dnn_model_runner.dnn_conversion.pytorch.classification.py_to_py_resnet50_onnx

以下のコードには、下に列挙する各ステップの説明が含まれている:

PyTorchモデルをインスタンス化する
PyTorchモデルを.onnxへ変換する

# initialize PyTorch ResNet-50 model
original_model = models.resnet50(pretrained=True)
 
# get the path to the converted into ONNX PyTorch model
full_model_path = get_pytorch_onnx_model(original_model)
print("PyTorch ResNet-50 model was successfully converted: ", full_model_path)

get_pytorch_onnx_model(original_model)関数はtorch.onnx.export(...)の呼び出しに基づいている:

# define the directory for further converted model save
onnx_model_path = "models"
# define the name of further converted model
onnx_model_name = "resnet50.onnx"
 
# create directory for further converted model
os.makedirs(onnx_model_path, exist_ok=True)
 
# get full path to the converted model
full_model_path = os.path.join(onnx_model_path, onnx_model_name)
 
# generate model input
generated_input = Variable(
    torch.randn(1, 3, 224, 224)
)
 
# model export into ONNX format
torch.onnx.export(
    original_model,
    generated_input,
    full_model_path,
    verbose=True,
    input_names=["input"],
    output_names=["output"],
    opset_version=11
)

上記のコードを正常に実行すると、以下の出力が得られる。

PyTorch ResNet-50 model was successfully converted: models/resnet50.onnx

dnn/samplesモジュールで提案されているdnn_model_runnerを使うと、次のPyTorch分類モデルについて上記の変換手順を再現できる:

alexnet
vgg11
vgg13
vgg16
vgg19
resnet18
resnet34
resnet50
resnet101
resnet152
squeezenet1_0
squeezenet1_1
resnext50_32x4d
resnext101_32x8d
wide_resnet50_2
wide_resnet101_2

変換済みモデルを得るには、次の行を実行する:

python -m dnn_model_runner.dnn_conversion.pytorch.classification.py_to_py_cls --model_name <pytorch_cls_model_name> --evaluate False

ResNet-50の場合は次の行を実行する:

python -m dnn_model_runner.dnn_conversion.pytorch.classification.py_to_py_cls --model_name resnet50 --evaluate False

変換済みモデルを保存するデフォルトのルートディレクトリは、CommonConfigモジュールで定義されている:

@dataclass
class CommonConfig:
    output_data_root_dir: str = "dnn_model_runner/dnn_conversion"

したがって、変換されたResNet-50はdnn_model_runner/dnn_conversion/modelsに保存される。

推論パイプライン

ここで、OpenCVのC/C++ APIを用いた推論パイプラインに models/resnet50.onnx を使用できる。実装されたパイプラインは samples/dnn/classification.cpp にある。サンプルをビルドすると(BUILD_EXAMPLES フラグの値を ON にする必要がある)、対応する example_dnn_classification 実行ファイルが生成される。

モデル推論を行うために、ImageNetのクラスID 335に対応する、以下のリスの写真（CC0ライセンス）を使用する:

fox squirrel, eastern fox squirrel, Sciurus niger

Classification model input image

得られた予測のラベルをデコードするために、ImageNetクラスの完全な一覧を含む imagenet_classes.txt ファイルも必要となる。

このチュートリアルでは、変換したPyTorch ResNet-50モデルの推論処理を、ビルド用ディレクトリ（samples/build）から実行する:

./dnn/example_dnn_classification --model=../dnn/models/resnet50.onnx --input=../data/squirrel_cls.jpg --width=224 --height=224 --rgb=true --scale="0.003921569" --mean="123.675 116.28 103.53" --std="0.229 0.224 0.225" --crop=true --initial_width=256 --initial_height=256 --classes=../data/dnn/classification_classes_ILSVRC2012.txt

classification.cppの要点を段階的に見ていく:

cv::dnn::readNet でモデルを読み込み、ネットワークを初期化する:

Net net = readNet(model, config, framework);

model引数の値は--modelキーから取得される。ここではresnet50.onnxである。

入力画像を前処理する:

if (rszWidth != 0 && rszHeight != 0)
{
    resize(frame, frame, Size(rszWidth, rszHeight));
}
 
// Create a 4D blob from a frame
blobFromImage(frame, blob, scale, Size(inpWidth, inpHeight), mean, swapRB, crop);
 
// Check std values.
if (std.val[0] != 0.0 && std.val[1] != 0.0 && std.val[2] != 0.0)
{
 // Divide blob by std.
    divide(blob, std, blob);
}

このステップでは、cv::dnn::blobFromImage関数を使ってモデル入力を準備する。PyTorch ResNetの推論パイプラインに記載されているとおり、初期の画像リサイズとして--initial_width=256 --initial_height=256とともにSize(rszWidth, rszHeight)を設定する。

なお、cv::dnn::blobFromImageでは、まず平均値が減算され、その後にピクセル値がスケールで乗算される点に注意が必要である。そこで、PyTorch分類モデルの本来の画像前処理の順序を再現するため、[0.485, 0.456, 0.406]に255.0を乗じた値に等しい--mean="123.675 116.28 103.53"を使用する:

img /= 255.0
img -= [0.485, 0.456, 0.406]
img /= [0.229, 0.224, 0.225]

順伝播を行う:

net.setInput(blob);

Mat prob = net.forward();

予測を処理する:

Point classIdPoint;
double confidence;
minMaxLoc(prob.reshape(1, 1), 0, &confidence, 0, &classIdPoint);
int classId = classIdPoint.x;

ここでは最も尤度の高いオブジェクトクラスを選ぶ。本ケースのclassIdの結果は335 - fox squirrel, eastern fox squirrel, Sciurus niger である:

ResNet50 OpenCV C++ inference output