前のチュートリアル: PyTorch 分類モデルの変換と OpenCV Python での起動


原著者	Anastasia Murzova
互換性	OpenCV >= 4.5

目標

このチュートリアルでは、以下の方法を学ぶ:

PyTorch分類モデルをONNX形式に変換する
変換したPyTorchモデルをOpenCVのC/C++ APIで実行する
モデル推論を提供する

上記の項目をResNet-50アーキテクチャの例を通して見ていく。

はじめに

OpenCV APIによるPyTorchモデルの移行パイプラインに関わる主要な概念を簡単に見ていく。PyTorchモデルをcv::dnn::Netへ変換する最初のステップは、モデルをONNX形式へ移すことである。ONNXはさまざまなフレームワーク間でニューラルネットワークを相互運用できるようにすることを目的としている。PyTorchにはONNX変換用の組み込み関数torch.onnx.exportがある。さらに、得られた.onnxモデルをcv::dnn::readNetFromONNXまたはcv::dnn::readNetに渡す。

要件

以下のコードを試せるようにするには、一連のライブラリをインストールする必要がある。ここではpython3.7+の仮想環境を使う。

virtualenv -p /usr/bin/python3.7 <env_dir_path>

source <env_dir_path>/bin/activate

OpenCV-Pythonをソースからビルドする場合は、OpenCV入門の該当する手順に従うこと。

ライブラリのインストールを始める前に、requirements.txt をカスタマイズし、一部の依存関係を除外したり（たとえば opencv-python を）含めたりできる。以下の行は、先に有効化した仮想環境への要件のインストールを開始する。

pip install -r requirements.txt

実践

この章では、以下の点を扱う。

分類モデルの変換パイプラインを作成する
推論を行い、予測結果を処理する

モデル変換パイプライン

本節のコードはsamples/dnn/dnn_model_runnerモジュールにあり、次の行で実行できる:

python -m dnn_model_runner.dnn_conversion.pytorch.classification.py_to_py_resnet50_onnx

以下のコードには、下に列挙する各ステップの説明が含まれている:

PyTorchモデルをインスタンス化する
PyTorchモデルを.onnxへ変換する

# initialize PyTorch ResNet-50 model
original_model = models.resnet50(pretrained=True)
 
# get the path to the converted into ONNX PyTorch model
full_model_path = get_pytorch_onnx_model(original_model)
print("PyTorch ResNet-50 model was successfully converted: ", full_model_path)

get_pytorch_onnx_model(original_model)関数はtorch.onnx.export(...)の呼び出しに基づいている:

# define the directory for further converted model save
onnx_model_path = "models"
# define the name of further converted model
onnx_model_name = "resnet50.onnx"
 
# create directory for further converted model
os.makedirs(onnx_model_path, exist_ok=True)
 
# get full path to the converted model
full_model_path = os.path.join(onnx_model_path, onnx_model_name)
 
# generate model input
generated_input = Variable(
    torch.randn(1, 3, 224, 224)
)
 
# model export into ONNX format
torch.onnx.export(
    original_model,
    generated_input,
    full_model_path,
    verbose=True,
    input_names=["input"],
    output_names=["output"],
    opset_version=11
)

上記のコードを正常に実行すると、以下の出力が得られる。

PyTorch ResNet-50 model was successfully converted: models/resnet50.onnx

dnn/samplesモジュールで提案されているdnn_model_runnerを使うと、次のPyTorch分類モデルについて上記の変換手順を再現できる:

alexnet
vgg11
vgg13
vgg16
vgg19
resnet18
resnet34
resnet50
resnet101
resnet152
squeezenet1_0
squeezenet1_1
resnext50_32x4d
resnext101_32x8d
wide_resnet50_2
wide_resnet101_2

変換済みモデルを得るには、次の行を実行する:

python -m dnn_model_runner.dnn_conversion.pytorch.classification.py_to_py_cls --model_name <pytorch_cls_model_name> --evaluate False

ResNet-50の場合は次の行を実行する:

python -m dnn_model_runner.dnn_conversion.pytorch.classification.py_to_py_cls --model_name resnet50 --evaluate False

変換済みモデルを保存するデフォルトのルートディレクトリは、CommonConfigモジュールで定義されている:

@dataclass
class CommonConfig:
    output_data_root_dir: str = "dnn_model_runner/dnn_conversion"

したがって、変換されたResNet-50はdnn_model_runner/dnn_conversion/modelsに保存される。

推論パイプライン

これで、OpenCVのC/C++ APIを使った推論パイプラインにmodels/resnet50.onnxを利用できる。実装済みのパイプラインはsamples/dnn/classification.cppにある。サンプルをビルドすると（BUILD_EXAMPLESフラグの値をONにする）、対応するexample_dnn_classification実行ファイルが生成される。

モデル推論を行うために、ImageNetのクラスID 335に対応する、以下のリスの写真（CC0ライセンス）を使用する:

fox squirrel, eastern fox squirrel, Sciurus niger

Classification model input image

得られた予測のラベルをデコードするために、ImageNetクラスの完全な一覧を含む imagenet_classes.txt ファイルも必要となる。

このチュートリアルでは、変換したPyTorch ResNet-50モデルの推論処理を、ビルド用ディレクトリ（samples/build）から実行する:

./dnn/example_dnn_classification --model=../dnn/models/resnet50.onnx --input=../data/squirrel_cls.jpg --width=224 --height=224 --rgb=true --scale="0.003921569" --mean="123.675 116.28 103.53" --std="0.229 0.224 0.225" --crop=true --initial_width=256 --initial_height=256 --classes=../data/dnn/classification_classes_ILSVRC2012.txt

classification.cppの要点を段階的に見ていく:

cv::dnn::readNetでモデルを読み込み、ネットワークを初期化する:

Net net = readNet(model, config, framework);

model引数の値は--modelキーから取得される。ここではresnet50.onnxである。

入力画像を前処理する:

if (rszWidth != 0 && rszHeight != 0)
{
    resize(frame, frame, Size(rszWidth, rszHeight));
}
 
// Create a 4D blob from a frame
blobFromImage(frame, blob, scale, Size(inpWidth, inpHeight), mean, swapRB, crop);
 
// Check std values.
if (std.val[0] != 0.0 && std.val[1] != 0.0 && std.val[2] != 0.0)
{
 // Divide blob by std.
    divide(blob, std, blob);
}

このステップでは、cv::dnn::blobFromImage関数を使ってモデル入力を準備する。PyTorch ResNetの推論パイプラインに記載されているとおり、初期の画像リサイズとして--initial_width=256 --initial_height=256とともにSize(rszWidth, rszHeight)を設定する。

なお、cv::dnn::blobFromImageでは、まず平均値が減算され、その後にピクセル値がスケールで乗算される点に注意が必要である。そこで、PyTorch分類モデルの本来の画像前処理の順序を再現するため、[0.485, 0.456, 0.406]に255.0を乗じた値に等しい--mean="123.675 116.28 103.53"を使用する:

img /= 255.0
img -= [0.485, 0.456, 0.406]
img /= [0.229, 0.224, 0.225]

順伝播を行う:

net.setInput(blob);

Mat prob = net.forward();

予測を処理する:

Point classIdPoint;
double confidence;
minMaxLoc(prob.reshape(1, 1), 0, &confidence, 0, &classIdPoint);
int classId = classIdPoint.x;

ここでは最も尤度の高いオブジェクトクラスを選ぶ。本ケースのclassIdの結果は335 - fox squirrel, eastern fox squirrel, Sciurus niger である:

ResNet50 OpenCV C++ inference output