1200字范文 > YOLOV3目标检测模型训练实例

YOLOV3目标检测模型训练实例

时间：2020-11-15 06:11:02

相关推荐

YOLOV3目标检测模型训练实例

YOLOV3目标检测

从零开始学习使用keras-yolov3进行图片的目标检测，比较详细地记录了准备以及训练过程，提供一个信号灯的目标检测模型训练实例，并提供相关代码与训练集。

DEMO测试

YOLO提供了模型以及源码，首先使用YOLO训练好的权重文件进行快速测试，首先下载权重文件

/media/files/yolov3.weights

将yolo3的版本库clone到本地，本次测试的commit id为e6598d1

git clone git@:qqwweee/keras-yolo3.git

安装各种依赖，缺啥就安啥，注意依赖版本对应，以下版本仅供参考

Keras==2.2.4numpy==1.16.0tensorflow==1.12.0...

执行convert.py文件，将darknet的yolo转换为可以用于keras的h5文件，生成的文件被保存在model_data下，此外convert.py和yolov3.vfg在git clone后的根目录已经给出，不需要单独下载。

python convert.py yolov3.cfg yolov3.weights model_data/yolo.h5

使用python yolo_video.py -h获取help内容

usage: yolo_video.py [-h] [--model MODEL] [--anchors ANCHORS][--classes CLASSES] [--gpu_num GPU_NUM] [--image][--input [INPUT]] [--output [OUTPUT]]optional arguments:-h, --help show this help message and exit--model MODELpath to model weight file, default model_data/yolo.h5--anchors ANCHORS path to anchor definitions, defaultmodel_data/yolo_anchors.txt--classes CLASSES path to class definitions, defaultmodel_data/coco_classes.txt--gpu_num GPU_NUM Number of GPU to use, default 1--image Image detection mode, will ignore all positionalarguments--input [INPUT] Video input path--output [OUTPUT] [Optional] Video output path

本次测试是进行图片的目标检测，注意当参数为--image时会忽略所有位置参数，也就是说当进行图片检测时每次都需要手动输入位置，当然这可以以后通过自行构建代码修改

python yolo_video.py --image

之后会出现Input image filename:我是放到./img/3.jpg下，于是就直接将路径输入

稍等一会就可以识别完成

模型训练

准备数据集

首先需要准备好目录结构，可以在 http://host.robots.ox.ac.uk/pascal/VOC/voc/ 中下载VOC数据集，然后删除其中所有的文件，仅保留目录结构，也可以手动建立如下目录结构

然后将所有的图片放置在JPEGImages目录下，然后在

/tzutalin/labelImg 下载labelImg标注工具，此工具是为了将图片框选标注后生成XML文件，使用labelImg打开图片，标注好后将图片生成的XML文件放置于Annotations文件夹内，保存的名字就是图片的名字。

准备训练文件

在VOCdevkit/VOC下建立一个python文件，将代码写入并运行，即会在VOCdevkit/VOC/ImageSets/Main下生成四个txt文件

import osimport randomtrainval_percent = 0train_percent = 1 # 全部划分为训练集，因为yolo3在训练时依旧会划分训练集与测试集，不需要在此划分xmlfilepath = 'Annotations'txtsavepath = 'ImageSets/Main'total_xml = os.listdir(xmlfilepath)num = len(total_xml)list = range(num)tv = int(num * trainval_percent)tr = int(tv * train_percent)trainval = random.sample(list, tv)train = random.sample(trainval, tr)ftrainval = open('ImageSets/Main/trainval.txt', 'w')ftest = open('ImageSets/Main/test.txt', 'w')ftrain = open('ImageSets/Main/train.txt', 'w')fval = open('ImageSets/Main/val.txt', 'w')for i in list:name = total_xml[i][:-4] + '\n'if i in trainval:ftrainval.write(name)if i in train:ftest.write(name)else:fval.write(name)else:ftrain.write(name)ftrainval.close()ftrain.close()fval.close()ftest.close()

在VOCdevkit的上层目录，我目前的目录结构为Train下，建立python文件并运行，生成三个txt文件，注意，此处代码需要将classes更改成需要训练的类别，我只需要训练person一类，所以此处数组中只有person类别

import xml.etree.ElementTree as ETfrom os import getcwdsets=[('', 'train'), ('', 'val'), ('', 'test')]classes = ["person"]def convert_annotation(year, image_id, list_file):in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id),'rb')tree=ET.parse(in_file)root = tree.getroot()for obj in root.iter('object'):difficult = obj.find('difficult').textcls = obj.find('name').textif cls not in classes or int(difficult)==1:continuecls_id = classes.index(cls)xmlbox = obj.find('bndbox')b = (int(xmlbox.find('xmin').text), int(xmlbox.find('ymin').text), int(xmlbox.find('xmax').text), int(xmlbox.find('ymax').text))list_file.write(" " + ",".join([str(a) for a in b]) + ',' + str(cls_id))wd = getcwd()for year, image_set in sets:image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()list_file = open('%s.txt'%(image_set), 'w')for image_id in image_ids:list_file.write('VOCdevkit/VOC%s/JPEGImages/%s.jpg'%(year, image_id))convert_annotation(year, image_id, list_file)list_file.write('\n')list_file.close()

接下来将Train目录下所有的文件复制到git clone后的目录下，此时的文件目录结构是这样的

修改参数

此时需要修改model_data/coco_classes.txt与voc_classes.txt文件，这两个文件都是需要存放训练类别的，同样我只是训练person类别，此处只有一行person。

接下来修改yolov3.cfg，假如你不需要加载预训练的权重，那么此文件是没有必要修改的，此文件是为生成yolo_weights.h5作配置的，在此文件中搜索yolo，会有三处匹配，都是相同的更改方式，以第一次匹配举例，三处注释位置，也就是共需改动9行

...[convolutional]size=1stride=1pad=1filters=18 # 3*(5+len(classes)) # 我训练一种类别即 3*(5+1) = 18activation=linear[yolo]mask = 6,7,8anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326classes=1 # 一种类别num=9jitter=.3ignore_thresh = .5truth_thresh = 1random=1 # 显存小就改为0 ...

运行python convert.py -w yolov3.cfg yolov3.weights model_data/yolo_weights.h5生成model_data / yolo_weights.h5用于加载预训练的权重。

训练模型

之后就可以开始训练了，因为我一开始暂时没有数据，就随便找了几张图片标注后试了一下，因为不足十张，外加我在构建VOC数据集时又划分了一下数据集与训练集，而train.py又默认将数据划分了0.1的训练集，不足十张乘0.1取整就是0，导致我一直报错，此处一定要注意，一定要有验证集，也就是至少需要有两张图片，一张作为训练集一张作为验证集，否则运行train.py时会报错KeyError: 'val_loss'，运行train_bottleneck.py会报错IndexError: list index out of range，此外还需要注意的是需要手动建立logs/000/目录，防止保存模型时无法找到目录而抛出异常。训练一般使用train.py就可以了，对于出现的问题多多去看看github的issue与README，很多问题都会有讨论与解决，对于train.py我略微做了一些更改以适应我的训练目的，对于一些更改的地方有注释

"""Retrain the YOLO model for your own dataset."""import numpy as npimport keras.backend as Kfrom keras.layers import Input, Lambdafrom keras.models import Modelfrom keras.optimizers import Adamfrom keras.callbacks import TensorBoard, ModelCheckpoint, ReduceLROnPlateau, EarlyStoppingfrom yolo3.model import preprocess_true_boxes, yolo_body, tiny_yolo_body, yolo_lossfrom yolo3.utils import get_random_datadef _main():annotation_path = 'train.txt'log_dir = 'logs/000/'classes_path = 'model_data/voc_classes.txt'anchors_path = 'model_data/yolo_anchors.txt'class_names = get_classes(classes_path)num_classes = len(class_names)anchors = get_anchors(anchors_path)input_shape = (416,416) # multiple of 32, hw# 此处去掉了 create_tiny_model 的判断 # load_pretrained 为False即不加载预训练的权重，为True则加载预训练的权重model = create_model(input_shape, anchors, num_classes,load_pretrained=False,freeze_body=2, weights_path='model_data/yolo_weights.h5') # make sure you know what you freezelogging = TensorBoard(log_dir=log_dir)# ModelCheckpoint 回调检查模型周期更改为每10次检查checkpoint = ModelCheckpoint(log_dir + 'ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5',monitor='val_loss', save_weights_only=True, save_best_only=True, period=10)reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3, verbose=1)early_stopping = EarlyStopping(monitor='val_loss', min_delta=0, patience=6000, verbose=1)# 对输入划分训练集与测试集的比重val_split = 0.3with open(annotation_path) as f:lines = f.readlines()np.random.seed(10101)np.random.shuffle(lines)np.random.seed(None)num_val = int(len(lines)*val_split)num_train = len(lines) - num_val# Train with frozen layers first, to get a stable loss.# Adjust num epochs to your dataset. This step is enough to obtain a not bad model.if True:pile(optimizer=Adam(lr=1e-3), loss={# use custom yolo_loss Lambda layer.'yolo_loss': lambda y_true, y_pred: y_pred})# batch_size 需要针对显存更改数量batch_size = 10print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))# epochs 即训练次数model.fit_generator(data_generator_wrapper(lines[:num_train], batch_size, input_shape, anchors, num_classes),steps_per_epoch=max(1, num_train//batch_size),validation_data=data_generator_wrapper(lines[num_train:], batch_size, input_shape, anchors, num_classes),validation_steps=max(1, num_val//batch_size),epochs=50, initial_epoch=0,callbacks=[logging, checkpoint])model.save_weights(log_dir + 'trained_weights_stage_1.h5')# Unfreeze and continue training, to fine-tune.# Train longer if the result is not good.if True:for i in range(len(model.layers)):model.layers[i].trainable = pile(optimizer=Adam(lr=1e-4), loss={'yolo_loss': lambda y_true, y_pred: y_pred}) # recompile to apply the changeprint('Unfreeze all of the layers.')# batch_size 需要针对显存更改数量batch_size = 10 # note that more GPU memory is required after unfreezing the bodyprint('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))# epochs即训练次数model.fit_generator(data_generator_wrapper(lines[:num_train], batch_size, input_shape, anchors, num_classes),steps_per_epoch=max(1, num_train//batch_size),validation_data=data_generator_wrapper(lines[num_train:], batch_size, input_shape, anchors, num_classes),validation_steps=max(1, num_val//batch_size),epochs=50,initial_epoch=50)model.save_weights(log_dir + 'trained_weights_final.h5')# Further training if needed.def get_classes(classes_path):'''loads the classes'''with open(classes_path) as f:class_names = f.readlines()class_names = [c.strip() for c in class_names]return class_namesdef get_anchors(anchors_path):'''loads the anchors from a file'''with open(anchors_path) as f:anchors = f.readline()anchors = [float(x) for x in anchors.split(',')]return np.array(anchors).reshape(-1, 2)def create_model(input_shape, anchors, num_classes, load_pretrained=True, freeze_body=2,weights_path='model_data/yolo_weights.h5'):'''create the training model'''K.clear_session() # get a new sessionimage_input = Input(shape=(None, None, 3))h, w = input_shapenum_anchors = len(anchors)y_true = [Input(shape=(h//{0:32, 1:16, 2:8}[l], w//{0:32, 1:16, 2:8}[l], \num_anchors//3, num_classes+5)) for l in range(3)]model_body = yolo_body(image_input, num_anchors//3, num_classes)print('Create YOLOv3 model with {} anchors and {} classes.'.format(num_anchors, num_classes))if load_pretrained:model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)print('Load weights {}.'.format(weights_path))if freeze_body in [1, 2]:# Freeze darknet53 body or freeze all but 3 output layers.num = (185, len(model_body.layers)-3)[freeze_body-1]for i in range(num): model_body.layers[i].trainable = Falseprint('Freeze the first {} layers of total {} layers.'.format(num, len(model_body.layers)))model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss',arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5})([*model_body.output, *y_true])model = Model([model_body.input, *y_true], model_loss)return modeldef create_tiny_model(input_shape, anchors, num_classes, load_pretrained=True, freeze_body=2,weights_path='model_data/tiny_yolo_weights.h5'):'''create the training model, for Tiny YOLOv3'''K.clear_session() # get a new sessionimage_input = Input(shape=(None, None, 3))h, w = input_shapenum_anchors = len(anchors)y_true = [Input(shape=(h//{0:32, 1:16}[l], w//{0:32, 1:16}[l], \num_anchors//2, num_classes+5)) for l in range(2)]model_body = tiny_yolo_body(image_input, num_anchors//2, num_classes)print('Create Tiny YOLOv3 model with {} anchors and {} classes.'.format(num_anchors, num_classes))if load_pretrained:model_body.load_weights(weights_path, by_name=True, skip_mismatch=True)print('Load weights {}.'.format(weights_path))if freeze_body in [1, 2]:# Freeze the darknet body or freeze all but 2 output layers.num = (20, len(model_body.layers)-2)[freeze_body-1]for i in range(num): model_body.layers[i].trainable = Falseprint('Freeze the first {} layers of total {} layers.'.format(num, len(model_body.layers)))model_loss = Lambda(yolo_loss, output_shape=(1,), name='yolo_loss',arguments={'anchors': anchors, 'num_classes': num_classes, 'ignore_thresh': 0.5})([*model_body.output, *y_true])model = Model([model_body.input, *y_true], model_loss)return modeldef data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes):'''data generator for fit_generator'''n = len(annotation_lines)i = 0while True:image_data = []box_data = []for b in range(batch_size):if i==0:np.random.shuffle(annotation_lines)image, box = get_random_data(annotation_lines[i], input_shape, random=True)image_data.append(image)box_data.append(box)i = (i+1) % nimage_data = np.array(image_data)box_data = np.array(box_data)y_true = preprocess_true_boxes(box_data, input_shape, anchors, num_classes)yield [image_data, *y_true], np.zeros(batch_size)def data_generator_wrapper(annotation_lines, batch_size, input_shape, anchors, num_classes):n = len(annotation_lines)if n==0 or batch_size<=0: return Nonereturn data_generator(annotation_lines, batch_size, input_shape, anchors, num_classes)if __name__ == '__main__':_main()

测试模型

当模型训练完成后，就可以加载模型进行图片测试了

import sysimport argparsefrom yolo import YOLO, detect_videofrom PIL import Imageif __name__ == '__main__':config = {"model_path": "logs/000/trained_weights_final.h5", # 加载模型"score": 0.1, # 超出这个值的预测才会被显示"iou": 0.5, # 交并比}yolo = YOLO(**config)image = Image.open("./img/1.jpg")r_image = yolo.detect_image(image)r_image.save("./img/2.jpg")

此后就需要不断开始优化参数并训练了，其实在目录中有很多文件是用不到的或者是使用一次后就一般不会再用到了，可以备份一下代码后适当精简目录结构。