1200字范文 > 目标检测数据集转换 json文件转换为txt文件格式

目标检测数据集转换 json文件转换为txt文件格式

时间：2020-02-08 08:02:50

目标检测任务中，制作数据集或寻找合适的数据集是极为重要的一项工作。我们需要就数据集标签格式调整模型代码，亦或是改动数据集标签的格式以满足模型需求。

本帖子所述的方法是：使用数据集制作工具labelme初步制定数据集生成的.json文件按照需求，转换为.txt文件，在此过程中删除冗余信息，以保留关键信息，节省存储空间。

import osimport jsontrain_path = 'F:\\train - 附件'text_filepath = train_path+'\\train.txt'text = open(text_filepath, 'w') # 以新建文本文件形式打开text_filepathjson_num = 0# 使用os.walk遍历所有目录和文件for root, dirs, files in os.walk(train_path):for file in files:if '.json' in file:file_ = filefile_ = file_.replace('.json', '.jpg')text.write(file_ + '\t\t')with open(os.path.join(root, file), 'r', encoding='utf8') as fp:json_data = json.load(fp) # 读取json文件i = 0for each in json_data['shapes']:print('写入', file, '标签信息:')del each['group_id']del each['shape_type']del each['flags']each = str(each)try:text.write(each)print(each, '写入成功')except Exception as e:print('错误:', e)if i < len(json_data['shapes']):text.write('\t')fp.close()text.write('\n')json_num += 1print('图片总数:', json_num)text.close()

def custom_reader(data_dir, mode):def reader():file_list = open(data_dir)label_dict = {}for line in file_list:#一个line就是train.txt的一行数据parts = line.split('\t')img_path = parts[0]# 获取到图片路径batch_out = []if mode == 'train' or mode == 'eval':###################### 以下可能是需要自定义修改的部分 ############################img_id = [parts[0].split('/')[-1][:-4]] is_crowd = [0]# 读取图片，确定图片像素img = Image.open(img_path)if img.mode != 'RGB':img = img.convert('RGB')im_width, im_height = img.sizegt_cls = []gt_box = []crowd = [] #目标是否密集，一般为0for object_str in parts[1:]:if len(object_str) <= 1:continueobject = json.loads(object_str)gt_cls.append(float(label_dict[object['label']])) #类别bbox = object['points']#坐标x1,y1,x2,y2box = [float(bbox[0][0]), float(bbox[0][1]), float(bbox[1][0]), float(bbox[1][1])]gt_box.append(box)crowd.append(0)###################### 可能需要自定义修改部分结束 ############################img, im_scales = data_Utilss.get_image_blob(img_path, mode) #对图片做预处理c, h, w=img.shapeimg_info=[h, w, im_scales] #im_scales为图片伸缩尺寸outs=(np.array(img), np.array(gt_box, dtype = 'float32'), np.array(gt_cls, dtype = 'int32'), np.array(crowd, dtype = 'int32'), np.array(img_info, dtype = 'float32'), np.array(img_id, dtype = 'int64'))batch_out.append(outs)yield batch_outreturn reader

说明：所获得的txt形式的数据集没有包含图片路径，在使用前尚需要在txt数据集图片名之前添加路径!另附录已经附有测试的文件，谢谢使用！！！

链接：/s/17Ekg-vlP2g53ZUD0LywRWg?pwd=cub3

提取码：cub3

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。