1200字范文 > TensorFlow下如何将图片制作成数据集

TensorFlow下如何将图片制作成数据集

时间：2021-07-27 08:26:24

引言：

在做TensorFlow案例时发现好多的图片数据集都是处理好的，直接在库中调用的。比如Mnist，CIFAR-10等等。但是在跑自己项目的时候如何去读取自己的数据集呢？其实，一方面TensorFlow官方已经给出方法，那就是将图片制作成tfrecord格式的数据，供TensorFlow读取。另一方面Python以及Python的图像处理第三方库都有读取制作的方法，种类繁杂。

下面我将介绍两种方法：1.用python制作数据集2.基于TensorFlow制作tfrecord格式的数据集

一用python制作数据集

代码比较简单这里做一下简单的说明：

1.一定要把.py文件放到图片所在的文件夹内，因为程序获取的路径是.py文件下的路径，但是你的源图片路径也得有图片否则回报错（目前是什么原因造成的还没发现，以后补充）。

2.程序已经写成函数了，所以只需要把图片路径以及将图片放到.py文件下就行了。参数有路径path和需要制作的标签Lables。

直接上代码：

import osimport matplotlib.pyplot as pltimport matplotlib.image as mpimgimport numpy as npdef make_data(path,labels):def getAllimages(folder):assert os.path.exists(folder)assert os.path.isdir(folder)imageList = os.listdir(folder)imageList = [os.path.abspath(item) for item in imageList if os.path.isfile(os.path.join(folder, item))]return imageListImageList=getAllimages(path)TrainList=[]Lable=[]Img_data=[]for i in range(len(ImageList)):string=str(ImageList[i])List=mpimg.imread(string)TrainList.append(List)Lable1=labelsLable.append(Lable1)Img = np.hstack((TrainList, Lable))Img_data=Img[:len(TrainList)]Img_lable=Img[len(TrainList):]return Img_data,Img_lablepath=(r'/home/wcy/图片')img,lable=make_data(path,0)print(lable)

注意：/home/wcy/图片目录下有需要制作的图片以及.py文件夹下也应该有图片。

二基于TensorFlow制作tfrecord格式的数据集

整个程序分为两部分一个是make_image_TFRecord另一部分是read_Tfrecord。

1.make_image_TFRecord.py

import osimport tensorflow as tffrom PIL import Imageimport numpy as npimport pandas as pd# 原始图片的存储位置orig_picture = os.getcwd()+'\\image\\test'# 生成图片的存储位置gen_picture = os.getcwd()+'\\image'# 需要的识别类型classes = {'0', '1'}# 样本总数num_samples = 40# 制作TFRecords数据def create_record():writer = tf.python_io.TFRecordWriter("test.tfrecords")for index, name in enumerate(classes):class_path = orig_picture + "/" + name + "/"for img_name in os.listdir(class_path):img_path = class_path + img_nameimg = Image.open(img_path)img = img.resize((32, 32)) # 设置需要转换的图片大小###图片灰度化####################################################################### img=img.convert("L")##############################################################################################img_raw = img.tobytes() # 将图片转化为原生bytesexample = tf.train.Example(features=tf.train.Features(feature={"label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))}))writer.write(example.SerializeToString())writer.close()# =======================================================================================def read_and_decode(filename,is_batch):# 创建文件队列,不限读取的数量 filename_queue = tf.train.string_input_producer([filename])# create a reader from file queue reader = tf.TFRecordReader()# reader从文件队列中读入一个序列化的样本 _, serialized_example = reader.read(filename_queue)# get feature from serialized example # 解析符号化的样本 features = tf.parse_single_example(serialized_example,features={'label': tf.FixedLenFeature([], tf.int64),'img_raw': tf.FixedLenFeature([], tf.string)})label = features['label']img = features['img_raw']img = tf.decode_raw(img, tf.uint8)img = tf.reshape(img, [32, 32, 3])# img = tf.cast(img, tf.float32) * (1. / 255) - 0.5label = tf.cast(label, tf.int32)if is_batch:batch_size = 3min_after_dequeue = 10capacity = min_after_dequeue + 3 * batch_sizeimg, label = tf.train.shuffle_batch([img, label],batch_size=batch_size,num_threads=3,capacity=capacity,min_after_dequeue=min_after_dequeue)return img, label# =======================================================================================

2.read_Tfrecord.py

import tensorflow as tfimport osimport pandas as pdfrom make_image_TFRecord import create_recordfrom make_image_TFRecord import read_and_decodefrom PIL import Imagenum_samples = 40create_record()train_image, train_label = read_and_decode('test.tfrecords', is_batch=False)# 初始化变量init_op = tf.group(tf.global_variables_initializer(), tf.local_variables_initializer())# 创建一个session用于run输出结果with tf.Session() as sess: # 开始一个会话sess.run(init_op)coord = tf.train.Coordinator()threads = tf.train.start_queue_runners(coord=coord)data = pd.DataFrame()for i in range(num_samples):example, lab = sess.run([train_image, train_label]) # 在会话中取出image和labelimg = Image.fromarray(example, 'RGB') # 这里Image是之前提到的# img.save(gen_picture + '/' + str(i) + 'samples' + str(lab) + '.jpg') # 存下图片;注意cwd后边加上‘/’# img.save( '/' + str(i) + 'samples' + str(lab) + '.jpg') # 存下图片;注意cwd后边加上‘/’# print(example, lab)print(lab)coord.request_stop()coord.join(threads)sess.close() # 关闭会话# ========================================================================================

第一个程序运行完之后会生成一个.tfrecords格式的文件，然后再第二个程序中直接读取调用就行。

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。