TensorFlow與OpenCV，讀取圖片，進行簡單操作並顯示

時間 2020-12-30

標籤 opencv tensonflow 简体版

原文原文鏈接

本文是OpenCV 2 Computer Vision Application Programming Cookbook讀書筆記的第一篇。在筆記中將以Python語言改寫每章的代碼。

PythonOpenCV的配置這裏就不介紹了。

注意，現在opencv for python就是通過NumPy進行綁定的。所以在使用時必須掌握一些NumPy的相關知識！

圖像就是一個矩陣，在OpenCV for Python中，圖像就是NumPy中的數組！

如果讀取圖像首先要導入OpenCV包，方法爲：

[python]view plain copy
import cv2  

讀取並顯示圖像

在Python中不需要聲明變量，所以也就不需要C++中的cv::Mat xxxxx了。只需這樣：

[python]view plain copy
img = cv2.imread("D:\cat.jpg")  

OpenCV目前支持讀取bmp、jpg、png、tiff等常用格式。更詳細的請參考OpenCV的參考文檔。

接着創建一個窗口

[python]view plain copy
cv2.namedWindow("Image")  

然後在窗口中顯示圖像

[python]view plain copy
cv2.imshow("Image", img)  

最後還要添上一句：

[python]view plain copy
cv2.waitKey (0)  

如果不添最後一句，在IDLE中執行窗口直接無響應。在命令行中執行的話，則是一閃而過。

完整的程序爲：

[python]view plain copy
import cv2   
  
img = cv2.imread("D:\\cat.jpg")   
cv2.namedWindow("Image")   
cv2.imshow("Image", img)   
cv2.waitKey (0)  
cv2.destroyAllWindows()  

最後釋放窗口是個好習慣！

創建/複製圖像

新的OpenCV的接口中沒有CreateImage接口。即沒有cv2.CreateImage這樣的函數。如果要創建圖像，需要使用numpy的函數（現在使用OpenCV-Python綁定，numpy是必裝的）。如下：

[python]view plain copy
emptyImage = np.zeros(img.shape, np.uint8)  

在新的OpenCV-Python綁定中，圖像使用NumPy數組的屬性來表示圖像的尺寸和通道信息。如果輸出img.shape，將得到(500, 375, 3)，這裏是以OpenCV自帶的cat.jpg爲示例。最後的3表示這是一個RGB圖像。

也可以複製原有的圖像來獲得一副新圖像。

[python]view plain copy
emptyImage2 = img.copy();  

如果不怕麻煩，還可以用cvtColor獲得原圖像的副本。

[python]view plain copy
emptyImage3=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)  
#emptyImage3[...]=0  

後面的emptyImage3[...]=0是將其轉成空白的黑色圖像。

保存圖像

保存圖像很簡單，直接用cv2.imwrite即可。

cv2.imwrite("D:\\cat2.jpg", img)

第一個參數是保存的路徑及文件名，第二個是圖像矩陣。其中，imwrite()有個可選的第三個參數，如下：

cv2.imwrite("D:\\cat2.jpg", img，[int(cv2.IMWRITE_JPEG_QUALITY), 5])

第三個參數針對特定的格式：對於JPEG，其表示的是圖像的質量，用0-100的整數表示，默認爲95。注意，cv2.IMWRITE_JPEG_QUALITY類型爲Long，必須轉換成int。下面是以不同質量存儲的兩幅圖：

對於PNG，第三個參數表示的是壓縮級別。cv2.IMWRITE_PNG_COMPRESSION，從0到9,壓縮級別越高，圖像尺寸越小。默認級別爲3：

[python]view plain copy
cv2.imwrite("./cat.png", img, [int(cv2.IMWRITE_PNG_COMPRESSION), 0])   
cv2.imwrite("./cat2.png", img, [int(cv2.IMWRITE_PNG_COMPRESSION), 9])  

保存的圖像尺寸如下：

還有一種支持的圖像，一般不常用。

完整的代碼爲：

[python]view plain copy
import cv2  
import numpy as np  
  
img = cv2.imread("./cat.jpg")  
emptyImage = np.zeros(img.shape, np.uint8)  
  
emptyImage2 = img.copy()  
  
emptyImage3=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)  
#emptyImage3[...]=0  
  
cv2.imshow("EmptyImage", emptyImage)  
cv2.imshow("Image", img)  
cv2.imshow("EmptyImage2", emptyImage2)  
cv2.imshow("EmptyImage3", emptyImage3)  
cv2.imwrite("./cat2.jpg", img, [int(cv2.IMWRITE_JPEG_QUALITY), 5])  
cv2.imwrite("./cat3.jpg", img, [int(cv2.IMWRITE_JPEG_QUALITY), 100])  
cv2.imwrite("./cat.png", img, [int(cv2.IMWRITE_PNG_COMPRESSION), 0])  
cv2.imwrite("./cat2.png", img, [int(cv2.IMWRITE_PNG_COMPRESSION), 9])  
cv2.waitKey (0)  
cv2.destroyAllWindows()  

參考資料：
《OpenCV References Manuel》

圖像是人們喜聞樂見的一種信息形式，「百聞不如一見」，有時一張圖能勝千言萬語。圖像處理是利用計算機將數值化的圖像進行一定（線性或非線性）變換獲得更好效果的方法。Photoshop，美顏相機就是利用圖像處理技術的應用程序。深度學習最重要的應用領域就是計算機視覺（CV, Computer Vision），歷史上，MNIST 手寫體數字識別和 ImageNet 大規模圖像識別均得益於深度學習模型，取得了相比傳統方法更高的準確率。從 2012 年的 AlexNet 模型開始，隨後的 VGG, GoogLeNet, ResNet 等模型不斷刷新 ImageNet 圖像識別準確率紀錄，甚至超過了人類水平。爲了獲得良好的識別效果，除了使用更好的模型，數據集的預處理也是十分重要的一項內容，最常用的方法有尺度縮放、隨機切片、隨機翻轉、顏色變換等。

本文介紹如何使用 TensorFlow 完成圖像數據的預處理，以及如何使用 tensorboard 工具將圖像數據進行可視化。在使用 TensorFlow 實現圖像識別、目標檢測時會經常用到本文介紹的內容。

首先看下輸入圖像，是一隻貓：

TensorFlow 讀取圖片數據代碼：

reader = tf.WholeFileReader()

key, value = reader.read(tf.train.string_input_producer(['cat.jpg']))

image0 = tf.image.decode_jpeg(value)

用過 Caffe 的讀者可能會非常熟悉上面的圖片（位於 caffe/examples/images/cat.jpg）。原圖尺寸爲 360 x 480。

圖像縮放

代碼：

resized_image = tf.image.resize_images(image0, [256, 256], \

method=tf.image.ResizeMethod.AREA)

其中 method 有四種選擇：

ResizeMethod.BILINEAR ：雙線性插值

ResizeMethod.NEAREST_NEIGHBOR ：最近鄰插值

ResizeMethod.BICUBIC ：雙三次插值

ResizeMethod.AREA ：面積插值

讀者可以分別試試，看看縮放效果。

圖像裁剪

代碼：

cropped_image = tf.image.crop_to_bounding_box(image0, 20, 20, 256, 256)

圖像水平翻轉

代碼：

flipped_image = tf.image.flip_left_right(image0)

除此之外還可以上下翻轉：

flipped_image = tf.image.flip_up_down(image0)

圖像旋轉

代碼：

rotated_image = tf.image.rot90(image0, k=1)

其中 k 值表示旋轉 90 度的次數，讀者可以嘗試對原圖旋轉 180 度、270 度。

圖像灰度變換

代碼：

grayed_image = tf.image.rgb_to_grayscale(image0)

從上面看到，用 TensorFlow 實現上述圖像預處理是非常簡單的。TensorFlow 也提供了針對目標檢測中用到的 bounding box 處理的 api，有興趣的讀者可以翻閱 api 文檔（https://www.tensorflow.org/versions/r1.0/api_docs/Python/image/working_with_bounding_boxes）學習。

爲了方便查看圖像預處理的效果，可以利用 TensorFlow 提供的 tensorboard 工具進行可視化。

使用方法也比較簡單，直接用 tf.summary.image 將圖像寫入 summary，對應代碼如下：

img_resize_summary = tf.summary.image('image resized', tf.expand_dims(resized_image, 0))

cropped_image_summary = tf.summary.image('image cropped', tf.expand_dims(cropped_image, 0))

flipped_image_summary = tf.summary.image('image flipped', tf.expand_dims(flipped_image, 0))

rotated_image_summary = tf.summary.image('image rotated', tf.expand_dims(rotated_image, 0))

grayed_image_summary = tf.summary.image('image grayed', tf.expand_dims(grayed_image, 0))

merged = tf.summary.merge_all()

with tf.Session() as sess:

summary_writer = tf.summary.FileWriter('/tmp/tensorboard', sess.graph)

summary_all = sess.run(merged)

summary_writer.add_summary(summary_all, 0)

summary_writer.close()

運行該程序，會在 /tmp/tensorboard 目錄下生成 summary，接着在命令行啓動 tensorboard 服務：

打開瀏覽器，輸入 127.0.0.1:6006 就可以查看 tensorboard 頁面了（Ubuntu 自帶的 firefox 打開 tensorboard 時不顯示圖像，可以更換爲 Chrome 瀏覽器）。

TensorBoard 圖像可視化效果

TensorFlow與OpenCV，讀取圖片，進行簡單操作並顯示

1 opencv讀入圖片，使用tf.Variable初始化爲tensor，加載到tensorflow對圖片進行轉置操作，然後opencv顯示轉置後的結果

[python]view plain copy
import tensorflow as tf  
import cv2  
  
file_path = "/home/lei/Desktop/"  
filename = "MarshOrchid.jpg"  
  
image = cv2.imread(filename, 1)  
cv2.namedWindow('image', 0)  
cv2.imshow('image', image)  
  
# Create a TensorFlow Variable  
x = tf.Variable(image, name='x')  
  
model = tf.initialize_all_variables()  
  
with tf.Session() as session:  
  x = tf.transpose(x, perm=[1, 0, 2])  
  session.run(model)  
  result = session.run(x)  
  
cv2.namedWindow('result', 0)  
cv2.imshow('result', result)  
cv2.waitKey(0)  

2 OpenCV讀入圖片，使用tf.placeholder符號變量加載到tensorflow裏，然後tensorflow對圖片進行剪切操作，最後opencv顯示轉置後的結果

x = tf.Variable(image, name='x')

model = tf.initialize_all_variables()

with tf.Session() as session:

x = tf.transpose(x, perm=[1, 0, 2])

session.run(model)

result = session.run(x)

cv2.namedWindow('result', 0)

cv2.imshow('result', result)

cv2.waitKey(0)

2 OpenCV讀入圖片，使用tf.placeholder符號變量加載到tensorflow裏，然後tensorflow對圖片進行剪切操作，最後opencv顯示轉置後的結果

[python]view plain copy
import tensorflow as tf  
import cv2  
  
  
with tf.Session() as session:  
  x = tf.transpose(x, perm=[1, 0, 2])  
  session.run(model)  
  result = session.run(x)  
  
cv2.namedWindow('result', 0)  
cv2.imshow('result', result)  
cv2.waitKey(0)  

2 OpenCV讀入圖片，使用tf.placeholder符號變量加載到tensorflow裏，然後tensorflow對圖片進行剪切操作，最後opencv顯示轉置後的結果

[python]view plain copy
import tensorflow as tf  
import cv2  
  
# First, load the image again  
cv2.waitKey(0)  

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。