AlexNet(Alex Krizhevsky,ILSVRC2012冠軍)適合做影象分類。層自左向右、自上向下讀取,關聯層分為一組,高度、寬度減小,深度增加。深度增加減少網路計算量。

訓練模型資料集 Stanford計算機視覺站點Stanford Dogs

http://

vision。stanford。edu/adi

tya86/ImageNetDogs/

。資料下載解壓到模型程式碼同一路徑imagenet-dogs目錄下。包含的120種狗影象。80%訓練,20%測試。產品模型需要預留原始資料交叉驗證。每幅影象JPEG格式(RGB),尺寸不一。

影象轉TFRecord檔案,有助加速訓練,簡化影象標籤匹配,影象分離利用檢查點檔案對模型進行不間斷測試。轉換影象格式把顏色空間轉灰度,影象修改統一尺寸,標籤除上每幅影象。訓練前只進行一次預處理,時間較長。

glob。glob 列舉指定路徑目錄,顯示資料集檔案結構。“*”萬用字元可以實現模糊查詢。檔名中8個數字對應ImageNet類別WordNetID。ImageNet網站可用WordNetID查影象細節: ImageNet Tree View 。

檔名分解為品種和相應的檔名,品種對應資料夾名稱。依據品種對影象分組。列舉每個品種影象,20%影象劃入測試集。檢查每個品種測試影象是否至少有全部影象的18%。目錄和影象組織到兩個與每個品種相關的字典,包含各品種所有影象。分類影象組織到字典中,簡化選擇分類影象及歸類過程。

預處理階段,依次遍歷所有分類影象,開啟列表中檔案。用dataset影象填充TFRecord檔案,把類別包含進去。dataset鍵值對應檔案列表標籤。record_location 儲存TFRecord輸出路徑。列舉dataset,當前索引用於檔案劃分,每隔100m幅影象,訓練樣本資訊寫入新的TFRecord檔案,加快寫操作程序。無法被TensorFlow識別為JPEG影象,用try/catch忽略。轉為灰度圖減少計算量和記憶體佔用。tf。cast把RGB值轉換到[0,1)區間內。標籤按字串儲存較高效,最好轉換為整數索引或獨熱編碼秩1張量。

開啟每幅影象,轉換為灰度圖,調整尺寸,新增到TFRecord檔案。tf。image。resize_images函式把所有影象調整為相同尺寸,不考慮長寬比,有扭曲。裁剪、邊界填充能保持影象長寬比。

按照TFRecord檔案讀取影象,每次載入少量影象及標籤。修改影象形狀有助訓練和輸出視覺化。匹配所有在訓練集目錄下TFRecord檔案載入訓練影象。每個TFRecord檔案包含多幅影象。tf。parse_single_example只從檔案提取單個樣本。批運算可同時訓練多幅影象或單幅影象,需要足夠系統記憶體。

影象轉灰度值為[0,1)浮點型別,匹配convolution2d期望輸入。卷積輸出第1維和最後一維不改變,中間兩維發生變化。tf。contrib。layers。convolution2d建立模型第1層。weights_initializer設定正態隨機值,第一組濾波器填充正態分佈隨機數。濾波器設定trainable,資訊輸入網路,權值調整,提高模型準確率。

max_pool把輸出降取樣。ksize、strides ([1,2,2,1]),卷積輸出形狀減半。輸出形狀減小,不改變濾波器數量(輸出通道)或影象批資料尺寸。減少分量,與影象(濾波器)高度、寬度有關。更多輸出通道,濾波器數量增加,2倍於第一層。多個卷積和池化層減少輸入高度、寬度,增加深度。很多架構,卷積層和池化層超過5層。訓練除錯時間更長,能匹配更多更復雜模式。

影象每個點與輸出神經元建立全連線。softmax,全連線層需要二階張量。第1維區分影象,第2維輸入張量秩1張量。tf。reshape 指示和使用其餘所有維,-1把最後池化層調整為巨大秩1張量。

池化層展開,網路當前狀態與預測全連線層整合。weights_initializer接收可呼叫引數,lambda表示式返回截斷正態分佈,指定分佈標準差。dropout 削減模型中神經元重要性。tf。contrib。layers。fully_connected 輸出前面所有層與訓練中分類的全連線。每個畫素與分類關聯。網路每一步將輸入影象轉化為濾波減小尺寸。濾波器與標籤匹配。減少訓練、測試網路計算量,輸出更具一般性。

訓練資料真實標籤和模型預測結果,輸入到訓練最佳化器(最佳化每層權值)計算模型損失。數次迭代,每次提升模型準確率。大部分分類函式(tf。nn。softmax)要求數值型別標籤。每個標籤轉換代表包含所有分類列表索引整數。tf。map_fn 匹配每個標籤並返回類別列表索引。map依據目錄列表建立包含分類列表。tf。map_fn 可用指定函式對資料流圖張量對映,生成僅包含每個標籤在所有類標籤列表索引秩1張量。tf。nn。softmax用索引預測。

除錯CNN,觀察濾波器(卷積核)每輪迭代變化。設計良好CNN,第一個卷積層工作,輸入權值被隨機初始化。權值透過影象啟用,啟用函式輸出(特徵圖)隨機。特徵圖視覺化,輸出外觀與原始圖相似,被施加靜力(static)。靜力由所有權值的隨機激發。經過多輪迭代,權值被調整擬合訓練反饋,濾波器趨於一致。網路收斂,濾波器與影象不同細小模式類似。tf。image_summary得到訓練後的濾波器和特徵圖簡單檢視。資料流圖影象概要輸出(image summary output)從整體瞭解所使用的濾波器和輸入影象特徵圖。TensorDebugger,迭代中以GIF動畫檢視濾波器變化。

文字輸入儲存在SparseTensor,大部分分量為0。CNN使用稠密輸入,每個值都重要,輸入大部分分量非0。

import

tensorflow

as

tf

import

glob

from

itertools

import

groupby

from

collections

import

defaultdict

sess

=

tf

InteractiveSession

()

image_filenames

=

glob

glob

“。/imagenet-dogs/n02*/*。jpg”

image_filenames

0

2

training_dataset

=

defaultdict

list

testing_dataset

=

defaultdict

list

image_filename_with_breed

=

map

lambda

filename

filename

split

“/”

)[

2

],

filename

),

image_filenames

for

dog_breed

breed_images

in

groupby

image_filename_with_breed

lambda

x

x

0

]):

for

i

breed_image

in

enumerate

breed_images

):

if

i

%

5

==

0

testing_dataset

dog_breed

append

breed_image

1

])

else

training_dataset

dog_breed

append

breed_image

1

])

breed_training_count

=

len

training_dataset

dog_breed

])

breed_testing_count

=

len

testing_dataset

dog_breed

])

breed_training_count_float

=

float

breed_training_count

breed_testing_count_float

=

float

breed_testing_count

assert

round

breed_testing_count_float

/

breed_training_count_float

+

breed_testing_count_float

),

2

>

0。18

“Not enough testing images。”

print

“training_dataset testing_dataset END ————————————————————————————”

def

write_records_file

dataset

record_location

):

writer

=

None

current_index

=

0

for

breed

images_filenames

in

dataset

items

():

for

image_filename

in

images_filenames

if

current_index

%

100

==

0

if

writer

writer

close

()

record_filename

=

“{record_location}-{current_index}。tfrecords”

format

record_location

=

record_location

current_index

=

current_index

writer

=

tf

python_io

TFRecordWriter

record_filename

print

record_filename

+

“————————————————————————————”

current_index

+=

1

image_file

=

tf

read_file

image_filename

try

image

=

tf

image

decode_jpeg

image_file

except

print

image_filename

continue

grayscale_image

=

tf

image

rgb_to_grayscale

image

resized_image

=

tf

image

resize_images

grayscale_image

250

151

])

image_bytes

=

sess

run

tf

cast

resized_image

tf

uint8

))

tobytes

()

image_label

=

breed

encode

“utf-8”

example

=

tf

train

Example

features

=

tf

train

Features

feature

=

{

‘label’

tf

train

Feature

bytes_list

=

tf

train

BytesList

value

=

image_label

])),

‘image’

tf

train

Feature

bytes_list

=

tf

train

BytesList

value

=

image_bytes

]))

}))

writer

write

example

SerializeToString

())

writer

close

()

write_records_file

testing_dataset

“。/output/testing-images/testing-image”

write_records_file

training_dataset

“。/output/training-images/training-image”

print

“write_records_file testing_dataset training_dataset END————————————————————————————”

filename_queue

=

tf

train

string_input_producer

tf

train

match_filenames_once

“。/output/training-images/*。tfrecords”

))

reader

=

tf

TFRecordReader

()

_

serialized

=

reader

read

filename_queue

features

=

tf

parse_single_example

serialized

features

=

{

‘label’

tf

FixedLenFeature

([],

tf

string

),

‘image’

tf

FixedLenFeature

([],

tf

string

),

})

record_image

=

tf

decode_raw

features

‘image’

],

tf

uint8

image

=

tf

reshape

record_image

250

151

1

])

label

=

tf

cast

features

‘label’

],

tf

string

min_after_dequeue

=

10

batch_size

=

3

capacity

=

min_after_dequeue

+

3

*

batch_size

image_batch

label_batch

=

tf

train

shuffle_batch

image

label

],

batch_size

=

batch_size

capacity

=

capacity

min_after_dequeue

=

min_after_dequeue

print

“load image from TFRecord END————————————————————————————”

float_image_batch

=

tf

image

convert_image_dtype

image_batch

tf

float32

conv2d_layer_one

=

tf

contrib

layers

convolution2d

float_image_batch

num_outputs

=

32

kernel_size

=

5

5

),

activation_fn

=

tf

nn

relu

weights_initializer

=

tf

random_normal

stride

=

2

2

),

trainable

=

True

pool_layer_one

=

tf

nn

max_pool

conv2d_layer_one

ksize

=

1

2

2

1

],

strides

=

1

2

2

1

],

padding

=

‘SAME’

conv2d_layer_one

get_shape

(),

pool_layer_one

get_shape

()

print

“conv2d_layer_one pool_layer_one END————————————————————————————”

conv2d_layer_two

=

tf

contrib

layers

convolution2d

pool_layer_one

num_outputs

=

64

kernel_size

=

5

5

),

activation_fn

=

tf

nn

relu

weights_initializer

=

tf

random_normal

stride

=

1

1

),

trainable

=

True

pool_layer_two

=

tf

nn

max_pool

conv2d_layer_two

ksize

=

1

2

2

1

],

strides

=

1

2

2

1

],

padding

=

‘SAME’

conv2d_layer_two

get_shape

(),

pool_layer_two

get_shape

()

print

“conv2d_layer_two pool_layer_two END————————————————————————————”

flattened_layer_two

=

tf

reshape

pool_layer_two

batch_size

-

1

])

flattened_layer_two

get_shape

()

print

“flattened_layer_two END————————————————————————————”

hidden_layer_three

=

tf

contrib

layers

fully_connected

flattened_layer_two

512

weights_initializer

=

lambda

i

dtype

tf

truncated_normal

([

38912

512

],

stddev

=

0。1

),

activation_fn

=

tf

nn

relu

hidden_layer_three

=

tf

nn

dropout

hidden_layer_three

0。1

final_fully_connected

=

tf

contrib

layers

fully_connected

hidden_layer_three

120

weights_initializer

=

lambda

i

dtype

tf

truncated_normal

([

512

120

],

stddev

=

0。1

print

“final_fully_connected END————————————————————————————”

labels

=

list

map

lambda

c

c

split

“/”

)[

-

1

],

glob

glob

“。/imagenet-dogs/*”

)))

train_labels

=

tf

map_fn

lambda

l

tf

where

tf

equal

labels

l

))[

0

0

1

][

0

],

label_batch

dtype

=

tf

int64

loss

=

tf

reduce_mean

tf

nn

sparse_softmax_cross_entropy_with_logits

final_fully_connected

train_labels

))

batch

=

tf

Variable

0

learning_rate

=

tf

train

exponential_decay

0。01

batch

*

3

120

0。95

staircase

=

True

optimizer

=

tf

train

AdamOptimizer

learning_rate

0。9

minimize

loss

global_step

=

batch

train_prediction

=

tf

nn

softmax

final_fully_connected

print

“train_prediction END————————————————————————————”

filename_queue

close

cancel_pending_enqueues

=

True

coord

request_stop

()

coord

join

threads

print

“END————————————————————————————”

參考資料:

《面向機器智慧的TensorFlow實踐》

歡迎加我微信交流:qingxingfengzi

我的微信公眾號:qingxingfengzigz

我老婆張幸清的微信公眾號:qingqingfeifangz