17 CGAN和ACGAN（下）

ACGAN

再透過一張圖瞭解ACGAN（Auxiliary Classifier GAN）的原理

和CGAN不同的是，

不直接輸入

。

不僅需要判斷每個樣本的真假，還需要完成一個分類任務即預測

，透過增加一個輔助分類器實現

對

而言，損失函式如下

$L_{adv}(D)=-\mathbb{E}_{x\sim p_{data}}[\log D(x)]-\mathbb{E}_{z\sim p_z,c\sim p_{c}}[\log(1-D(G(z,c)))]$

$L_{cls}(D)=\mathbb{E}_{x\sim p_{data}}[L_D(c_x|x)]$

對

而言，損失函式如下

$L_{adv}(G)=\mathbb{E}_{z\sim p_z,c\sim p_{c}}[\log(1-D(G(z,c)))]$

$L_{cls}(G)=\mathbb{E}_{z\sim p_z,c\sim p_{c}}[L_D(c|G(z,c))]$

還是以CelebA的Male作為條件，在WGAN的基礎上實現ACGAN

載入庫

# -*- coding： utf-8 -*-

import tensorflow as tf

import numpy as np

import os

import matplotlib。pyplot as plt

%matplotlib inline

from imageio import imread， imsave， mimsave

import cv2

import glob

from tqdm import tqdm

載入圖片

images = glob。glob（‘celeba/*。jpg’）

print（len（images））

讀取圖片的Male標籤

tags = {}

target = ‘Male’

with open（‘list_attr_celeba。txt’， ‘r’） as fr：

lines = fr。readlines（）

all_tags = lines［0］。strip（‘\n’）。split（）

for i in range（1， len（lines））：

line = lines［i］。strip（‘\n’）。split（）

if int（line［all_tags。index（target） + 1］） == 1：

tags［line［0］］ = ［1， 0］ # 男

else：

tags［line［0］］ = ［0， 1］ # 女

print（len（tags））

print（all_tags）

定義一些常量、網路輸入、輔助函式

batch_size = 100

z_dim = 100

WIDTH = 64

HEIGHT = 64

LABEL = 2

LAMBDA = 10

DIS_ITERS = 3 # 5

OUTPUT_DIR = ‘samples’

if not os。path。exists（OUTPUT_DIR）：

os。mkdir（OUTPUT_DIR）

X = tf。placeholder（dtype=tf。float32， shape=［batch_size， HEIGHT， WIDTH， 3］， name=‘X’）

Y = tf。placeholder（dtype=tf。float32， shape=［batch_size， LABEL］， name=‘Y’）

noise = tf。placeholder（dtype=tf。float32， shape=［batch_size， z_dim］， name=‘noise’）

is_training = tf。placeholder（dtype=tf。bool， name=‘is_training’）

def lrelu（x， leak=0。2）：

return tf。maximum（x， leak * x）

判別器部分，去掉條件輸入，增加分類輸出

def discriminator（image， reuse=None， is_training=is_training）：

momentum = 0。9

with tf。variable_scope（‘discriminator’， reuse=reuse）：

h0 = lrelu（tf。layers。conv2d（image， kernel_size=5， filters=64， strides=2， padding=‘same’））

h1 = lrelu（tf。layers。conv2d（h0， kernel_size=5， filters=128， strides=2， padding=‘same’））

h2 = lrelu（tf。layers。conv2d（h1， kernel_size=5， filters=256， strides=2， padding=‘same’））

h3 = lrelu（tf。layers。conv2d（h2， kernel_size=5， filters=512， strides=2， padding=‘same’））

h4 = tf。contrib。layers。flatten（h3）

Y_ = tf。layers。dense（h4， units=LABEL）

h4 = tf。layers。dense（h4， units=1）

return h4， Y_

生成器部分，沒有任何改動

def generator（z， label， is_training=is_training）：

momentum = 0。9

with tf。variable_scope（‘generator’， reuse=None）：

d = 4

z = tf。concat（［z， label］， axis=1）

h0 = tf。layers。dense（z， units=d * d * 512）

h0 = tf。reshape（h0， shape=［-1， d， d， 512］）

h0 = tf。nn。relu（tf。contrib。layers。batch_norm（h0， is_training=is_training， decay=momentum））

h1 = tf。layers。conv2d_transpose（h0， kernel_size=5， filters=256， strides=2， padding=‘same’）

h1 = tf。nn。relu（tf。contrib。layers。batch_norm（h1， is_training=is_training， decay=momentum））

h2 = tf。layers。conv2d_transpose（h1， kernel_size=5， filters=128， strides=2， padding=‘same’）

h2 = tf。nn。relu（tf。contrib。layers。batch_norm（h2， is_training=is_training， decay=momentum））

h3 = tf。layers。conv2d_transpose（h2， kernel_size=5， filters=64， strides=2， padding=‘same’）

h3 = tf。nn。relu（tf。contrib。layers。batch_norm（h3， is_training=is_training， decay=momentum））

h4 = tf。layers。conv2d_transpose（h3， kernel_size=5， filters=3， strides=2， padding=‘same’， activation=tf。nn。tanh， name=‘g’）

return h4

定義損失函式，加上分類部分對應的損失。理論上分類問題應該用交叉熵作為損失函式，這裡使用MSE效果也不錯

g = generator（noise， Y）

d_real， y_real = discriminator（X）

d_fake， y_fake = discriminator（g， reuse=True）

loss_d_real = -tf。reduce_mean（d_real）

loss_d_fake = tf。reduce_mean（d_fake）

loss_cls_real = tf。losses。mean_squared_error（Y， y_real）

loss_cls_fake = tf。losses。mean_squared_error（Y， y_fake）

loss_d = loss_d_real + loss_d_fake + loss_cls_real

loss_g = -tf。reduce_mean（d_fake） + loss_cls_fake

alpha = tf。random_uniform（shape=［batch_size， 1， 1， 1］， minval=0。， maxval=1。）

interpolates = alpha * X + （1 - alpha） * g

grad = tf。gradients（discriminator（interpolates， reuse=True），［interpolates］）［0］

slop = tf。sqrt（tf。reduce_sum（tf。square（grad）， axis=［1］））

gp = tf。reduce_mean（（slop - 1。） ** 2）

loss_d += LAMBDA * gp

vars_g = ［var for var in tf。trainable_variables（） if var。name。startswith（‘generator’）］

vars_d = ［var for var in tf。trainable_variables（） if var。name。startswith（‘discriminator’）］

定義最佳化器

update_ops = tf。get_collection（tf。GraphKeys。UPDATE_OPS）

with tf。control_dependencies（update_ops）：

optimizer_d = tf。train。AdamOptimizer（learning_rate=0。0002， beta1=0。5）。minimize（loss_d， var_list=vars_d）

optimizer_g = tf。train。AdamOptimizer（learning_rate=0。0002， beta1=0。5）。minimize（loss_g， var_list=vars_g）

拼接圖片的函式

def montage（images）：

if isinstance（images， list）：

images = np。array（images）

img_h = images。shape［1］

img_w = images。shape［2］

n_plots = int（np。ceil（np。sqrt（images。shape［0］）））

if len（images。shape） == 4 and images。shape［3］ == 3：

m = np。ones（

（images。shape［1］ * n_plots + n_plots + 1，

images。shape［2］ * n_plots + n_plots + 1， 3）） * 0。5

elif len（images。shape） == 4 and images。shape［3］ == 1：

m = np。ones（

（images。shape［1］ * n_plots + n_plots + 1，

images。shape［2］ * n_plots + n_plots + 1， 1）） * 0。5

elif len（images。shape） == 3：

m = np。ones（

（images。shape［1］ * n_plots + n_plots + 1，

images。shape［2］ * n_plots + n_plots + 1）） * 0。5

else：

raise ValueError（‘Could not parse image shape of {}’。format（images。shape））

for i in range（n_plots）：

for j in range（n_plots）：

this_filter = i * n_plots + j

if this_filter < images。shape［0］：

this_img = images［this_filter］

m［1 + i + i * img_h：1 + i + （i + 1） * img_h，

1 + j + j * img_w：1 + j + （j + 1） * img_w］ = this_img

return m

整理資料

X_all = ［］

Y_all = ［］

for i in tqdm（range（len（images）））：

image = imread（images［i］）

h = image。shape［0］

w = image。shape［1］

if h > w：

image = image［h // 2 - w // 2： h // 2 + w // 2，：，：］

else：

image = image［：， w // 2 - h // 2： w // 2 + h // 2，：］

image = cv2。resize（image，（WIDTH， HEIGHT））

image = （image / 255。 - 0。5） * 2

X_all。append（image）

image_name = images［i］［images［i］。find（‘/’） + 1：］

Y_all。append（tags［image_name］）

X_all = np。array（X_all）

Y_all = np。array（Y_all）

print（X_all。shape， Y_all。shape）

檢視資料樣例

for i in range（10）：

plt。imshow（（X_all［i，：，：，：］ + 1） / 2）

plt。show（）

print（Y_all［i，：］）

定義隨機產生批資料的函式

def get_random_batch（）：

data_index = np。arange（X_all。shape［0］）

np。random。shuffle（data_index）

data_index = data_index［：batch_size］

X_batch = X_all［data_index，：，：，：］

Y_batch = Y_all［data_index，：］

return X_batch， Y_batch

訓練模型，根據ACGAN作相應調整

sess = tf。Session（）

sess。run（tf。global_variables_initializer（））

zs = np。random。uniform（-1。0， 1。0，［batch_size // 2， z_dim］）。astype（np。float32）

z_samples = ［］

y_samples = ［］

for i in range（batch_size // 2）：

z_samples。append（zs［i，：］）

y_samples。append（［1， 0］）

z_samples。append（zs［i，：］）

y_samples。append（［0， 1］）

samples = ［］

loss = {‘d’：［］， ‘g’：［］}

for i in tqdm（range（60000））：

for j in range（DIS_ITERS）：

n = np。random。uniform（-1。0， 1。0，［batch_size， z_dim］）。astype（np。float32）

X_batch， Y_batch = get_random_batch（）

_， d_ls = sess。run（［optimizer_d， loss_d］， feed_dict={X： X_batch， Y： Y_batch， noise： n， is_training： True}）

_， g_ls = sess。run（［optimizer_g， loss_g］， feed_dict={X： X_batch， Y： Y_batch， noise： n， is_training： True}）

loss［‘d’］。append（d_ls）

loss［‘g’］。append（g_ls）

if i % 500 == 0：

print（i， d_ls， g_ls）

gen_imgs = sess。run（g， feed_dict={noise： z_samples， Y： y_samples， is_training： False}）

gen_imgs = （gen_imgs + 1） / 2

imgs = ［img［：，：，：］ for img in gen_imgs］

gen_imgs = montage（imgs）

plt。axis（‘off’）

plt。imshow（gen_imgs）

imsave（os。path。join（OUTPUT_DIR， ‘sample_%d。jpg’ % i）， gen_imgs）

plt。show（）

samples。append（gen_imgs）

plt。plot（loss［‘d’］， label=‘Discriminator’）

plt。plot（loss［‘g’］， label=‘Generator’）

plt。legend（loc=‘upper right’）

plt。savefig（‘Loss。png’）

plt。show（）

mimsave（os。path。join（OUTPUT_DIR， ‘samples。gif’）， samples， fps=10）

結果如下，比CGAN的結果好一些，崩掉的情況比較少，而且人臉更真實更清晰

儲存模型

saver = tf。train。Saver（）

saver。save（sess， ‘。/celeba_acgan’， global_step=60000）

在單機上載入模型，進行以下兩個嘗試：

固定噪音，漸變條件；

固定條件，漸變噪音。

# -*- coding： utf-8 -*-

import tensorflow as tf

import numpy as np

import matplotlib。pyplot as plt

batch_size = 100

z_dim = 100

LABEL = 2

def montage（images）：

if isinstance（images， list）：

images = np。array（images）

img_h = images。shape［1］

img_w = images。shape［2］

n_plots = int（np。ceil（np。sqrt（images。shape［0］）））

if len（images。shape） == 4 and images。shape［3］ == 3：

m = np。ones（

（images。shape［1］ * n_plots + n_plots + 1，

images。shape［2］ * n_plots + n_plots + 1， 3）） * 0。5

elif len（images。shape） == 4 and images。shape［3］ == 1：

m = np。ones（

（images。shape［1］ * n_plots + n_plots + 1，

images。shape［2］ * n_plots + n_plots + 1， 1）） * 0。5

elif len（images。shape） == 3：

m = np。ones（

（images。shape［1］ * n_plots + n_plots + 1，

images。shape［2］ * n_plots + n_plots + 1）） * 0。5

else：

raise ValueError（‘Could not parse image shape of {}’。format（images。shape））

for i in range（n_plots）：

for j in range（n_plots）：

this_filter = i * n_plots + j

if this_filter < images。shape［0］：

this_img = images［this_filter］

m［1 + i + i * img_h：1 + i + （i + 1） * img_h，

1 + j + j * img_w：1 + j + （j + 1） * img_w］ = this_img

return m

sess = tf。Session（）

sess。run（tf。global_variables_initializer（））

saver = tf。train。import_meta_graph（‘。/celeba_acgan-60000。meta’）

saver。restore（sess， tf。train。latest_checkpoint（‘。/’））

graph = tf。get_default_graph（）

g = graph。get_tensor_by_name（‘generator/g/Tanh：0’）

noise = graph。get_tensor_by_name（‘noise：0’）

Y = graph。get_tensor_by_name（‘Y：0’）

is_training = graph。get_tensor_by_name（‘is_training：0’）

# 固定噪音，漸變條件

n = np。random。uniform（-1。0， 1。0，［10， z_dim］）。astype（np。float32）

ns = ［］

y_samples = ［］

for i in range（100）：

ns。append（n［i // 10，：］）

y_samples。append（［i % 10 / 9， 1 - i % 10 / 9］）

gen_imgs = sess。run（g， feed_dict={noise： ns， Y： y_samples， is_training： False}）

gen_imgs = （gen_imgs + 1） / 2

imgs = ［img［：，：，：］ for img in gen_imgs］

gen_imgs = montage（imgs）

gen_imgs = np。clip（gen_imgs， 0， 1）

plt。figure（figsize=（8， 8））

plt。axis（‘off’）

plt。imshow（gen_imgs）

plt。show（）

# 固定條件，漸變噪音

n = np。random。uniform（-1。0， 1。0，［5， 2， z_dim］）。astype（np。float32）

ns = ［］

y_samples = ［］

for i in range（5）：

for k in range（2）：

for j in range（10）：

start = n［i， 0，：］

end = n［i， 1，：］

ns。append（start + j * （end - start） / 9）

if k == 0：

y_samples。append（［0， 1］）

else：

y_samples。append（［1， 0］）

gen_imgs = sess。run（g， feed_dict={noise： ns， Y： y_samples， is_training： False}）

gen_imgs = （gen_imgs + 1） / 2

imgs = ［img［：，：，：］ for img in gen_imgs］

gen_imgs = montage（imgs）

gen_imgs = np。clip（gen_imgs， 0， 1）

plt。figure（figsize=（8， 8））

plt。axis（‘off’）

plt。imshow（gen_imgs）

plt。show（）

由女變男的過程

人臉兩兩之間的漸變

參考

Conditional Generative Adversarial Nets：

https：//

arxiv。org/abs/1411。1784

Conditional Image Synthesis With Auxiliary Classifier GANs：

https：//

arxiv。org/abs/1610。0958

影片講解課程

小蜜蜂問答

小蜜蜂問答

17 CGAN和ACGAN（下）

推薦文章

小蜜蜂問答

小蜜蜂問答

17 CGAN和ACGAN（下）

相關文章

國產跑車，你是喜歡中華酷寶還是MGTF啊？

tf卡cd卡有什麼區別？

為什麼有的行車記錄儀會"燒壞"TF卡？

神經網路正則化(2)：dropout正則化

推薦文章