1.程序講解(jie)
(1)香草編(bian)碼器(qi)
在這種自編碼器的(de)(de)最簡單結構中,只有三個(ge)網絡(luo)層,即只有一個(ge)隱藏層的(de)(de)神經網絡(luo)。它的(de)(de)輸入和輸出是(shi)相同的(de)(de),可通過使用Adam優化器和均方誤(wu)差損失函數,來學(xue)習如何重構輸入。
在這里,如果隱含層維數(64)小于輸入(ru)維數(784),則(ze)稱這(zhe)個編(bian)碼器是有損的(de)。通(tong)過這(zhe)個約束,來(lai)迫使(shi)神經(jing)網絡(luo)來(lai)學習數據的(de)壓縮表征。
input_size = 784
hidden_size = 64
output_size = 784
?
x = Input(shape=(input_size,))
?
# Encoder
h = Dense(hidden_size, activation='relu')(x)
?
# Decoder
r = Dense(output_size, activation='sigmoid')(h)
?
autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')
Dense:Keras Dense層,keras.layers.core.Dense( units, activation=None)
units:代表該層(ceng)的輸出維度
activation=None:激活(huo)函數(shu).但(dan)是(shi)默認 liner
Activation:激(ji)活(huo)層對一個(ge)層的輸出施加(jia)激(ji)活(huo)函數(shu)
model.compile() :Model模型(xing)方法之一(yi):compile
optimizer:優(you)(you)化器,為預定義優(you)(you)化器名或(huo)優(you)(you)化器對象,參考優化器(qi)
loss:損(sun)失函數,為(wei)預定義損(sun)失函數名或一個目標函數,參考(kao)損(sun)失函數
adam:adaptive moment estimation,是對RMSProp優化器的(de)更新。利用(yong)梯(ti)度的(de)一階矩(ju)估(gu)計和二階矩(ju)估(gu)計動態調整每個參數的(de)學習率。優點:每一次迭代學習率都有一個明確的(de)范圍,使得參數變化很平(ping)穩。
mse:mean_squared_error,均方誤(wu)差
(2)多層自編碼器
如果一個隱含層還不夠,顯然可以將自動(dong)編(bian)碼(ma)器的隱含層數目進一步提高。
在這里,實現中使用了3個隱(yin)含層,而不是只(zhi)有一(yi)個。任意一個隱含(han)層(ceng)都可以(yi)作為特(te)征(zheng)表征(zheng),但是(shi)為了使網絡對稱,我們使用(yong)了最中間的網絡層。
input_size = 784
hidden_size = 128
code_size = 64
?
x = Input(shape=(input_size,))
?
# Encoder
hidden_1 = Dense(hidden_size, activation='relu')(x)
h = Dense(code_size, activation='relu')(hidden_1)
?
# Decoder
hidden_2 = Dense(hidden_size, activation='relu')(h)
r = Dense(input_size, activation='sigmoid')(hidden_2)
?
autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')
(3)卷(juan)積自編碼器(qi)
除了全連接層,自(zi)編碼器也(ye)能(neng)應用到卷積層,原理是一樣的,但是要使用3D矢量(liang)(如圖(tu)像)而不是展平后的一維矢量(liang)。對輸入圖像進行下采(cai)樣,以提供較(jiao)小(xiao)維度的(de)潛在表征,來迫使自編碼(ma)器從壓(ya)縮后(hou)的(de)數據進行學習。
x = Input(shape=(28, 28,1))
?
# Encoder
conv1_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
pool1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool1)
pool2 = MaxPooling2D((2, 2), padding='same')(conv1_2)
conv1_3 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool2)
h = MaxPooling2D((2, 2), padding='same')(conv1_3)
?
# Decoder
conv2_1 = Conv2D(8, (3, 3), activation='relu', padding='same')(h)
up1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(up1)
up2 = UpSampling2D((2, 2))(conv2_2)
conv2_3 = Conv2D(16, (3, 3), activation='relu')(up2)
up3 = UpSampling2D((2, 2))(conv2_3)
r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up3)
?
autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')
conv2d:Conv2D(filters, kernel_size, strides=(1, 1), padding='valid')
filters:卷積核的(de)數目(即輸出的(de)維度)。
kernel_size:卷積核(he)的寬度(du)(du)和長度(du)(du),單(dan)個整(zheng)數或(huo)由兩個整(zheng)數構成的list/tuple。如為單(dan)個整(zheng)數,則表示在各個空(kong)間維(wei)度(du)(du)的相同長度(du)(du)。
strides:卷積的(de)步長,單個(ge)整數(shu)或由兩個(ge)整數(shu)構成的(de)list/tuple。如為單個(ge)整數(shu),則表示在各個(ge)空間維(wei)度的(de)相同步長。任何不(bu)為1的(de)strides均與任何不(bu)為1的(de)dilation_rate均不(bu)兼容(rong)。
padding:補0策略,有(you)“valid”, “same” 兩種。“valid”代表只(zhi)進行有(you)效的(de)卷積,即對邊(bian)界(jie)數(shu)據不處理。“same”代表保留邊(bian)界(jie)處的(de)卷積結果,通常(chang)會導致(zhi)輸出shape與輸入shape相(xiang)同(tong)。
MaxPooling2D:2D輸(shu)入的最大(da)池化層。MaxPooling2D(pool_size=(2, 2), strides=None, border_mode='valid')
pool_size:pool_size:長為2的(de)整數tuple,代表在(zai)兩(liang)個方向(豎(shu)直,水平)上(shang)的(de)下采樣因(yin)子,如取(2,2)將(jiang)使(shi)圖片在(zai)兩(liang)個維度上(shang)均變為原長的(de)一半。 strides:長為2的整數(shu)tuple,或(huo)者None,步長值。 padding:字(zi)符(fu)串,“valid”或(huo)者”same”。
UpSampling2D:上采樣。UpSampling2D(size=(2, 2))
size:整數tuple,分別為行和列上采樣因(yin)子。
(4)正則(ze)自編碼器
除(chu)了施加一(yi)個(ge)比輸入維度(du)小的隱(yin)含層(ceng),一(yi)些其他方法(fa)也可用來約束自編(bian)碼器(qi)重構(gou),如正(zheng)則自編(bian)碼器(qi)。
正則自(zi)編碼(ma)(ma)器不需要使(shi)用淺(qian)層的(de)編碼(ma)(ma)器和解碼(ma)(ma)器以及小的(de)編碼(ma)(ma)維數來限制(zhi)模型容量,而是使(shi)用損失函(han)數(shu)來鼓勵模(mo)型學習其他特性(xing)(除(chu)了將輸(shu)入復(fu)制到輸(shu)出)。這些特性(xing)包括稀(xi)疏表征、小(xiao)導數表征、以及(ji)對(dui)噪聲或輸(shu)入缺失的(de)魯棒性(xing)。
即使模型容量大(da)到足以學(xue)習一個無意義的(de)恒等函數,非線性(xing)且過完備的(de)正(zheng)則自編碼器仍然能(neng)夠從(cong)數據中學(xue)到一些關于數據分布的(de)有用信息。
在實際應(ying)用中,常用到兩種(zhong)正則自編(bian)碼器(qi),分別是稀疏(shu)自(zi)編碼器和降(jiang)噪自(zi)編碼(ma)器。
(5)稀疏自編碼器(qi)
一般用(yong)來學習(xi)特征(zheng),以便用(yong)于(yu)像(xiang)分類這樣的(de)任務。稀(xi)疏(shu)正則化的(de)自(zi)編碼(ma)器(qi)必(bi)須反映訓(xun)練(lian)數(shu)據(ju)集(ji)的(de)獨特統計特征(zheng),而不(bu)是簡單地充當恒等函數(shu)。以這種方式訓(xun)練(lian),執(zhi)行附(fu)帶(dai)稀(xi)疏(shu)懲罰(fa)的(de)復現任務可以得到能學習(xi)有用(yong)特征(zheng)的(de)模型(xing)。
還有一(yi)種用來約束(shu)自(zi)動(dong)編碼(ma)器重構的方法,是對(dui)其損失函數施加約束(shu)。比(bi)如,可對(dui)損失函數添加一(yi)個(ge)正則化(hua)約(yue)束,這樣能使自編碼器學(xue)習到數據的(de)稀疏表征。
要注(zhu)意(yi),在隱(yin)含層(ceng)中,我們還加入了L1正則(ze)化,作為優化階(jie)段中損失(shi)函數的(de)懲(cheng)罰(fa)項。與香(xiang)草自編(bian)碼器(qi)相(xiang)比(bi),這(zhe)樣操作后的數據表(biao)征更為(wei)稀疏。
input_size = 784
hidden_size = 64
output_size = 784
?
x = Input(shape=(input_size,))
?
# Encoder
h = Dense(hidden_size, activation='relu', activity_regularizer=regularizers.l1(10e-5))(x)
#施加(jia)在輸(shu)出上的L1正(zheng)則項
?
# Decoder
r = Dense(output_size, activation='sigmoid')(h)
?
autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')
activity_regularizer:施(shi)加在輸出上的正則項,為ActivityRegularizer對象
l1(l=0.01):L1正則(ze)項(xiang),正則(ze)項(xiang)通(tong)常(chang)用于對(dui)模型(xing)的訓練施加某種約束,L1正則(ze)項(xiang)即L1范數約束,該約束會(hui)使被約束矩陣(zhen)/向量(liang)更稀疏。
(6)降噪自編碼器
這里不是通(tong)過對損失函數施(shi)加懲罰(fa)項(xiang),而(er)是通過改變損(sun)失函(han)數的(de)重構誤(wu)差項(xiang)來(lai)學習(xi)一些有用信(xin)息。
向訓練數(shu)據加(jia)入噪聲(sheng),并使自編碼器學會去除(chu)這種噪聲(sheng)來獲得沒(mei)有被噪聲(sheng)污染過的(de)(de)真實輸(shu)入。因此,這就迫使編碼器學習提(ti)取最重要的(de)(de)特征并學習輸(shu)入數(shu)據中更加(jia)魯棒的(de)(de)表(biao)征,這也(ye)是它的(de)(de)泛化能力比一般編碼器(qi)強(qiang)的(de)原因。
這種(zhong)結構(gou)可以通過梯度(du)下降算(suan)法來訓練。
x = Input(shape=(28, 28, 1))
?
# Encoder
conv1_1 = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
pool1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(pool1)
h = MaxPooling2D((2, 2), padding='same')(conv1_2)
?
# Decoder
conv2_1 = Conv2D(32, (3, 3), activation='relu', padding='same')(h)
up1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(up1)
up2 = UpSampling2D((2, 2))(conv2_2)
r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up2)
?
autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')
2.程序實例:
(1)單層自編碼器
from keras.layers import Input, Dense
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)
#單層自編碼器
encoding_dim = 32
input_img = Input(shape=(784,))
encoded = Dense(encoding_dim, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)
autoencoder = Model(inputs=input_img, outputs=decoded)
encoder = Model(inputs=input_img, outputs=encoded)
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-1]
decoder = Model(inputs=encoded_input, outputs=decoder_layer(encoded_input))
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256,
shuffle=True, validation_data=(x_test, x_test))
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
#輸出圖像(xiang)
n = 10 # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i + 1)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
ax = plt.subplot(2, n, i + 1 + n)
plt.imshow(decoded_imgs[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
(2)卷(juan)積自編碼器
from keras.layers import Input, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from keras.callbacks import TensorBoard
?
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape)
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
print(x_train.shape)
print(x_test.shape)
?
?
#卷(juan)積自(zi)編(bian)碼器
input_img = Input(shape=(28, 28, 1))
x = Convolution2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
autoencoder = Model(inputs=input_img, outputs=decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
# 打開一個終端并啟動TensorBoard,終端中(zhong)輸入 tensorboard --logdir=/autoencoder
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256,
shuffle=True, validation_data=(x_test, x_test),
callbacks=[TensorBoard(log_dir='autoencoder')])
decoded_imgs = autoencoder.predict(x_test)
?
?
#輸出圖像
n = 10 # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i + 1)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
ax = plt.subplot(2, n, i + 1 + n)
plt.imshow(decoded_imgs[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
(3)深度(du)自編碼器
from keras.layers import Input, Dense
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
?
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)
?
?
?
#深度自編(bian)碼器
input_img = Input(shape=(784,))
encoded = Dense(128, activation='relu')(input_img)
encoded = Dense(64, activation='relu')(encoded)
decoded_input = Dense(32, activation='relu')(encoded)
decoded = Dense(64, activation='relu')(decoded_input)
decoded = Dense(128, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(encoded)
autoencoder = Model(inputs=input_img, outputs=decoded)
encoder = Model(inputs=input_img, outputs=decoded_input)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256,
shuffle=True, validation_data=(x_test, x_test))
encoded_imgs = encoder.predict(x_test)
decoded_imgs = autoencoder.predict(x_test)
?
?
?
#輸出(chu)圖(tu)像
n = 10 # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
ax = plt.subplot(2, n, i + 1)
plt.imshow(x_test[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
ax = plt.subplot(2, n, i + 1 + n)
plt.imshow(decoded_imgs[i].reshape(28, 28))
plt.gray()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.show()
(4)降噪(zao)自編碼器
from keras.layers import Input, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from keras.callbacks import TensorBoard
 
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) 
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape) 
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
print(x_train.shape)
print(x_test.shape)
 
input_img = Input(shape=(28, 28, 1))
 
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
 
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
# 打開一個終端并啟動TensorBoard,終端中輸入 tensorboard --logdir=/autoencoder
autoencoder.fit(x_train_noisy, x_train, epochs=10, batch_size=256,
                shuffle=True, validation_data=(x_test_noisy, x_test),
                callbacks=[TensorBoard(log_dir='autoencoder', write_graph=False)])
 
decoded_imgs = autoencoder.predict(x_test_noisy)
 
n = 10
plt.figure(figsize=(30, 6))
for i in range(n):
    ax = plt.subplot(3, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
    
    ax = plt.subplot(3, n, i + 1 + n)
    plt.imshow(x_test_noisy[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(3, n, i + 1 + 2*n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()
