TY - GEN
T1 - Face completion with pyramid semantic attention and latent codes
AU - Cao, Shilei
AU - Sakurai, Kouichi
N1 - Publisher Copyright:
© 2020 IEEE
PY - 2020/11
Y1 - 2020/11
N2 - —Face completion, which is to reproduce the missing region of an incomplete face image, has achieved promising performance due to the adoption of generative adversarial network (GANs) and the development of GPU. However, current network frameworks always ignore that face images usually have strong semantic correlation and symmetry, failing to recover some semantically plausible details especially for filling in large contiguous holes. To produce visually realistic and semantically correct results, we propose a two-stage adversarial framework, the first stage is to produce coarse images and works as a prior for searching the most similar latent codes in reference sets, which are combined by composing their intermediate feature maps. Besides, the second stage captures texture information with our novel pyramid semantic attention block for fully using semantic information and embed the learned structure features into the inpainting process. Our attention layer considers not only the known contents but also our reconstructed parts, so that we can improve the realism of reconstructing parts, then apply this attention layer into a novel pyramid structure. In addition, we add weights in the loss function around the predicted boundary for encouraging our model to generate clearer contour lines and better interpolation properties. Empirically, the experiment on CelebA dataset shows our proposed method is effective to fill in large contiguous holes on the face images. Especially, the SSIM score of our model is nearly higher 0.1 than context encoder model.
AB - —Face completion, which is to reproduce the missing region of an incomplete face image, has achieved promising performance due to the adoption of generative adversarial network (GANs) and the development of GPU. However, current network frameworks always ignore that face images usually have strong semantic correlation and symmetry, failing to recover some semantically plausible details especially for filling in large contiguous holes. To produce visually realistic and semantically correct results, we propose a two-stage adversarial framework, the first stage is to produce coarse images and works as a prior for searching the most similar latent codes in reference sets, which are combined by composing their intermediate feature maps. Besides, the second stage captures texture information with our novel pyramid semantic attention block for fully using semantic information and embed the learned structure features into the inpainting process. Our attention layer considers not only the known contents but also our reconstructed parts, so that we can improve the realism of reconstructing parts, then apply this attention layer into a novel pyramid structure. In addition, we add weights in the loss function around the predicted boundary for encouraging our model to generate clearer contour lines and better interpolation properties. Empirically, the experiment on CelebA dataset shows our proposed method is effective to fill in large contiguous holes on the face images. Especially, the SSIM score of our model is nearly higher 0.1 than context encoder model.
UR - http://www.scopus.com/inward/record.url?scp=85104659256&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85104659256&partnerID=8YFLogxK
U2 - 10.1109/CANDAR51075.2020.00009
DO - 10.1109/CANDAR51075.2020.00009
M3 - Conference contribution
AN - SCOPUS:85104659256
T3 - Proceedings - 2020 8th International Symposium on Computing and Networking, CANDAR 2020
SP - 1
EP - 8
BT - Proceedings - 2020 8th International Symposium on Computing and Networking, CANDAR 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th International Symposium on Computing and Networking, CANDAR 2020
Y2 - 24 November 2020 through 27 November 2020
ER -