We propose a new Training method that enables an autoencoder to extract more useful features for retrieval or classification tasks with limited-size datasets. Some targets in document analysis and recognition (DAR) including signature verification, historical document analysis, and scene text recognition, involve a common problem in which the size of the dataset available for training is small against the intra-class variety of the target appearance. Recently, several approaches, such as variational autoencoders and deep metric learning, have been proposed to obtain a feature representation that is suitable for the tasks. However, these methods sometimes cause an overfitting problem in which the accuracy of the test data is relatively low, while the performance for the training dataset is quite high. Our proposed method obtains feature representations for such tasks in DAR using convolutional autoencoders with metric learning. The accuracy is evaluated on an image-based retrieval of ancient Japanese signatures.