One of the most important tasks in image understanding is to map visual information into symbolic concepts which describe an input scene. However, the mapping presents the following difficulties: 1. How to resolve the ambiguity in visual information. 2. How to reduce the redundancy of visual information. 3. How to describe scenes efficiently using symbolic concepts. In this paper, we propose the ICE System, which is a framework of computer vision addressing the above three problems. First we will see a multi-layered model based on the hypercolumn with selective attention mechanism can solve the first two problems. Then, we will describe how the ICE System is structurally constructed on the basis of the multi-layered model.