3D reconstruction of indoor scenes that contain non-rigidly moving human body using depth cameras is a task of extraordinary difficulty. Despite intensive efforts from the researchers in the 3D vision community, existing methods are still limited to reconstruct small scale scenes. This is because of the difficulty to track the camera motion when a target person moves in a totally different direction. Due to the narrow field of view (FoV) of consumer-grade red-green-blue-depth (RGB-D) cameras, a target person (generally put at about 2-3 meters from the camera) covers most of the FoV of the camera. Therefore, there are not enough features from the static background to track the motion of the camera. In this paper, we propose a system which reconstructs a moving human body and the background of an indoor scene using multiple depth cameras. Our system is composed of three Kinects that are approximately set in the same line and facing the same direction so that their FoV do not overlap (to avoid interference). Owing to this setup, we capture images of a person moving in a large scale indoor scene. The three Kinect cameras are calibrated with a robust method that uses three large non parallel planes. A moving person is detected by using human skeleton information, and is reconstructed separately from the static background. By separating the human body and the background, static 3D reconstruction can be adopted for the static background area while a method specialized for the human body area can be used to reconstruct the 3D model of the moving person. The experimental result shows the performance of proposed system for human body in a large-scale indoor scene.