This paper describes a vision based 3D real-virtual interaction which enables realistic avatar motion control, and in which the virtual camera is controlled by the body posture of the user. The human motion analysis method is implemented by blob tracking. A physically-constrained motion synthesis method is implemented to generate realistic motion from a limit number of blobs. We address our framework to utilize virtual scene contexts as a priori knowledge. In order to make the virtual scene more realistically beyond the limitation of the real world sensing, we use a framework to augment the reality in the virtual scene by simulating various events of the real world. Concretely, we suppose that a virtual environment can provide action information for the avatar. 3rd-person viewpoint control coupled with body postures is also realized to directly access virtual objects.