This paper presents a ray-casting-based three-dimensional (3D) positioning system that interactively reconstructs scene structures for handheld augmented reality. The proposed system employs visual simultaneous localization and mapping (vSLAM) technology to acquire camera poses of a smartphone and sparse 3D feature points in an unknown scene. First, users specify a geometric shape region, such as a plane, in captured images while capturing a scene. This is performed by manually selecting some of the feature points generated by vSLAM in the region. Next, the system computes the shape parameter with the selected feature points so that the scene structure is reconstructed densely. Subsequently, users select the pixel of a target point in the scene at one camera view for 3D positioning. Finally, the system computes the intersection between the 3D ray computed with the selected pixel and the reconstructed scene structure to determine the 3D coordinates of the target point. Owing to the proposed interactive reconstruction, the scene structure can be estimated accurately and stably; therefore, 3D positioning will be accurate. Because the geometric shape used for the scene structure is a plane in this study, our system is referred to as PlanAR. In the evaluation, the performance of our system is compared statistically with an existing 3D positioning system to demonstrate the accuracy and stability of our system.