The broadband merits of wireless gigabit (WiGig) technology motivated its extensive usage in future wireless sensor networks (WSNs) as well as internet of things (IoT) networks in general. A WiGig sensor should select the best nearby one for relaying its collected information, which maximizes its achievable throughput while mitigating the energy consumption. However, the nearby best sensor selection (NBSS) problem needs intelligent solutions that mitigate the resultant beamforming training (BT) overhead. In this paper, with the help of online learning, the NBSS problem is modeled as a stochastic multi-armed bandit (MAB), where the nearby sensor nodes are the arms, and the reward is the received throughput by the player, i.e., the source sensor node. Hence, sophisticated energy aware (EA)-MAB schemes such as perturbed history exploration (PHE) and randomized upper confidence bound (RUCB) algorithms are proposed to handle the matter in real scenarios via updating the residual energies of the nearby sensors during the online selection process. Analytical simulations prove the efficiency of the proposed NBSS schemes over benchmark selection methods in terms of average throughput and energy efficiency.