Hybrid system falsification is an actively studied topic, as a scalable quality assurance methodology for real-world cyber-physical systems. In falsification, one employs stochastic hill-climbing optimization to quickly find a counterexample input to a black-box system model. Quantitative robust semantics is the technical key that enables use of such optimization. In this paper, we tackle the so-called scale problem regarding Boolean connectives that is widely recognized in the community: quantities of different scales (such as speed [km/h] vs. rpm, or worse, rph) can mask each other’s contribution to robustness. Our solution consists of integration of the multi-armed bandit algorithms in hill climbing-guided falsification frameworks, with a technical novelty of a new reward notion that we call hill-climbing gain. Our experiments show our approach’s robustness under the change of scales, and that it outperforms a state-of-the-art falsification tool.