Auroras are beautiful phenomena and attract many people. However, its physical model still remains a subject of dispute because it is caused by the interaction of diverse areas, such as solar wind, magnetosphere, and ionosphere, and it is difficult to simultaneously obtain data in such wide areas. This paper is devoted to forecasting the onset of brightening of auroras followed by poleward expansion, called auroral substorms. We adopt a data-driven approach, instead of physical models of auroras. This approach requires labeled data, which shows when auroras appeared. However, this is challenging because there exist a wide variety of observed data from diverse areas while they are not tied with onset time of auroras. We identified auroral substorms using all-sky images obtained at Tromso, Norway. Then, we chose solar wind and geomagnetic field data as the first attempt toward the goal, out of many types of data, and associated them with the onset times of the identified auroral substorms. We trained a classifier of the support vector machine, which is a typical supervised learning algorithm, using the constructed data, and the classifier achieves around 78% classification accuracy at 5-fold cross validation.