Real-time implementation and robustness against illumination variation are two essential issues for traffic congestion classification systems, which are still challenging issues. This paper proposes an efficient automated system for traffic congestion classification based on compact image representation and deep residual networks. Specifically, the proposed system comprises three steps: video dynamics extraction, feature extraction, and classification. In the first step, we propose two approaches for modeling the dynamics of each video and produce a compact representation. In the first approach, we aggregate the optical flow in front direction, while in the second approach, we use a temporal pooling method to generate a dynamic image describing the input video. In the second step, we use a deep residual neural network to extract texture features from the compact representation of each video. In the third step, we build a classification model to discriminate between the classes of traffic congestion (low, medium, or high). We use the UCSD and NU1 traffic congestion datasets to assess the performance of the proposed method. The two datasets contain different illumination and shadow variations. The proposed method gives excellent results compared to state-of-the-art methods. It also can classify the input video in a short time (37 fps), and thus, we can use it with real-time applications.