State-of-the-art deep learning (DL) systems are vulnerable to adversarial examples, which hinders their potential adoption in safety-and security-critical scenarios. While some recent progress has been made in analyzing the robustness of feed-forward neural networks, the robustness analysis for stateful DL systems, such as recurrent neural networks (RNNs), still remains largely uncharted. In this paper, we propose Marble, a model-based approach for quantitative robustness analysis of real-world RNN-based DL systems. Marble builds a probabilistic model to compactly characterize the robustness of RNNs through abstraction. Furthermore, we propose an iterative refinement algorithm to derive a precise abstraction, which enables accurate quantification of the robustness measurement. We evaluate the effectiveness of Marble on both LSTM and GRU models trained separately with three popular natural language datasets. The results demonstrate that (1) our refinement algorithm is more efficient in deriving an accurate abstraction than the random strategy, and (2) Marble enables quantitative robustness analysis, in rendering better efficiency, accuracy, and scalability than the state-of-the-art techniques.