With the help of massive data and rich computational resource, deep Q-learning has been widely used in operations research and management science and receives great success in numerous applications including, recommender system, games and robotic manipulation. Compared with avid research activities in practice, there lack solid theoretical verifications and interpretability for the success of deep Q-learning, making it be a little bit mystery. The aim of this talk is to discuss the power of depth in deep Q-learning. In the framework of learning theory, we rigorously prove that deep Q-learning outperforms the traditional one by showing its good generalization error bound. Our results show that the main reason of the success of deep Q-learning is due to the excellent performance of deep neural networks (deep nets) in capturing special properties of rewards such as the spatially sparse and piecewise constant rather than due to their large capacities. In particular, we provide answers to questions why and when deep Q-learning performs better than the traditional one and how about the generalization capability of deep Q-learning.