Crowd counting is of great importance to many applications. Conventional vision-based approaches require line of sight and pose privacy concerns, while most radio-based approaches involve high deployment cost. In this paper, we propose to utilize WiFi channel state information (CSI) to infer crowd count in a device-free way, with only one pair of WiFi transmitter and receiver. The proposed method establishes the statistical relationship between the variation of CSI and the number of people with deep neural networks (DNN) and thereafter estimates the people count according to the real-time CSI through the trained DNN model. Evaluations demonstrate the effectiveness of the method. For the crowd size of 6, the counting error was within 1 person for 100% of the cases. For the crowd size of 34, the counting error was within 1 person for 97.7% of the cases and within 2 persons for 99.3% of the cases.