Fooyo recently received a request to make a news classification app. It sounds quite challenging and interesting. We feel excited to solve hard problems. The request is a bit too urgent, but we managed to deliver it with good feedbacks from the client. The requests is to crawl news data from various news sources, delete the duplications, classify the news into different categories and then display them on a mobile app. It sounds quite straightforward, however, each module actually involves quite hard components that need deep knowledge/understanding to solve them. In summary there are four components: 1. A web news crawler 2. A duplication deletion mechanism 3. A news classification mechanism 4. A mobile app which interacts with the server A web crawler is used to crawl news list data and the corresponding page links. The web contents of the page links are then saved into the server. There are quite a few good open sourced crawler out there, e.g., scrapy . Some sites ma