Mothership News Crawler
Web Scraping
News Crawler that can automatically retrieve article links, headlines and subtitles
I am using the Singapore news website, Mothership, to create a new crawler, using Requests and BeautifulSoup in Python.
It is my first attempt at a web automation project using Requests and BeautifulSoup.
For my practices,
- In main.py, I practised how to use requests to retrieve the Mothership website and use beautifulsoup to crawl through the web elements.
- In ms_latest_news.py, I practised retrieving the URL links, header and subtitle of the news articles under the ‘Latest News’ section of the Mothership website.
Mothership News Crawler
Finally, in Project: Mothership News Crawler, by Bei Le, I retrieved the header and subtitle of all the suggested posts at the bottom of the latest news articles’ pages from the ‘Latest News’ section of the Mothership website.
I think it’s really cool that we can use programming to automate tasks like these for us!