I’m not a developer by trade, but I know just enough Python to automate repetitive tasks. Today at work, I saw an opportunity to speed up a task by writing a few lines of Python. Without going into too much detail, the task involved finding all the blog posts without a table of contents section.

We have a ton of content on our blog, so going through each post manually would’ve taken forever. To speed things up, I spent a few minutes writing the Python script below. To create the list of posts to check, I grabbed all the URLs from our sitemap, formatted them into a Python list, and assigned the list to the blog_urls variable. Finally, I ran the script. A few minutes later, I had a complete list of all the blog posts that don’t have a table of contents section.

import requests
from bs4 import BeautifulSoup
from multiprocessing import Pool

blog_urls = [(LIST OF URLS)]

def check_toc_status(url):
	r = requests.get(url)
	html = r.text
	soup = BeautifulSoup(html, features="html.parser")
	if soup.find_all("aside", {"class": "toc"}):
		print(f"OK: {url}")
	else:
		print(f"NO: {url}")

if __name__ == '__main__':
	with Pool(8) as pool:
		result = pool.map(check_toc_status, blog_urls)