Personal project · 2025

Wikipedia Scraper

Async crawler with 100 concurrent workers, O(1) URL deduplication, and a 20-second global deadline.

Python · Asyncio · Aiohttp · BeautifulSoup

Challenge

Efficiently crawling large-scale websites requires balancing speed with resource management under strict time constraints.

Approach

Built a high-concurrency async crawler with 100 workers, O(1) URL deduplication, and a global 20-second deadline using Python's asyncio and aiohttp.

What it does