Top #crawl Tools & Software
Explore 10 hand-picked tools and software tagged with crawl — ranked by popularity and community signals.
scrapy
githubScrapy, a fast high-level web crawling & scraping framework for Python.
Scrapling
github🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!
lux
github👾 Fast and simple video download library and CLI tool written in Go
crawlee
githubCrawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
maxun
github🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minutes 🔥
crawlee-python
githubCrawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
InfoSpider
githubINFO-SPIDER 是一个集众多数据源于一身的爬虫工具箱🧰,旨在安全快捷的帮助用户拿回自己的数据,工具代码开源,流程透明。支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、中国移动、中国联通、中国电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源中国博客、简书。
autoscraper
githubA Smart, Automatic, Fast and Lightweight Web Scraper for Python
rod
githubA Chrome DevTools Protocol driver for web automation and scraping.
pydoll
githubPydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.