Top #crawl Tools & Software

Explore 17 hand-picked tools and software tagged with crawl — ranked by popularity and community signals.

firecrawl

github

🔥 The API to search, scrape, and interact with the web for AI

AI Tools TypeScript
★ 111,353

scrapy

github

Scrapy, a fast high-level web crawling & scraping framework for Python.

Frameworks Python
★ 61,339

Scrapling

github

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

AI Tools Python
★ 37,267

lux

github

👾 Fast and simple video download library and CLI tool written in Go

Collaboration Go
★ 31,241

colly

github

Elegant Scraper and Crawler Framework for Golang

Frameworks Go
★ 25,246

Scrapegraph-ai

github

Python scraper based on AI

AI Tools Python
★ 23,403

crawlee

github

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

Automation TypeScript
★ 22,808

katana

github

A next-generation crawling and spidering framework.

CLI Tools Go
★ 16,646

maxun

github

🔥 The open-source no-code platform for web scraping, crawling, search and AI data extraction • Turn websites into structured APIs in minutes 🔥

Automation TypeScript
★ 15,375

crawlee-python

github

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Parsel, BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

Automation Python
★ 8,783

InfoSpider

github

INFO-SPIDER 是一个集众多数据源于一身的爬虫工具箱🧰,旨在安全快捷的帮助用户拿回自己的数据,工具代码开源,流程透明。支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、中国移动、中国联通、中国电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源中国博客、简书。

Automation Python
★ 8,203

autoscraper

github

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

AI Tools Python
★ 7,141

rod

github

A Chrome DevTools Protocol driver for web automation and scraping.

Automation Go
★ 6,863

pydoll

github

Pydoll is a library for automating chromium-based browsers without a WebDriver, offering realistic interactions.

Testing Python
★ 6,748

TorBot

github

Dark Web OSINT Tool

Security Python
★ 4,053

cariddi

github

Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more

Security Go
★ 3,359

NGCBot

github

一个基于✨HOOK机制的微信机器人,支持🌱安全新闻定时推送【FreeBuf,先知,安全客,奇安信攻防社区】,👯Kfc文案,⚡漏洞查询,⚡手机号归属地查询,⚡知识库查询,🎉星座查询,⚡天气查询,🌱摸鱼日历,⚡微步威胁情报查询, 🐛视频,⚡图片,👯帮助菜单。📫 支持积分功能,⚡支持自动拉人,,🌱自动群发,👯Ai回复(国内主流AI模型,扣子,FastGpt,Dify全面支持!),⚡视频号解析,😄自定义程度丰富,小白也可轻松上手!

Security
★ 3,318