Top #data Tools & Software

Explore 64 hand-picked tools and software tagged with data โ€” ranked by popularity and community signals.

Retrieval-based-Voice-Conversion-WebUI

github

Easily train a good VC model with voice data <= 10 mins!

Developer Tools Python
โ˜… 35,237

spaCy

github

๐Ÿ’ซ Industrial-strength Natural Language Processing (NLP) in Python

AI Tools Python
โ˜… 33,472

dokploy

github

Open Source Alternative to Vercel, Netlify and Heroku.

Database TypeScript
โ˜… 33,104

posthog

github

๐Ÿฆ” PostHog is an all-in-one developer platform for building successful products. We offer product analytics, web analytics, session replay, error tracking, feature flags, experimentation, surveys, data warehouse, a CDP, and an AI product assistant to help debug your code, ship features faster, and keep all your usage and customer data in one stack.

Analytics Python
โ˜… 32,605

cockroach

github

CockroachDB โ€” the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.

Database Go
โ˜… 32,059

interactive-coding-challenges

github

120+ interactive Python coding interview challenges (algorithms and data structures). Includes Anki flashcards.

Developer Tools Python
โ˜… 31,337

ML-From-Scratch

github

Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

AI Tools Python
โ˜… 31,306

pytorch-lightning

github

Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.

AI Tools Python
โ˜… 31,051

EasyOCR

github

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

AI Tools Python
โ˜… 29,301

data-science-ipython-notebooks

github

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

AI Tools Python
โ˜… 29,002

d2l-en

github

Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 500 universities from 70 countries including Stanford, MIT, Harvard, and Cambridge.

AI Tools Python
โ˜… 28,623

redash

github

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

Analytics Python
โ˜… 28,414

kestra

github

Event Driven Orchestration & Scheduling Platform for Mission Critical Applications

No-code Java
โ˜… 26,700

gitleaks

github

Find secrets with Gitleaks ๐Ÿ”‘

DevOps Go
โ˜… 25,995

prefect

github

Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

Automation Python
โ˜… 22,183

taipy

github

Turns Data and AI algorithms into production-ready web applications in no time.

Automation Python
โ˜… 19,170

maxun

github

๐Ÿ”ฅ The open-source no-code platform for web scraping, crawling, search and AI data extraction โ€ข Turn websites into structured APIs in minutes ๐Ÿ”ฅ

Automation TypeScript
โ˜… 15,375

awesome-mlops

github

A curated list of references for MLOps

AI Tools
โ˜… 13,854

RD-Agent

github

Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are committed to automating these high-value generic R&D processes through R&D-Agent, which lets AI drive data-driven AI. ๐Ÿ”—https://aka.ms/RD-Agent-Tech-Report

AI Tools Python
โ˜… 12,531

encore

github

Open source framework for building robust type-safe distributed systems with declarative infrastructure

Cloud Go
โ˜… 11,804

test-your-sysadmin-skills

github

A collection of Linux Sysadmin Test Questions and Answers. Test your knowledge and skills in different fields with these Q/A.

Security
โ˜… 11,560

pipedream

github

Connect APIs, remarkably fast. Free for developers.

Automation JavaScript
โ˜… 11,251

tpot

github

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

AI Tools
โ˜… 10,041

miller

github

Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON

DevOps Go
โ˜… 9,839