web-scraping

Here are 483 public repositories matching this topic...

codingforentrepreneurs / 30-Days-of-Python

Learn Python for the next 30 (or so) Days.

python api flask automation tutorial csv jupyter rest-api selenium pandas python3 web-scraping selenium-webdriver fastapi

Updated Feb 27, 2024
HTML

jaebradley / basketball_reference_web_scraper

Star

NBA Stats API via Basketball Reference

python nba web-scraper web-scraping basketball-reference

Updated May 4, 2026
HTML

programminghistorian / jekyll

Star

Jekyll-based static site for The Programming Historian

Updated Apr 28, 2026
HTML

austinoboyle / scrape-linkedin-selenium

Star

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

python scraper linkedin scraping selenium web-scraper web-scraping scrape selenium-webdriver

Updated Oct 16, 2022
HTML

davidteather / everything-web-scraping

Sponsor

Star

Learn everything web scraping with David Teather Codes on YouTube

python course everything reverse-engineering python3 web-scraping courses webscraping hacktoberfest youtube-series python-web-scraper project-based-learning web-scraping-tutorial project-based-learning-courses hacktoerfest web-scraping-python project-based-tutorials

Updated Jul 31, 2023
HTML

City-Bureau / city-scrapers

Star

Scrape, standardize and share public meetings from local government websites

python open-data web-scraping scrapy city-scrapers

Updated May 4, 2026
HTML

currentslab / extractnet

Star

A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one package

python machine-learning text-mining news web-scraping webscraping news-articles news-extractor content-extraction news-extraction text-cleaning date-extraction author-extraction

Updated May 19, 2025
HTML

OSINT-TECHNOLOGIES / dpulse

Star

DPULSE - Tool for complex approach to domain OSINT

Updated Apr 6, 2026
HTML

programminghistorian / ph-submissions

Star

The repository and website hosting the peer review process for new Programming Historian lessons

python api open-source mapping multi-lingual web-scraping digital-humanities data-management pedagogy web-archiving network-analysis linked-open-data programming-historian dh open-educational-resources r-studio digital-history distant-reading

Updated May 4, 2026
HTML

khuyentran1401 / top-github-scraper

Sponsor

Star

Scape top GitHub repositories and users based on keywords

github python github-api scraping web-scraper web-scraping

Updated Jun 27, 2023
HTML

scrapehero / selectorlib

Star

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

python scraping web-scraping selectors xpath

Updated Jan 30, 2023
HTML

A high-performance personal fund tracker focused on providing real-time net value estimations. It features deep stock penetration, smart reverse-calculation, and robust multi-level caching for a seamless experience. 一款专注于提供基金实时净值估算的高性能追踪看板。支持底层重仓股穿透、智能净值反向推算，并内置防御级三级缓存架构。

python finance dashboard data-visualization web-scraping echarts realtime-data tailwind-css realtime-navigation fastapi fund-tracker nav-estimation

Updated Mar 21, 2026
HTML

Amey-Thakur / TEXT-SUMMARIZER

Sponsor

Star

Machine Learning Project to Compare and Evaluate Text Summarization Algorithms Using SpaCy, NLTK, Gensim, and Sumy.

Updated Feb 20, 2026
HTML

LexiestLeszek / sova_ollama

Star

Open source implementation of Sova - RAG-based Web search engine using power of LLMs. Using Langchain, Ollama, HuggingFace Embeddings and scraping google search results.

web-scraping large-language-models llm retrieval-augmented-generation rag-implementation