Brand LogoBrand Logo (Dark)
HomeAI AgentsToolkitsGitHub PicksSubmit AgentBlog

Categories

  • Art Generators
  • Audio Generators
  • Automation Tools
  • Chatbots & AI Agents
  • Code Tools
  • Financial Tools

Categories

  • Large Language Models
  • Marketing Tools
  • No-Code & Low-Code
  • Research & Search
  • Video & Animation
  • Video Editing

GitHub Picks

  • DeerFlow — ByteDance Open-Source SuperAgent Harness

Latest Blogs

  • OpenClaw vs Composer 2 Which AI Assistant Delivers More Value
  • Google AI Studio vs Anthropic Console
  • Stitch 2.0 vs Lovable Which AI Design Tool Wins in 2026
  • Monetizing AI for Solopreneurs and Small Teams in 2026
  • OpenClaw vs MiniMax Which AI Assistant Wins in 2026

Latest Blogs

  • OpenClaw vs KiloClaw Is Self-Hosting Still Better
  • OpenClaw vs Kimi Claw
  • GPT-5.4 vs Gemini 3.1 Pro
  • Farewell to Bloomberg Terminal as Perplexity Computer AI Redefines Finance
  • Best Practices for OpenClaw
LinkStartAI© 2026 LinkstartAI. All rights reserved.
Contact UsAbout
  1. Home
  2. GitHub Picks
  3. Clawfeed
Clawfeed logo

Clawfeed

A self-hosted web-to-RSS feed generator that extracts updates from webpages and normalizes them into RSS/Atom for monitoring, archiving, and reader integrations.
1.3kHTMLMIT license
#rss#atom#web-to-rss#feed-generator#content-monitoring
#web-scraping
#self-hosted
#docker
#alternative-to-rsshub
#alternative-to-rss-bridge
#feedly-like
#inoreader-like

What is it?

Clawfeed turns webpages without native feeds into durable RSS/Atom outputs, upgrading information intake from manual checking to an automated pipeline. It behaves more like a feed builder than a reader, so you can plug generated feeds into Feedly or Inoreader for reading and syncing. In team workflows, the key is making scraping rules, filtering logic, and refresh cadence engineering-grade: rules can be versioned, outputs cached, failures degraded with alerts, and upstream volatility absorbed. With containerized delivery such as Docker, it becomes a lightweight monitoring and archiving layer that fits intranet and privacy-sensitive environments.

Pain Points vs Innovation

✕Traditional Pain Points✓Innovative Solutions
Many sources ship no RSS, forcing teams into notifications or manual checks with delays, weak traceability, and poor archiving.Clawfeed engineers web extraction into feed generation: rule-driven scraping, controlled refresh, and cacheable outputs that turn feeds into an operable capability.
Readers are good at consuming feeds but not producing them, especially when auth, caching, filtering, and stable refresh are required.It focuses on self-hosting and composability so RSS/Atom outputs work with any reader while supporting isolation, rate limits, and alerting for team needs.

Architecture Deep Dive

Rule-Driven Pipeline from Scraping to Feeds
Clawfeed models each source as rules plus an execution pipeline: inputs are webpages or endpoints and outputs are standard RSS/Atom items. The point is to confine scraping uncertainty to the rule layer so downstream systems only consume stable feed URLs. The pipeline typically includes extraction, normalization, deduplication, and ordering to prevent noisy re-emission across refresh cycles. From an ops standpoint, rule-driven design supports versioning and rollback, making upstream adaptation safer without reshaping the whole system.
Operable Refresh Strategy and Reliability Boundaries
For long-running feeds, the real risk is not a single failure but uncontrolled refresh and silent breakage. Clawfeed makes refresh engineering explicit by treating caching, retries, timeouts, and degraded outputs as one policy, keeping feeds consumable under upstream volatility. To avoid subscriber fan-out, a robust approach is timer-driven refresh with cache reuse for hot sources rather than request-driven fetching. The result is a monitorable, rate-limited, and isolatable feed production layer that scales for team usage.

Deployment Guide

1. Clone the repo and install dependencies (choose npm/pnpm per docs)

bash
1git clone https://github.com/kevinho/clawfeed.git && cd clawfeed && npm i

2. Configure runtime and refresh policy (targets, intervals, cache)

bash
1cp .env.example .env && sed -i '' 's/REFRESH_INTERVAL=.*/REFRESH_INTERVAL=300/' .env

3. Start locally and verify generated feed output

bash
1npm run dev

4. Containerize for production and add health checks and rate limits

bash
1docker build -t clawfeed:latest . && docker run -d --name clawfeed -p 1200:1200 clawfeed:latest

Use Cases

Core SceneTarget AudienceSolutionOutcome
Competitive & Product Update MonitoringOps and Product ManagersConvert changelogs and announcement pages into RSS with alertsCatch changes early with auditability
Intranet Collection and ArchivingEnterprise IT and Security TeamsSelf-host and standardize external updates into RSS with access controlLess external dependency and better traceability
Update Signal Layer for Data PipelinesData EngineersUse RSS/Atom as a unified change signal into ETL and workflowsLower scraper maintenance and improved stability

Limitations & Gotchas

Limitations & Gotchas
  • Upstream markup changes can break rules; add degraded outputs and alerts for critical sources and version rule updates with releases.
  • High-frequency refresh can trigger anti-bot blocks; prefer cache reuse, timer-driven refresh, and rate limits to control request fan-out.
  • Login-dependent sources require cookies or tokens; isolate credentials per source and follow least privilege to avoid leakage.

Frequently Asked Questions

How does Clawfeed differ from RSSHub?▾
Clawfeed focuses on turning a small set of critical pages into long-running feed pipelines with controlled refresh, deduplication, and stable outputs, which fits teams that want reliability-first ingestion. RSSHub is closer to a route catalog with a large ecosystem of rules and broad coverage. Choose based on intent: if you want breadth and ready-made routes, RSSHub is usually faster; if you want high reliability and controllable policies for a few sources, Clawfeed is a tighter fit.
What’s the trade-off vs RSS-Bridge?▾
RSS-Bridge is a collection of bridges, great for quickly filling missing feeds with a lightweight setup. Clawfeed is more of an operable scraping-to-feed pipeline where refresh, caching, alerting, and isolation are first-class. If you only need a handful of bridges, RSS-Bridge can be enough; if you run feeds as a long-lived service, Clawfeed tends to feel more engineered.
How do I prevent subscriber growth from exploding upstream requests?▾
Switch from request-driven fetching to timer-driven refresh: refresh once per interval and serve subscribers from cache. Use longer TTLs for hot sources and apply backoff retries on failures to avoid hammering upstreams. Add rate limits and isolate login-dependent sources into separate configs or instances to protect both credentials and resources.
View on GitHub

Project Metrics

Stars1.3 k
LanguageHTML
LicenseMIT license
Deploy DifficultyMedium

Table of Contents

  1. 01What is it?
  2. 02Pain Points vs Innovation
  3. 03Architecture Deep Dive
  4. 04Deployment Guide
  5. 05Use Cases
  6. 06Limitations & Gotchas
  7. 07Frequently Asked Questions

Related Projects

OpenClaw
OpenClaw
25.1 k·TypeScript
CoPaw
CoPaw
1.1 k·Python
DeerFlow — ByteDance Open-Source SuperAgent Harness
DeerFlow — ByteDance Open-Source SuperAgent Harness
26.1 k·Python
gstack
gstack
0·TypeScript