When RSS Parsing Ships: The Last Manual Step Disappears
There’s one manual step left in the pipeline: clicking “Sync” in the AutoRAG dashboard. When RSS parsing ships, even that goes away.
The Current State
Right now, the flow is:
/publish-post
│
▼
Git push → Cloudflare Pages deploys → Sitemap updates
│
▼
[MANUAL] Click "Sync" in AI Search dashboard
│
▼
AutoRAG crawls → Indexes new content → Chatbot knows about it
That manual sync step is the last friction point. It takes 10 seconds, but it requires:
- Remembering to do it
- Having the dashboard open
- Waiting to confirm it worked
For true “publish and forget,” this needs to be automatic.
What’s Coming: RSS Auto-Indexing
Cloudflare is adding RSS feed support to AutoRAG. Instead of scheduled sitemap crawls, it will:
- Subscribe to your RSS feed
- Detect new entries automatically
- Index new content immediately
- No manual trigger required
The future flow:
/publish-post
│
▼
Git push → Cloudflare Pages deploys
│
├─→ Sitemap updates
│
└─→ RSS feed updates with new entry
│
▼
AutoRAG detects RSS update
│
▼
Auto-indexes new content
│
▼
Chatbot immediately knows about new post
Zero manual steps. Publish and forget.
Why RSS Over Sitemap
Sitemap limitations:
- Requires scheduled polling (every X hours)
- Or manual sync trigger
- No “push” mechanism
- Can’t detect updates in real-time
RSS advantages:
- Designed for content updates
- Natural “new entry” signaling
- Can be polled frequently or pushed via webhooks
- Standard format, well-supported
The sitemap says “here’s everything.” The RSS feed says “here’s what’s new.”
Astro RSS Setup
Astro has built-in RSS support:
// src/pages/rss.xml.js
import rss from '@astrojs/rss';
import { getCollection } from 'astro:content';
export async function GET(context) {
const posts = await getCollection('posts');
return rss({
title: 'Operational Semantics',
description: 'Building AI infrastructure in public',
site: context.site,
items: posts.map((post) => ({
title: post.data.title,
pubDate: new Date(post.data.date),
description: post.data.description,
link: `/posts/${post.slug}/`,
})),
});
}
This generates /rss.xml on every build. New posts automatically appear in the feed.
The Complete Automated Flow
Once RSS parsing ships:
┌─────────────────────────────────────────────────────────────┐
│ WRITE & PUBLISH │
│ │
│ You: /publish-post "New Post Title" │
│ │
│ PAI: ✓ Post created │
│ ✓ Frontmatter valid │
│ ✓ Build succeeds │
│ ✓ Git pushed │
│ ✓ Substack published │
│ │
│ Done. Go do something else. │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ AUTOMATIC (NO HUMAN INVOLVED) │
│ │
│ 1. Cloudflare Pages detects push │
│ 2. Build runs (~45 seconds) │
│ 3. Site deploys to edge │
│ 4. RSS feed regenerates with new entry │
│ 5. AutoRAG detects RSS update │
│ 6. New post fetched and chunked │
│ 7. Embeddings generated │
│ 8. Vectors stored in Vectorize │
│ 9. Chatbot can answer questions about new post │
│ │
│ Time: ~2-3 minutes │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ VERIFICATION │
│ │
│ Reader: "What's the latest post about?" │
│ │
│ Chatbot: "The most recent post is 'New Post Title' │
│ published today. It covers..." │
│ │
└─────────────────────────────────────────────────────────────┘
From “publish” to “chatbot knows” - completely hands-off.
What This Enables
1. True “Publish and Forget”
Write in the morning. Publish before lunch. By afternoon, the chatbot knows about it. No dashboard visits required.
2. Faster Knowledge Updates
Currently: Scheduled sync every few hours, or manual trigger. With RSS: Updates within minutes of publishing.
3. Multi-Platform Sync
RSS isn’t just for AutoRAG. The same feed can:
- Trigger Substack cross-posting
- Update social media queues
- Notify subscribers via other channels
- Feed into aggregators
4. Audit Trail
The RSS feed is a timestamped record of what was published when. Useful for debugging “why doesn’t the chatbot know about X?”
Preparing for RSS Parsing
Even before the feature ships, you can prepare:
1. Set up the RSS endpoint
Add @astrojs/rss and create src/pages/rss.xml.js. The feed generates on every build.
2. Verify the feed
curl https://operationalsemantics.dev/rss.xml | head -20
Make sure it includes all posts with correct dates and descriptions.
3. Add to robots.txt (optional)
Sitemap: https://operationalsemantics.dev/sitemap-index.xml
RSS: https://operationalsemantics.dev/rss.xml
4. Wait for Cloudflare
When RSS support ships in AI Search, you’ll be ready to enable it immediately.
The Vision: Reactive Publishing
The end state is reactive publishing:
| Trigger | Response |
|---|---|
| Git push | Site deploys |
| Site deploys | RSS updates |
| RSS updates | AI indexes |
| AI indexes | Chatbot learns |
No cron jobs. No scheduled tasks. No manual syncs.
Events propagate through the system automatically. You publish; everything else reacts.
Fallback: Scheduled Sync
If RSS parsing doesn’t ship, or you prefer the current approach:
Scheduled sitemap sync still works:
- AI Search Settings → Indexing
- Set sync schedule (hourly, daily, etc.)
- System automatically re-crawls on schedule
It’s slightly less real-time, but still hands-off.
Summary
| State | Manual Steps | Time to Indexed |
|---|---|---|
| Current (sitemap + manual sync) | 1 (click sync) | 5-10 minutes |
| Scheduled sync | 0 | Up to X hours |
| RSS auto-index | 0 | 2-3 minutes |
The goal is zero friction between “I wrote something” and “readers can ask about it.”
RSS parsing is the last piece.
Next up: Lessons and Gotchas - Everything that broke along the way.