, 3 min read
Generate RSS from Markdown
Original post is here eklausmeier.goip.de/blog/2021/05-30-generate-rss-from-markdown.
For this blog I wanted an RSS feed. Saaze does not provide this functionality. Saaze is supposed to be "stupidly simple" by design, which I consider a plus. Since 15-Aug-2022 Simplified Saaze can generate a RSS XML feed. Simplified Saaze, so to speak, is the successor of Saaze.
This post shows how you can generate an RSS feed without Simplified Saaze, but using just plain Perl.
Generating an RSS feed is simple. It contains a header with some fixed XML. Then each post is printed as so called "item" with
- link / URL
- publication date
- title
- an excerpt or even the full blog post
Finally the required closing XML tags. That's it.
Taking this information directly from Markdown file with some frontmatter seems to be the easiest approach. For example, the frontmatter for this blog post is:
---
date: "2021-05-30 20:00:00"
title: "Generate RSS from Markdown"
draft: false
categories: ["www"]
tags: ["RSS", "feed", "Markdown"]
author: "Elmar Klausmeier"
prismjs: true
---
Below Perl script mkdwnrss
implements this. As input files it wants those blog posts which should be part of the RSS feed. So usually you will "generate" the list of files. Implementing this in PHP would be equally simple.
The excerpt is restricted to either 9 lines of Markdown or less than 500 characters.
#!/bin/perl -W
# Create RSS XML file ("feed") based on Markdown files
#
# Input: List of Markdown files (order of files determines order of <item>))
# Output: RSS (description with 3 lines of Markdown as excerpt)
#
# Example:
# mkdwnrss `find blog/2021 -type f | sort -r`
use strict;
my $dt = localtime();
print <<"EOT";
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
<title>Elmar Klausmeier's Blog</title>
<description>Elmar Klausmeier's Blog</description>
<lastBuildDate>$dt</lastBuildDate>
<link>https://eklausmeier.goip.de</link>
<atom:link href="https://eklausmeier.goip.de/feed.xml" rel="self" type="application/rss+xml" />
<generator>mkdwnrss</generator>
EOT
sub item(@) {
my $f = $_[0];
open(F,"< $f") || die("Cannot open $f");
my $link = $f;
$link =~ s/\.md$/\//;
print "\t<item>\n"
. "\t\t<link>https://eklausmeier.goip.de/$link</link>\n"
. "\t\t<guid>https://eklausmeier.goip.de/$link</guid>\n";
my ($sep,$linecnt,$excerpt) = (0,0,"");
while (<F>) {
chomp;
if (/^\-\-\-$/) { $sep++ ; next; }
if ($sep == 1) {
if (/^title:\s+"(.+)"$/) {
printf("\t\t<title>%s</title>\n",$1);
} elsif (/^date:\s+"(.+)"$/) {
printf("\t\t<pubDate>%s</pubDate>\n",$1);
}
} elsif ($sep >= 2) {
next if (length($_) == 0);
if ($linecnt++ == 0) {
print "\t\t<description><![CDATA[";
$excerpt = $_;
} elsif ($linecnt < 9 || length($excerpt) < 500) {
$excerpt .= " " . $_;
} else {
last;
}
}
}
print $excerpt . "]]></description>\n" if ($linecnt > 0);
print "\t</item>\n";
close(F) || die("Cannot close $f");
}
while (<@ARGV>) {
item($_);
}
print "</channel>\n</rss>\n";
Source code for mkdwnrss
is in GitHub.
During development I checked whether my RSS looks similar to the RSS feed in WordPress: feed. I also checked Alex Le's blog post on RSS feed: Create An RSS Feed From Scratch.
Added 08-Jul-2021: When checking the RSS in W3C Feed Validation Service the dates and descriptions were marked as non-compliant. This is now corrected. Checking now gives:
Added 17-Jan-2022: Also see Generate RSS from HTML.
Added 15-Aug-2022: Since today Simplified Saaze can generate RSS directly with the -r
flag. The PHP code used in that is described here: RSS XML Feed.