Hi! Sorry to interrupt, but I'm currently looking for work.
If you know anyone looking for a recent CompSci graduate with deep experience in strictly-typed languages like Rust, please send the listing to me or send my Resume over!
I'm based in Brighton, UK and looking for roles in and around the London area, or remote. Thanks!


   (Sets a cookie in your browser)

I Made a DSL for My Blog

🕒


I think I have spent most of my time on this blog trying to figure out how to write posts better. I honestly think I have outdone myself this time.

A diagram. Markdown to PHP to HTML to Makup. An arrow is pointing to Makup saying 'you are here'.
What I have used as blog post formats.
MarkdownPHPHTMLMakup.
Click each entry to go to their blog posts.

I've been constantly changing because I have been unsatisfied with everything I've tried. I switched to a HTML-style format so I could have more control over my formatting. I've been trying to balance control over formatting with lack of verbosity.

So here's the language:

HTML
<p>This is a sample paragraph.</p>
<$code-block lang="html">
<span>This is a sample code block.</span>
</$code-block>
I've changed the load-bearing symbol from # to $, for maybe obvious reasons.

I titled it makup because I'm terrible at naming things. It's just markup without the "r". Or maybe makeup with the "e". I dunno.

What it Does

Makup (the compiler) parses makup language, then passes any named components (eg <$code-block>) to a lookup table. If the tag name is registered, it will pass the contents and attributes of the component to a function.

rust
fn hello(input: &str, attrs: HashMap<&str, &str>) -> String {
format!("<span>Hello {input}</span>")
}

The outputted string will replace the component. In theory, this language does not really need to work on top of HTML (it's similar to PHP in that way) but the syntax is trying to mimic it.

How I Made it

The language is specified using Pest, a PEG parser (and generator). Having only worked with LL/LALR parsers before, it was a little strange to work with, but the grammar is quite simple.

Pest
document = { SOI ~ (statement | text)* ~ EOI }
statement = { tag_open ~ text ~ tag_close }
tag_open = { "<$" ~ PUSH(tag_ident) ~ attr_list? ~ ">" }
tag_ident = { (ASCII_ALPHA_LOWER | "-" | "_")+ }
tag_close = { "</$" ~ POP ~ ">" }
attr_list = { " "+ ~ attr ~ (" " ~ attr)* ~ " "* }
attr = { attr_key ~ "=" ~ "\"" ~ quoted_text ~ "\"" }
attr_key = { (ASCII_ALPHA_LOWER | "-" | "_")+ }
quoted_text = { (!("<$" | "</$" | "\"") ~ ANY)+ }
text = { (!("<$" | "</$") ~ ANY)+ }

The whole library is less than 100 lines of code (excluding generated code and tests.) You can view the "core" of it on Tangled, here. It's a bit ugly right now. I'm not sure if I could get it to be nicer, but I'm just glad it works robust enough to write blog posts in.

Using the library

Honestly, the main reason I wanted to work on this is... code block highlighting sucks. What's nice about this approach is that I can finally render them on the server instead of the client., with a robust parser (tree-sitter). To do this, I'm using the library autumnus. I was going to use the inkjet crate, but it got deprecated days after I found it. Welp.

rust
// SAFETY: function is not allowed to error. Annoyingly.
#[allow(clippy::expect_used, clippy::unwrap_used)]
fn code_block(contents: &str, attrs: HashMap<&str, &str>) -> String {
let lang = attrs
.get("lang")
.or_else(|| attrs.get("language"))
.unwrap_or(&"plain");
let formatter = autumnus::HtmlInlineBuilder::new()
.lang(Language::guess(lang, contents))
.source(contents)
.theme(Some(
autumnus::themes::get("catppuccin_mocha").expect("Built in!"),
))
.build()
.expect("i hope this doesnt crash!");
let mut output = Vec::new();
formatter.format(&mut output).unwrap();
let output = String::from_utf8(output).unwrap();
maud::html! {
.code-block {
.code-language { (lang) }
(PreEscaped(output))
}
}
.into_string()
}

As you can see, there's still some papercuts with Makup, but it's perfectly usable for this. (Unwrapping here is not catastrophic, it's currently expected that the parsing will panic in certain cases.) I might try and add maud support directly so i don't have to call a conversion method, but trait trickery like that in Rust is maybe somewhat above my skill level.

The other use case I had for this was to render the speech-boxes that I use to add a bit of personality to my writing. In this case, I already had a function that generated them for non-post pages on my website. That means it was as simple as plugging that into my rewriter function:
rust
fn speech_box(contents: &str, attrs: HashMap<&str, &str>) -> String {
let char = attrs.get("character").unwrap_or(&"deer");
let emotion = attrs.get("emotion").unwrap_or(&"neutral");
// ⤵︎ this function!
speech(
&match *char {
"you" => SpeechCharacter::You,
_ => SpeechCharacter::Deer,
},
&match *emotion {
"worried" => SpeechEmotion::Worried,
"shocked" => SpeechEmotion::Shocked,
"happy" => SpeechEmotion::Happy,
_ => SpeechEmotion::Neutral,
},
&html! { (PreEscaped(contents)) },
)
.into_string()
}
drawing of a happy deer, talking to you.

That's all. If anyone would be interested in using this library for anything, please reach out. I would love to work on it further, given the chance .

Footnotes

  1. I've probably talked about this already on here, but the client side rendering was pretty horrible. When I was using Prism and Shiki, I used a special comments syntax for having unescaped markup in code blocks. When first loading the page, an empty code block would be shown, and then once the highlighter kicked in it would be swapped for a pretty code block. The one thing I currently miss from Shiki is the ability to have two themes and swap them with device color scheme. I believe this is possible with autumnus, but it would have to be a bit hacky. I'll open an issue now on the repo and see what happens.
  2. I'm definitely not done working on it. I want to eventually port over the footnotes system, but that would require making my rewriter a bit more stateful (and adding void tags!) Specifically, I would need some shared state somewhere so footnotes can be ordered correctly and backreferences can be made. The current version of this is in a quick JS script I made.