not quite buildless

…buildsome?
I’ve seen a few articles about going buildless with programming Web applications.
- there’s of course the classic Vanilla JS
- a couple weeks ago I saw someone’s idea for Vanilla Prime, which is an opinion on how to make Vanilla JS somewhat optimal and more pleasant
- and yesterday I saw Going Buildless, a blog post which evaluates how far you can go without any builds.
as a big fan of simplicity, I really like that there’s a corner of the Web which values simplicity over 100% convenience.
- a lot of focus in modern day Web development seems to be on delivering apps ASAP but implemented inefficiently, and then trying to iron over that with complex tooling such as minifiers.
  - I think there’s a genuine use case for that—if so many people are using it, that means there’s a demand for it. but I honestly don’t like the complexity. I deal with enough of it at work, and I wouldn’t wanna have to understand a complex build toolchain for my little homegrown website.
- I mean, if the browser does it for you… then it’s probably smart to conserve that energy, and do something more interesting!
  - be it your energy, or the literal power that flows through your wall to fuel your computer running that complicated website build.
so I built my own static website generator!
- but I promise this blog post is not just about that—I’ve written about it before, and I don’t really see value in me writing yet another blog post in the vein of “here’s an overview of my statically generated blog’s tech stack okay bye,”
  - which isn’t to say you shouldn’t write a blog post about your tech stack! I think there’s a lot of value in writing about how you made your blog. it’s another thing to write about, and if you want to learn writing… you should write a lot!
    write write write a write a write.
so here are some stories about my handiwork: a website built with my very own two paws. enjoy!
I Ate My Themplate Engine, and why not quite buildless
- I’ll put probably the most interesting bit right at the start. the treehouse is not quite buildless.
  - one might even call it… buildsome? I mean, it’s not quite buildful, and definitely not buildless…?
- initially the treehouse started off as a completely statically generated website—to reduce the work needed to be done by the Web server, I decided to make it do as little as it possibly had to. so static generation it ended up being.
- static generation is super cool, because all you have to end up writing is a single function that deletes the existing output directory, creates a new one, and fills it in with a bunch of files.
  - in case of the treehouse, the sources for the statically generated files are .tree files, a bunch of Handlebars templates, and a bunch of assorted static files, including JavaScript.
- but honestly… I don’t really like that choice of templating engine, or rather templating engine implementation! the Rust implementation in particular has been kind of a pain in the butt, because it tries very hard to be multithreading friendly. that of course means some annoying Send + Sync bounds…
  - Handlebars allows you to write helpers, which are custom functions for processing text. I use them a bit in the treehouse—for example, I have include_static, which pastes in a file from my static directory into the template.
  - however, if a helper has to reference some outside data, or wants to cache its results because it’s expensive to run… you’re pretty much out of luck, because the Handlebars registry does not encode any lifetime bounds on the helpers.
    in a single-threaded setting this would mean you’d need to wrap your shared state in an Rc… but in a single-threaded setting, only one thread can ever access the Handlebars instance, so you may as well let the helper mutate itself, or reference outside data! (as in &, not Rc)!
  - my personal opinion is that it would be neater if there was an initial setup step, which freezes some parts of the registry—these parts become immutable after you set them up, and then you can send them out to threads—for example, behind an Arc. afterwards, you construct the render-time part, which requires mutable access—and therefore every thread that needs to render templates gets its own instance of that.
    using this architecture, you can forgo any locking at all, which is really cool for ergonomics, and great for performance!
  - I’ve looked at some other popular templating engines for Rust, and none of them really seem to solve this problem very neatly…
    so much so that I ended up writing my own little templating engine for the .tree format, which is stupid, but yeah. I need to access my generator’s shared state somehow (why? I don’t remember ), and I’m not gonna Clone all that, or put that beind a Mutex.
    I’d love to write a more proper templating engine that implements that dream architecture of mine, but it’s not a priority. Handlebars works well enough where it does, and I don’t really feel like spending my precious engineering hours on something that doesn’t have much of a benefit.
    other than being really heckin’ fun of course. which is why I may do it someday anyways.
- at some point I wanted to add OpenGraph metadata, so that branches you link to via permalinks such as this one would get nice embeds in Discord and other chat apps, including the branch’s text—which prompted me to drop the try_files and proxy_pass to an axum server instead. but it was fun and simple while it lasted!
not incremental by design
- did you know: you don’t need incremental builds, if your non-incremental builds are really fast!
  - on my Linux box at home, the treehouse manages to build itself in around 250ms. on my work PC running Windows, it takes about 800ms, which is still very acceptable.
    …and that’s a debug build!!
    the bottleneck on Windows is probably due to IOps being much slower there. I believe it’s deleting and copying around files that takes so long, because there’s a barely noticable but present delay on that task in the log output.
- this is the idea I built the treehouse generator in mind with—computers are pretty heckin’ fast. for the 56 unique .tree-derived pages the treehouse generates, most of the time is not actually spent parsing and generating text files, but rather… reading metadata for image files, so that I can generate proper width="" height="" attributes on <img> elements.
  - I don’t have any profiling infrastructure set up just yet, but I suspect it’s something with the image crate doing more work than it needs to in order to obtain size metadata.
  - shame on me, but I don’t have image size probing implemented for SVG just yet. which I use for some emoji, such as
- if I wanted to implement incremental builds, I feel like the dependency tracking would get pretty hellish pretty quickly.
  - say a .tree file uses the include_static template directive to include some file into itself. now in addition to compiling the .tree file when it changes, I’d also need to recompile the .tree file when that included_static file changes too—and that sort of dependency tracking is ripe for bugs as the codebase grows more complex!
    not with a well-built abstraction of course, but do I really want to invest my time into that abstraction, if my debug builds take a quarter of a second?
- I’ll probably keep using this non-incremental build system, at least until my build times get unbearably long—I’d say 2 or 3 seconds would be testing my patience already. I really don’t like using slow software.
Vanilla JS
- TypeScript is cool, but it’s yet another build step. and with tsc it’s a pretty slow one at that!
  - and it’s a dependency—one you have to install outside of rustc and cargo, which isn’t great…
    and if you want to use something else like swc, you’re out of luck unless you use npm. yuck.
  - I’m sure there’ll be a lot of people thinking “it’s not that slow,” but consider me a fanatic who despises long build times. adding an additional 100ms of Node.js warmup to that would be like comitting a cardinal sin!
- so I decided to go with Vanilla JS for the treehouse.
  - in fact, not just for the treehouse. the app I’m building currently is also completely Vanilla JS, for the exact same reasons.
- it actually hasn’t been that bad! aside from type errors being a little harder to debug, and IDE support not being as great—I honestly do miss the latter…—you can program in Vanilla JS just fine.
  - so I don’t really get hystericising over not having TypeScript. I’ve had worse experiences writing code. the best part about plain vanilla JavaScript is that the iteration times are really quick, and you have stack traces, and a pretty darn good graphical debugger to help you out in trying times.
    not a replay debugger mind you… but it’s still a pretty good debugger! way snappier than most tools for debugging C++ on Windows. (I gotta try out RAD Debugger one day >–<)
- my experience writing rakugaki only confirms that TypeScript isn’t really needed—it’s a pretty sizable Web app, and aside from me wishing JavaScript had more strict error handling—which TypeScript does not solve by the way, it’s really not that bad.
- I’ve written about my feelings towards JavaScript before though, so I won’t repeat myself too much here.
not minified
- did you know: your browser has a handy DevTools panel. ain’t that neat?
  - too bad it’s useless if you minify your code!
- remember: a lot of people can learn from reading someone else’s source code, including you! so try not minifying your source code.
  - let compression do most of the heavy lifting, and liberate yourself with an improved debugging experience, as well as letting people see how the cool parts of your website are implemented.
    even if they’re really janky!
  - you can also provide source maps to transfer optimal minified sources by default, and let your browser’s DevTools display your original sources to the user upon their request.
    the important part is that having some form of readable sources right on your page is really nice!
release mode? what’s that?
- tying into the previous point, the treehouse builds mostly the same code, both in debug mode and release mode.
  - the reason is that I don’t want to waste time testing my website separately on release mode, so having it behave differently in subtle ways means more bugs!
- there are only two parts of the treehouse that behaves differently between debug and release mode.
  - generation speed, and…
live reloading
- in debug mode, I have the treehouse reload itself on change automatically.
  - I can’t count the number of precious developer seconds this has saved me—not having to refocus my window to the browser is really nice!
  - I actually learned the Web ecosystem has such cool mechanisms for app development back when I was trying out React.js for a little project of mine. I didn’t end up building anything in the end because the project idea just didn’t end up vibing with me, but it provided me with lots of useful knowledge. especially related to how frickin’ cool React is in terms of developer experience.
- here’s the script I use for rkgk, hosted under static/live-reload.js:
  
  // NOTE: The server never fulfills this request, it stalls forever. // Once the connection is closed, we try to connect with the server until we establish a successful // connection. Then we reload the page. await fetch("/auto-reload/stall").catch(async () => { while (true) { try { let response = await fetch("/auto-reload/back-up"); if (response.status == 200) { window.location.reload(); break; } } catch (e) { await new Promise((resolve) => setTimeout(resolve, 100)); } } });
- I then import it into any HTML page which I want to reload on change.
  
  <script type="module"> import "rkgk/live-reload.js"; </script>
- on the Rust side, these /stall and /back-up endpoints are implemented like so:
  
  use std::time::Duration; use axum::{routing::get, Router}; use tokio::time::sleep; pub fn router<S>() -> Router<S> { let router = Router::new().route("/back-up", get(back_up)); #[cfg(debug_assertions)] let router = router.route("/stall", get(stall)); router.with_state(()) } #[cfg(debug_assertions)] async fn stall() -> String { loop { // Sleep for a day, I guess. Just to uphold the connection forever without really using any // significant resources. sleep(Duration::from_secs(60 * 60 * 24)).await; } } async fn back_up() -> String { "".into() }
  - the /stall route is only enabled in debug builds, because in release mode, rkgk uses a smarter exponential backoff system with some random noise added to the reload timeout, to prevent the server from DDoSing itself if it dies.
    or right after deploying, which also causes all existing WebSocket connections to close.
- in the treehouse, I use tower-livereload, but I wouldn’t recommend it.
  - as you can see above, there’s not a lot of client-side JavaScript, and it’s not hard to write the HTTP routes either!
  - also, tower-livereload also has a couple disadvantages compared to the solution I’ve shown here.
    it’s slow, because it only polls the server’s /back-up endpoint every second, which increases reload latency. I’ve bumped that up to polling every 100ms in my script, because I don’t like slow.
    it cats the JavaScript payload directly to your server’s Content-Type: text/html responses, which produces invalid HTML. as far as I can tell, all major browsers seem to parse it correctly, but it’s a pretty ugly hack nevertheless.
    it’s 500 lines of dependency for something you can do with 12 lines of JavaScript, 3 lines of HTML, and 28 lines of Rust!
- to trigger the reloads, I use cargo-watch, mostly because it’s really convenient.
  
  cargo watch -x run
  
  and you’re done! your site will now reload when you :w.
Web Component shenanigans
- as the treehouse uses Vanilla JS, I needed some solution for building reusable components that wasn’t React. luckily for me, I already knew about Web Components—in particular, custom elements.
  - if you’re using them for your own website, I’d recommend skipping shadow DOM, because it’s not really useful in case you have control over all styles. it’s good to know what it does in case you ever need it, but it shouldn’t be your go-to tool for building components.
- custom elements are just that—custom HTML elements that you can include in your page. according to the MDN docs, these have two flavours:
  - autonomous custom elements, which always extend HTMLElement in JavaScript. these can be used to implement standalone components composed out of smaller pieces. these are used like any HTML element—<your-element></your-element>.
  - customized built-in elements, which extend any built-in element. these are applied using the is="" attribute on a base built-in element, like <li is="your-element"></li>.
    unfortunately customized built-in elements are practically useless, because Safari doesn’t implement them, and doesn’t even plan to do so…
    
    but that doesn’t diminish the usefulness of autonomous custom elements either way!
- to implement a custom element, I usually use this pattern:
  
  class YourElement extends HTMLElement { // You can use the constructor of a custom element to require some // parameters from other JS code. // Note that adding *required* parameters here makes your element // practically unusable from HTML, because there's no way to pass them in! constructor() { super(); } connectedCallback() { // Read attributes and add children here. } } // Choose a different prefix; `owo` is just an example to get you going. // You *must* choose a prefix, because custom elements require at least one dash `-` in their names. customElements.define("owo-your-element", YourElement);
  
  and off we go!
- I’ve found a couple useful idioms for working with custom elements.
  - styling: custom elements start off with display: inline;, which is probably not what you want. therefore, you’ll usually want to replace that with display: block;.
    
    owo-custom-element { display: block; }
  - DOM construction: a useful idiom is passing createElement to appendChild. this allows you to append a new element to the component (or really anything, it’s a useful idiom overall).
    
    I usually follow it up with adding a CSS class for easy styling, naming the CSS class after the object field name on the JavaScript side.
    
    this.textArea = this.appendChild(document.createElement("textarea")); this.textArea.classList.add("text-area");
    
    it’s kind of verbose, but if you don’t like it, you’re free to wrap that up in a helper function—I personally don’t mind, since it’s simple code that I usually follow this up with more initialization logic.
  - patching everything together: I usually use plain DOM Events for event handlers that don’t need to return any data back to the component. I prefix their names with . to not confuse them with built-in DOM events such as mousedown.
    
    it’s pretty convenient to construct events with Object.assign, too.
    
    this.dispatchEvent( Object.assign( new Event(".codeChanged"), { newCode }, ), );
    
    you can wrap the event construction in a function too if you mind the verbosity much, but again—I personally don’t.
cache bust a little, cache bust some more
- cache busting is a super cool technique for ensuring the browser does not download assets that haven’t changed. essentially, for each asset you serve to the user, you compute a hash that’s then included in all URLs referencing the asset from your website.
  - for example, as of writing this, treehouse’s CSS stylesheets are linked into the main page more or less like so:
    
    <link rel="stylesheet" href="/static/css/main.css?cache=b3-f12e225c"> <link rel="stylesheet" href="/static/css/tree.css?cache=b3-62885cff">
  - my server sees the cache query parameter, and adds in a little HTTP header, that basically tells the client “don’t redownload this, that’d be stupid. this asset isn’t going to change like, ever.”
    
    Cache-Control: public, max-age=31536000, immutable
  - and that’s it! the actual value of the ?cache parameter is never interpreted by anyone, anyhow. it’s only there so that whenever something does change, we change the URL, and the browser thinks that “hey, that’s a different asset! gotta download it.”
    
    that way, the browser only ever downloads files that changed since your last visit.
- initially I implemented cache busting for most static assets, because that’s pretty easy to do: add a helper to your templating engine that can derive these ?cache-augmented URLs by computing a hash of the linked file.
  - in my case I use BLAKE3—as indicated by the b3- prefix—but the choice of hash function shouldn’t matter that much; I just chose a fast crypto hash for lower likelihood of collisions. which would of course cause assets not to get redownloaded, if that ever happened.
    which is kind of bad, but eh. happens rarely enough we don’t need to care about it.
    (and yes, I know that I’m increasing the likelihood of collisions by truncating the hash. as I said: it doesn’t matter, I don’t care.)
- the far bigger challenge was making this work for JavaScript files.
  - HTML we generate, so that’s easy—add a template helper, and replace all occurrences of /static URLs with that helper.
  - CSS doesn’t refer to too many assets—there’s fonts and a couple images in that one blog post’s stylesheet, which will probably never change, so we can hardcode those.
  - but JavaScript. man. where do I even begin.
- treehouse is built on ES modules. as I mentioned before, I don’t bundle or minify anything, because HTTP/2 makes using plain ES modules quite efficient, as long as you import them all from your main HTML file. the problem is that if you’re referring to modules like this…
  
  import "treehouse/vendor/codejar.js";
  
  how the heck are you going to add that ?cache parameter in there?
- as you can see in the previous example, the treehouse had already used import maps by that point. for those of you who don’t know, these are handy little bits of JSON that tell the browser’s JavaScript runtime where to source your modules from.
- before implementing cache busting, I’d use a simple import map hardcoded right into my Handlebars template:
  
  <script type="importmap">{ "imports": { "treehouse/": "{{ config.site }}/static/js/" } }</script>
- so the challenge was to turn that puny import map into something that lists all the individual modules, with a ?cache parameter! and initially I thought, “well this sure is gonna be simple, just walk through all my .js files, compute those hashful URLs,”
  - okay cool we’re getting somewhere,
- “and then we could even cache that import map with a ?cache parameter too—”
- and then reality struck:
  
  The src, […and other] attributes must not be specified.
  
  Fuck.
- so, “sure,” I say. “I’ll have to include the whole import map verbatim in each .html file. no big deal, we don’t cache .htmls anyways…”
  - it’s kind of sad, because it’d allow me to cache linked branches (such as this one)—I’d love it if I could get rid of the Loading... text entirely if you’ve ever loaded a branch, but while that is feasible, it’s probably going to benefit snappiness less than I’d like, due to import maps influencing the hash of each .html file. and therefore each time I add a .js file, all cached HTML files would get busted…
    
    oh well.
    on the other hand, I do understand why browser vendors wouldn’t want to implement it—it’s a performance pitfall. it adds in an additional dependency towards evaluating <script>s, which would block parsing on any inline JavaScript that’s type="module".
- and with an import map implemented, I go look at my glorious generated sources, and see… that my import map keys change every build
  - this is a really stupid thing, but Rust (and other languages) randomise the ordering of hash maps to prevent hash DoS attacks, which means you can’t use them to generate deterministic data. such as a file that shouldn’t change across rebuilds!
- so I swapped the std::collections::HashMap with a indexmap::IndexMap, sorted it after generation, and everything’s working smoothly
  - in theory, I could’ve used a Vec<(String, String)>, but serde won’t serialize that as a map by default (for good reasons. it’s not a map after all, it’s a sequence!) and I was too lazy to implement that serialization logic myself.
- …and all that’s possible without ever parsing any JavaScript!
Djot down some notes
- I’d initially chosen Markdown as my website’s markup language, simply because I was already familiar with it, and because I’ve seen the Rust ecosystem had a nice parser for it that seemed pretty customizable.
- as time went on though, I discovered another light markup language: Djot, made by the same person who made Markdown, with lots of lessons learned from his previous attempt.
  - I initially didn’t wanna go through with it, because “sigh am I really gonna have to rewrite my entire content to use Djot?”
  - but then I did it anyways, because life’s too short to have to deal with poorly designed markup languages
    and also because I’m are have stupid, but let’s not talk about that.
- the one thing that sold me on Djot was how easy it is to create custom HTML elements. for instance, it has syntax for a div:
  
  ::: class-goes-here I'm in a div! :::
  
  or a span:
  
  [I'm in a span!]{.whatever}
  
  which is really cool if you’re doing a lot of bespoke markup in your blog posts.
- ultimately, the switch mostly came down to converting *abc* into _abc_, **abc** into *abc*, and ~~abc~~ into {~abc~}, as well as fencing any inline HTML off with some =html, as well as fixing up some links—because Djot forces you to use two pairs of [], like [here's a link that's defined elsewhere][], instead of just one like Markdown.
  - you get the gist—pretty mechanical actions, that probably could have been automated away, but I decided it wasn’t worth it for what little content I had here.
    says that with 50000 words on his website already, and counting…
- and in the end, it’s interesting the switch made parsing slightly faster, and the HTML generation code slightly cleaner!
  - I believe I can attribute the parsing speedups to Djot’s more computer-friendly syntax, but I haven’t measured. just my guess. maybe it’s my HTML generator doing less useless error handling work, too.
    I am pushing into a string, that literally cannot fail! (other than with an OOM, but that’s a panic)
  - the HTML generation code got cleaner, because the crate I’m using—jotdown—does not use a callback for filling in broken links. a pattern that’s best known under the moniker “yeah, don’t do that” in the Rust world.
    I found the easiest way of going about writing your HTML generator is copying the built-in one in your light markup parsing library of choice, and adjusting it to your needs. so yeah. mine’s mostly stolen code.
    aside, but there’s one frustrating thing about the Rust ecosystem: why does everything have to be a trait? I don’t think I’ve declared a trait once in my recent projects—both in the treehouse and rkgk, yet I constantly see examples of unnecessary traits such as this one in the wild.
    and of course Render::push has to take anything that implements Write, because handling I/O errors is oh so nice and efficient to do, especially when you have to do it every 5 characters you emit!
    why can’t it take in a &mut String instead?
    or… why does this trait have to exist in first place? how often will you want to swap out renderer implementations, really?
    and when you swap out renderer implementations, isn’t the API gonna be quite different, because both renderers require different data to derive your HTML from? shouldn’t that warrant you calling a different function with a different set of arguments?
    please don’t take this as me bashing jotdown or its author; I think it’s an otherwise well-designed and generally great library for parsing Djot, and a single odd API decision shouldn’t detract you from just how awesome it is.
a rustc stability benchmark
- this is a weird one, but: sometimes rustc will choke up on evaluating obligations for the implementation of Unpin for SemaBranch, and just… die.
- what’s funny is that it dies dropping a rustc-ice-YYYY-MM-DDTHH_MM_SS-NNNNN.txt file into your current working directory, and combined with cargo-watch this has the super funny effect of generating tens of those files in the treehouse repo.
  - I have forgotten to remove these before checking in at least once before. I don’t think I ever ended up pushing that commit though, so I can’t show you… but you can see the ICE I have stored in my local Git history here.
  - an annoying side effect is that to fix this, I have to ^C out of cargo-watch, run cargo clean, run cargo-watch again, and wait until the whole project compiles.
    at least it’s a debug build…
- I unfortunately don’t have a consistent repro on this, though rustc has told me this is a known issue. it’s weird it only happens with the treehouse. like there’s a ghost inhabiting it…
and that’s all the stories I have for now! feel free to come back here any time. I may or may not update this post with more of them in the future.

not quite buildless

I Ate My Themplate Engine, and why not quite buildless

not incremental by design

Vanilla JS

not minified

release mode? what’s that?

live reloading

Web Component shenanigans

cache bust a little, cache bust some more

Djot down some notes

a rustc stability benchmark

a `rustc` stability benchmark