Porting my blog to Zola
on
Moving away from GitHub Pages and Jekyll
I've hosted my blog for a few years on GitHub Pages and Jekyll, sporadically updating it with new content. Recently I've grown frustrated with the difficulty of writing posts, and synchronizing drafts between devices without pushing to my GitHub repository. I began researching server-based self-hosted editing environments. (In the end I abandoned server-based CMSes and stuck to static site generators, but used Syncthing to synchronize my drafts and unpushed Git branches between machines and phones. Additionally I setup an always-on central hub node, using an old Android phone running Syncthing-Fork in root mode, so my machines wouldn't fall out of sync even if they weren't on at the same time.)
I looked into using Netlify CMS for my posts, but I abandoned it after realizing that it was not designed to be easily self-hosted or run locally.
The first idea I tried out was hosting a private WordPress server on a Pi running in my LAN (to avoid the security nightmare of exposing WordPress plugins to hackers), and exporting static sites to GitHub Pages. But the Gutenberg editor proved to be a buggy trainwreck for editing blog posts, and many of my cursor movement/editing bug reports were ignored. (Interestingly, Gutenberg is placing some effort in editing web design site layouts rather than article content.) The classic HTML-based editor was more workable, but was prone to producing inconsistent styling from pasted HTML. And many WordPress themes were highly commercialized (such as paid themes/features) or built for storefronts rather than blogs, and some noncommercial themes had translation errors from another language or were missing ids for each header. I tried installing the Android WordPress app to edit my site on the go, but the app's editing interface was half-baked (it rendered syntax-highlighted code blocks as black boxes), and was even worse than the Markor markdown editor. I eventually abandoned trying to migrate my blog posts to WordPress and writing new articles in it.
Hugo was an option, and GitLab Pages offered it as an example. However from my initial (flawed) research, I found that Hugo stored images globally like Jekyll (making it harder to bundle images with an article), rather than next to articles. This has been fixed in Hugo 0.32 (release notes) and documented at https://gohugo.io/content-management/image-processing/ and https://gohugo.io/content-management/organization/, but I failed to read those links when doing my research. If I were to rebuild this site again, I would definitely research using Hugo.
I tried setting up a Zola site locally, but I had difficulty installing a theme and configuring my site to use it. I ran through the Zola themes page trying to pick a theme I liked. While most theme demos included syntax-highlighted code blocks without needing JS (since this is a core Zola feature), some themes had their layout break with JS turned off, and some appeared abandoned. Additionally Zola required manual setup to get a demo site up and running, and I didn't know how to do it since each theme expects a different layout of Markdown files and parameters (for history/archives, About pages, and posts).
In the end, I deferred actually porting my site off Jekyll for a few months (joining my expansive graveyard of abandoned projects). But my frustrations with Jekyll returned when I tried editing my latest blog post on an old laptop I revived with Void Linux, and wanted to generate a Markdown page preview. The Jekyll version used by GitHub Pages required Ruby 2, which was absent from both void-packages and rvm, so I had to compile it from scratch... on a laptop Core 2 Duo... At that point I swore to find a new platform for my blog.
Setting up the Zola site
Around that time, I happened to read "Arc and Mutex in Rust" and was struck by its typography. Once I finished publishing my blog post, I downloaded Tale-Zola to my Zola project, setup the sample site (by copying sample posts from the theme into my site), and began changing theme layouts, discovering and fixing invalid HTML generated by the theme, replacing fonts, increasing text contrast, improving spacing, adding visited link colors, cursing out CSS specificity rules...
I struggled to figure out Zola at first. In particular, the relationship between index.md, _index.md, and pages in my site, and templates like index.html, section.html, and page.html in the theme, was unclear. Sections and pages each have a "hidden" default (convention over configuration) template name, which can be overridden by parent sections or the .md file itself.
I couldn't figure out what template .html
was used for each section or page .md
by reading the sample site alone; it was very confusing to remove individual templates and watch the build break, as Zola tried to reference missing template files whose names never appeared anywhere in the theme and configuration. I was also overwhelmed by the search results for zola page.html
and similar queries. In the end, I managed to partly figure out the template names and continue working on my site, before taking a break to read the docs fully (which did explain the template naming/default system).
- I would much rather require the project configuration to explicitly spell out which template is used for the site root, sections, and pages. This way, if I want to see which pages are using
index.html
, I can grep my project for that filename, and find a configuration line mapping specific content files to that template. (See discussion on convention over configuration.)
The Zola docs were comprehensive and fairly clear, but the material was dense, and I had to consciously read, take notes, and stop reading whenever I found myself glazing over the pages. In any case, I now better know how to work with Zola, though I probably wouldn't be able to write a site config or theme from scratch.
Porting my content
After I finished setting up the site's configuration and theming, I got to work importing my old blog articles into Zola.
Markdown is not a well-specified markup format, but a family of subtly-incompatible dialects, and every blog engine uses a different Markdown parser and interprets text slightly differently. So if you feed existing pages into a different blog engine, it's practically guaranteed to break old blog posts, and you have to read over your entire site to find and fix the breakage. (Normalizing all pages might make them look identical in different parsers; I haven't tried normalizing my Markdown and haven't found widely-unsed tools for doing so.)
In my case, I had to unindent some bullet points in a footnote (which Zola interpreted as an indented code block), then move the footnote to the bottom of the page (to match Jekyll doing it automatically, though I don't know if it's a good thing). Zola appears to place footnotes where written, rather than at the bottom of the document like Jekyll. And Zola "supports" multi-line footnotes as a footnote line followed by regular lines (which looks identical to a multi-line footnote, since footnotes are rendered in-place), though you can put a multi-line blockquote in a footnote.
Additionally, in Jekyll, I had created custom _includes/
HTML files holding templates for images with captions, and embedded them in documents using {% include img-125.html src="file.png" alt="caption" %}
. I ported this to Zola's shortcode system, which was manageable once I figured out it was called a shortcode. The shortcodes are embedded using {{ img125(src="file.png", alt="caption") }}
.
In the process of writing and calling shortcodes, I had to look back and forth between Zola and Tera (Zola's templating engine)'s docs. I ran into a few issues making shortcodes work:
- I kept mixing up
{{ }}
(Tera's self-closing expressions) and{% %}
(Tera's paired control-flow tags, but Jekyll uses the same syntax for self-closing expressions).- The Tera docs distinguish
{{
for expressions and{%
for statements, but don't explain that expressions expand to values, whereas statements define regions and control flow. - The Zola shortcode docs neither explain what
{{
and{%
mean, nor link to the Tera docs.
- The Tera docs distinguish
- Hyphens are not allowed in shortcode names, unlike Jekyll.
Zola is a lot pickier about date formats than Jekyll, requiring either yyyy-mm-dd
or yyyy-mm-ddThh:mm:ss±hh:mm
, and rejecting other date formats. It took me a while to figure this one out, before I had read the docs and found the valid date format. Interestingly, bare yyyy-mm-dd
dates always record a timezone of +00:00 in the Open Graph and JSON-LD metadata. I wonder if this is a good or bad thing, or if it doesn't matter at all.
I removed comments because nobody used them anyway ;) and I didn't take the time to redo the Utterances integration.
Hosting
Jekyll is the only static site generator with special hard-coded support in GitHub Pages. After switching to Zola, I had to figure out how to make CI generate sites from the Markdown sources, and serve them.
GitLab Pages
GitLab has a dedicated Pages build process; every time you push to your repo, GitLab runs your CI workflow with the input repository, and copies the HTML it generates directly to the hosting service. Unlike GitHub Pages/Actions, this doesn't require setting up an access token, or creating a "pages" branch in your repository and force-pushing to it when your source branch is pushed to. GitLab Pages is a cleaner approach, but arguably less transparent since you can't browse the generated static site as a Git tree and inspect what files are being served.
GitHub Pages
You can run Zola on GitHub Pages (Zola CI docs), but you need to use a GitHub action (zola-deploy-action
) which force-pushes a single commit with compiled output to the gh-pages
branch every time you push new data to a source branch.
Strangely, the Zola CI docs say to create a personal access token and assign it to the TOKEN
environment variable, whereas the action docs say to specify GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
(secrets.GITHUB_TOKEN
is created by GitHub on every job, and does not need manual setup). I found secrets.GITHUB_TOKEN
to work well, though there's a delay after the first action run creates the gh-pages
branch, before you can reconfigure your GitHub repo to publish it.
- Interestingly, the unofficial GitHub Pages action (which pushes but does not build a site) can work with both a personal access token and
GITHUB_TOKEN
, but the latter has some limitations.
Worse yet, zola-deploy-action
is remarkably slow, because every CI run rebuilds a Docker container from scratch, using a Dockerfile with 13 operations each with 1 second of overhead. I vendored a fork of the action with this issue fixed (zola-deploy-action
fork), and discussed fixing it upstream (upstream discussion), but the discussion stalled over which Linux distro and version of Zola to build the sites with.
Additionally, the Zola action doesn't create a .nojekyll
file (which turns off running Jekyll on Zola's output), whereas the unofficial GitHub Pages action does. Omitting the file may slow down deploying the resulting static site. I edited my fork of zola-deploy-action
to create a .nojekyll
file, but I didn't notice a speed difference (though it's still good because avoids the chance of Jekyll mangling the site, like breaking symlinks).
Future of hosting
In any case, following GitHub's decision to charge for Copilot (building a neural network out of the community's code, then selling it back to the community as a commercial product), I'm planning to move away from GitHub, and no longer wish to host my blog on GitHub Pages. Instead of using GitHub Actions to deploy the Zola rewrite to nyanpasu64.github.io, I will be freezing it as-is or redirecting to nyanpasu64.gitlab.io. In the future, I may move my blog from gitlab.io to a more open hosting service, like Codeberg Pages (currently static-only), or sourcehut pages with built-in CI.
2022-08-07 EDIT: Unfortunately, when linking to nyanpasu64.gitlab.io pages in Discord, the embedded preview often (but not always) shows a GitLab login preview rather than the page contents itself. I may consider switching to another host, and create a local script to build and push the generated site. Possible hosts:
- Codeberg code hosting has experienced downtime and slowdown. I do not know if their site hosting is reliable.
- sourcehut CI is paid and sourcehut itself may become paid in the future, and I have no income right now.
- The Neocities CLI is an option, once I setup a workflow around building and pushing a site locally (rather than from CI).
Conclusions
Zola is a good engine. The docs were mostly comprehensive outside of template/shortcode syntax, though I did come across one piece of outdated information (which I contributed a pull request to fix), and the CI scripts and docs were inconsistent. The docs are well-suited for reading front-to-back and skipping irrelevant sections, and less well suited for searching for help performing a specific task (you find a page in the middle of the docs, but the information is incomplete, so you flip between other pages looking for context to figure out the full picture), but they're still a lot better than other programs I've worked with.
The command line is 🚀✨ blazing fast: it builds my entire blog in around 40-50 ms and rebuilds in 20-30 ms on my Ryzen 5 5600X desktop on Linux (slower on Windows lol), and rebuilds in around 1 second on my aging Core 2 Duo laptop running Void Linux on a mechanical drive.
Outside of my learning pains, I'm very satisfied with rewriting my blog in Zola. I hope it will continue to be maintained and grow with my site, and I suuure hope that Arch, Void, and Alpine won't start shipping Zola versions with incompatible changes...