(How) I Gave a Bash SSG “Seamless Updating” Functionality [With Source Code]

Note: The Bash code block snippets in this article are Copyright 2019–2021, Anton Mcclure, and are licensed under the MIT License.

A little over half a year ago I wrote a proprietary static site generator for a short-lived static HTML version of AntonMcClure.com which quickly grew out of control and became too unorganized. Instead of giving up on the code it was later given a slight rewrite, simplified a ton, and released under an open-source license as ezSite, where it sat practically abandoned on GitHub until some people on IRC showed interested in the program and started making suggestions and pointing out overlooked errors that they were seeing. Realizing that ezSite actually has the potential to be a usable static site generator, I then started actively maintaining it. While I don’t use it on AntonMcClure.com personally, a technical “fork” of it is used on a small personal website that I run.

Even though it worked okay for an admittingly quickly written SSG, I didn’t like how it did site updates. These worked by simply generating new pages, deleting the old ones, and then moving the new pages in. Even though the SSG splits some functions into different files, it essentially worked like:

This worked for a basic SSG, but if it was running and someone connected to your site, the visitor would see errors or an unwanted directory browsing page since the pages are were simply deleted. This is not something people like to see when they visit a website. It makes your site seem unstable, and visitors may think your site is broke, which it technically would’ve been for those few milliseconds/seconds.

While working on the personal website, I got an idea: why not load the pages first, and then remove the “deleted” pages from the web directory. The only problem now is how was I going to do that?

The solution I settled on was to have the program write a list of both the markdown files in the pages/ directory, and the HTML pages in the site directory used by the web server software. It saves a list of the "source" markdown pages to one temporary text file, and a list of the "published" HTML pages to another temporary file, as seen below:

Now that we have the lists of files, we can now use these to determine which pages are no longer in our pages/ directory and need to be deleted. To do that, I used the Bash sort command to compare both files, and then piped it through uniq, which is a command to report or omit/filter out repeated lines in files, with the flag -u to only output lines unique lines that are not repeated. For each line that is output, the corresponding .html file is then deleted, as shown in the code below:

Since the temporary files are no longer needed it then deletes them.

The reason I used temporary files rather than something like trying to compare ls -1 outputs is for ease of debugging, in the event the lists don't generate properly for some reason, and hopefully it would be easier for users to debug in the event of issues with the feature.

As once tweeted by Jeff Bezos: “I have decided to give back to my community.” However, instead of (not) giving away Bitcoin, I’m giving away the code snippets for this functionality under the MIT License, an free & open source software license, for anyone to use in their projects, and with the program itself. Hopefully you find this code useful for your projects, scripts, or other Bash things!

Dev sysadmin, webmaster — supporter of modern innovations in technology — Author of free news/articles — https://www.antonmcclure.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store