Where have you been? Somewhere around 1998, someone created a tool which I wish I’d known about: sitecopy.

And it does what I’d been doing a decade ago for bilikfamily.com but in a different manner.

When I was writing entries at my site, I’d usually create it all on my local computer, preview it, and when happy with it, “deploy” it (i.e. upload it) to my web server.

This is one common approach with smaller sites that aren’t Amazon, Facebook, etc. It has an upside of being faster loading, much easier on the server, and far more secure against internet exploits. If you don’t touch your website for a week, you need not worry that hackers might find some code exploit and takeover your website.

The problem is that usually you’d prefer to just send the incremental differences. Uploading everything is just redundant and a waste of time and network bandwidth.

In my days of writing bilikfamily.com I had a simple process:

  1. “touch” a timestamp file prior to doing any updates. This is a file with nothing in it, but what’s important is the created timestamp on it which will be used later. It’s kind of a reference point.

  2. Now do any writing and local website updates, iterating and previewing locally until you’re ready to upload.

  3. The content management software (CMS) I used to use would only create/update the web HTML files it needed to and leave the rest of my local copy of the web site files alone. This is because the CMS was originally designed to run on the server itself. This approach is rarely done with static site generation nowadays.

  4. Many years ago I’d written a small program that traversed the local computer copy of the web site, determine what files were “newer” than that timestamp reference file, and upload just those to the server that is hosting bilikfamily.com. This made uploads take 10-20 seconds typically when I was ready to commit to the public facing bilikfamily.com.

Now that I’m trying out hugo for content management, I can’t guarantee that it’ll only update the minimal set of files locally. Incremental update builds are not its strength. Hugo’s strength is its raw build speed and you can’t rely on timestamps to determine what is new.

This sitecopy program does the approach I had as a backup in my thinking – utilizing md5sum’s. In a very gross oversimplification think of an md5sum as the mathematical sum of all of the bytes in a particular file. sitecopy keeps a list of files and their respective md5sum’s for the files that it has seen and uploaded. It traverses the local web copy directory and any new files are md5sum’d, noted for the future, and uploaded. Any existing file whose md5sum doesn’t match the one sitecopy had recorded before is considered modified. It’s uploaded and the new md5sum is noted for the future.

I’m interested to see how this works out. Since I’ll be using old school FTP (File Transfer Protocol) for uploading, it’s not the kind of thing hugo was built for. All The Kids These Days use snazzy techniques that my server still doesn’t support. :-/