Cinobelix chronicles - Librarian
Contents
Remember Cinobelix, the big guy with an insatiable hunger for data? Let’s kick off this year’s stories with one of his adventures. This isn’t just the first post of the new year; it’s also the inaugural entry in a series where I’ll demonstrate practical examples of working with CinodeFS.
Key holder
Cinobelix is holding the keys to the Cinode Maps project. Not in the sense that we, humans, tend to use our keys. He won’t be opening any doors. He keeps the keys that give him the power to perform updates.
So far, Cinobelix has been responsible for map tile updates sourced from OpenStreetMap. The algorithm is still very immature, but that’s about to change with a smarter update mechanism. That improvement won’t be the topic of today’s discussion though. Today we’ll talk about libraries.
Serious task of keeping things up-to-date
Today’s software, including web pages, is created using external libraries. That way we can efficiently write software, reusing code that is already battle-tested in other projects. But depending on such external code comes with a significant cost: dependency updates.
It’s easy to forget about a small library used in a project. Those libraries are (still) made by people, and people make mistakes. Every library may carry a hidden vulnerability within its code. There could be a bug that may manifest months or even years after it was first created. Fixing a vulnerability requires updating the library that a project builds on top of. Such a process may be time-consuming, especially when the number of used libraries is large. It’s always better to automate such things.
As you may have guessed, such a library update task is a perfect job for Cinobelix. The Cinode maps project, the main responsibility of Cinobelix, uses the Leaflet library. That library is referenced in the main maps HTML file through a dedicated dynamic link. The purpose of that link is to point to the newest version of the Leaflet library. Whenever the content behind that link changes, Cinode maps will automatically use the updated library version.
The cool thing about such an approach is that the same dynamic link can be used by many other projects. Every time Cinobelix finds out about a new version and updates the link, all those projects start using the newer version straight away.
Where the news comes from
The first step of updating the Leaflet library is to figure out where to find out about the new version. Since the project is hosted on GitHub, this task is pretty simple. We just need to find out what the latest GitHub release is and update the library if it has changed compared to the version that is present in the Cinode datastore.
A quick lookup into Go’s library ecosystem reveals that there’s a nice GitHub API library maintained by Google. Of course, we will be using only a fraction of its functionality, but having a good and well-tested library definitely speeds up building our updater code.
Let’s take a look at a small snippet that grabs the latest version info from the Leaflet project:
|
|
Extractor
After taking a closer look at release assets we can easily conclude that library files are stored in a leaflet.zip
archive:
Let’s find that file in release artifacts:
|
|
The zip file can be downloaded through the URL that we get from the asset.GetBrowserDownloadURL()
method. To process this zip file, we can use Go’s built-in zip package:
|
|
You may have noticed that we first read the file into memory and then pass that memory buffer to a zip reader. That is necessary because the zip reader requires an io.ReaderAt
instance. Such a reader allows reading at an arbitrary offset location, which is not available for HTTP streams. Leaflet release files are not large so we can afford to keep one such file in memory.
And now the most important part - storing release files in Cinode:
|
|
In the code above we iterate over all entries in the zip file. The first check in the loop looks for entries ending with a /
character. If that’s the case, such an entry points to a directory rather than a file, so we can ignore it.
Also worth noting is that every file in Leaflet’s zip archive is stored in the dist
folder. In Cinode, we don’t want to use such indirection, thus the dist/
prefix is stripped from the file name.
Finally, once we get to entry upload, the fs.SetEntryFile
method of the CinodeFS
instance is used to upload the content directly into the datastore. The SetEntryFile
method takes io.Reader
as an argument, thus the upload of that file, contrary to opening a zip file from an HTTP response stream, can be done by streaming uncompressed zip data directly into Cinode without any intermediate storage.
Better safe than sorry
Great power brings great responsibility. From now on, Cinobelix takes on the burden of updating the Leaflet library in the Cinode world. But he has to be cautious. What if someone makes a mistake and there are missing files in the zip archive? What if the format of the archive changes and there’s no dist
prefix? Unfortunately, Cinobelix can’t solve serious compatibility issues by himself. He lacks experience… and AI logic too. However, he can perform simple checks to prevent obvious cases of invalid zip archives:
|
|
That small code snipped checks two things:
- Whether two most important files:
leaflet.css
andleaflet.js
exist in the archive - Whether all files are located in the
dist
folder
Finding out if all required files are present may look a bit weird. We start by declaring a map with keys indicating required file names. Whenever we encounter a file with such a name in the zip archive, entry with this name is removed from the map. This means that if the zip archive contains all the required files, that map should be empty once all zip entries are processed.
To check the map’s emptiness, we simply iterate over this map’s entries. If it is empty, the loop will be skipped completely; otherwise, the first iteration (i.e., some random missing file) will cause an error.
Checking the version
As mentioned before, Cinobelix should only update Leaflet instance if there’s a new version published. But how can we figure out which version is currently stored in Cinode? We could try to analyze Leaflet files, version information could be embedded somewhere. Instead, I decided to simply store a small file: .cinobelix/version
next to the Leaflet files that contain the current version number.
This code reads the currently stored version number:
|
|
The logic here is pretty simple - we open the file, read its content, and return this content as a string. You may have noticed the check for cinodefs.ErrEntryNotFound
error when opening a file. This case will happen when there’s no version file - either the existing leaflet version was not generated by Cinobelix or the datastore is still empty. In such case we still want to update the library from the latest release - this is achieved by returning an empty string as a version.
Once all the files are updated, the new version data is saved next to other files and we can “commit” the change by flushing the filesystem:
|
|
Thorns
The current Leaflet updater is a very simple and straightforward code. The whole updater is less than 200 lines of code, most of it dealing with Leaflet’s zip archive. The current CinodeFS API, used to build the updater, is still pretty immature. It lacks some crucial functionality and existing methods may need a bit of refactoring. But the Leaflet example proves that even at this stage, CinodeFS exposes a very powerful concept worth exploring.
There’s no rose without thorns, of course. To keep this version of the updater simple, it does not try to deal with backwards compatibility of the Leaflet library at all. There are a few cases where it may have severe consequences:
- any breaking change introduced in the new version of the Leaflet library may cause troubles to projects using that library
- any version on github marked as the latest one will be the one that Cinobelix will pick it up, even if for some reason an older one is marked as so
- github is considered the source of truth, if it is hacked, the project is blocked or the ownership is somehow stolen, Cinobelix will still trust this data
As you can see, there are many drawbacks. Some of them are easy to fix - like check if the new version is higher than the current one upon update. However, validating the API’s backwards compatibility is a much broader topic and is currently beyond the scope of this discussion.
Not the end
Let’s pause our journey here. Cinobelix made a leap step today, but that’s not the end. There are many more stories to bring up in this blog, some of them will be about Cinobelix, some about CinodeFS…
If you’re interested in the full source code for the leaflet updater, you can find it in this github repo.
As a bonus I also recorded the whole code creation process which you can find in the video blow (that’s still a youtube frame, byt hope to host it directly in Cinode one day):
Author BYO
LastMod 2024-01-15