Over my time using Obsidian, I’ve independently authored around 400 notes. Over time I’ve had a relatively consistent schema for my tags and frontmatter attributes:
Getting too deep into what all of these mean is outside the scope of this post. For now, it’s enough to know that for any Obsidian note, these properties must be present in order for my pipelines to do their job.
Until now, I managed my note frontmatter by hand, or with
grep. I’ve got a bit of experience using these tools to manipulate text files, so it’s been relatively comfortable but extremely manual.
The problem is that over time, humans get sloppy, forget things, decide to do things differently. In practice, this doesn’t impact the usage of my vault in Obsidian; I access most of my notes via the Quick Switcher so filenames and aliases are the things I really focus on.
A place where consistency does matter is when you’re automating tasks. Tools that work with Markdown like static site generators care a lot about frontmatter metadata.
For these tools to work the way I expect and need them to, I need to guarantee that my notes are configured correctly.
This is a project I’ve been meditating on for a long time. The specific problem I had is that most markdown frontmatter is YAML. I’d done cursory searching and come up with no satisfying results for a “YAML schema engine”, something to formally validate the structure and content of a YAML document.
I was a fool. For years I’d know that YAML was a superset of JSON, and I’d assume that the superset part meant that no tool that expects JSON could ever be guaranteed work on YAML and that’s not acceptable for automation.
The detail that matters is that only the syntax is a superset of JSON. The underlying data types: null, bool, integer, string, array, and object, still map onto JSON 1 to 1. With that revelation, my work could finally begin.
My implementation language of choice is Go, naturally. Speed, type-safety, and cross-compilation all make for a great pipeline.
Validate() is basically all you need in terms of Go code. The full code repo has a bit more complexity because I’m wiring things through Cobra and stuff, but here’s some sample output:
You get a relatively detailed description of why validation failed and a non-zero exit code, exactly what you need to prevent malformed data from entering your pipeline.
You might notice that when I specify a schema, it’s hosted at
schemas.ndumas.com. Here you can find the repository powering that domain.
It’s pretty simple, just a handful of folders and the following Drone pipeline:
and this Caddy configuration block:
Feel free to browse around the schema site.
At time of writing, I haven’t folded this into any pipelines. This code is basically my proof-of-concept for only a small small part of a larger rewrite of my pipeline.
The one use-case that seemed really relevant was for users of the Breadcrumbs plugin. That one uses YAML metadata extensively to create complex hierarchies and relationships. Perfect candidate for a schema validation tool.