Bash

When viewing longer Markdown notes in Obsidian, tables of content (TOC) help a lot with navigation. There is a handful of community plugins to help with TOC generation, but I have two issues with them:

It creates a dependency on code whose developer may lose interest and eventually abandon the project. At least one dynamic TOC plugin has suffered this fate.
All of the TOC plugins have the same visual result. When you navigate to a note, Obsidian places the focus at the top of the note, beneath the frontmatter. That’s fine unless the content starts with a TOC markup block, in which case it’s not the TOC itself that is displayed, but the markup for the TOC plugin itself as depicted in the image below.

For me the solution was to write a script that scans the vault looking for this pair of markers:

I alluded to this nuance involving variable scope in my post on automating pdf processing, but I wanted to expand on it a bit.

Consider this little snippet:

i=0
printf "foo:bar:baz:quux" | grep -o '[^:]\+' | while read -r line ; do
   printf "Inner scope: %d - %s\n" $i $line
   ((i++))
   [ $i -eq 3 ] && break;
done
printf "====\nOuter scope\ni = %d\n" $i;

If you run this script - not in interactive mode in the shell - but as a script, what will i be in the outer scope? And why?

In my perpetual effort to get out of work, I’ve developed a suite of automation tools to help file statements that I download from banks, credit cards and others. While my setup described here is tuned to my specific needs, any of the ideas should be adaptable for your particular circumstances. For the purposes of this post, I’m going to assume you already have Hazel. None of what follows will be of much use to you without it. I’ll also emphasize that this is a macOS-specific post. Bear in mind, too, that companies have the nasty habit of tweaking their statement formats. That fact alone makes any approach like this fragile; so be aware that maintaining these rules is just part of the game. With that out of the way, let’s dive in.

This may be obvious to some, but visually-recognizing character encoding at a glance is not always obvious.

For example, pronunciation files downloaded form Forvo have the following appearance:

pronunciation_ru_Ð¾ÑÐ±ÑÐ²Ð°Ð½Ð¸Ðµ.mp3

How can we extact the actual word from this gibberish? Optimally, the filename should reflect that actual word uttered in the pronunciation file, after all.

Step 1 - Extracting the interesting bits

The gibberish begins after the pronunciation_ru_ and ends before the file extension. Any regex tool can tease that out.

I’ve written a lot about applying and removing syllabic stress marks in Russian text because I use it a lot when making Anki cards.

This iteration is a command line tool for applying the stress mark at a particular character index. The advantage of these little shell tools is that they can be composable, integrating into different tools as the need arises.

#!/usr/local/bin/zsh

while getopts i:w: flag
do
    case "${flag}" in
        i) index=${OPTARG};;
        w) word=${OPTARG};;
    esac
done

if [ $word ]; then
    temp=$word
else
    read temp
fi

outword=""
for (( i=0; i<${#temp}; i++ )); do
    thischar="${temp:$i:1}"
    if [ $i -eq $index ]; then
        thischar=$(echo $thischar | perl -C -pe 's/(.)/\1\x{301}/g;')
    fi
    outword="$outword$thischar"
done

echo $outword

We can use it in a couple different ways. For example, we can provide all of the arguments in a declarative way:

Introducing sterilize-ng [GitHub link] - a URL sterilizer made to work flexibily on the command line.

Background

The surveillance capitalist economy is built on the relentless tracking of users. Imagine going about town running errands but everywhere you go, someone is quietly following you. When you pop into the grocery, they examine your receipt. They look into the bags to see what you bought. Then they hop in the car with you and keep careful records of where you go, how fast you drive, whom you talk with on the phone. This is surveillance capitalism - the relentless “digital exhaust” left by our actions online.

As part of a Hazel rule to process downloaded mp3 files, I worked out a couple different methods for extracting the ID3 title tag. Not rocket science, but it took a little time to sort out. Both rely on non-standard third-party tools, both for parsing the text and for extracting the ID3 tags.

Extracting ID3 title with `ffprobe`

ffprobe is part of the ffmpeg suite of tools which on macOS can be installed with Homebrew. If you don’t have the latter, go install it now; because it opens up so many tools for your use. In this case, it makes ffmpeg available via brew install ffmpeg.

With all its peculiarities of syntax, sed leaves a bit to be desired. That’s why I was pleased to find sd, or “Search and Displace”, a tool that does what sed does but with less arcane syntax.

For example:

# prints "Corruption" (W.Barr)
echo "\"Corruption\" by W.Barr" | sd '(.+)\sby\s(.+)' '$1 ($2)'

Or just leave out the \s metacharacter in favour of a literal string:

sd '(.+) by (.+)' '$1 ($2)'

It does all the tricks you’d expect from a sed replacement. For example, named capture groups:

Since I’m not fond of carrying around all my photos on a cell phone where they’re perpetually at list of loss, I peridiocally dump the image and video files to a drive on my desktop for later burning to optical disc.¹ Saving these images in archival form is a hedge against the bet that my existing backup system won’t fail someday.

I’m using Blue-Ray optical discs to archive these image and video files; and each stores 25 GB of data. So my plan was to split the iPhone image dump into 24 GB partitions. H

Creating Obsidian tables of content

Bash variable scope and pipelines

Automating the handling of bank and financial statements

Converting Cyrillic UTF-8 text encoded as Latin-1

Step 1 - Extracting the interesting bits

accentchar: a command-line utility to apply Russian stress marks

sterilize-ng: a command-line URL sterilizer

Background

Extracting ID3 tags from the command line - two methods

Extracting ID3 title with `ffprobe`

Speaking of regex: sd - a tool to search and displace

Partitioning a large directory into subdirectories by size

Step 1 - Extracting the interesting bits

Background

Extracting ID3 title with ffprobe

Extracting ID3 title with `ffprobe`