This is a script to check if my photo archives conform to my naming rules.
My archives look like this:
- Pictures/
- 2024
- 2024-01-01 New Years
- <many files that end in>.jpg
- 2024-05-01 May Day
- …
- 2024-01-01 New Years
- 2025
- …
- Some Other Directory Name
- 2024
Right now, it’s approximately 65,000 files totalling 96GB, so it scales up okay for me. It might not work for you.
This script has a bunch of exceptions and slack coded in, because my archives are still a little bit messy, and if it didn’t have these exceptions, there’d be too many errors.
The current error list is:
Scanning directory: /media/johnk/External Backup/John Kawakami Archive/Pictures
Non-conforming items:
File not in directory: /media/johnk/External Backup/John Kawakami Archive/Pictures/2001/*
File name invalid: /media/johnk/External Backup/John Kawakami Archive/Pictures/2005/2005 camper/campersmall.bmp
File name invalid: /media/johnk/External Backup/John Kawakami Archive/Pictures/2016/2016-10 Bernie/2016-06-berniejpg
File not in directory: /media/johnk/External Backup/John Kawakami Archive/Pictures/2018/20180702_174053.jpg
File not in directory: /media/johnk/External Backup/John Kawakami Archive/Pictures/2021/2021-10-31_El_Sereno_Steps.zip
File name invalid: /media/johnk/External Backup/John Kawakami Archive/Pictures/2021/2021-June-17-Action/newsletter.pages
Non-conforming directory: Blog
Non-conforming directory: Boyle Heights Burn to Rebuild
Non-conforming directory: Documents
I’ll tighten it up as I fix up the archives. Eventually, I’ll use these commands to pre-flight directories before they are copied into the archives.
#!/bin/bash
# Directory to scan
DIR_TO_SCAN="$1"
# Naming convention regex pattern
# All dirs start with a year or date
DIR_PATTERN="^\d{,4}(-\d\d|)(-\d\d|)[ a-zA-Z0-9_.-]+$"
FILE_PATTERN="^[ \(\)~#.a-zA-Z0-9_-]+?\.(JPEG|JPG|jpeg|jpg|gif|heic|PNG|png|webp|webm|mp4|MP4|MOV|mov|xcf|XCF|AVI|avi|odg|ppm|odt|txt|rtf|amr|pdf|THM)$"
SPECIAL_DIR_PATTERN="^()$"
IGNORE_DIR_PATTERN="^(Food|Food Pictures|Free Stuff|Memes|Photos.+|Ebay)$"
# Function to check directory structure and naming convention
check_photo_files() {
local dir="$1"
local indent="$2"
for item in "$dir"/*; do
if [[ -f "$item" ]]; then
if [[ ! ${item##*/} =~ $FILE_PATTERN ]]; then
echo "${indent}File name invalid: $item"
fi
fi
done
}
# Every directory should be named with the date.
check_year_archive() {
local dir="$1"
local indent="$2"
for item in "$dir"/*; do
if [[ -d "$item" ]]; then
if [[ ! ${item##*/} =~ $DIR_PATTERN ]]; then
echo "${indent}Non-conforming directory: ${item##*/}"
else
check_photo_files "$item" " $indent"
fi
elif [[ ! -d "$item" ]]; then
echo "${indent}File not in directory: $item"
fi
done
}
# Every dir should be a year
# Unless it's in the special directory pattern
check_photo_archive() {
local dir="$1"
local indent="$2"
for item in "$dir"/*; do
if [[ -d "$item" ]]; then
if [[ ${item##*/} =~ $IGNORE_DIR_PATTERN ]]; then
: # do nothing
elif [[ ${item##*/} =~ ^[0-9]{,4}$ ]]; then
check_year_archive "$item" " $indent"
elif [[ ${item##*/} =~ $SPECIAL_DIR_PATTERN ]]; then
: # check_year_archive "$item" " $indent"
else
echo "${indent}Non-conforming directory: ${item##*/}"
fi
elif [[ -f "$item" ]]; then
echo "${indent}File not in directorty: $item"
fi
done
}
# Start the scan
echo "Scanning directory: $DIR_TO_SCAN"
echo "Non-conforming items:"
check_photo_archive "$DIR_TO_SCAN" ""