SEP 34: Overhaul image pipeline
Most documents on the site do not contain images, but images being images, they are always going to take up more space than textfiles.
At time of writing (2023-04-11) there are 178 images on the site, most of them in documents related to my ‘journeys’, weighing in at 31.7MB, for an average image size of 178KB. Not an absurd figure but I’m not above trying to crunch it down.
91.3% of the sites size on disk is images. That being said, I’m not really waging a war on images, I think they’re important. I don’t want to squash my photos into a pixelated mess just to save a bunch of theoretical bandwidth1. But if optimisations can be made while maintaining the level of quality I want, then why not.
Currently I use the Pillow package to process images, click below to see this function — it’s a crude effort but it gets the job done.
Prep for this proposal will require a review of the state of the art in image optimisation. Prospective tools:
-
pngnq
/advpng
for PNGs -
mozjpeg
for JPEGs
Appendix 1: Current Image Processing Function
# Process images
def process_images():
status = "Used cached images"
if verbosity > 1:
print("Processing Images")
image_assets = []
image_assets.extend(Path(images_dir).glob('**/*.jpg'))
image_assets.extend(Path(images_dir).glob('**/*.jpeg'))
image_assets.extend(Path(images_dir).glob('**/*.JPG'))
image_assets.extend(Path(images_dir).glob('**/*.png'))
image_assets.extend(Path(images_dir).glob('**/*.PNG'))
image_assets.extend(Path(images_dir).glob('**/*.webp'))
for image in image_assets:
rel = os.path.relpath(image, images_dir)
ext = os.path.splitext(rel)[1]
base = os.path.splitext(rel)[0]
path_large = os.path.join(output_dir + 'images/' + rel)
path_small = os.path.join(output_dir + 'images/' + base + '.small' + ext)
# if image already exists in output_dir, skip
try:
with open(path_large) as f:
with open(path_small) as f:
pass
# If image does not yet exist in output_dir, process and output
except IOError:
status = ''
# Print source file path
if verbosity > 2:
print(f" {image} >> ", end=", flush=True)
# Open image, derive height from new width as a proportion of original width
im = Image.open(image)
width_large = 1000
height_large = int(im.size[1]*float((width_large/float(im.size[0]))))
large = im.resize((width_large, height_large), Image.Resampling.LANCZOS)
width_small = 300
height_small = int(im.size[1]*float((width_small/float(im.size[0]))))
small = im.resize((width_small, height_small), Image.Resampling.LANCZOS)
# Create output directory tree
os.makedirs(os.path.dirname(path_large), exist_ok=True)
# Output resized images
large.save(path_large)
small.save(path_small)
# Print written file path
if verbosity > 2:
print(f"{path_large}")
if status and verbosity > 1:
print(f" {status}")
-
I say theoretical because the number of people who actually read this site is probably a rounding-error on a rounding-error, but hey, gotta pinch those bytes.↩︎