Welcome to Language Agnostic, the blog of Inaimathi! It's now built on top of the Clojure framework http-kit. I'm reasonably confident it's no longer the least performant blog on the internet.

Enjoy the various programming-themed writings availble on offer. The latest post is available below and the archive link is directly above this text.


Not Dead Yet

Sat May 24, 2025Listen to this post

This is just a status update. There's at least four different things swirling around in my head that deserve their own post at this point, but I don't want to make the mistake of being silent for multiple months again. So, here's a bunch of stuff I've been thinking about in no particular order.

Self-Other Overlap Talk

This past AI meetup talk was given by me on the subject of Self-Other Overlap as an alignment approach. It was based on this post and a bunch of papers it links. You can find the notes, which I'll eventually expand into a full post, over here. You can also find some follow-up work which might warrant its' own talk over here.

The thumbnail on this one is that it looks like a really promising and under-explored approach with lots of low-hanging fruit. Contingent on some of the forthcoming experiments from this team showing possibilities, this is probably the most optimistic I've been about a potential alignment strategy. Had I not known that some team was already working on pushing it forward, I'd probably jump into it myself.

Video Generation SOTA is Getting Really Good

I've been linked this, this and this in the past couple of days. This section is going to be awful for the transcription robot, because I don't think there's an automated routine that can describe the videos at the other end of those links to my satisfaction. Just go watch them, with the understanding that they're 100% prompt-generated AI videos, then come back. My understanding is that there's no human editing happening on the other end, apart from stitching clips together into a longer video. In other words, "making a movie" is now plausibly the work of a GPU fleet, one or two people feeding it prompts, and then dumping the output into ffmpeg -f concat.

I am continuously baffled by the fact that this market is still at 38% as of this writing. And yes, I bought in hard.

3D Printing is Surprisingly Fun

To be clear, I thought it would be fun, but I didn't realize how much fun it would be. I got a basic FDM printer a little while ago for one of my kids because he was temporarily obsessed with making his own board games. And I ended up putting together a bunch of projects of my own with it. I might end up doing a fuller writeup, and I might try to turn this into a bigger time-sink for myself, but so far it's just for fun. Given that I'm an Emacs user who wrote his own AI interaction mode, it shouldn't be surprising that I picked up OpenSCAD for modeling purposes. Cheat sheet available over here.

The main complaints I've got so far is that AI assistance is less than maximally helpful. I'll frequently ask it to model something basic and find myself needing to make heavy corrections. Or worse yet, do a google search shudder. I suspect this is something I could fine-tune my way out of on locally hosted models, and possibly on Claude too. There's a decent amount of training data available in the form of things like the BOSL2 library or the Bezier curve implementation. For my current purposes, I'm avoiding it and just learning the basics along with using a few available libraries. Probably the most useful one so far has been the OpenSCAD rebuild of Gridfinity, which lets me design chamfered, optionally magnetized boxes almost as parametrically as I like.

To set up your own small-scale FDM manufacturing, apart from a printer and modeling software, you also need a slicer to translate your 3D models into printer-compatible machine code. gcode is to 3D printers what PostScript is to 2D printers. In theory, you could write it yourself, but there's software out there that does it better. I assume there's lots of serious nerds applying compiler principles to this problem already, so I'm not about to wade in. So far I've been using PrusaSlicer and Cura. For some reason, cura works out of the box in guix whereas the prusa-slicer gives me weird locale errors every time I try to pull it down. I ended up buckling and installing it through apt eventually, but it was mildly annoying. There is zero chance of my installing flatpack, which I also won't link to here.

Usability-wise, prusa-slicer is the better experience. Possibly 3D printing beginners should start with Cura and then move up once they need finer-grained control over gcode output? I'll leave some exploration as an exercise for the reader.

AI Assistance Is ... Mid

The more iron I've been putting behind larger scale AI-assisted programming, the more I get into situations where it makes obviously stupid architectural decisions and fails to back out of them despite many-shot prompting and minor scaffolding. This seems to happen across all the different models and services. I think my local deepseek-coder is the best at avoiding it, but I don't have the data to back that up. It still seems like a net-productivity gain to have AIs do function-level refactoring work for me, but I've been finding myself using it to generate large swathes of code less and less lately. And not just because I've been writing more scad than python.

The main place I saw issues was while setting up a local scaffolding to translate my blog implementation back to Python from Clojure. The project itself is a longer story than I'm getting into here, but one of the big takeaways I had was how bad idea translation really was in all of the frontier models. Despite prompts highlighting that Clojure had an immutable/functional approach to the world, and Python was more object oriented, and then many-shot prompting to specifically call out that atoms are unnecessary in the target language, the port was not smooth. I ended up needing to do a lot of that work by hand at either the module or function level. Compare that branch above to the current master to see some of the details of what the port actually ended up taking.

I'm not really sure how much to update on this. On the one hand, this absolutely seems like the sort of thing I could hand a precocious intern with minimal guidance and expect them to get me the right answer with no fuss. On the other, this is pushing the ambition level of how much I've been asking of code generating models, and it's possible that additional scaffolding and/or fine-tuning just makes the problem go away. It definitely doesn't make me feel much less confident that programmers are getting automated shortly, but it does make me think it'll take longer. Give it like a year instead of three months. I guess there's an outside chance that this is the sort of thing that needs long-horizon, consistent world models to work properly, and in that case, it might even take more like a year and a half.

Voice Model SOTA Has Moved Up

Ok, last thing, I promise. Voice models are now much faster, and also apparently better at cloning, than they were last time I tried the "get a robot to read my blog out loud" thing. I haven't finished a system implementation yet, which is why you're not seeing that headphone icon next to recent posts, but I'm getting dangerously close to getting it back online. Last time I put a system together, I used tortoise-tts. It took a blog post, and around nine hours, and produced an audio of "me" reading the material that has been described as

inaimathi if someone crushed all his hopes and dreams, then sedated him, then woke him up and made him read something

The system I'm working on getting off the ground now instead uses OpenVoice, which looks like it'll make massive improvements in both categories. Firstly, listeners of the test files have described them as "yeah, that sounds like you, but it has kind of a weird cadence". I gauge this to be a massive improvement. Also massively improved, instead of being a multi-hour process, it looks like this can crunch through a blog post in on the order of 20 minutes on my laptop. Once I get the GPU rig operational, I wouldn't be at all surprised to find out that post narration starts being close enough to instant that I can automate the initial reading/publishing process and keep hooks around for QA purposes.

There.

That's a high-level overview of all the stuff I've been working on. Each of those is very likely to get its own write-up in the fullness of time. As always I'll let you know how it goes.


Creative Commons License

all articles at langnostic are licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License

Reprint, rehost and distribute freely (even for profit), but attribute the work and allow your readers the same freedoms. Here's a license widget you can use.

The menu background image is Jewel Wash, taken from Dan Zen's flickr stream and released under a CC-BY license