You Should Probably Pay Attention to Tokenizers

| 14 minutes | 2863 words |

Last week I was helping a friend of mine to get one of his new apps off the ground. I can’t speak much about it at the moment, other than like most apps nowadays it has some AI sprinkled over it. Ok, maybe a bit maybe more just a bit – depends on the way you look at it, I suppose.

There is a Retrieval-augmented generation (RAG) hiding somewhere in most of the AI apps. RAG is still all the RAGe – it even has its own Wikipedia page now! I’m not sure if anyone is tracking how fast a term reaches the point where it gets its own Wiki page but RAG must be somewhere near the top of the charts.

[Read More]

ai llm tokenizers

Some Notes on Adversarial Attacks on LLMs

| 29 minutes | 6094 words |

Intro

Last week I was catching up with one of my best mates after a long while. He is a well-recognised industry expert who also runs a successful cybersecurity consultancy. Though we had a lot of other things to catch up on, inevitably, our conversation led to AI, LLMs and their (cyber)security implications.

I’ve spent the last couple of months working for early-stage startups building LLM (Large Language Model) apps, as well as hacking on various silly side projects which involved interacting with LLMs in one way or another. But only now I’m starting to realize how naive some of the apps I have helped to build were from the security [and safety] point of view.

[Read More]

ai llm safety security adversarial attacks research

My Three Favorite Scenes From The Bear

| 8 minutes | 1564 words |

I heard about The Bear TV show from my brother a few times over the past couple of years but I never really felt like watching it. A TV show about a frantic family-run bistro in Chicago? Color me sceptical. Don’t get me wrong, I’ve got nothing against the catering or hospitality industry! My brother is a restaurant manager and I spent a fair amount of my youth cleaning dishes at many dubious establishments over a few summers before I learnt to make money by doing things with computers.

[Read More]

tv the bear life

Using Cuelang With Go for LLM Data Extraction

| 20 minutes | 4142 words |

I have been aware of Cuelang (CUE) pretty much since the early stages of its development. It always seemed to me the language had the potential to solve a lot of problems in the ocean of YAML which we found ourselves drowning in the Cloud Native ecosystem.

CUE excels in validating data against strictly defined schemas and is equally capable of generating code for data models from them. These are wonderful features, though I hadn’t found the perfect application for them in any of the projects I had been working on. That changed recently with my increased involvement in projects utilizing Large Language Models (LLM)s.

[Read More]

go golang llm ai cue cuelang

Go or Rust? Just Listen to the Bots

| 20 minutes | 4159 words |

It all started as a joke. I was in a group chat with a few of my friends and we were talking about football (soccer for the American readers). I entered the chat during a mildly heated discussion about the manager of a team one of my friends supports. It was going on for a bit while with seemingly no end in sight when it occurred to me that I could just as well clone my friends’ voices and pit them against each other by backing them with LLMs, and I’d probably not see much difference in the conversation.

[Read More]

go golang rust rust-lang llm tts speech-synthesis bot

Builders Are Happier But What Happens When AI Takes Over

| 5 minutes | 924 words |

I have been busy hacking since I got back from my long holidays. I didn’t miss computers while travelling around the world. Not for a second. When you hike up a volcano and engorge yourself in the beautiful views only this planet can reward you with it’s hard to think of computers let alone hacking.

But now that I’m back and re-engaged my hacking mode I’ve gained a whole new appreciation for what the act of building software gives me. I like building things. Silly things. Any things. It’s fun. I find it engaging and fulfilling for reasons I don’t quite fully understand. To be honest, I’ve never really thought about the reasons. Is it the dopamine I get from solving problems with code? Is it the feeling of accomplishment? Is it just the IKEA effect at play? I don’t know. Frankly, I don’t really care that much about the actual reasons behind it. It’s the effect the activity has on me what matters.

[Read More]

essay ai software engineering

Rust tokio task cancellation patterns

| 12 minutes | 2497 words |

Update: 19/04/2024: read at the end of the post for more info.

I have been trying to pick up Rust again recently. It’s been a bit of a slow burn at the beginning but I think I’m finally starting to feel the compounding effects kicking in. Maybe it’s just my brain playing a trick on me, but I’m feeling at much more ease when writing Rust now than I was a few weeks back.

[Read More]

rust concurrency

Circular Buffer Performance Trick

| 9 minutes | 1869 words |

Update 12/04/2024: Read at the end of the post for more info.

I have been hacking on AI agents recently for both fun and profit as part of the work I’m doing for one of my clients.

They’re mostly text-to-speech (TTS) agents leveraging LLMs for generating text which is then turned into voice by a trained TTS model.

As you [probably] know, maintaining conversation with LLMs over a longer period of time requires maintaining the conversational context and sending it back to the LLM along with your follow-up prompts to prevent the LLMs from “hallucinating” from the get-go.

[Read More]

go golang rust performance

A Small Tool for Exploring Text Embeddings

| 8 minutes | 1492 words |

Last year I wrote about the superpowers text embeddings can give you and how I tried using them to compare the song lyrics of some music artists. Though the results failed to paint the picture I hoped for – this was due to the methodology, or rather lack thereof – it made me appreciate the importance of simple open source tools (OSS) in the currently booming AI/LLM space.

To get to the point of displaying the embedding projections in the blog post I had to jump through some hoops and combine a lot of different Go modules before I could finally generate the nice interactive plots from the computed data. This wasn’t ideal I knew even back then but I wrote a blog post on a whim trying to quickly prove a silly point to a friend of mine. So at the time, I made do with whatever was necessary.

[Read More]

Go go embeddings golang Golang ai vector database

On The Importance of Getting The Foundations Right

| 8 minutes | 1572 words |

Throughout my career, I’ve learnt, usually the hard way, the importance of getting the foundations of whatever I was working on right. Or at least as right as possible. I learnt how fundamental it is for setting your project — and by proxy, your team — up for success. I’d argue it’s one of the most important things you should pay attention to. Getting the basics right is notoriously hard due to the inevitability of changing requirements, external factors, etc. which is why dedicating a reasonably large amount of time to figure out the foundations is so invaluable.

[Read More]

essay advice engineering career foundations software engineering