Last week I ran a workshop for my colleagues - on AI, obviously.
Started, as one does, from the basics, because I genuinely believe that without understanding how the mechanism works, it's pretty hard to navigate the nuances. Sure, to successfully watch TV you don't need to understand how color reproduction works in an OLED matrix. But AI is a technical and complex tool. And for now it still requires at least some baseline foundational knowledge - if only to save on tokens, which aren't exactly cheap.
Besides, it's useful for yourself every now and then to come back to the theory, to realign your understanding. So let's walk through the fundamentals.
What everyone today calls artificial intelligence is based on the concept of LLMs - Large Language Models - which made the key breakthrough in a technology that the brightest minds around the world had been working on for decades already. It's a kind of black box that takes input tokens (cheaper) on one end and produces output tokens (more expensive) on the other, by generating them.
That last word is the key one here. Because unlike, say, a software system with a clear algorithm - which is deterministic, meaning predetermined - an LLM produces a stochastic, or probabilistic, result.
What does that mean? A piece of code produces, 100% of the time, exactly the result that's baked into the algorithm. But inside our "AI" there is no algorithm - it generates, essentially guesses, what the output should be. Yes, in modern models the probability of "guessing" correctly is very high, and if you ask an LLM what 2 + 2 is, it will genuinely answer correctly in almost every case.
But if you fill the context with specific information - say, tell it that 2 is actually an encoded 3 - the LLM will change its answer, and you can no longer rely on that result as precise. Meanwhile a plain old calculator will always get it right.
So. AI is playing a guessing game with us - just with a very high probability of guessing correctly.
After adding the /wrap skill-command to my dev loop, it felt like it finally closed the loop and became whole. Now it actually feels like a continuous development flow - no constant detours, edits, endless tweaking, interruptions, and fixing expensive mistakes.
Everything that needs to be there is there: agent self-discipline, recurring patterns covered by skills, memory problem solved, context and token overuse handled, errors accounted for with actions taken to prevent them going forward.
Now I'm gonna describe clearly all the parts of my dev loop so you can put together the same thing for yourself and check it in practice.
So, the parts of the dev loop:
1. Git repo for tracking all changes
2. Project management system (task tracking)
3. Agent instructions
4. Local project skills tailored to context
5. Global handoff-prompt skill for handing off to an agent in a new session
6. Local /wrap skill-command for retrospective of the past session, cleanup, updating and optimizing points 1-4, and final launch of point 5
To build the /wrap skill-command I've already done retros in several projects - talked about that before too. And since it's already a recurring process, today I'm planning to create a global skill for running that initial retro, the result of which should be the /wrap command appearing in any project.
Plug it in and enjoy productive AI-assisted development!
AI isn't taking over the world the way people were predicting a couple years ago. Not yet.
For months now I've been integrating AI into corporate environments, building tools to support my team, trying to construct a central AI business brain - and honestly, I don't see any approaching apocalypse where this AI suddenly replaces us all, fires everyone, and effortlessly does the work of an entire team a thousand times faster and cheaper.
Not even close. AI is a tool in human hands. I can pick up a chisel and knock a few chips off a block of marble. But I couldn't sculpt a Michelangelo statue even with the very same chisel the master held.
So the results you get from using AI on any given task depend less on the AI itself and more on its operator - their skill, their domain knowledge, their understanding of how to actually apply the tool. And there's a lot more going on under the hood, without which even the latest LLMs remain nothing more than a fun toy for generating slop.
But that's not actually what I wanted to write about. What I wanted to talk about is the strange narrative circulating among people - even those deep in the IT world, working with AI every single day. And the mountain of videos promising that AI will earn a billion dollars in one second flat and never make a single mistake.
In reality, all of that is very far from the truth. An enormous number of nuances comes up at literally every step of implementing AI or solving problems with it. The approach has changed, sure - but the amount of ritual rain-dancing that's always been part of IT, that hasn't gone anywhere.
I'm at peace with this, and it didn't surprise me, because from the age of 10 or 12 I could see that pressing the wrong button at the wrong moment could seriously break an entire operating system - and then you'd spend hours, sometimes days, just getting things back to where they were.
Maybe I'll find the inspiration to dig deeper into this topic with proper technical detail someday, but today I just feel a bit philosophical.
197 sessions in (yes, that's the real number - though it includes the many sessions the agent itself spawned internally) of developing an information system with AI, you start to realize not everything goes smoothly. This actually happens with human dev teams too, which is exactly why regularly running what's called a Retro is considered good practice.
It's a procedure where you look back at the past sprint and collect feedback: what went wrong, what can be improved. Ideally, it's followed by actual changes to the process and actions that drive those changes.
The same thing is worth doing with your AI agents - especially on larger projects, when the initial setup stops holding, instructions get ignored, context gets lost, and all the other delightful quirks of working with modern LLMs kick in.
Below I'll give you the list of what we cover in a retro, so you can just drop this list into your sessions and kick off that infamous improvement loop - which is, after all, the whole point of this exercise.
So, here's what a retro looks like:
1. Go through all logs from all sessions of the current project and analyze them for inconsistencies and contradictions: where mistakes were made, rules and instructions were broken, established conventions weren't followed, important context was forgotten - anything that critically impacts the outcome.
2. Pay special attention to the operator's messages (that's you), because the AI is confident it's doing everything right - but your comments, questions, and interruptions are exactly what exposes the gaps. So it makes sense to focus on those. That said, the agents' own output tokens are worth analyzing too, since they contain a lot of reasoning about decisions made.
3. Put together a list of possible improvements to instructions, memory, and context. When doing this, it's important to draw on the model developer's recommendations (Anthropic's, for example, if you're working with Claude Code) - so researching their documentation will be useful here.
4. Propose writing new skills or refining existing ones for recurring patterns.
5. Optimize all of this - again, following recommendations and best practices. What I mean is: keep your instruction volume reasonable, tidy up the memory so none of this is flooding a million tokens right at the start of a session.
6. Get the changes approved by the operator.
7. Implement the agreed-upon changes.
8. Create a skill and slash command (like /wrap) for running this retro at the end of every session.
9. Do repository hygiene, task management system hygiene, and general workspace cleanup.
10. Based on the now-cleaned-up backlog, identify the next task.
11. Write a handoff prompt for the next session.
And that's where the loop closes - because in the next session we'll call /wrap and the agent will run this retro again. Do that three times a day and your back won't hurt.
I'll write an update on the information systems I'm building for myself personally using AI - systems designed to make life simpler. I want to write about this because these are systems that, several months in, are still in development for certain reasons I want to break down.
The personal finance tracking system is one of them. The system itself is already written and working, but what remains is data cleanup and normalization.
The thing is, I've been tracking my personal finances for about fifteen years now - logging every income, expense, and transfer, strictly categorizing every transaction. And naturally, over many years of doing this, I've switched software several times, irreversibly corrupted the database for certain periods, and somewhere along the way simplified my bookkeeping (for example, not logging intermediate token swaps during crypto transfers).
But I set myself a goal: consolidate all the data in clean form into the new system so that every balance on every account zeroes out perfectly. To do that, I'm requesting bank exports, restoring data backups, trying to find everything possible to enrich the data and reconstruct the full history. We're scanning the blockchain for every crypto transaction to trace the path from one wallet to another. And there's a whole pile of other headaches that are, of course, impossible to automate fully for an AI.
We've classified the data inconsistencies and are running through them in batch mode to at least partially automate the process. For example, when analyzing one transaction, you can identify a pattern that repeats across other records - and then a script can normalize all the rest of that same class.
That painstaking, semi-manual work is exactly what's causing the delay. The system itself is already ready to use - it's the dirty data that's holding everything back. And even AI can't quite help yet...
Haven't talked about useful business tools in a while. Today let's give some attention to Payload CMS - a full-fledged Next.js backend that can serve as either a complete application backend or a content management system.
Obviously, the first thing worth noting is that it's an open source solution, meaning you won't pay a dime for it. The use cases, though, vary depending on what you actually need.
I discovered Payload CMS when I was looking for a headless CMS for static sites - I'm really drawn to their minimalism and speed. But for comfortable content editing, you're missing that layer that something like WordPress gives you out of the box: a proper, user-friendly admin panel that lets you edit site content, track updates, change structure, and so on.
When you're working with static setups like Astro, a lot of what full-blown engines like WordPress offer just isn't necessary - and you can get by without a CMS entirely if, say, it's a corporate landing page that gets updated maybe once a year. In that case it's faster and simpler to just crack open the code and fix the text directly than to build out a whole CMS with a backend.
But the moment content editing becomes a real need - for a blog or an online store, for instance - Payload CMS becomes genuinely useful. Sure, it's not WordPress where everything is simple and configured out of the box - you'll have to put in some work to set up the admin panel initially, connect it to your site's collections, configure editing forms, and a bunch of other stuff. But that kind of thing doesn't scare us in the era of AI agents, right?
Beyond content editing, Payload CMS can become a full-on builder for enterprise applications, an e-commerce management system, or a digital asset manager.
For now I've deployed it as a CMS for a static Astro site - once I've had a chance to poke around its other capabilities, I'll share it with you.
An interesting challenge came up while setting up a corporate Hermes instance that's supposed to run through Mattermost.
Quick reminder - Hermes is an AI harness, a shell around AI agents that knows how to manage them properly. It has a built-in self-learning system and a wide set of skills that let you stop worrying about configuring that feedback-improvement-loop yourself and just calmly hand out tasks, knowing that each next one will be executed better and more efficiently.
On top of that, the shell has a long-term memory storage system that fills up with knowledge about you, the project, the team, the business - whatever your case is. The cherry on top - personalization, the ability to give the agent a personality (reminded me of that scene in Interstellar where the main character turned down the humor level on the AI robot TARS).
And the key thing - the ability to talk to AI agents through familiar messengers without messing around in the terminal or IDE like nerds like me do. The core idea is this: Hermes runs on its own VPS 24/7 and is connected to your Telegram. And you, wherever you are, can write to your agent through Telegram. And it'll do everything available in its environment.
The native Mattermost integration turned out to be insufficient - apparently not many people use this combo yet (guess I just love exotic setups). And the key bug was that every new message to Hermes spawned a new session with it, and it naturally had no idea about the context.
After a couple of iterations we fixed the bug. Once I thoroughly test the fix, I'll publish it to the shared repo so you don't have to fix the same thing yourself. But for now keep in mind that not everything will work out of the box right away, the product is still very young (literally a few months old).
Ending a session before the AI starts noticeably degrading has already become a habit. I've talked before about the handoff-prompt command and skill I use to wrap up each session - passing the baton to a fresh instance with a clean head (context window).
But in complex projects - like building out a codebase - beyond just transferring context, it's critically important to continuously improve the ecosystem the agents operate in. I mean agent instructions, memory, the repository, the task management system, and the overall shared understanding of context.
That last one is especially important, because after some sessions you realize the agent wasn't doing quite what you expected - especially when it was running autonomously. Spent tokens don't come back, so aligning on the key context points at critical moments matters a lot.
The rest, I think, is pretty straightforward - clean up the repo, sort out tasks, save updates, optimize memory and instructions. That's the infamous feedback & improvement loop everyone talks about but nobody actually explains how it works.
So I built a skill that does the following:
1. Sends the current session log to an independent agent to look for contradictions and moments that clearly expose agent errors - in other words, finds what can be improved
2. Collects key moments from the context and composes a brief summary of how the agent understands them
3. In interactive mode, presents the results of the above and lets you give feedback - do we both understand the context the same way, and do I agree with the proposed instruction updates
4. Applies the agreed changes to memory and instructions
5. Cleans up the repository, task statuses, and anything else that's out of order
6. And finally, using the same handoff-prompt skill, produces a handoff bundle to kick off the next session
The wrap skill is wrapped in a /wrap command I run at the end of each session. And since the order of operations inside it is project-specific, I keep this skill local to the particular project - unlike handoff-prompt, which is global.
My First IoT Development <written by a human being>
I never thought I'd end up doing IoT (Internet of Things) development someday. I have an ambient RGB lamp controlled through a mobile app, which isn't always convenient - and honestly I'm not a big fan of mobile apps in general. A PC interface on a big screen with a keyboard and mouse just feels more natural to me.
And yesterday it hit me - I can vibe-code my own desktop app to control this lamp! I fired up Claude Code with this idea, and we had a pretty interesting research session figuring out how the lamp actually communicates with the app and the phone. We even got as far as connecting a smartphone to the PC in debug mode to collect Bluetooth transmitter logs - and eventually realized the lamp runs over WiFi and Bluetooth has nothing to do with it at all.
The next challenge was getting the device's identifier key, which the manufacturer hides pretty carefully. But if you register as an IoT developer on their official site, you get API access that lets you pull the device data you need. Which is exactly what we did.
After that everything was pretty straightforward - test Python scripts for connecting and configuring the lamp, trying different variations, picking the right algorithms, designing the interface, testing and debugging, packaging it into a final app.
The result is a working desktop utility that controls the ambient lamp. No smartphone needed anymore.
Oh, and my washing machine and dryer are also connected over WiFi, by the way...
How many agents do you need to burn through all Claude Code limits <written by a human being>
With every new model version, AI gets smarter. In practice - in development, for example - this means longer autonomous sessions that don't require operator intervention. Which means you need to watch what the agent is doing less and less, it interrupts itself mid-task less often to ask something it could've figured out on its own. And the decisions it makes get closer and closer to what you'd have made yourself.
So at some point I just launch an agent and realize it'll be working autonomously on its task for the next 20-30 minutes on its own. So in the meantime I'll spin up the next agent on a parallel task - and so on, up to a limit defined by two factors.
The first factor is the ability to add the right context at the right time and switch between tasks. I've noticed that with 2-3 agents running simultaneously I manage pretty comfortably and even get other stuff done in between while my input isn't needed. But 4-5 is already my ceiling - past that point the work turns into a sweaty time crunch and an unpleasant cognitive overload.
The second factor, obviously, is tokens. Sure, you can launch 15 agents at once, but they'll devour a 5-hour limit in about 10 minutes of continuous work. The result is 15 tasks probably won't get done, and you're waiting 5 hours for the next reset. Clearly counterproductive.
But 4 agents running continuously eat through almost exactly the 5-hour limit. One small footnote though - I don't respond to their prompts immediately when they call for input, since I usually check the result, test the feature, or configure something to unblock the agent. Meanwhile 2-3 other agents that aren't waiting on context from me are grinding away nonstop.
And in this mode - 4 agents running in parallel - I manage to squeeze the maximum out of Max plan for $100. 5 agents, which I experimented with this week, drain the limits faster, roughly 1-1.5 hours before the reset, so for my workflow 4 is the sweet spot, arrived at empirically. What about you?
Anticode guy
<written by a human being>
Last week I ran a workshop for my colleagues - on AI, obviously.
Started, as one does, from the basics, because I genuinely believe that without understanding how the mechanism works, it's pretty hard to navigate the nuances. Sure, to successfully watch TV you don't need to understand how color reproduction works in an OLED matrix. But AI is a technical and complex tool. And for now it still requires at least some baseline foundational knowledge - if only to save on tokens, which aren't exactly cheap.
Besides, it's useful for yourself every now and then to come back to the theory, to realign your understanding. So let's walk through the fundamentals.
What everyone today calls artificial intelligence is based on the concept of LLMs - Large Language Models - which made the key breakthrough in a technology that the brightest minds around the world had been working on for decades already. It's a kind of black box that takes input tokens (cheaper) on one end and produces output tokens (more expensive) on the other, by generating them.
That last word is the key one here. Because unlike, say, a software system with a clear algorithm - which is deterministic, meaning predetermined - an LLM produces a stochastic, or probabilistic, result.
What does that mean? A piece of code produces, 100% of the time, exactly the result that's baked into the algorithm. But inside our "AI" there is no algorithm - it generates, essentially guesses, what the output should be. Yes, in modern models the probability of "guessing" correctly is very high, and if you ask an LLM what 2 + 2 is, it will genuinely answer correctly in almost every case.
But if you fill the context with specific information - say, tell it that 2 is actually an encoded 3 - the LLM will change its answer, and you can no longer rely on that result as precise. Meanwhile a plain old calculator will always get it right.
So. AI is playing a guessing game with us - just with a very high probability of guessing correctly.
2 hours ago | [YT] | 0
View 0 replies
Anticode guy
<written by a human being>
After adding the /wrap skill-command to my dev loop, it felt like it finally closed the loop and became whole. Now it actually feels like a continuous development flow - no constant detours, edits, endless tweaking, interruptions, and fixing expensive mistakes.
Everything that needs to be there is there: agent self-discipline, recurring patterns covered by skills, memory problem solved, context and token overuse handled, errors accounted for with actions taken to prevent them going forward.
Now I'm gonna describe clearly all the parts of my dev loop so you can put together the same thing for yourself and check it in practice.
So, the parts of the dev loop:
1. Git repo for tracking all changes
2. Project management system (task tracking)
3. Agent instructions
4. Local project skills tailored to context
5. Global handoff-prompt skill for handing off to an agent in a new session
6. Local /wrap skill-command for retrospective of the past session, cleanup, updating and optimizing points 1-4, and final launch of point 5
To build the /wrap skill-command I've already done retros in several projects - talked about that before too. And since it's already a recurring process, today I'm planning to create a global skill for running that initial retro, the result of which should be the /wrap command appearing in any project.
Plug it in and enjoy productive AI-assisted development!
1 day ago | [YT] | 0
View 0 replies
Anticode guy
<written by a human being>
AI isn't taking over the world the way people were predicting a couple years ago. Not yet.
For months now I've been integrating AI into corporate environments, building tools to support my team, trying to construct a central AI business brain - and honestly, I don't see any approaching apocalypse where this AI suddenly replaces us all, fires everyone, and effortlessly does the work of an entire team a thousand times faster and cheaper.
Not even close. AI is a tool in human hands. I can pick up a chisel and knock a few chips off a block of marble. But I couldn't sculpt a Michelangelo statue even with the very same chisel the master held.
So the results you get from using AI on any given task depend less on the AI itself and more on its operator - their skill, their domain knowledge, their understanding of how to actually apply the tool. And there's a lot more going on under the hood, without which even the latest LLMs remain nothing more than a fun toy for generating slop.
But that's not actually what I wanted to write about. What I wanted to talk about is the strange narrative circulating among people - even those deep in the IT world, working with AI every single day. And the mountain of videos promising that AI will earn a billion dollars in one second flat and never make a single mistake.
In reality, all of that is very far from the truth. An enormous number of nuances comes up at literally every step of implementing AI or solving problems with it. The approach has changed, sure - but the amount of ritual rain-dancing that's always been part of IT, that hasn't gone anywhere.
I'm at peace with this, and it didn't surprise me, because from the age of 10 or 12 I could see that pressing the wrong button at the wrong moment could seriously break an entire operating system - and then you'd spend hours, sometimes days, just getting things back to where they were.
Maybe I'll find the inspiration to dig deeper into this topic with proper technical detail someday, but today I just feel a bit philosophical.
2 days ago | [YT] | 0
View 0 replies
Anticode guy
<written by a human being>
197 sessions in (yes, that's the real number - though it includes the many sessions the agent itself spawned internally) of developing an information system with AI, you start to realize not everything goes smoothly. This actually happens with human dev teams too, which is exactly why regularly running what's called a Retro is considered good practice.
It's a procedure where you look back at the past sprint and collect feedback: what went wrong, what can be improved. Ideally, it's followed by actual changes to the process and actions that drive those changes.
The same thing is worth doing with your AI agents - especially on larger projects, when the initial setup stops holding, instructions get ignored, context gets lost, and all the other delightful quirks of working with modern LLMs kick in.
Below I'll give you the list of what we cover in a retro, so you can just drop this list into your sessions and kick off that infamous improvement loop - which is, after all, the whole point of this exercise.
So, here's what a retro looks like:
1. Go through all logs from all sessions of the current project and analyze them for inconsistencies and contradictions: where mistakes were made, rules and instructions were broken, established conventions weren't followed, important context was forgotten - anything that critically impacts the outcome.
2. Pay special attention to the operator's messages (that's you), because the AI is confident it's doing everything right - but your comments, questions, and interruptions are exactly what exposes the gaps. So it makes sense to focus on those. That said, the agents' own output tokens are worth analyzing too, since they contain a lot of reasoning about decisions made.
3. Put together a list of possible improvements to instructions, memory, and context. When doing this, it's important to draw on the model developer's recommendations (Anthropic's, for example, if you're working with Claude Code) - so researching their documentation will be useful here.
4. Propose writing new skills or refining existing ones for recurring patterns.
5. Optimize all of this - again, following recommendations and best practices. What I mean is: keep your instruction volume reasonable, tidy up the memory so none of this is flooding a million tokens right at the start of a session.
6. Get the changes approved by the operator.
7. Implement the agreed-upon changes.
8. Create a skill and slash command (like /wrap) for running this retro at the end of every session.
9. Do repository hygiene, task management system hygiene, and general workspace cleanup.
10. Based on the now-cleaned-up backlog, identify the next task.
11. Write a handoff prompt for the next session.
And that's where the loop closes - because in the next session we'll call /wrap and the agent will run this retro again. Do that three times a day and your back won't hurt.
3 days ago | [YT] | 0
View 0 replies
Anticode guy
<written by a human being>
I'll write an update on the information systems I'm building for myself personally using AI - systems designed to make life simpler. I want to write about this because these are systems that, several months in, are still in development for certain reasons I want to break down.
The personal finance tracking system is one of them. The system itself is already written and working, but what remains is data cleanup and normalization.
The thing is, I've been tracking my personal finances for about fifteen years now - logging every income, expense, and transfer, strictly categorizing every transaction. And naturally, over many years of doing this, I've switched software several times, irreversibly corrupted the database for certain periods, and somewhere along the way simplified my bookkeeping (for example, not logging intermediate token swaps during crypto transfers).
But I set myself a goal: consolidate all the data in clean form into the new system so that every balance on every account zeroes out perfectly. To do that, I'm requesting bank exports, restoring data backups, trying to find everything possible to enrich the data and reconstruct the full history. We're scanning the blockchain for every crypto transaction to trace the path from one wallet to another. And there's a whole pile of other headaches that are, of course, impossible to automate fully for an AI.
We've classified the data inconsistencies and are running through them in batch mode to at least partially automate the process. For example, when analyzing one transaction, you can identify a pattern that repeats across other records - and then a script can normalize all the rest of that same class.
That painstaking, semi-manual work is exactly what's causing the delay. The system itself is already ready to use - it's the dirty data that's holding everything back. And even AI can't quite help yet...
4 days ago | [YT] | 1
View 0 replies
Anticode guy
<written by a human being>
Haven't talked about useful business tools in a while. Today let's give some attention to Payload CMS - a full-fledged Next.js backend that can serve as either a complete application backend or a content management system.
Obviously, the first thing worth noting is that it's an open source solution, meaning you won't pay a dime for it. The use cases, though, vary depending on what you actually need.
I discovered Payload CMS when I was looking for a headless CMS for static sites - I'm really drawn to their minimalism and speed. But for comfortable content editing, you're missing that layer that something like WordPress gives you out of the box: a proper, user-friendly admin panel that lets you edit site content, track updates, change structure, and so on.
When you're working with static setups like Astro, a lot of what full-blown engines like WordPress offer just isn't necessary - and you can get by without a CMS entirely if, say, it's a corporate landing page that gets updated maybe once a year. In that case it's faster and simpler to just crack open the code and fix the text directly than to build out a whole CMS with a backend.
But the moment content editing becomes a real need - for a blog or an online store, for instance - Payload CMS becomes genuinely useful. Sure, it's not WordPress where everything is simple and configured out of the box - you'll have to put in some work to set up the admin panel initially, connect it to your site's collections, configure editing forms, and a bunch of other stuff. But that kind of thing doesn't scare us in the era of AI agents, right?
Beyond content editing, Payload CMS can become a full-on builder for enterprise applications, an e-commerce management system, or a digital asset manager.
For now I've deployed it as a CMS for a static Astro site - once I've had a chance to poke around its other capabilities, I'll share it with you.
5 days ago | [YT] | 1
View 0 replies
Anticode guy
<written by a human being>
An interesting challenge came up while setting up a corporate Hermes instance that's supposed to run through Mattermost.
Quick reminder - Hermes is an AI harness, a shell around AI agents that knows how to manage them properly. It has a built-in self-learning system and a wide set of skills that let you stop worrying about configuring that feedback-improvement-loop yourself and just calmly hand out tasks, knowing that each next one will be executed better and more efficiently.
On top of that, the shell has a long-term memory storage system that fills up with knowledge about you, the project, the team, the business - whatever your case is. The cherry on top - personalization, the ability to give the agent a personality (reminded me of that scene in Interstellar where the main character turned down the humor level on the AI robot TARS).
And the key thing - the ability to talk to AI agents through familiar messengers without messing around in the terminal or IDE like nerds like me do. The core idea is this: Hermes runs on its own VPS 24/7 and is connected to your Telegram. And you, wherever you are, can write to your agent through Telegram. And it'll do everything available in its environment.
The native Mattermost integration turned out to be insufficient - apparently not many people use this combo yet (guess I just love exotic setups). And the key bug was that every new message to Hermes spawned a new session with it, and it naturally had no idea about the context.
After a couple of iterations we fixed the bug. Once I thoroughly test the fix, I'll publish it to the shared repo so you don't have to fix the same thing yourself. But for now keep in mind that not everything will work out of the box right away, the product is still very young (literally a few months old).
6 days ago | [YT] | 1
View 0 replies
Anticode guy
<written by a human being>
Ending a session before the AI starts noticeably degrading has already become a habit. I've talked before about the handoff-prompt command and skill I use to wrap up each session - passing the baton to a fresh instance with a clean head (context window).
But in complex projects - like building out a codebase - beyond just transferring context, it's critically important to continuously improve the ecosystem the agents operate in. I mean agent instructions, memory, the repository, the task management system, and the overall shared understanding of context.
That last one is especially important, because after some sessions you realize the agent wasn't doing quite what you expected - especially when it was running autonomously. Spent tokens don't come back, so aligning on the key context points at critical moments matters a lot.
The rest, I think, is pretty straightforward - clean up the repo, sort out tasks, save updates, optimize memory and instructions. That's the infamous feedback & improvement loop everyone talks about but nobody actually explains how it works.
So I built a skill that does the following:
1. Sends the current session log to an independent agent to look for contradictions and moments that clearly expose agent errors - in other words, finds what can be improved
2. Collects key moments from the context and composes a brief summary of how the agent understands them
3. In interactive mode, presents the results of the above and lets you give feedback - do we both understand the context the same way, and do I agree with the proposed instruction updates
4. Applies the agreed changes to memory and instructions
5. Cleans up the repository, task statuses, and anything else that's out of order
6. And finally, using the same handoff-prompt skill, produces a handoff bundle to kick off the next session
The wrap skill is wrapped in a /wrap command I run at the end of each session. And since the order of operations inside it is project-specific, I keep this skill local to the particular project - unlike handoff-prompt, which is global.
1 week ago | [YT] | 0
View 0 replies
Anticode guy
My First IoT Development
<written by a human being>
I never thought I'd end up doing IoT (Internet of Things) development someday. I have an ambient RGB lamp controlled through a mobile app, which isn't always convenient - and honestly I'm not a big fan of mobile apps in general. A PC interface on a big screen with a keyboard and mouse just feels more natural to me.
And yesterday it hit me - I can vibe-code my own desktop app to control this lamp! I fired up Claude Code with this idea, and we had a pretty interesting research session figuring out how the lamp actually communicates with the app and the phone. We even got as far as connecting a smartphone to the PC in debug mode to collect Bluetooth transmitter logs - and eventually realized the lamp runs over WiFi and Bluetooth has nothing to do with it at all.
The next challenge was getting the device's identifier key, which the manufacturer hides pretty carefully. But if you register as an IoT developer on their official site, you get API access that lets you pull the device data you need. Which is exactly what we did.
After that everything was pretty straightforward - test Python scripts for connecting and configuring the lamp, trying different variations, picking the right algorithms, designing the interface, testing and debugging, packaging it into a final app.
The result is a working desktop utility that controls the ambient lamp. No smartphone needed anymore.
Oh, and my washing machine and dryer are also connected over WiFi, by the way...
1 week ago | [YT] | 1
View 0 replies
Anticode guy
How many agents do you need to burn through all Claude Code limits
<written by a human being>
With every new model version, AI gets smarter. In practice - in development, for example - this means longer autonomous sessions that don't require operator intervention. Which means you need to watch what the agent is doing less and less, it interrupts itself mid-task less often to ask something it could've figured out on its own. And the decisions it makes get closer and closer to what you'd have made yourself.
So at some point I just launch an agent and realize it'll be working autonomously on its task for the next 20-30 minutes on its own. So in the meantime I'll spin up the next agent on a parallel task - and so on, up to a limit defined by two factors.
The first factor is the ability to add the right context at the right time and switch between tasks. I've noticed that with 2-3 agents running simultaneously I manage pretty comfortably and even get other stuff done in between while my input isn't needed. But 4-5 is already my ceiling - past that point the work turns into a sweaty time crunch and an unpleasant cognitive overload.
The second factor, obviously, is tokens. Sure, you can launch 15 agents at once, but they'll devour a 5-hour limit in about 10 minutes of continuous work. The result is 15 tasks probably won't get done, and you're waiting 5 hours for the next reset. Clearly counterproductive.
But 4 agents running continuously eat through almost exactly the 5-hour limit. One small footnote though - I don't respond to their prompts immediately when they call for input, since I usually check the result, test the feature, or configure something to unblock the agent. Meanwhile 2-3 other agents that aren't waiting on context from me are grinding away nonstop.
And in this mode - 4 agents running in parallel - I manage to squeeze the maximum out of Max plan for $100. 5 agents, which I experimented with this week, drain the limits faster, roughly 1-1.5 hours before the reset, so for my workflow 4 is the sweet spot, arrived at empirically. What about you?
1 week ago | [YT] | 1
View 0 replies
Load more