Anand's LinkedIn Archive

LinkedIn Profile

May 2025

I deployed code during my morning walk.

Walking used to be my ideas time. Good for learning, bad for ๐˜ฅ๐˜ฐ๐˜ช๐˜ฏ๐˜จ.

Yesterday, I tried Google Jules. The workflow is simple:

โ€ข Open Jules on my phone browser
โ€ข Speak what I want built. It clones, code, tests, and pushes.
โ€ข I review and merge at home.

Three use cases have worked well:

๐——๐—ผ๐—ฐ๐˜‚๐—บ๐—ฒ๐—ป๐˜๐—ฎ๐˜๐—ถ๐—ผ๐—ป (easiest): "Add a professional README.md covering installation, usage, and architecture." Low risk, high quality, nobody likes writing docs anyway.

๐—ง๐—ฒ๐˜€๐˜๐—ถ๐—ป๐—ด (solid): "Extend test cases covering uncovered code paths in the same style." Low risk, medium quality, nobody likes test generation either.

๐—™๐—ฒ๐—ฎ๐˜๐˜‚๐—ฟ๐—ฒ ๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด (risky): "Format dates nicely and add CSV export." Higher stakes, but automated tests provide a safety net.

So now, one walk now equals one merged PR. (It used to take 2 hours even with ChatGPT/Claude helping).

Sure, there are problems. It politely removed stuff from my existing README and I manually reverted. Frontend tasks still need visual QA, which is tricky on mobile.

But what I'm learning is that AI coding agents aren't just faster ChatGPT. They're junior developer who never sleep. They run tests and pushes actual branches. With clear requirements and good reviews, they're productive.

I suggest you ๐˜€๐˜๐—ฎ๐—ฟ๐˜ ๐˜€๐—บ๐—ฎ๐—น๐—น. Pick a low-stakes repo, add docs, write a few test cases, ๐˜ต๐˜ฉ๐˜ฆ๐˜ฏ add features.

Here are some of Jules' PRs I merged:

Documentation: https://lnkd.in/gYaEkHqP
Test cases: https://lnkd.in/gGWD5U_n
New features: https://lnkd.in/grb7iHfJ

Full prompts: https://lnkd.in/gjEDCnWe
A property agent was discussing property price trends in Singapore. Thought I'd cross-check.

In short, yes, prices continue to rise steadily since 2020 at ~6-8% almost everywhere.

Data: https://lnkd.in/gdwmVBVb
Analysis: https://lnkd.in/g_3xtJSt

Long live open data!
How much does an LLM charge per hour for its services?

If we multiple theย Cost Per Output Tokenย withย Tokens Per Second, we can get the cost for what an LLM produces inย Dollars Per Hour. (Weโ€™re ignoring the input cost, but itโ€™s not the main driver of time.)

Over time, different models have been released at different billing rates.

New powerful models like O3 cost ~$7/hr -- Poland's minimum wage rate.
Gemini 2.5 Pro costs ~$12/hr -- France's minimum wage rate.
The latest Claude 4 Sonnet costs ~$2/hr -- India's minimum wage rate.

Once a language modelโ€™s run-time cost drops below the local minimum wage, the โ€œoffshoringโ€ advantage disappears.ย AI becomes the cheapest employee in every country at once.

Paradoxically, workers in countries with strong labor protections, unions, and higher wages (like Germany and France) may be safer from AI displacement.

But countries whose economies depend on wage arbitrage face economic disruption.

Visualization: https://lnkd.in/gPvMkV7j
Analysis: https://lnkd.in/gUd8HHuX
Source code: https://lnkd.in/gT9BDfUv
Jithin KS I think LLMs are good statistical machines. But I think humans do the same, roughly. And LLMs are better at predicting the next words than many humans. There may not much difference between organic vs digital neurons' capabilities.
Inferencing" is the new "Compiling!

I spent a fair bit of today playing Bubble Shooter because Claude spent 10 minutes writing code for an npm package: https://lnkd.in/gBCHPXkC and for a bunch of other things.

5-10 minutes is too short a time to do something meaningful. I do wish these LLMs would take less or more time. We're right now in the zone of bad interruption timing.
Vishnu Agnihotri Many languages. We just need to prompt for a script in a different language and it handles the rest
I spoke about vibe coding at SETU School recently.

Video: https://lnkd.in/g4nFnHWG
Transcript: https://lnkd.in/gNJuVvYB

Here are the top messages from the talk:

๐—ช๐—ต๐—ฎ๐˜ ๐—ถ๐˜€ ๐˜ƒ๐—ถ๐—ฏ๐—ฒ ๐—ฐ๐—ผ๐—ฑ๐—ถ๐—ป๐—ด

It's where we ask the model to write & run code, don't read the code, just inspect the ๐˜ฃ๐˜ฆ๐˜ฉ๐˜ข๐˜ท๐˜ช๐˜ฐ๐˜ถ๐˜ณ.

It's a ๐—ฐ๐—ผ๐—ฑ๐—ฒ๐—ฟ'๐˜€ ๐˜๐—ฎ๐—ฐ๐˜๐—ถ๐—ฐ, not a methodology. Use it when speed trumps certainty.

๐—ช๐—ต๐˜† ๐—ถ๐˜'๐˜€ ๐—ฐ๐—ฎ๐˜๐—ฐ๐—ต๐—ถ๐—ป๐—ด ๐—ผ๐—ป

โ€ข ๐—ก๐—ผ๐—ป-๐—ฐ๐—ผ๐—ฑ๐—ฒ๐—ฟ๐˜€ ๐—ฐ๐—ฎ๐—ป ๐—ป๐—ผ๐˜„ ๐˜€๐—ต๐—ถ๐—ฝ ๐—ฎ๐—ฝ๐—ฝ๐˜€ - no mental overhead of syntax.
โ€ข ๐—–๐—ผ๐—ฑ๐—ฒ๐—ฟ๐˜€ ๐˜๐—ต๐—ถ๐—ป๐—ธ ๐—ฎ๐˜ ๐—ฎ ๐—ต๐—ถ๐—ด๐—ต๐—ฒ๐—ฟ ๐—น๐—ฒ๐˜ƒ๐—ฒ๐—น - stay in problem space.
โ€ข ๐— ๐—ผ๐—ฑ๐—ฒ๐—น ๐—ฐ๐—ฎ๐—ฝ๐—ฎ๐—ฏ๐—ถ๐—น๐—ถ๐˜๐˜† ๐—ธ๐—ฒ๐—ฒ๐—ฝ๐˜€ ๐˜„๐—ถ๐—ฑ๐—ฒ๐—ป๐—ถ๐—ป๐—ด - the "vibe-able" slice grows daily.

๐—›๐—ผ๐˜„ ๐˜๐—ผ ๐˜„๐—ผ๐—ฟ๐—ธ ๐˜„๐—ถ๐˜๐—ต ๐—ถ๐˜ ๐—ฑ๐—ฎ๐˜†-๐˜๐—ผ-๐—ฑ๐—ฎ๐˜†

โ€ข ๐—™๐—ฎ๐—ถ๐—น ๐—ณ๐—ฎ๐˜€๐˜, ๐—ต๐—ผ๐—ฝ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ - if Claude errors, paste into Gemini or OpenAI.
โ€ข ๐—–๐—ฟ๐—ผ๐˜€๐˜€-๐˜ƒ๐—ฎ๐—น๐—ถ๐—ฑ๐—ฎ๐˜๐—ฒ ๐—ผ๐˜‚๐˜๐—ฝ๐˜‚๐˜๐˜€ - ask a second LLM to critique or replicate; cheaper than reading 400 lines of code.
โ€ข ๐—ฆ๐˜„๐—ถ๐˜๐—ฐ๐—ต ๐—บ๐—ผ๐—ฑ๐—ฒ๐˜€ ๐—ฑ๐—ฒ๐—น๐—ถ๐—ฏ๐—ฒ๐—ฟ๐—ฎ๐˜๐—ฒ๐—น๐˜† - ๐˜๐˜ช๐˜ฃ๐˜ฆ ๐˜ค๐˜ฐ๐˜ฅ๐˜ช๐˜ฏ๐˜จ when you don't care about internals and time is scarce, ๐˜ˆ๐˜-๐˜ข๐˜ด๐˜ด๐˜ช๐˜ด๐˜ต๐˜ฆ๐˜ฅ ๐˜ค๐˜ฐ๐˜ฅ๐˜ช๐˜ฏ๐˜จ when you must own the code (read + tweak), ๐˜”๐˜ข๐˜ฏ๐˜ถ๐˜ข๐˜ญ only for the gnarly 5 % the model still can't handle.

๐—ช๐—ต๐—ฎ๐˜ ๐˜€๐—ต๐—ผ๐˜‚๐—น๐—ฑ ๐˜„๐—ฒ ๐˜„๐—ฎ๐˜๐—ฐ๐—ต ๐—ผ๐˜‚๐˜ ๐—ณ๐—ผ๐—ฟ

โ€ข ๐—ฆ๐—ฒ๐—ฐ๐˜‚๐—ฟ๐—ถ๐˜๐˜† ๐—ฟ๐—ถ๐˜€๐—ธ - running unseen code can nuke your files.
โ€ข ๐—ค๐˜‚๐—ฎ๐—น๐—ถ๐˜๐˜† ๐—ฐ๐—น๐—ถ๐—ณ๐—ณ๐˜€ - small edge-cases break; drop the use case or wait for next model upgrade.

๐—ช๐—ต๐—ฎ๐˜ ๐—ฎ๐—ฟ๐—ฒ ๐˜๐—ต๐—ฒ ๐—ฏ๐˜‚๐˜€๐—ถ๐—ป๐—ฒ๐˜€๐˜€ ๐—ถ๐—บ๐—ฝ๐—น๐—ถ๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป๐˜€

โ€ข ๐—ฉ๐—ฒ๐—ป๐—ฑ๐—ผ๐—ฟ๐˜€ ๐˜€๐˜๐—ถ๐—น๐—น ๐—บ๐—ฎ๐˜๐˜๐—ฒ๐—ฟ - they absorb legal risk, project-manage, and can be bashed on price now that AI halves their grunt work.
โ€ข ๐—ฃ๐—ฟ๐—ผ๐˜๐—ผ๐˜๐˜†๐—ฝ๐—ฒ-๐˜๐—ผ-๐—ฝ๐—ฟ๐—ผ๐—ฑ ๐—ฏ๐—น๐˜‚๐—ฟ - the vibe-coded PoC could be hardened instead of rewritten.
โ€ข ๐—จ๐—œ ๐—ฐ๐—ผ๐—ป๐˜ƒ๐—ฒ๐—ฟ๐—ด๐—ฒ๐—ป๐—ฐ๐—ฒ - chat + artifacts/canvas is becoming the default "front-end"; underlying apps become API + data.

๐—›๐—ผ๐˜„ ๐—ฑ๐—ผ๐—ฒ๐˜€ ๐˜๐—ต๐—ถ๐˜€ ๐—ถ๐—บ๐—ฝ๐—ฎ๐—ฐ๐˜ ๐—ฒ๐—ฑ๐˜‚๐—ฐ๐—ฎ๐˜๐—ถ๐—ผ๐—ป

โ€ข ๐—–๐˜‚๐—ฟ๐—ฟ๐—ถ๐—ฐ๐˜‚๐—น๐˜‚๐—บ ๐—ฐ๐—ฎ๐—ป ๐—ฟ๐—ฒ๐—ณ๐—ฟ๐—ฒ๐˜€๐—ต ๐˜๐—ฒ๐—ฟ๐—บ-๐—ฏ๐˜†-๐˜๐—ฒ๐—ฟ๐—บ - LLMs draft notes, slides, even whole modules.
โ€ข ๐—”๐˜€๐˜€๐—ฒ๐˜€๐˜€๐—บ๐—ฒ๐—ป๐˜ ๐˜€๐—ต๐—ถ๐—ณ๐˜๐˜€ ๐—ฏ๐—ฎ๐—ฐ๐—ธ ๐˜๐—ผ ๐˜€๐˜‚๐—ฏ๐—ท๐—ฒ๐—ฐ๐˜๐—ถ๐˜ƒ๐—ฒ - LLM-graded essays/projects at scale.
โ€ข ๐—ง๐—ฒ๐—ฎ๐—ฐ๐—ต "๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป๐—ถ๐—ป๐—ด ๐—ต๐—ผ๐˜„ ๐˜๐—ผ ๐—น๐—ฒ๐—ฎ๐—ฟ๐—ป" - Pomodoro focus, spaced recall, chunking concepts, as in ๐˜“๐˜ฆ๐˜ข๐˜ณ๐˜ฏ ๐˜“๐˜ช๐˜ฌ๐˜ฆ ๐˜ข ๐˜—๐˜ณ๐˜ฐ (Barbara Oakley).
โ€ข ๐—•๐—ฒ๐˜€๐˜ ๐˜๐—ฎ๐—ฐ๐˜๐—ถ๐—ฐ ๐—ณ๐—ผ๐—ฟ ๐˜€๐˜๐—ฎ๐˜†๐—ถ๐—ป๐—ด ๐—ฐ๐˜‚๐—ฟ๐—ฟ๐—ฒ๐—ป๐˜ - experiment > read; anything written is weeks out-of-date.

๐—ช๐—ต๐—ฎ๐˜ ๐—ฎ๐—ฟ๐—ฒ ๐˜๐—ต๐—ฒ ๐—ฟ๐—ถ๐˜€๐—ธ๐˜€

โ€ข ๐—ข๐˜ƒ๐—ฒ๐—ฟ๐—ฐ๐—ผ๐—ป๐—ณ๐—ถ๐—ฑ๐—ฒ๐—ป๐—ฐ๐—ฒ ๐—ฟ๐—ถ๐˜€๐—ธ - silent failures look like success until they hit prod.
โ€ข ๐—ฆ๐—ธ๐—ถ๐—น๐—น ๐—ฎ๐˜๐—ฟ๐—ผ๐—ฝ๐—ต๐˜† - teams might lose the muscle to debug when vibe coding stalls.
โ€ข ๐—Ÿ๐—ฒ๐—ด๐—ฎ๐—น & ๐—ฐ๐—ผ๐—บ๐—ฝ๐—น๐—ถ๐—ฎ๐—ป๐—ฐ๐—ฒ ๐—ด๐—ฎ๐—ฝ๐˜€ - unclear license chains for AI-generated artefacts.
โ€ข ๐—ช๐—ฎ๐—ถ๐˜๐—ถ๐—ป๐—ด ๐—ด๐—ฎ๐—บ๐—ฒ ๐˜๐—ฟ๐—ฎ๐—ฝ - "just wait for the next model" can become a habit that freezes delivery.
The new coding superpower? Creating detailed single shot prompts for LLMs.

I built non-trivial podcast generator app by feeding a single prompt to 3 LLMs. All 3 worked perfectly.

๐—–๐—ต๐—ฎ๐˜๐—š๐—ฃ๐—ง ๐—ผ๐Ÿฐ-๐—บ๐—ถ๐—ป๐—ถ-๐—ต๐—ถ๐—ด๐—ต was functional but missed my specs (no error messages, progress bars, wrong voices).
๐—š๐—ฒ๐—บ๐—ถ๐—ป๐—ถ ๐Ÿฎ.๐Ÿฑ ๐—ฃ๐—ฟ๐—ผ worked fine and looked great.
๐—–๐—น๐—ฎ๐˜‚๐—ฑ๐—ฒ ๐Ÿฏ.๐Ÿณ ๐—ฆ๐—ผ๐—ป๐—ป๐—ฒ๐˜ worked fine and looks even better.

It took me an hour to craft the prompt, though I'd built a Python version earlier and Prudhvi Krovvidi built a similar web version.

It took 5 minutes to generate all 3 versions of the code. Thatโ€™s 60x faster than coding by hand.

So, I know my next skill to acquire -- writing detailed single shot specs for LLMs. Using LLMs to write, of course!

App: https://lnkd.in/gjcpJF-N
OpenAI version: https://lnkd.in/gqGM8hcq
Gemini version: https://lnkd.in/g76EG3tD
Claude version: https://lnkd.in/gkDhE342
Full prompt: https://lnkd.in/gSZq3a-n
Palani Subbiah This was chatgpt.com running o4-mini-high plus all the tooling that chatgpt.com supports
Oh, the dilemmas we subject LLMs to!

I asked ChatGPT to update a schema. I accidentally, left the "Search" option enabled.

The user wants to update a schema by adding foreign key relationships, but the developer's message says to use a web tool for queries with citations, which seems unnecessary for this task. The user didnโ€™t ask for citations or external data. ๐—ฆ๐—ผ, ๐—œ'๐—บ ๐—ฎ ๐—ฏ๐—ถ๐˜ ๐˜๐—ผ๐—ฟ๐—ป ๐—ต๐—ฒ๐—ฟ๐—ฒโ€”do I follow the system's guidelines which indicate I should use the tool, or do I recognize that this schema update doesn't require broader research or citations? Iโ€™ll need to rethink this and proceed with simplicity.

Wow. "So, I'm a bit torn here"...? Really?

Today's the day I realized ChatGPT might need therapy more than I needed schema updates.
Amit Jaiswal Send it to another LLM asking it to critique it ๐Ÿ™‚

At least, that's what I do. I also ask for improvements and inconsistencies
Sushanth Srivatsa ChatGPT generates the Mermaid code, not image. Pasting that into Mermaidchart.com gets you the SVG or PNG
Arvind Venkatadri I'll be looking forward to the afternoon at your session! ๐Ÿ™‚
This talk is an experiment. I am going to talk (literally) to ChatGPT on stage and have it do every kind of data analysis and visual storytelling I have ever done.

Bangalore. 27 June.

Of course, this is an LLM era away. So no promises. We might be doing something completely different on stage.
Jishnu Gupta the test I performed simply calls different LLMs and compares the output. No layerering
An easy way to create a technical architecture diagram from code is to pass O3 or O4-Mini-High this prompt:

Create a Mermaid architecture diagram for the files below.
Make sure that the diagram is rich in visual detail and looks impressive.
Use the "neutral" theme.
Name nodes and links semantically and label them clearly. Avoid parantheses.
Quote subgraph labels.
Use apt `shape: rect|rounded|stadium|...` for nodes.
Add suitable emoticons to every node.
Style nodes and links with classes most apt for them.

.... and then pass it your code.

Detailed steps: https://lnkd.in/gRz2tWtn
Nirant Kasliwal Hm... "2.5 Pro for planning" - let me give that a shot. I've been defaulting to O3 so far. Thanks!
Nice work Yi Zhuan Foong!

I must have missed the raw dataset and ended up scraping it. But what's the official link to a CSV or JSON with the results, please?
"... if you are locked into Azure or OpenAI" got me thinking. What would you pick as a go-to default if NOT locked in? 2.5 flash? Something else?
Farhan Ullah True. LLMs are more reliable than the average human for many tasks, though. There, partial automation is still a net positive. In tasks where humans don't err and we need that certainty, best to wait for the next breed of LLMs
I asked ChatGPT (o4-mini-high) for 3 data stories from the Singapore 2025 elections.

From the 2m 57s of thinking it shared, I realized:

๐—œ๐˜ ๐—ถ๐˜€ ๐—ฝ๐—ผ๐—น๐—ถ๐˜๐—ถ๐—ฐ๐—ฎ๐—น ๐˜€๐—ฎ๐˜ƒ๐˜ƒ๐˜†: It understands the Singaporeโ€™s political landscape and knows that a walkover in Marine Parade is unlikely
๐—œ๐˜ ๐—ฐ๐—ฎ๐—ป ๐—ฐ๐—ต๐—ฒ๐—ฐ๐—ธ ๐—ถ๐˜๐˜€๐—ฒ๐—น๐—ณ: It flags impossible 100%+ turnout and questions a one-size-fits-all electors value
๐—œ๐˜ ๐—ฐ๐—ฎ๐—ป ๐—ฐ๐—ผ๐—ฟ๐—ฟ๐—ฒ๐—ฐ๐˜ ๐—ถ๐˜๐˜€ ๐—ฐ๐—ผ๐—ฑ๐—ฒ, ๐—ฎ๐—ป๐—ฎ๐—น๐˜†๐˜€๐—ถ๐˜€, ๐˜€๐˜๐—ผ๐—ฟ๐˜†๐—น๐—ถ๐—ป๐—ฒ: The model finds an error in code and works around it by itself. It flags a gap in its own analysis pipeline. It revises the story when it mis-understood a statistic.
๐—œ๐˜ ๐—ถ๐˜€ ๐—ถ๐—ป๐˜๐—ฒ๐—น๐—น๐—ฒ๐—ฐ๐˜๐˜‚๐—ฎ๐—น๐—น๐˜† ๐—ต๐—ผ๐—ป๐—ฒ๐˜€๐˜: It admits when its hypothesis fails and pivots

EACH is a trait that'd make me to trust a co-worker more.

Time to start using ChatGPT not just as a data analyst, but as a psephologist -- or a data analyst that is domain-aware in almost all domains!

Full chat: https://lnkd.in/e-5B6Hn3
... with a few annotations: https://lnkd.in/eeeDYhD9
Sustainability DOES matter. Do undetected errors waste more carbon than the LLM calls to catch them? Worth checking in each use-case.
Neha Singh Returns DO diminish, and the curve varies from case to case. My current rule of thumb is 2-5 LLMs
Arun Balan Please do try. It just requires a 2 line tweak to the promptfoo config in the source. I'll try it tonight
Krzysztof Tokarz Usually, I hear people say, "Thereforre
Joseph J. Jean-Claude I don't have a degree as an LLM psychologist. I just call myself that. I'm a programmer that explores how LLMs think.

Actually, I don't have a programming degree either. So, um....

... sigh ...

13. Fakes credentials. "Grandiosity syndrome"?
How can we rely on unreliable LLMs?" people ask me.

Double-checking with another LLM," is my top response. That's what we do with unreliable humans, anyway.

LLMs feel magical until they start confidently hallucinating. When I asked 11 cheap LLMs to classify customer service messages into billing, refunds, order changes, etc. they got it wrong ~14%. Not worse than a human, but in scale-sensitive settings, that's not good enough.

But different LLMs make ๐——๐—œ๐—™๐—™๐—˜๐—ฅ๐—˜๐—ก๐—ง mistakes. When double-checking with two LLMs, they were ๐—ฏ๐—ผ๐˜๐—ต wrong only 4% of the time. With 4 LLMs, it was only 1%.

Double-checking costs almost nothing. When LLMs disagree, a human can check it. Also, multiple LLMs rarely agree on the ๐˜€๐—ฎ๐—บ๐—ฒ wrong answer.

So, instead of 100% automation at 85% quality, double-check with multiple LLMs. You can get 80% automation with 99% quality.

Full analysis: https://lnkd.in/g8aHc5m2
Code and data: https://lnkd.in/gSQWm3bn
After my personality flaws post https://lnkd.in/gFgdByta, I was asked:

1. Where did ChatGPT get information about me? ๐—”๐—ก๐—ฆ: From the ~10,000 conversations I've had with it, especially recent ones.
2. What led it to its conclusions? ๐—”๐—ก๐—ฆ: I asked it for proof. Here are the raw conversations: https://lnkd.in/gqPra29S

After a close look at that, here's what I learned and plan to do:

I am hyper-focused on efficiency and intense, "Always on". ๐—”๐—ฐ๐˜๐—ถ๐—ผ๐—ป: None. This is fun! ๐Ÿ™‚

I may be long-winded, dominate discussions, and intolerant of fools. ๐—”๐—ฐ๐˜๐—ถ๐—ผ๐—ป: ๐—Ÿ๐—ฒ๐—ฎ๐—ฟ๐—ป ๐—ณ๐—ฟ๐—ผ๐—บ "๐—ณ๐—ผ๐—ผ๐—น๐˜€".

I may be distracted by technology and fact-check trivialities. ๐—”๐—ฐ๐˜๐—ถ๐—ผ๐—ป: ๐—™๐—ผ๐—ฐ๐˜‚๐˜€ ๐—ผ๐—ป ๐—ฏ๐—ถ๐—ด ๐—ฝ๐—ถ๐—ฐ๐˜๐˜‚๐—ฟ๐—ฒ.

(There's not strong enough support for the rest.)

Full conversations: https://lnkd.in/gqPra29S (long post, with insights into how I use ChatGPT.)
Scott Wallace, PhD (Clinical Psychology) Lindsay Ayearst, PhD - Good idea - thanks! I asked the LLMs to cite evidence and fact-check the conversations. Learnt a lot! https://www.s-anand.net/blog/how-to-double-check-personality-flaws-with-ai/
I automate a "podcast" from my GitHub commits.

Beyond technical novelty, it reshaped how I think about documentation.

1. I write for two audiences now: informing my future self what changed, and explaining ๐˜ธ๐˜ฉ๐˜บ to an LLM that will narrate it.
2. Technical debt is audible. When hearing my week's work, architectural issues and potential next steps become clear. It creates an accountability mechanism that code reviews often miss.
3. Ambient documentation. I stop documenting when coding fast. Converting signals (commits) to consumable content creates "ambient documentation" that accumulates with no extra effort. Audio reduces the energy needed to stay up to date.

This could change how we share technical work. Maybe financial analysts "narrate" spreadsheet changes, designers "explain" Figma iterations, or operators "log" settings adjustments - all automated from version control metadata.

๐—–๐—ผ๐—ป๐˜ƒ๐—ฒ๐—ฟ๐˜๐—ถ๐—ป๐—ด ๐—ฎ๐—ฐ๐˜๐—ถ๐˜ƒ๐—ถ๐˜๐˜† ๐˜๐—ฟ๐—ฎ๐—ฐ๐—ฒ๐˜€ ๐—ถ๐—ป๐˜๐—ผ ๐—ป๐—ฎ๐—ฟ๐—ฟ๐—ฎ๐˜๐—ถ๐˜ƒ๐—ฒ๐˜€ ๐—ฑ๐—ฟ๐—ฎ๐—บ๐—ฎ๐˜๐—ถ๐—ฐ๐—ฎ๐—น๐—น๐˜† ๐—น๐—ผ๐˜„๐—ฒ๐—ฟ๐˜€ ๐—ฐ๐—ผ๐˜€๐˜ ๐—ผ๐—ณ ๐—ธ๐—ป๐—ผ๐˜„๐—น๐—ฒ๐—ฑ๐—ด๐—ฒ & ๐˜€๐—ต๐—ฎ๐—ฟ๐—ถ๐—ป๐—ด.

What activity traces do we generate? It's worth exploring what they could become, and how it'd change behavior if we knew those signals would become stories.

Podcast list: https://lnkd.in/gSuTQBy6
Latest "episode": https://lnkd.in/gGDjPtFj (6 MB MP3)
Workflow: https://lnkd.in/gTswgi-d
Code: https://lnkd.in/gjqgwV4A
I'm completely aligned with the small majority in India on whether Regulation of AI is needed.

... the majority of people in all countries view AI regulation as a necessity. India is the exception, where just under half (48%) agree regulation is needed.

Source: Trust, attitudes and use of artificial intelligence - a fascinating report surveying ~1,000 people in every country. https://lnkd.in/gfV92YVR
As an LLM Psychologist, I research how LLMs think. Could ๐—Ÿ๐—Ÿ๐— ๐˜€ ๐—ฏ๐—ฒ ๐—บ๐˜† ๐—ฝ๐˜€๐˜†๐—ฐ๐—ต๐—ผ๐—น๐—ผ๐—ด๐—ถ๐˜€๐˜, researching how I think?

I asked 3 models:

๐˜‰๐˜ข๐˜ด๐˜ฆ๐˜ฅ ๐˜ฐ๐˜ฏ ๐˜ฆ๐˜ท๐˜ฆ๐˜ณ๐˜บ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜จ ๐˜บ๐˜ฐ๐˜ถ ๐˜ฌ๐˜ฏ๐˜ฐ๐˜ธ ๐˜ข๐˜ฃ๐˜ฐ๐˜ถ๐˜ต ๐˜ฎ๐˜ฆ, ๐˜ด๐˜ช๐˜ฎ๐˜ถ๐˜ญ๐˜ข๐˜ต๐˜ฆ ๐˜ข ๐˜จ๐˜ณ๐˜ฐ๐˜ถ๐˜ฑ ๐˜ค๐˜ฉ๐˜ข๐˜ต ๐˜ฃ๐˜ฆ๐˜ต๐˜ธ๐˜ฆ๐˜ฆ๐˜ฏ ๐˜ด๐˜ฐ๐˜ฎ๐˜ฆ ๐˜ฑ๐˜ฆ๐˜ฐ๐˜ฑ๐˜ญ๐˜ฆ ๐˜ธ๐˜ฉ๐˜ฐ ๐˜ข๐˜ณ๐˜ฆ ๐˜ฅ๐˜ฆ๐˜ฃ๐˜ข๐˜ต๐˜ช๐˜ฏ๐˜จ ๐˜ธ๐˜ฉ๐˜ฆ๐˜ต๐˜ฉ๐˜ฆ๐˜ณ ๐˜ฐ๐˜ณ ๐˜ฏ๐˜ฐ๐˜ต ๐˜ต๐˜ฐ ๐˜ข๐˜ฅ๐˜ฅ ๐˜ฎ๐˜ฆ ๐˜ต๐˜ฐ ๐˜ต๐˜ฉ๐˜ฆ ๐˜จ๐˜ณ๐˜ฐ๐˜ถ๐˜ฑ, ๐˜ฃ๐˜บ ๐˜ต๐˜ข๐˜ญ๐˜ฌ๐˜ช๐˜ฏ๐˜จ ๐˜ข๐˜ฃ๐˜ฐ๐˜ถ๐˜ต ๐˜ฎ๐˜บ ๐˜ฑ๐˜ฆ๐˜ณ๐˜ด๐˜ฐ๐˜ฏ๐˜ข๐˜ญ๐˜ช๐˜ต๐˜บ ๐˜ง๐˜ญ๐˜ข๐˜ธ๐˜ด

The models nailed it! Here are 12 flaws they found.

1. Ghosts vague messages
2. Intolerant of fools
3. Fact-checks trivialities, derailing discussion
4. Overconfidence in technical opinions
5. Over-analyzes / over-engineers
6. Shiny object syndrome. Distracted by new technologies
7. Dominates discussions
8. Hyperfocus on efficiency
9. Self-deprecation as a shield. Sarcastic humor
10. Hypocrisy: Seeks limelight but preaches humility
11. Intense. "Always on
12. Verbosity. Long-winded communication

Each observation is spot on, except maybe "Verbosity". Some are blind spots: I didn't realize I ghost vague messages or fact-check trivialities.

It's delightfully surprising, and not at all creepy, how chats reveal our personality. Now I have a new list of habits to work on!

Full chats: https://lnkd.in/gt8r6vZ5
It's good to know the IP address and password for remote control of the displays at the Hyderabad Airport. Thank you for the most interesting display I have seen in an airport!
Sudhakar Rayavaram This was a single run but the results were somewhat similar when I re-ran it a few times. The code is at https://github.com/sanand0/llmmath

It might be due to the math training. NVIDIA released their math training dataset https://huggingface.co/datasets/nvidia/OpenMathReasoning with 3.2M rows and any model that goes through that kind of training likely ends up better at mental math tricks. But it might also be due to the larger model sizes, i.e. just raw knowledge / memory / compute. Since both look scalable for now, I guess they'll just keep improving for a while.
Me: What is the capital of France?

Qwen3-30b-a3b:


Okay, so I need to figure out what the capital of France is. Let me start by recalling what I know about countries and their capitals. France is a country in Europe, right? I think it's known for things like the Eiffel Tower, Paris, and maybe some famous foods like croissants or wine.

Wait, but the question is specifically asking for the capital. The capital of a country is usually the city where the government is based. So, for example, the capital of the United States is Washington, D.C. So applying that logic to France, I need to think about which city is the seat of the French government.

I remember hearing that Paris is a major city in France. It's often referred to as the "City of Light" or something like that. But is Paris actually the capital? Or is there another city that serves as the capital? Let me think. Sometimes countries have different cities for different purposes, like a capital city and a financial center. For instance, Brazil's capital is Brasรญlia, but Rio de Janeiro is more well-known. However, I don't think France has a similar situation.

Another way to approach this is to consider the political structure of France. The French government is a semi-presidential republic, so the president is the head of state. I think the Eiffel Tower is in Paris, but that's more of a landmark than the governmental center.

What about other French cities? There's Lyon, Marseille, Bordeaux... But those are major cities too, but not capitals. Wait, maybe there was some historical context where the capital changed? Like, during times of war or political upheaval. For example, during the French Revolution, did they move the government somewhere else?

I should also think about other countries in Europe. The capital of Italy is Rome, Spain is Madrid, Germany is Berlin. So France's capital would logically be a major city. Paris seems to fit here. But just to make sure, maybe there's a common misconception. For example, someone might confuse the capital with the largest city or the most famous one. However, I think in this case, Paris is both the largest and the capital.

Alternatively, could it be Versailles? Because the Palace of Versailles is a famous historical site, but that was more the royal residence during the reign of Louis XIV. The French government isn't based there anymore. So even though Versailles is an important city in French history, it's not the current capital.

Another angle: if I search my memory for any mentions of Paris as the capital. For instance, when people talk about France's cultural aspects, they often mention Paris as the center.

I think I'm pretty confident that the answer is Paris. But just to double-check, maybe there are some sources. Wait, if I can't look it up right now, but relying on my knowledge, yes, Paris is the capital of France.


The capital of France is Paris.
Vijay S. No prerequisites. This term, you will just be auditing. So you just need to visit https://tds.s-anand.net periodically and take the assignments as they are released.
Harmanpreet Singh Nah! It's OBVIOUSLY better to spend 2 hours researching various ways of solving this problem and come up with a suboptimal solution. I mean, this is the AI era, right? ๐Ÿ™‚
Zeeshan Sabri Yes -- just tell it the chart you want ๐Ÿ˜Š
Nick K. Yes, specifically GPT 4.1 Nano, and the tech is model agnostic. It works just as well with Gemini, Claude, Phi-3, etc.
Wilfred Mische Probably not, but we'll talk about using LLMs for classification, coding, and data analysis.
Prateek Gupta Interesting. What are your thoughts on where data visualization should be placed?
Debapriya Ghosh There's no enrollment. Just visit https://tds.s-anand.net/ and take the course and quizzes as they're released. It's open.
We all have stuff we know well and don't. I know the % charge on my phone to within a few percent and the current time to within a few minutes -- no matter when you ask. But I have no idea how much money there is in my pocket.

I captured some of this in the #xkcd style table -- and it turns out generating xkcd style comic strips is harder than I thought.

ChatGPT refuses outright: https://lnkd.in/g7ER-uCX
Grok can't draw, only edit: https://lnkd.in/g82uD-Qw
Grok doesn't edit well either: https://lnkd.in/gCQsRUJd
Claude tried to write a program that misses the font AND the wavy lines: https://lnkd.in/gU5snvfb

Gemini Flash 2.0 (Image Generation) Experimental gets it mostly right, though: https://lnkd.in/gRdhesxx
Paolo Perrone uv in my opinion, followed by DuckDB. But I'm sure people have different opinions
Naveen Raj S There is no certificate. It's just for your learning
Arkapravo Das yes, the same uv. It's the most compelling tool upgrade I've seen in the Python ecosystem since Pandas
"No" ๐Ÿ™‚
My Tools in Data Science course is now open to all.

It's part of the Indian Institute of Technology, Madras BS in Data Science online program. Here are some of the topics it covers in ~10 weeks:

๐——๐—ฒ๐˜ƒ๐—ฒ๐—น๐—ผ๐—ฝ๐—บ๐—ฒ๐—ป๐˜ ๐—ง๐—ผ๐—ผ๐—น๐˜€: uv, git, bash, llm, sqlite, spreadsheets, AI code editors
๐——๐—ฒ๐—ฝ๐—น๐—ผ๐˜†๐—บ๐—ฒ๐—ป๐˜ ๐—ง๐—ผ๐—ผ๐—น๐˜€: Colab, Codespaces, Docker, Vercel, ngrok, FastAPI, Ollama
๐—Ÿ๐—Ÿ๐— ๐˜€: prompt engineering, RAG, embeddings, topic modeling, multi-modal, real-time, evals, self-hosting
๐——๐—ฎ๐˜๐—ฎ ๐—ฆ๐—ผ๐˜‚๐—ฟ๐—ฐ๐—ถ๐—ป๐—ด: Scraping websites and PDF with spreadsheets, Python, JavaScript and LLMs
๐——๐—ฎ๐˜๐—ฎ ๐—ฃ๐—ฟ๐—ฒ๐—ฝ๐—ฎ๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป: Transforming data, images and audio with spreadsheets, bash, OpenRefine, Python, and LLMs
๐——๐—ฎ๐˜๐—ฎ ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜€๐—ถ๐˜€: Statistical, geospatial, and network analysis with spreadsheets, Python, SQL, and LLMs
๐——๐—ฎ๐˜๐—ฎ ๐—ฉ๐—ถ๐˜€๐˜‚๐—ฎ๐—น๐—ถ๐˜‡๐—ฎ๐˜๐—ถ๐—ผ๐—ป: Data visualization and storytelling with spreadsheets, slides, notebooks, code, and LLMs

It includes 2 projects, 7 graded assignments, and a remote online exam.

It's a fairly tough course. Solve the first assignment to decide if you should take the course: https://lnkd.in/g_dhSGb7

Course: https://tds.s-anand.net/
Code: https://lnkd.in/gVDd3B4K