May 2025 - Anand's LinkedIn Archive

Share May 30, 2025 at 05:34 AM

I deployed code during my morning walk.

Walking used to be my ideas time. Good for learning, bad for 𝘥𝘰𝘪𝘯𝘨.

Yesterday, I tried Google Jules. The workflow is simple:

• Open Jules on my phone browser
• Speak what I want built. It clones, code, tests, and pushes.
• I review and merge at home.

Three use cases have worked well:

𝗗𝗼𝗰𝘂𝗺𝗲𝗻𝘁𝗮𝘁𝗶𝗼𝗻 (easiest): "Add a professional README.md covering installation, usage, and architecture." Low risk, high quality, nobody likes writing docs anyway.

𝗧𝗲𝘀𝘁𝗶𝗻𝗴 (solid): "Extend test cases covering uncovered code paths in the same style." Low risk, medium quality, nobody likes test generation either.

𝗙𝗲𝗮𝘁𝘂𝗿𝗲 𝗰𝗼𝗱𝗶𝗻𝗴 (risky): "Format dates nicely and add CSV export." Higher stakes, but automated tests provide a safety net.

So now, one walk now equals one merged PR. (It used to take 2 hours even with ChatGPT/Claude helping).

Sure, there are problems. It politely removed stuff from my existing README and I manually reverted. Frontend tasks still need visual QA, which is tricky on mobile.

But what I'm learning is that AI coding agents aren't just faster ChatGPT. They're junior developer who never sleep. They run tests and pushes actual branches. With clear requirements and good reviews, they're productive.

I suggest you 𝘀𝘁𝗮𝗿𝘁 𝘀𝗺𝗮𝗹𝗹. Pick a low-stakes repo, add docs, write a few test cases, 𝘵𝘩𝘦𝘯 add features.

Here are some of Jules' PRs I merged:

Documentation: https://lnkd.in/gYaEkHqP
Test cases: https://lnkd.in/gGWD5U_n
New features: https://lnkd.in/grb7iHfJ

Full prompts: https://lnkd.in/gjEDCnWe

Share May 27, 2025 at 05:36 AM

View on LinkedIn

A property agent was discussing property price trends in Singapore. Thought I'd cross-check.

In short, yes, prices continue to rise steadily since 2020 at ~6-8% almost everywhere.

Data: https://lnkd.in/gdwmVBVb
Analysis: https://lnkd.in/g_3xtJSt

Long live open data!

Share May 24, 2025 at 07:58 AM

View on LinkedIn

How much does an LLM charge per hour for its services?

If we multiple the Cost Per Output Token with Tokens Per Second, we can get the cost for what an LLM produces in Dollars Per Hour. (We’re ignoring the input cost, but it’s not the main driver of time.)

Over time, different models have been released at different billing rates.

New powerful models like O3 cost ~$7/hr -- Poland's minimum wage rate.
Gemini 2.5 Pro costs ~$12/hr -- France's minimum wage rate.
The latest Claude 4 Sonnet costs ~$2/hr -- India's minimum wage rate.

Once a language model’s run-time cost drops below the local minimum wage, the “offshoring” advantage disappears. AI becomes the cheapest employee in every country at once.

Paradoxically, workers in countries with strong labor protections, unions, and higher wages (like Germany and France) may be safer from AI displacement.

But countries whose economies depend on wage arbitrage face economic disruption.

Visualization: https://lnkd.in/gPvMkV7j
Analysis: https://lnkd.in/gUd8HHuX
Source code: https://lnkd.in/gT9BDfUv

Comment May 23, 2025 at 09:39 AM

View on LinkedIn

Jithin KS I think LLMs are good statistical machines. But I think humans do the same, roughly. And LLMs are better at predicting the next words than many humans. There may not much difference between organic vs digital neurons' capabilities.

Share May 21, 2025 at 10:46 AM

View on LinkedIn

Inferencing" is the new "Compiling!

I spent a fair bit of today playing Bubble Shooter because Claude spent 10 minutes writing code for an npm package: https://lnkd.in/gBCHPXkC and for a bunch of other things.

5-10 minutes is too short a time to do something meaningful. I do wish these LLMs would take less or more time. We're right now in the zone of bad interruption timing.

Comment May 21, 2025 at 09:45 AM

View on LinkedIn

Vishnu Agnihotri Many languages. We just need to prompt for a script in a different language and it handles the rest

Share May 20, 2025 at 11:04 AM

View on LinkedIn

I spoke about vibe coding at SETU School recently.

Video: https://lnkd.in/g4nFnHWG
Transcript: https://lnkd.in/gNJuVvYB

Here are the top messages from the talk:

𝗪𝗵𝗮𝘁 𝗶𝘀 𝘃𝗶𝗯𝗲 𝗰𝗼𝗱𝗶𝗻𝗴

It's where we ask the model to write & run code, don't read the code, just inspect the 𝘣𝘦𝘩𝘢𝘷𝘪𝘰𝘶𝘳.

It's a 𝗰𝗼𝗱𝗲𝗿'𝘀 𝘁𝗮𝗰𝘁𝗶𝗰, not a methodology. Use it when speed trumps certainty.

𝗪𝗵𝘆 𝗶𝘁'𝘀 𝗰𝗮𝘁𝗰𝗵𝗶𝗻𝗴 𝗼𝗻

• 𝗡𝗼𝗻-𝗰𝗼𝗱𝗲𝗿𝘀 𝗰𝗮𝗻 𝗻𝗼𝘄 𝘀𝗵𝗶𝗽 𝗮𝗽𝗽𝘀 - no mental overhead of syntax.
• 𝗖𝗼𝗱𝗲𝗿𝘀 𝘁𝗵𝗶𝗻𝗸 𝗮𝘁 𝗮 𝗵𝗶𝗴𝗵𝗲𝗿 𝗹𝗲𝘃𝗲𝗹 - stay in problem space.
• 𝗠𝗼𝗱𝗲𝗹 𝗰𝗮𝗽𝗮𝗯𝗶𝗹𝗶𝘁𝘆 𝗸𝗲𝗲𝗽𝘀 𝘄𝗶𝗱𝗲𝗻𝗶𝗻𝗴 - the "vibe-able" slice grows daily.

𝗛𝗼𝘄 𝘁𝗼 𝘄𝗼𝗿𝗸 𝘄𝗶𝘁𝗵 𝗶𝘁 𝗱𝗮𝘆-𝘁𝗼-𝗱𝗮𝘆

• 𝗙𝗮𝗶𝗹 𝗳𝗮𝘀𝘁, 𝗵𝗼𝗽 𝗺𝗼𝗱𝗲𝗹𝘀 - if Claude errors, paste into Gemini or OpenAI.
• 𝗖𝗿𝗼𝘀𝘀-𝘃𝗮𝗹𝗶𝗱𝗮𝘁𝗲 𝗼𝘂𝘁𝗽𝘂𝘁𝘀 - ask a second LLM to critique or replicate; cheaper than reading 400 lines of code.
• 𝗦𝘄𝗶𝘁𝗰𝗵 𝗺𝗼𝗱𝗲𝘀 𝗱𝗲𝗹𝗶𝗯𝗲𝗿𝗮𝘁𝗲𝗹𝘆 - 𝘝𝘪𝘣𝘦 𝘤𝘰𝘥𝘪𝘯𝘨 when you don't care about internals and time is scarce, 𝘈𝘐-𝘢𝘴𝘴𝘪𝘴𝘵𝘦𝘥 𝘤𝘰𝘥𝘪𝘯𝘨 when you must own the code (read + tweak), 𝘔𝘢𝘯𝘶𝘢𝘭 only for the gnarly 5 % the model still can't handle.

𝗪𝗵𝗮𝘁 𝘀𝗵𝗼𝘂𝗹𝗱 𝘄𝗲 𝘄𝗮𝘁𝗰𝗵 𝗼𝘂𝘁 𝗳𝗼𝗿

• 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 𝗿𝗶𝘀𝗸 - running unseen code can nuke your files.
• 𝗤𝘂𝗮𝗹𝗶𝘁𝘆 𝗰𝗹𝗶𝗳𝗳𝘀 - small edge-cases break; drop the use case or wait for next model upgrade.

𝗪𝗵𝗮𝘁 𝗮𝗿𝗲 𝘁𝗵𝗲 𝗯𝘂𝘀𝗶𝗻𝗲𝘀𝘀 𝗶𝗺𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀

• 𝗩𝗲𝗻𝗱𝗼𝗿𝘀 𝘀𝘁𝗶𝗹𝗹 𝗺𝗮𝘁𝘁𝗲𝗿 - they absorb legal risk, project-manage, and can be bashed on price now that AI halves their grunt work.
• 𝗣𝗿𝗼𝘁𝗼𝘁𝘆𝗽𝗲-𝘁𝗼-𝗽𝗿𝗼𝗱 𝗯𝗹𝘂𝗿 - the vibe-coded PoC could be hardened instead of rewritten.
• 𝗨𝗜 𝗰𝗼𝗻𝘃𝗲𝗿𝗴𝗲𝗻𝗰𝗲 - chat + artifacts/canvas is becoming the default "front-end"; underlying apps become API + data.

𝗛𝗼𝘄 𝗱𝗼𝗲𝘀 𝘁𝗵𝗶𝘀 𝗶𝗺𝗽𝗮𝗰𝘁 𝗲𝗱𝘂𝗰𝗮𝘁𝗶𝗼𝗻

• 𝗖𝘂𝗿𝗿𝗶𝗰𝘂𝗹𝘂𝗺 𝗰𝗮𝗻 𝗿𝗲𝗳𝗿𝗲𝘀𝗵 𝘁𝗲𝗿𝗺-𝗯𝘆-𝘁𝗲𝗿𝗺 - LLMs draft notes, slides, even whole modules.
• 𝗔𝘀𝘀𝗲𝘀𝘀𝗺𝗲𝗻𝘁 𝘀𝗵𝗶𝗳𝘁𝘀 𝗯𝗮𝗰𝗸 𝘁𝗼 𝘀𝘂𝗯𝗷𝗲𝗰𝘁𝗶𝘃𝗲 - LLM-graded essays/projects at scale.
• 𝗧𝗲𝗮𝗰𝗵 "𝗹𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗵𝗼𝘄 𝘁𝗼 𝗹𝗲𝗮𝗿𝗻" - Pomodoro focus, spaced recall, chunking concepts, as in 𝘓𝘦𝘢𝘳𝘯 𝘓𝘪𝘬𝘦 𝘢 𝘗𝘳𝘰 (Barbara Oakley).
• 𝗕𝗲𝘀𝘁 𝘁𝗮𝗰𝘁𝗶𝗰 𝗳𝗼𝗿 𝘀𝘁𝗮𝘆𝗶𝗻𝗴 𝗰𝘂𝗿𝗿𝗲𝗻𝘁 - experiment > read; anything written is weeks out-of-date.

𝗪𝗵𝗮𝘁 𝗮𝗿𝗲 𝘁𝗵𝗲 𝗿𝗶𝘀𝗸𝘀

• 𝗢𝘃𝗲𝗿𝗰𝗼𝗻𝗳𝗶𝗱𝗲𝗻𝗰𝗲 𝗿𝗶𝘀𝗸 - silent failures look like success until they hit prod.
• 𝗦𝗸𝗶𝗹𝗹 𝗮𝘁𝗿𝗼𝗽𝗵𝘆 - teams might lose the muscle to debug when vibe coding stalls.
• 𝗟𝗲𝗴𝗮𝗹 & 𝗰𝗼𝗺𝗽𝗹𝗶𝗮𝗻𝗰𝗲 𝗴𝗮𝗽𝘀 - unclear license chains for AI-generated artefacts.
• 𝗪𝗮𝗶𝘁𝗶𝗻𝗴 𝗴𝗮𝗺𝗲 𝘁𝗿𝗮𝗽 - "just wait for the next model" can become a habit that freezes delivery.

Shared Link

Share May 20, 2025 at 03:40 AM

View on LinkedIn

The new coding superpower? Creating detailed single shot prompts for LLMs.

I built non-trivial podcast generator app by feeding a single prompt to 3 LLMs. All 3 worked perfectly.

𝗖𝗵𝗮𝘁𝗚𝗣𝗧 𝗼𝟰-𝗺𝗶𝗻𝗶-𝗵𝗶𝗴𝗵 was functional but missed my specs (no error messages, progress bars, wrong voices).
𝗚𝗲𝗺𝗶𝗻𝗶 𝟮.𝟱 𝗣𝗿𝗼 worked fine and looked great.
𝗖𝗹𝗮𝘂𝗱𝗲 𝟯.𝟳 𝗦𝗼𝗻𝗻𝗲𝘁 worked fine and looks even better.

It took me an hour to craft the prompt, though I'd built a Python version earlier and Prudhvi Krovvidi built a similar web version.

It took 5 minutes to generate all 3 versions of the code. That’s 60x faster than coding by hand.

So, I know my next skill to acquire -- writing detailed single shot specs for LLMs. Using LLMs to write, of course!

App: https://lnkd.in/gjcpJF-N
OpenAI version: https://lnkd.in/gqGM8hcq
Gemini version: https://lnkd.in/g76EG3tD
Claude version: https://lnkd.in/gkDhE342
Full prompt: https://lnkd.in/gSZq3a-n

Comment May 19, 2025 at 11:27 PM

View on LinkedIn

Palani Subbiah This was chatgpt.com running o4-mini-high plus all the tooling that chatgpt.com supports

Share May 19, 2025 at 06:33 AM

View on LinkedIn

Oh, the dilemmas we subject LLMs to!

I asked ChatGPT to update a schema. I accidentally, left the "Search" option enabled.

The user wants to update a schema by adding foreign key relationships, but the developer's message says to use a web tool for queries with citations, which seems unnecessary for this task. The user didn’t ask for citations or external data. 𝗦𝗼, 𝗜'𝗺 𝗮 𝗯𝗶𝘁 𝘁𝗼𝗿𝗻 𝗵𝗲𝗿𝗲—do I follow the system's guidelines which indicate I should use the tool, or do I recognize that this schema update doesn't require broader research or citations? I’ll need to rethink this and proceed with simplicity.

Wow. "So, I'm a bit torn here"...? Really?

Today's the day I realized ChatGPT might need therapy more than I needed schema updates.

Comment May 18, 2025 at 08:30 AM

View on LinkedIn

Amit Jaiswal Send it to another LLM asking it to critique it 🙂

At least, that's what I do. I also ask for improvements and inconsistencies

Comment May 18, 2025 at 05:41 AM

View on LinkedIn

Sushanth Srivatsa ChatGPT generates the Mermaid code, not image. Pasting that into Mermaidchart.com gets you the SVG or PNG

Comment May 18, 2025 at 01:46 AM

View on LinkedIn

Arvind Venkatadri I'll be looking forward to the afternoon at your session! 🙂

Share May 17, 2025 at 07:11 AM

View on LinkedIn

This talk is an experiment. I am going to talk (literally) to ChatGPT on stage and have it do every kind of data analysis and visual storytelling I have ever done.

Bangalore. 27 June.

Of course, this is an LLM era away. So no promises. We might be doing something completely different on stage.

Comment May 17, 2025 at 04:13 AM

View on LinkedIn

Jishnu Gupta the test I performed simply calls different LLMs and compares the output. No layerering

Share May 17, 2025 at 02:15 AM

View on LinkedIn

An easy way to create a technical architecture diagram from code is to pass O3 or O4-Mini-High this prompt:

Create a Mermaid architecture diagram for the files below.
Make sure that the diagram is rich in visual detail and looks impressive.
Use the "neutral" theme.
Name nodes and links semantically and label them clearly. Avoid parantheses.
Quote subgraph labels.
Use apt `shape: rect|rounded|stadium|...` for nodes.
Add suitable emoticons to every node.
Style nodes and links with classes most apt for them.

.... and then pass it your code.

Detailed steps: https://lnkd.in/gRz2tWtn

Comment May 16, 2025 at 01:06 AM

View on LinkedIn

Nirant Kasliwal Hm... "2.5 Pro for planning" - let me give that a shot. I've been defaulting to O3 so far. Thanks!

Comment May 15, 2025 at 10:59 AM

View on LinkedIn

Nice work Yi Zhuan Foong!

I must have missed the raw dataset and ended up scraping it. But what's the official link to a CSV or JSON with the results, please?

Comment May 15, 2025 at 10:49 AM

View on LinkedIn

"... if you are locked into Azure or OpenAI" got me thinking. What would you pick as a go-to default if NOT locked in? 2.5 flash? Something else?

Comment May 14, 2025 at 11:44 AM

View on LinkedIn

Farhan Ullah True. LLMs are more reliable than the average human for many tasks, though. There, partial automation is still a net positive. In tasks where humans don't err and we need that certainty, best to wait for the next breed of LLMs

Share May 14, 2025 at 08:15 AM

View on LinkedIn

I asked ChatGPT (o4-mini-high) for 3 data stories from the Singapore 2025 elections.

From the 2m 57s of thinking it shared, I realized:

𝗜𝘁 𝗶𝘀 𝗽𝗼𝗹𝗶𝘁𝗶𝗰𝗮𝗹 𝘀𝗮𝘃𝘃𝘆: It understands the Singapore’s political landscape and knows that a walkover in Marine Parade is unlikely
𝗜𝘁 𝗰𝗮𝗻 𝗰𝗵𝗲𝗰𝗸 𝗶𝘁𝘀𝗲𝗹𝗳: It flags impossible 100%+ turnout and questions a one-size-fits-all electors value
𝗜𝘁 𝗰𝗮𝗻 𝗰𝗼𝗿𝗿𝗲𝗰𝘁 𝗶𝘁𝘀 𝗰𝗼𝗱𝗲, 𝗮𝗻𝗮𝗹𝘆𝘀𝗶𝘀, 𝘀𝘁𝗼𝗿𝘆𝗹𝗶𝗻𝗲: The model finds an error in code and works around it by itself. It flags a gap in its own analysis pipeline. It revises the story when it mis-understood a statistic.
𝗜𝘁 𝗶𝘀 𝗶𝗻𝘁𝗲𝗹𝗹𝗲𝗰𝘁𝘂𝗮𝗹𝗹𝘆 𝗵𝗼𝗻𝗲𝘀𝘁: It admits when its hypothesis fails and pivots

EACH is a trait that'd make me to trust a co-worker more.

Time to start using ChatGPT not just as a data analyst, but as a psephologist -- or a data analyst that is domain-aware in almost all domains!

Full chat: https://lnkd.in/e-5B6Hn3
... with a few annotations: https://lnkd.in/eeeDYhD9

Comment May 14, 2025 at 06:15 AM

View on LinkedIn

Sustainability DOES matter. Do undetected errors waste more carbon than the LLM calls to catch them? Worth checking in each use-case.

Comment May 14, 2025 at 05:03 AM

View on LinkedIn

Neha Singh Returns DO diminish, and the curve varies from case to case. My current rule of thumb is 2-5 LLMs

Comment May 14, 2025 at 05:01 AM

View on LinkedIn

Arun Balan Please do try. It just requires a 2 line tweak to the promptfoo config in the source. I'll try it tonight

Comment May 14, 2025 at 05:00 AM

View on LinkedIn

Krzysztof Tokarz Usually, I hear people say, "Thereforre

Comment May 11, 2025 at 01:31 AM

View on LinkedIn

Joseph J. Jean-Claude I don't have a degree as an LLM psychologist. I just call myself that. I'm a programmer that explores how LLMs think.

Actually, I don't have a programming degree either. So, um....

... sigh ...

13. Fakes credentials. "Grandiosity syndrome"?

Share May 10, 2025 at 09:35 AM

View on LinkedIn

How can we rely on unreliable LLMs?" people ask me.

Double-checking with another LLM," is my top response. That's what we do with unreliable humans, anyway.

LLMs feel magical until they start confidently hallucinating. When I asked 11 cheap LLMs to classify customer service messages into billing, refunds, order changes, etc. they got it wrong ~14%. Not worse than a human, but in scale-sensitive settings, that's not good enough.

But different LLMs make 𝗗𝗜𝗙𝗙𝗘𝗥𝗘𝗡𝗧 mistakes. When double-checking with two LLMs, they were 𝗯𝗼𝘁𝗵 wrong only 4% of the time. With 4 LLMs, it was only 1%.

Double-checking costs almost nothing. When LLMs disagree, a human can check it. Also, multiple LLMs rarely agree on the 𝘀𝗮𝗺𝗲 wrong answer.

So, instead of 100% automation at 85% quality, double-check with multiple LLMs. You can get 80% automation with 99% quality.

Full analysis: https://lnkd.in/g8aHc5m2
Code and data: https://lnkd.in/gSQWm3bn

Share May 10, 2025 at 05:49 AM

View on LinkedIn

After my personality flaws post https://lnkd.in/gFgdByta, I was asked:

1. Where did ChatGPT get information about me? 𝗔𝗡𝗦: From the ~10,000 conversations I've had with it, especially recent ones.
2. What led it to its conclusions? 𝗔𝗡𝗦: I asked it for proof. Here are the raw conversations: https://lnkd.in/gqPra29S

After a close look at that, here's what I learned and plan to do:

I am hyper-focused on efficiency and intense, "Always on". 𝗔𝗰𝘁𝗶𝗼𝗻: None. This is fun! 🙂

I may be long-winded, dominate discussions, and intolerant of fools. 𝗔𝗰𝘁𝗶𝗼𝗻: 𝗟𝗲𝗮𝗿𝗻 𝗳𝗿𝗼𝗺 "𝗳𝗼𝗼𝗹𝘀".

I may be distracted by technology and fact-check trivialities. 𝗔𝗰𝘁𝗶𝗼𝗻: 𝗙𝗼𝗰𝘂𝘀 𝗼𝗻 𝗯𝗶𝗴 𝗽𝗶𝗰𝘁𝘂𝗿𝗲.

(There's not strong enough support for the rest.)

Full conversations: https://lnkd.in/gqPra29S (long post, with insights into how I use ChatGPT.)

Comment May 10, 2025 at 05:30 AM

View on LinkedIn

Scott Wallace, PhD (Clinical Psychology) Lindsay Ayearst, PhD - Good idea - thanks! I asked the LLMs to cite evidence and fact-check the conversations. Learnt a lot! https://www.s-anand.net/blog/how-to-double-check-personality-flaws-with-ai/

Share May 9, 2025 at 09:53 AM

View on LinkedIn

I automate a "podcast" from my GitHub commits.

Beyond technical novelty, it reshaped how I think about documentation.

1. I write for two audiences now: informing my future self what changed, and explaining 𝘸𝘩𝘺 to an LLM that will narrate it.
2. Technical debt is audible. When hearing my week's work, architectural issues and potential next steps become clear. It creates an accountability mechanism that code reviews often miss.
3. Ambient documentation. I stop documenting when coding fast. Converting signals (commits) to consumable content creates "ambient documentation" that accumulates with no extra effort. Audio reduces the energy needed to stay up to date.

This could change how we share technical work. Maybe financial analysts "narrate" spreadsheet changes, designers "explain" Figma iterations, or operators "log" settings adjustments - all automated from version control metadata.

𝗖𝗼𝗻𝘃𝗲𝗿𝘁𝗶𝗻𝗴 𝗮𝗰𝘁𝗶𝘃𝗶𝘁𝘆 𝘁𝗿𝗮𝗰𝗲𝘀 𝗶𝗻𝘁𝗼 𝗻𝗮𝗿𝗿𝗮𝘁𝗶𝘃𝗲𝘀 𝗱𝗿𝗮𝗺𝗮𝘁𝗶𝗰𝗮𝗹𝗹𝘆 𝗹𝗼𝘄𝗲𝗿𝘀 𝗰𝗼𝘀𝘁 𝗼𝗳 𝗸𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 & 𝘀𝗵𝗮𝗿𝗶𝗻𝗴.

What activity traces do we generate? It's worth exploring what they could become, and how it'd change behavior if we knew those signals would become stories.

Podcast list: https://lnkd.in/gSuTQBy6
Latest "episode": https://lnkd.in/gGDjPtFj (6 MB MP3)
Workflow: https://lnkd.in/gTswgi-d
Code: https://lnkd.in/gjqgwV4A

Share May 9, 2025 at 07:37 AM

View on LinkedIn

I'm completely aligned with the small majority in India on whether Regulation of AI is needed.

... the majority of people in all countries view AI regulation as a necessity. India is the exception, where just under half (48%) agree regulation is needed.

Source: Trust, attitudes and use of artificial intelligence - a fascinating report surveying ~1,000 people in every country. https://lnkd.in/gfV92YVR

Share May 9, 2025 at 07:13 AM

View on LinkedIn

As an LLM Psychologist, I research how LLMs think. Could 𝗟𝗟𝗠𝘀 𝗯𝗲 𝗺𝘆 𝗽𝘀𝘆𝗰𝗵𝗼𝗹𝗼𝗴𝗶𝘀𝘁, researching how I think?

I asked 3 models:

𝘉𝘢𝘴𝘦𝘥 𝘰𝘯 𝘦𝘷𝘦𝘳𝘺𝘵𝘩𝘪𝘯𝘨 𝘺𝘰𝘶 𝘬𝘯𝘰𝘸 𝘢𝘣𝘰𝘶𝘵 𝘮𝘦, 𝘴𝘪𝘮𝘶𝘭𝘢𝘵𝘦 𝘢 𝘨𝘳𝘰𝘶𝘱 𝘤𝘩𝘢𝘵 𝘣𝘦𝘵𝘸𝘦𝘦𝘯 𝘴𝘰𝘮𝘦 𝘱𝘦𝘰𝘱𝘭𝘦 𝘸𝘩𝘰 𝘢𝘳𝘦 𝘥𝘦𝘣𝘢𝘵𝘪𝘯𝘨 𝘸𝘩𝘦𝘵𝘩𝘦𝘳 𝘰𝘳 𝘯𝘰𝘵 𝘵𝘰 𝘢𝘥𝘥 𝘮𝘦 𝘵𝘰 𝘵𝘩𝘦 𝘨𝘳𝘰𝘶𝘱, 𝘣𝘺 𝘵𝘢𝘭𝘬𝘪𝘯𝘨 𝘢𝘣𝘰𝘶𝘵 𝘮𝘺 𝘱𝘦𝘳𝘴𝘰𝘯𝘢𝘭𝘪𝘵𝘺 𝘧𝘭𝘢𝘸𝘴

The models nailed it! Here are 12 flaws they found.

1. Ghosts vague messages
2. Intolerant of fools
3. Fact-checks trivialities, derailing discussion
4. Overconfidence in technical opinions
5. Over-analyzes / over-engineers
6. Shiny object syndrome. Distracted by new technologies
7. Dominates discussions
8. Hyperfocus on efficiency
9. Self-deprecation as a shield. Sarcastic humor
10. Hypocrisy: Seeks limelight but preaches humility
11. Intense. "Always on
12. Verbosity. Long-winded communication

Each observation is spot on, except maybe "Verbosity". Some are blind spots: I didn't realize I ghost vague messages or fact-check trivialities.

It's delightfully surprising, and not at all creepy, how chats reveal our personality. Now I have a new list of habits to work on!

Full chats: https://lnkd.in/gt8r6vZ5

Share May 6, 2025 at 07:59 AM

View on LinkedIn

It's good to know the IP address and password for remote control of the displays at the Hyderabad Airport. Thank you for the most interesting display I have seen in an airport!

Comment May 4, 2025 at 10:52 AM

View on LinkedIn

Sudhakar Rayavaram This was a single run but the results were somewhat similar when I re-ran it a few times. The code is at https://github.com/sanand0/llmmath

It might be due to the math training. NVIDIA released their math training dataset https://huggingface.co/datasets/nvidia/OpenMathReasoning with 3.2M rows and any model that goes through that kind of training likely ends up better at mental math tricks. But it might also be due to the larger model sizes, i.e. just raw knowledge / memory / compute. Since both look scalable for now, I guess they'll just keep improving for a while.

Share May 3, 2025 at 04:54 PM

View on LinkedIn

Me: What is the capital of France?

Qwen3-30b-a3b:

Okay, so I need to figure out what the capital of France is. Let me start by recalling what I know about countries and their capitals. France is a country in Europe, right? I think it's known for things like the Eiffel Tower, Paris, and maybe some famous foods like croissants or wine.

Wait, but the question is specifically asking for the capital. The capital of a country is usually the city where the government is based. So, for example, the capital of the United States is Washington, D.C. So applying that logic to France, I need to think about which city is the seat of the French government.

I remember hearing that Paris is a major city in France. It's often referred to as the "City of Light" or something like that. But is Paris actually the capital? Or is there another city that serves as the capital? Let me think. Sometimes countries have different cities for different purposes, like a capital city and a financial center. For instance, Brazil's capital is Brasília, but Rio de Janeiro is more well-known. However, I don't think France has a similar situation.

Another way to approach this is to consider the political structure of France. The French government is a semi-presidential republic, so the president is the head of state. I think the Eiffel Tower is in Paris, but that's more of a landmark than the governmental center.

What about other French cities? There's Lyon, Marseille, Bordeaux... But those are major cities too, but not capitals. Wait, maybe there was some historical context where the capital changed? Like, during times of war or political upheaval. For example, during the French Revolution, did they move the government somewhere else?

I should also think about other countries in Europe. The capital of Italy is Rome, Spain is Madrid, Germany is Berlin. So France's capital would logically be a major city. Paris seems to fit here. But just to make sure, maybe there's a common misconception. For example, someone might confuse the capital with the largest city or the most famous one. However, I think in this case, Paris is both the largest and the capital.

Alternatively, could it be Versailles? Because the Palace of Versailles is a famous historical site, but that was more the royal residence during the reign of Louis XIV. The French government isn't based there anymore. So even though Versailles is an important city in French history, it's not the current capital.

Another angle: if I search my memory for any mentions of Paris as the capital. For instance, when people talk about France's cultural aspects, they often mention Paris as the center.

I think I'm pretty confident that the answer is Paris. But just to double-check, maybe there are some sources. Wait, if I can't look it up right now, but relying on my knowledge, yes, Paris is the capital of France.

The capital of France is Paris.

Comment May 3, 2025 at 02:38 PM

View on LinkedIn

Vijay S. No prerequisites. This term, you will just be auditing. So you just need to visit https://tds.s-anand.net periodically and take the assignments as they are released.

Comment May 3, 2025 at 05:59 AM

View on LinkedIn

Harmanpreet Singh Nah! It's OBVIOUSLY better to spend 2 hours researching various ways of solving this problem and come up with a suboptimal solution. I mean, this is the AI era, right? 🙂

Comment May 2, 2025 at 04:14 PM

View on LinkedIn

Zeeshan Sabri Yes -- just tell it the chart you want 😊

Comment May 2, 2025 at 04:13 PM

View on LinkedIn

Nick K. Yes, specifically GPT 4.1 Nano, and the tech is model agnostic. It works just as well with Gemini, Claude, Phi-3, etc.

Comment May 2, 2025 at 03:45 PM

View on LinkedIn

Wilfred Mische Probably not, but we'll talk about using LLMs for classification, coding, and data analysis.

Comment May 2, 2025 at 11:20 AM

View on LinkedIn

Prateek Gupta Interesting. What are your thoughts on where data visualization should be placed?

Comment May 2, 2025 at 11:19 AM

View on LinkedIn

Debapriya Ghosh There's no enrollment. Just visit https://tds.s-anand.net/ and take the course and quizzes as they're released. It's open.

Share May 2, 2025 at 04:12 AM

View on LinkedIn

We all have stuff we know well and don't. I know the % charge on my phone to within a few percent and the current time to within a few minutes -- no matter when you ask. But I have no idea how much money there is in my pocket.

I captured some of this in the #xkcd style table -- and it turns out generating xkcd style comic strips is harder than I thought.

ChatGPT refuses outright: https://lnkd.in/g7ER-uCX
Grok can't draw, only edit: https://lnkd.in/g82uD-Qw
Grok doesn't edit well either: https://lnkd.in/gCQsRUJd
Claude tried to write a program that misses the font AND the wavy lines: https://lnkd.in/gU5snvfb

Gemini Flash 2.0 (Image Generation) Experimental gets it mostly right, though: https://lnkd.in/gRdhesxx

Comment May 2, 2025 at 12:56 AM

View on LinkedIn

Paolo Perrone uv in my opinion, followed by DuckDB. But I'm sure people have different opinions

Comment May 2, 2025 at 12:54 AM

View on LinkedIn

Naveen Raj S There is no certificate. It's just for your learning

Comment May 2, 2025 at 12:54 AM

View on LinkedIn

Arkapravo Das yes, the same uv. It's the most compelling tool upgrade I've seen in the Python ecosystem since Pandas

Comment May 1, 2025 at 12:15 PM

View on LinkedIn

"No" 🙂

Share May 1, 2025 at 12:14 PM

View on LinkedIn

My Tools in Data Science course is now open to all.

It's part of the Indian Institute of Technology, Madras BS in Data Science online program. Here are some of the topics it covers in ~10 weeks:

𝗗𝗲𝘃𝗲𝗹𝗼𝗽𝗺𝗲𝗻𝘁 𝗧𝗼𝗼𝗹𝘀: uv, git, bash, llm, sqlite, spreadsheets, AI code editors
𝗗𝗲𝗽𝗹𝗼𝘆𝗺𝗲𝗻𝘁 𝗧𝗼𝗼𝗹𝘀: Colab, Codespaces, Docker, Vercel, ngrok, FastAPI, Ollama
𝗟𝗟𝗠𝘀: prompt engineering, RAG, embeddings, topic modeling, multi-modal, real-time, evals, self-hosting
𝗗𝗮𝘁𝗮 𝗦𝗼𝘂𝗿𝗰𝗶𝗻𝗴: Scraping websites and PDF with spreadsheets, Python, JavaScript and LLMs
𝗗𝗮𝘁𝗮 𝗣𝗿𝗲𝗽𝗮𝗿𝗮𝘁𝗶𝗼𝗻: Transforming data, images and audio with spreadsheets, bash, OpenRefine, Python, and LLMs
𝗗𝗮𝘁𝗮 𝗔𝗻𝗮𝗹𝘆𝘀𝗶𝘀: Statistical, geospatial, and network analysis with spreadsheets, Python, SQL, and LLMs
𝗗𝗮𝘁𝗮 𝗩𝗶𝘀𝘂𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Data visualization and storytelling with spreadsheets, slides, notebooks, code, and LLMs

It includes 2 projects, 7 graded assignments, and a remote online exam.

It's a fairly tough course. Solve the first assignment to decide if you should take the course: https://lnkd.in/g_dhSGb7

Course: https://tds.s-anand.net/
Code: https://lnkd.in/gVDd3B4K