Vibe Analysis

Fifth Elephant Workshop · 16 Sep 2025, 2:00 pm IST · Bangalore
Anand S · LLM Psychologist · Straive
CC0 - Public Domain

https://sanand0.github.io/talks/

You need some software before we start

For programmers:

We will be sharing prompts on WhatsApp

WhatsApp Invite

Vibe Coding: code like code doesn't exist

... where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.

I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment

Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away.

Andrej Karpathy

Vibe Analysis: analyze, but ignore the code

Here's the vibe analysis mindset:

  • You're pragmatic. You care about insights, not how they're coded.
  • You're sceptical. You cross-question and find errors.
  • You're playful. You try weird "what ifs" just to see what breaks.

Let's analyze some datasets with this mindset.

Pick any dataset. Here are some datasets

Here are some India-specific datasets

Now we'll use LLMs for everything

  1. Explore data
  2. Clean data
  3. Model data
  4. Explain data
  5. Deploy data
  6. Anonymize data

We'll ad lib this

  1. I have no script / preparation. We'll do this live.
  2. You pick a dataset. You suggest hypotheses.
  3. You guide the agents, even on my system. I'll just "facilitate".
  4. Publish your findings and prompts. Make it reproducible.
  5. Red team this. Critique sceptically. See what survives.
  6. Synthesize learnings. Let's see what emerges / drops.

I'm keen to experiment

  • Do multiple versions pay off?
  • Explain-then-code vs code-then-explain?
  • What sceptical review approaches are effective?
  • What unit tests help most? Any invariants?
  • What's the minimal stack for analysis?
  • What schema context helps LLMs most? (Ablation)
  • Do data tools via AGENTS.md (dbt, rclone, ...) help?
  • What sub-agent specializations work well?
  • Where are pre-mortems effective?

Let's cook some insights!

Vibe Analysis

Fifth Elephant Workshop · 16 Sep 2025, 2:00 pm IST · Bangalore
Anand S · LLM Psychologist · Straive
CC0 - Public Domain

Please share your feedback here:

Insights from IMDb rating histogram

Lockdown-era genre peaks

Reality-TV reached its highest output during lockdown with 78 feature releases in 2020, a 2.07× surge versus the 2017–2019 baseline of 37.7 films per year and 3.12× the 2022–2024 recovery average of 25 releases.

Talk-Show titles also peaked in 2020, delivering 26 releases—double the pre-lockdown average of 13 per year and 2.23× the immediate post-lockdown average of 11.7 releases.

Both genres cleared the study’s thresholds (at least 25 lockdown-year films and ≥20% growth over the pre-lockdown window), marking them as the only categories with statistically meaningful release spikes tied to the COVID-19 lockdown period

Director Movies Preferred Letter
Rakesh Roshan 6 K
Haruo Sotozaki 5 D
Keishi Otomo 5 R
Frank Darabont 4 T
Chad Stahelski 4 J
Alan Mak 4 I
Kunihiko Yuyama 4 P
Michael Chaves 4 T
John R. Cherry III 4 E
Saeed Roustayi 3 L

Any genres that rate 1950-60s films too high?

  • Film-noir ratings cluster in the classic period: about 13.3 % of all noir ratings come from 1950s–1960s releases and another 25 % from the 1930s–1940s. The median genre sees only ~4.5 % of its ratings tied to 1950s–1960s titles, so noir is roughly 3× as concentrated there. Those mid-century titles are strong performers (e.g., 1950s average 3.997, 1960s 4.053), but later decades (1970s neo-noir at 4.12) keep the overall average high too.
  • War films also skew earlier: 1950s–1960s releases account for 11.3 % of war-movie ratings versus that same ~4.5 % median elsewhere. Their 1950s/1960s averages (3.963 and 4.004) are among the genre’s best, yet pre/post-1960 decades remain relatively strong (e.g., 1930s at 3.86, 1990s at 3.85).
  • It isn’t solely a volume imbalance though; even later entries hold up reasonably well, suggesting a combination of historical canon effects and comparatively consistent quality.

Feedback analysis

Lessons from vibe analysis of feedback:

  • Share setup instructions + video/asciinema to reduce setup friction.
  • Share self-check instructions for a default dataset with prompts.
  • Add explicit checkpoints with self-tests.
  • Use two-lane pacing: Cruise (safe, minimal path) vs Sprint (optional accelerators).

Participant feedback

Participant feedback: Ambarish

Before the workshop, I was curious about the name and content, but trusted the organizer (Hasgeek) & Anand. Anand’s deep knowledge and lucid explanations guided us through using LLMs for data analysis, testing hypotheses, and running parallel tasks with tools like Claude and Codex. I learned how BI and data analysis are evolving in the AI age, and how using multiple models reduces hallucinations and errors. Even as a remote participant, the pace was comfortable. Thanks again, Anand, for a wonderful session!

Participant feedback: Anindo Chakraborty

Yesterday’s workshop cut through the hype around ‘vibe coding’ and showed how it applies to analytics and data science. Anand walked us through the full lifecycle — from raw data to hypotheses, AI-driven testing, and refining results into business-ready insights. Advanced topics included building Agents, sub-agents, and using AGENTS.md files for task automation. The hands-on ‘flow state’ approach made learning incredibly practical. My biggest takeaway: start small, get your hands dirty, and practical experience will follow. Outstanding workshop — thank you, Anand!

Ideas from https://chatgpt.com/c/68c8bea3-c348-832b-a011-cc2723a47279