I vibe-coded data visualizations of my browser history over the last 4 months.
STEP 1: Ran codex --search with GPT-5-codex (high) to create an ideation plan.md
Suggest interesting analyses & visual storytelling ideas based on my browser history that I can publish as journalistic data stories on my blog.
Use my Edge/Ubuntu browser history at ~/.config/microsoft-edge/Default/History - a SQLite database. (Open it as read-only with no lock.)
Search online for more creative ideas.
Evaluate each based on analysis novelty, visual impact, usefulness for the reader, and reliability/robustness of the analysis.
Save as plan.md.
STEP 2: Ran the same with codex--search with GPT-5 (high) to create an ideation plan-b.md
β I like GPT-5's ideation over GPT-5-Codex's ideation.
STEP 3: Continued the thread with GPT-5 asking it to merge the plans into plan-spec.md.
Read plan.md. Merge ideas from there and write a revised story idea list in plan-spec.md. Re-evaluate the idea list on the same criteria. Pick the top 3 ideas. For the top 3, create a concise prompt for Claude Code to generate the visual story. Include the SQL query and analysis steps. Do not explain what visual to create or how. Claude Code has a better visual aesthetic and can decide its course. Instead, explain the effect we want the analysis and visual to have on the audience; the mood / feeling to evoke.
STEP 4: Reviewed plan-spec.md. I like the goals.
... but would rather have Codex do only the analysis and Claude Code do only the visualization. So I moved plan-spec.md to /tmp (temporarily) and ...
STEP 5: Re-merge into plan-spec.md and create story-wise specs using codex --search with GPT-5 (high)
Write prompts for interesting journalistic visual data stories from my browser history for my blog.
Based on my Edge/Ubuntu browser history at ~/.config/microsoft-edge/Default/History - a SQLite database (open it as read-only with no lock) we have ideas in plan.md and plan-b.md. Merge ideas and write a revised story idea list in plan-spec.md.
Evaluate each based on analysis novelty, visual impact, usefulness for the reader, and reliability/robustness of the analysis. Pick the top 3 ideas.
For each of these top 3 ideas:
- Create a folder named concisely (2-3 words) based on the story. Use lowercase-hyphenated names.
- In each folder, create a concise prompt for Claude Code to generate the visual story in
${folder}/spec.md. Do not explain what visual to create or how. Claude Code has a better visual aesthetic and can decide its course. Instead, explain the effect we want the analysis and visual to have on the audience; the mood / feeling to evoke. Explain how Claude Code can re-do the analysis or get any additional context.- Run the analyses required to generate the story and save them as .csv files under the folder. It is better to provide more information for context than just the summaries. So include any additional fields/rows that may help write a better story. Mention what each file contains and how it was generated in spec.md.
STEP 5: Ran npx -y @anthropic-ai/claude-code --permission-mode acceptEdits in attention-clock/.
Create a beautiful award winning data journalistic visualization on my browsing history as per spec.md.
It created attention-clock which was beautiful but not too insightful.
STEP 6: Asked Claude Code to generate a full story by itself.
I copied the history into history.db and ran npx -y @anthropic-ai/claude-code --permission-mode acceptEdits in history/ with:
Create a beautiful award winning data journalistic visualization on Anand's browsing history.
Use this copy of Edge's SQLite
history.db.Be insightful, funny/witty, gently instructive.
It generated a version which looked OK but had this problem:
See screenshot1.webp. Two charts are empty. Fix these.
That was OK, not too impressive either.
STEP 7: Ran npx -y @anthropic-ai/claude-code --permission-mode acceptEdits in rabbit-holes/ with a revised version of spec.md using a local copy of the history.
Create a beautiful award winning data journalistic visualization on Anand's browsing history. Focus on how a single spark becomes a journey. Make readers feel the momentum of a rabbit hole β curiosity tightening into flow β and the tenderness of losing it. Celebrate deep dives, avoid shaming detours.
Effect/mood to evoke
- Momentum: chains that gather force and coherence.
- Texture: some journeys are tight spirals, others are branching explorations.
- Reflection: name the themes without reducing them to productivity.
Analysis hints:
- Prefer Edgeβs on-device βJourneysβ clusters (
clusters,clusters_and_visits) when populated; otherwise fall back to pure graph walks overvisits.from_visit.- Re-run SQL on a read-only copy of
history.db.- Foreground time from
context_annotations.total_foreground_duration(Β΅s). Time bounds via min/max timestamps per journey/chain.Some existing CSVs for reference:
- chains_summary.csv β one row per root
from_visitchain with visit_count, distinct_domains, start/end timestamps (UTC), duration_sec, total_foreground_sec; built via a recursive CTE that walksfrom_visitfrom roots.- chain_depths.csv β per chain, maximum depth (longest branch) and total_nodes; useful to pick the most βrabbitβholeβishβ journeys.
- If
clusters*tables are non-empty, you can also compute the cluster-level summaries (clusters_summary.csv,cluster_depths.csv).
While working, it said something I quite liked:
The visualization will use a particle flow system where each chain is a flowing stream of particles, showing momentum building as curiosity deepens.
It created rabbit-holes/index.html which was beautiful! But I needed more details.
STEP 7: I really liked what it created, so I continued the session, asking it to expand.
I love the design and style of this! But this does not give me any specifics about any of these: the spirals, the branching explorations, etc. like what sites I visited, what the chain was about, or even what the branching structure looked like.
Expand the story to provide specifics. Be insightful and instructive.
STEP 8: This was beautiful, so I took a backup of this version and had it tweak:
Fantastic! Hyperlink all pages. Begin with the patterns of exploration. Then, for each pattern, show two interesting examples. The current examples are perfect. Add more as required. Pick the most interesting ones.
Then:
Avoid localhost examples
Then:
Ensure that the three patterns of exploration fit in a single row. Ensure that each category begins with a public site example, i.e. not a site that requires a login or that the audience cannot access.
The result is beautiful!
STEP 9: Same approach. Ran npx -y @anthropic-ai/claude-code --permission-mode acceptEdits in search-funnels/ with a revised version of spec.md using a local copy of the history.
Create a beautiful award winning data journalistic visualization on Anand's browsing history. Reveal whether questions actually find answers. Make the journey from a search to its first destination feel tangible β hopeful when swift, curious when meandering. Focus on empathy: everyone googles in loops; clarity comes from seeing the loops.
Effect/mood to evoke
- Curiosity resolved: quick funnels feel like a small win.
- Discovery vs. dithering: some terms scatter to many sites; others converge.
- Practicality: readers pick up how to craft searches that land.
Analysis hints:
- Re-run SQLite on
history.db.- Identify search visits via
keyword_search_termsand find first click-throughs usingvisits.from_visit.- Compute
time_to_click_secas(dest.visit_time - search.visit_time)/1e6and aggregate bynormalized_term.Some existing CSVs for reference:
- search_funnels_terms.csv β per normalized search term: search_events, clicks, click_rate, avg_time_to_click_sec, unique_dest_domains, top_dest_domain (+ clicks/share), total_dest_foreground_sec.
- search_destinations.csv β for each term Γ destination domain: clicks, avg_time_to_click_sec, total_dest_foreground_sec.
- search_samples.csv β sample term β destination titles with timestamps (anonymize as needed in publication).
Version 1 was nice. So I added:
Nice! In the search journeys, highlight the most interesting individual (search, time-to-click) items and add a sentence explaining these. Allow filtering by these (apart ) Allow sorting by time, clicks, searches, destinations, and restoring to the original order. For each journey type, synthesize and share insights based on the patterns you see. Add links to search queries (e.g. google.com/search?q=...). Replace the Rhythm of Search visual with a more insightful, visually appealing one.
Version 2 could be improved a bit:
Great!
Drop the "Where Do Searches Lead" section For the "Interesting" searches, hand-craft the explanation for each. Unique, insightful, leading up to a collective story. Improve the "What Each Journey Type Reveals" section. Rewrite with big, useful, non-obvious/surprising insights.
Modify "The Search Landscape" into a scatterplot that fills the width of the window, where:
- x = Number of searches
- y = Number of destinations
- Size = Number of searches at that (x,y) intersection
- Color = Average time to click (divergent scale)
Rewrite the "Reading the Landscape" chart to explain the chart AND share with big, useful, non-obvious/surprising insights.
This led to a squashed chart. So....
The search landscape chart is squeezed into the left. See screenshot-landscape.webp
Version 3 didn't have much insight on the search landscape. So:
Delete "The Search Landscape" and "Reading the Landscape" sections.
To review for privacy issues, I asked codex with GPT-5 (high) to create leaks.md:
Review files added for commit and identify privacy issues / data leaks. For each file, mention a list of issues (if any) along with severity. Provide enough information for the reader to decide whether to publish or not. Save this in browser-history/leaks.md.
After reviewing, I went to the search-funnels folder and, in a new session of codex with GPT-5 (high) to:
Go through v1.html, v2.html, v3.html and index.html. Find out which terms would be loaded from search_funnels_terms.csv based on the code's functionality. Filter the file so that only those terms remain and delete the rest. Ensure that this does not break any functionality.
That went too aggressive, so:
Go through index.html. Find out which terms would be loaded from search_funnels_terms.csv based on the code's functionality. Ensure that every filter (instant, quick, ...) and every sort order will have at least 20 results. Ensure that all interesting terms (e.g. those mentioned in the story) are included. Delete the rest from search_funnels_terms.csv.
I continued exploring rabbit holes with this prompt.
Create a beautiful award winning data journalistic visualization on Anand's browsing history as a single page web app.
Write about the following three patterns of browsing: Linear Spirals, Hub & Spoke, Wide Survey.
- Skip localhost sites.
- Categorize browsing chains into these buckets (or "Other").
- Find out how many belong in each category.
- Explore interesting chains and write 3 mini-stories about the most interesting ones in each category for a lay audience.
- If the most interesting ones are private sites (requiring login, e.g. banks, email, etc.) pick no more than 1 such story per category.
- Always include links to public pages. Never include links to private pages (i.e. that requires login, e.g. banks, email, etc.)
Linear Spirals: One page β next β next. Deep focus, single thread. Branching β 1.0 Root ββ A ββ B ββ C ββ D ββ E ββ F Hub & Spoke: Return to index, open new tab. Configuration work. Branching 2-6Γ Root ββ A ββ B β ββ B1 β ββ B2 ββ C ββ D Wide Survey: Many tabs from root. Cataloging, surveying. Branching 10-24Γ Root ββ A ββ B ββ C ββ D ββ E ββ ...Analysis: Run SQL on
history.db. Use these queries as hints:-- chain_summary WITH recursive edges AS ( SELECT v.id AS visit_id, v.from_visit, datetime((v.visit_time/1000000)-11644473600,'unixepoch') AS ts_utc, COALESCE(c.total_foreground_duration, v.visit_duration)/1000000.0 AS fg_sec, (v.transition & 255) AS base_t, u.url, u.title, CASE WHEN u.url LIKE 'http%' THEN substr(u.url, instr(u.url,'://')+3, CASE instr(substr(u.url, instr(u.url,'://')+3),'/') WHEN 0 THEN length(u.url) ELSE instr(substr(u.url, instr(u.url,'://')+3),'/')-1 END) ELSE u.url END AS host FROM visits v LEFT JOIN context_annotations c ON c.visit_id=v.id JOIN urls u ON u.id=v.url WHERE ( v.transition & 255) NOT IN (3,4) ), roots AS ( SELECT e.visit_id FROM edges e LEFT JOIN edges p ON e.from_visit=p.visit_id WHERE e.from_visit IS NULL OR p.visit_id IS NULL), walk(root_id, visit_id, from_visit, ts_utc, fg_sec, host, title, depth) AS ( SELECT r.visit_id AS root_id, e.visit_id, e.from_visit, e.ts_utc, e.fg_sec, e.host, e.title, 1 FROM roots r JOIN edges e ON e.visit_id = r.visit_id UNION ALL SELECT w.root_id, e.visit_id, e.from_visit, e.ts_utc, e.fg_sec, e.host, e.title, w.depth+1 FROM edges e JOIN walk w ON e.from_visit = w.visit_id) SELECT root_id AS chain_id, count(*) AS visit_count, count(DISTINCT host) AS distinct_domains, min(ts_utc) AS start_ts_utc, max(ts_utc) AS end_ts_utc, cast((julianday(max(ts_utc))-julianday(min(ts_utc)))*86400.0 AS integer) AS duration_sec, round(sum( CASE WHEN fg_sec>0 THEN fg_sec ELSE 0 END),2) AS total_foreground_sec FROM walk GROUP BY root_id HAVING visit_count >= 3 ORDER BY visit_count DESC -- chain_depths.csv WITH recursive edges AS ( SELECT v.id AS visit_id, v.from_visit FROM visits v WHERE ( v.transition & 255) NOT IN (3,4)), roots AS ( SELECT e.visit_id FROM edges e LEFT JOIN edges p ON e.from_visit=p.visit_id WHERE e.from_visit IS NULL OR p.visit_id IS NULL), walk(root_id, visit_id, depth) AS ( SELECT r.visit_id, r.visit_id, 1 FROM roots r UNION ALL SELECT w.root_id, e.visit_id, w.depth+1 FROM edges e JOIN walk w ON e.from_visit = w.visit_id) SELECT root_id AS chain_id, max(depth) AS max_depth, count(*) AS total_nodes FROM walk GROUP BY root_id ORDER BY max_depth DESC limit 5000;
But the stories were not very impressive (and also hard to anonymize). So, here are a few observations.
rabbit-holes/ would beat attention-clock/.