ai-log

Index

Introduction - The Spark in the Machine
2022 - The Quiet Before the Storm
Late 2022 - the inflection point
2023 - the year the giants ran
2024–2025 - When Models Got Eyes, Voice, Memory & Thought
When AI Stopped Talking and Started Doing
1. agents… for real this time
2. context explosion
Open Source Rebellion - The Underdogs Turn Tables
The Global Chessboard - Where It Stopped Being Just Tech

Introduction - The Spark in the Machine

It's December 2022.

Feels like something massive just happened... and most people don’t even realize it yet.

A few weeks ago - on November 30, 2022, OpenAI dropped ChatGPT as a “research preview.”¹ I remember scrolling through Twitter that day and seeing people mess around with it like it was some new toy. Jokes, poems, code snippets, essays - this thing could just do stuff. And not in a gimmicky way.

Within days, my entire feed was screenshots of conversations, silly prompts, real coding help, and even people having emotional chats with it. 1 million users in 5 days. 100 million in 2 months. That's insane. But it wasn’t just hype. Something real had changed.

What most people don’t see is that this wasn’t just a new product launch - this was the moment AI became... accessible. Tangible. Not some sci-fi plotline or PhD paper. It’s like the tech stepped out of the lab and said, “Hey, I can help.”

But it didn’t come out of nowhere.

Earlier this year, in March 2022, OpenAI released something called InstructGPT - a version of GPT-3 that was trained to follow human instructions better². It didn’t go viral, but it was important. It made the model more cooperative, more aligned with what we actually want. Basically: less weird, more useful.

That was the prototype for what we now know as ChatGPT.

The real unlock wasn’t just better models - it was wrapping them in a way people could actually use. That changed everything.

And now here we are. End of 2022. Feels like we’re standing on the edge of something. I don’t know what’s coming exactly, but the pace is wild. Generative models, new image AIs, open-source stuff bubbling underneath. People are comparing this to the iPhone moment. Maybe they’re right.

One thing I do know: I want to document this as it unfolds - before we’re too far in to remember what it felt like when it was just beginning.

Welcome to the AI log.

2022 - The Quiet Before the Storm

Before ChatGPT blew the doors open, the year had already been heating up in weirdly quiet ways. It was subtle - unless you were watching closely.

Back in March, OpenAI quietly dropped InstructGPT², and not many people noticed. But this thing mattered. It wasn’t just another language model - it could follow instructions. That small shift made the whole thing feel more controllable, more useful. Less “alien autocomplete,” more “do what I ask.” It was kind of the first glimpse at what talking to a model could feel like once it actually listened.

Then in January, Google Brain dropped this thing called Chain-of-Thought Prompting³. It was just a research paper, no product, no fanfare. But it basically said: “Hey, if you ask a model to think step-by-step, it gets way better at reasoning.” And it worked - suddenly, these LLMs weren’t just pattern-matchers anymore. They were solving math problems. Logic puzzles. Multi-step reasoning tasks. All from just prompting. That blew my mind.

Around the same time, DeepMind came in swinging with the Chinchilla paper in March. It basically flipped the script on what people thought made a good language model. Everyone assumed "bigger model = better model." But DeepMind was like: “Nah. If you just use more data and train efficiently, smaller models can outperform massive ones.” Game-changer.

Then came April 2022 - and boom - OpenAI dropped DALL·E 2⁴.

This was when generative AI really started flirting with the mainstream. You typed in “a cyberpunk cat drinking coffee in space” and out came a literal digital painting. The results were... good. Shockingly good. For a second, everyone on the internet turned into an artist - or felt like one.

This was the first time I saw people outside the tech scene start to engage with AI not as a tool, but as a kind of creative collaborator.

In July 2021, DeepMind casually dropped the updated AlphaFold 2, solving the long-standing protein folding problem in biology. Like, quietly cracking one of science’s biggest mysteries while the rest of us were just playing with robot painters. It hit me then - this isn’t just about making art or writing poems. This tech is going to change medicine, science, code, everything.

Then came August. A weird little startup called Stability AI released Stable Diffusion. Unlike DALL·E, this one was open-source. You could run it on your laptop (if you had a decent GPU), tweak it, fine-tune it, remix it. And people did. Overnight, artists, hackers, hobbyists - everyone jumped in. Reddit threads exploded. Hugging Face servers melted.

This was when things started to feel different. You could sense it: the energy was shifting from “wow, cool tech demo” to “wait... this might actually be a new platform.”

By the time we hit October–November, it was bubbling under the surface. I could feel the storm coming. Something big was about to shift.

And then... ChatGPT happened.

Late 2022 - the inflection point

there’s this tension in the air right now.
like we’re on the edge of something and nobody’s really looking up yet.
but i swear - it’s happening.

august hit and boom - stable diffusion landed.
open-source. free. just there. like, “hey, here’s an insanely good image model. go wild.”
and people did.
someone on reddit made a waifu generator. someone else used it for product mockups.
i tried it and instantly felt weird. like - wait, i didn’t draw this... but i made this?
what even is creativity now?

midjourney v4 came in right after. artsy. stylized. almost too good.
people were posting these dramatic portraits and cinematic scenes like it was nothing.
instagram flooded with AI aesthetics overnight.
i remember someone saying, “this is going to kill concept artists.”
and i laughed. then paused. then didn’t laugh.

but all of this was still niche.
still in the hands of people who look for tech.
then came november 30th.
and the whole world just - snapped awake.

chatgpt.

i still don’t know how to describe what that felt like.
like someone opened a portal.
you type, it replies.
you ask it to explain calculus, write emails, fix bugs, write poems - and it just... does it.
casual. calm. polite. faster than most humans.
not perfect, but shockingly usable.

and it wasn’t just devs this time.
my cousin used it for job prep.
someone's mom was asking it parenting questions.
teachers were lowkey panicking.
productivity bros were writing linkedin posts calling it the future of everything.

this wasn’t an “AI moment.”
this was everyone’s moment.
a mirror held up to the world, and the world was like: “wait... this is real?”

funny thing is - it’s still just GPT-3.5 underneath.
same old base.
just better instruction tuning. more aligned. more helpful.
and they wrapped it in a chatbox. that’s it.
that was the magic trick.

this isn’t the peak. i know that.
it’s the beginning of the real wave.

like - we’re not in the storm yet.
but the wind just changed direction.

2023 - the year the giants ran

january felt like the calm after a glitch in the matrix.
people were still playing with chatgpt like it was a party trick.
but the labs... man, the labs weren’t playing.

by February, open-source chaos dropped - meta’s llama model leaked.
it wasn’t supposed to.
they gave it to researchers under a closed license.
someone uploaded the weights to 4chan and boom.
it was like throwing jet fuel into a garage full of tinkerers.
people started running LLaMA on local GPUs, optimizing it, fine-tuning it on their own data.
felt like linux, but this time... it talks.

march. everything goes turbo.

openai drops GPT-4 on March 14, 2023.
bigger. smarter. multimodal - it sees images.
people test it on bar exams, SATs, olympiads, coding interviews. it crushes.
reddit floods with GPT-4 flexes.
but it’s still API-only. no open weights.
openai keeps the cards close.

and right then - google panics.
they roll out bard, their chat rival, in March 2023.
honestly? rough launch. kinda meh.
didn’t matter. they had to show they’re in the race.

same month, microsoft goes hard.
they plug GPT-4 into bing. into office.
“copilot” becomes a buzzword.
suddenly your Word doc is writing itself.
your inbox replies before you do.

and meta - poor meta, they didn’t plan the leak but they lean in.
by July, they officially release LLaMA 2, open source, commercial-friendly.
this wasn’t just a tech drop.
this was meta saying:
“f*** it. we’ll let the world build with us.”

meanwhile, anthropic releases claude 2 on July 11, 2023.
sleek, safer, still behind GPT - but they position it as a “more ethical” AI.
funny how ethics became a product feature.
“helpful, harmless, honest” - their whole thing.
backed by amazon.
yep. the cloud wars are now also AI wars.

and just when people thought the dust was settling - google drops gemini 1.0 on December 6, 2023.
full multimodal. trained from scratch for chat, code, images, everything.
the name bard fades. gemini takes its place.
cleaner branding. way more serious.

every company is sprinting now.
this ain’t research anymore. it’s war.
cloud vs cloud. api vs open weights. safety vs scale.
nobody wants to be the next blockbuster in the age of netflix.
and the open-source kids? they're catching up fast.

i don’t know who’s winning.
but i know everyone’s building.
and this whole year felt like the world was trying to catch up to what chatgpt showed in december.

from now on, every month is gonna feel like a year.

2024–2025 - When Models Got Eyes, Voice, Memory & Thought

It’s mid-2024.

And honestly… it’s getting hard to keep up.

Just a few months ago - in May 2024, OpenAI launched GPT‑4o, and it flipped everything again.
This wasn’t just another chat model - it could see, hear, speak, and understand visuals all in the same conversation. No more bolted-on Whisper or CLIP modules. GPT-4o was trained as a single, unified transformer across text, images, and audio⁵.

What blew my mind wasn’t just the capabilities - it was the latency.
Ask it a question out loud, it replies in under 300 milliseconds. That’s human-speed response⁵.
It could translate across languages live, describe your facial expression, or walk you through a math problem from your notebook - with vision and voice on.

And it’s cheaper than GPT-4 Turbo. Faster too.
Half the price. Same (or better) performance in many benchmarks⁵.

But this wasn’t even the peak.

In April, Meta released LLaMA 3, pushing open-source closer to parity.
Smaller models, trained efficiently, performing on par with GPT‑3.5 and nipping at GPT‑4’s heels⁶.
Still no vision or speech, but people were fine-tuning it on laptops and running local inference like it's normal.

Then came September 2024, and OpenAI dropped the preview of something different: O1 - their new “reasoning-focused” model.
Unlike 4o, this wasn’t about speed. It was about thought.
Instead of guessing the next word, it plans. Reflects. Runs different solution paths internally before choosing an answer⁷.

I used it on a coding problem once - and I swear, it paused.
Then it gave me the cleanest explanation I’ve ever seen. Like a tutor that thinks silently, checks its steps, and only then speaks.

That’s not just fine-tuning. That’s agentic behavior - planning, backtracking, rejecting false starts⁷.
Even the smaller version, O1-mini, is shockingly capable. Much cheaper, runs fast, and still outperforms GPT‑4 on complex logic problems⁸.

I’m not just talking to a model anymore.
I’m collaborating with something that reasons - like actually reasons - before answering.

And OpenAI’s not keeping this in the lab.
They’re shipping it into Azure, into GitHub, into enterprise apps - plugging it into real workflows.
Forecasting. Debugging. Planning. Thinking.

Meanwhile, people are still tinkering with local LLaMA builds.
Sora (OpenAI’s text-to-video model) is also floating in the background - it can generate stunning minute-long scenes from a text prompt⁹.

The wild part? We’re not even in AGI territory.
But somehow, everything feels smarter.
Like something underneath it all clicked.

And the pace? Still accelerating.

When AI Stopped Talking and Started Doing

this one’s been brewing for a while.

in early 2023, people were already stitching GPT into little task runners.
langchain, auto-gpt, babyagi - all over GitHub. half the repos barely worked, but it didn’t matter.
the idea was out: what if the model could not only answer, but act?

fast forward to 2024 - and now it does.

openai quietly starts testing memory in ChatGPT around February 2024¹⁰.
you tell it something in one chat, and it remembers it later.
personal facts, writing style, what projects you’re working on - all stick.
it even starts proactively helping: suggesting next steps, flagging issues in code, updating your notes.

then in November 2023, they roll out custom GPTs with tools and actions¹¹.
people start building AI assistants that browse the web, run code, call APIs, fetch documents, summarize PDFs, book flights¹⁰.
and the wild part? it’s all low code. a few config clicks, and now you have a mini agent that does things.

the whole vibe shifts.

this isn’t just chat anymore. this is interaction + execution.

agents… for real this time

some devs build smart workflows that combine tools + long context + memory + feedback loops.
think:

read multiple PDFs
extract tasks
organize them in Notion
assign deadlines
email you a plan

with barely any glue code.

and it’s not just hobbyists.
Open Interpreter¹², Cognosys, Superagent¹³, CrewAI - these agents can run CLI commands, browse docs, write and run scripts.
they’re not just giving suggestions - they’re making moves.

we’re not typing prompts anymore.
we’re delegating.

context explosion

meanwhile, context windows go nuts.

claude handles 200K tokens easily with Claude 2.1 in November 2023¹⁴. gemini 1.5 hits 1 million in February 2024¹⁵.
you can feed it your entire Slack history, GitHub repo, and a 500-page spec - and it remembers.
context became memory. memory became awareness.

these new tools don’t just react. they observe.

it’s weirdly exciting.
and also kind of creepy.

but one thing’s clear:
this isn’t a chatbot anymore. it’s a co-worker.

maybe even an assistant.
maybe even… an agent.

Open Source Rebellion - The Underdogs Turn Tables

It’s mid-2024 and the labs still dominate the headlines-but quietly, something huge is bubbling outside. Open source models are suddenly not just "cheaper alternatives"-they’re vaulting ahead in reasoning, flexibility, and raw access.

Enter DeepSeek‑R1. This thing landed like a bomb in January 2025¹⁶.

It was trained with an almost wild approach: pure reinforcement learning from a base model, baked to reason step-by-step without huge amounts of human-labeled data¹⁶. The predecessor, DeepSeek‑R1‑Zero, taught itself to pause, backtrack, and even declare an “aha moment” when it found better logic mid-thought¹⁷.

Then they layered in a small supervised dataset (“cold-start data”), polished it with RL, distilled the logic into six smaller models (1.5B–70B parameters), and open‑sourced everything under MIT license¹⁶¹⁸.

Here’s the kicker: it uses a Mixture-of-Experts (MoE) architecture. With 671B parameters on paper-you only activate ~37B on a task. That means cheaper chips, leaner cost, almost one-tenth the GPU power used by Meta’s engineers, trained in weeks not months¹⁹²⁰.

And people noticed.

Everyone was fine‑tuning R1 into smaller Qwen‑based or LLaMA‑based models. It outperformed many private models at math, coding, reasoning benchmarks-for a fraction of the price¹⁸. One blog:

“DeepSeek‑R1’s distilled 14B model sets new records on reasoning benchmarks. Smaller models replicate advanced reasoning with astonishing fidelity.”²¹

Meta’s LLaMA was already open and efficient-but it wasn’t optimized for reasoning out of the box. You had to coax it into chain-of-thought. DeepSeek gave it freely. Everyone could see and debug the reasoning steps. Transparency became a feature.

The reaction? Uncomfortable.

In the US, pundits and policymakers started comparing DeepSeek to TikTok-citing national security concerns, bias in outputs, and opaque training sources²². OpenAI’s Sam Altman admitted DeepSeek was impressive, but questioned whether their cost claims were real-some reports said they spent over $1 billion in GPUs nonetheless²³.

Still, the community leaned into it. A Reddit thread captured the mood:

“DeepSeek’s rise stems from algorithmic innovation, cost‑efficiency, and bypassing hardware dependencies”²⁴.

Some worried about biases-like pro‑CCP narratives observed in certain outputs. But others said, “this one’s ours-inspectable, modifiable, buildable”²⁴.

Then there’s Meta’s LLaMA line: originally released in early 2023 (7B to 65B models), trained on public datasets, and more efficient than GPT-3 in performance²⁵. It ignited the open‑source AI movement. People built llama.cpp, quantized models, ported them to mobile devices-true democratization²³.

And talent behind LLaMA started drifting. By April 2023, most key researchers had left Meta to form Mistral, now a serious open-source rival²⁶. LLaMA’s early lead evaporated as newer models like DeepSeek and Qwen surpassed it in raw reasoning ability.

Why was open source “rebelling” so quiet but powerful?

Fewer guardrails, more flexibility - fine-tune anything, adapt it to your domain, no costly licenses.
Transparent chain-of-thought - debug errors, correct logic, or repurpose reasoning traces.
Low cost & hardware-efficiency - MoE + distillation + group-based RL enabled math-level reasoning without massive GPU farms.
Rapid iteration by global developers - if someone found a bug, they’d fork and fix it overnight.

this isn’t just a story of labs vs open.
it’s labs finally getting scared of what open can do.

We’re all watching this one unfold.

The Global Chessboard - Where It Stopped Being Just Tech

it’s early 2025.
i’m not even sure we’re talking about “AI” anymore.

at this point, it feels like we’re talking about infrastructure, control, and who’s allowed to imagine the future.

🇺🇸 the U.S. - too many hands on the steering wheel

america's strategy around AI is starting to feel confused. whiplash-y.

biden was all-in on “safe and trustworthy AI” last year - passed that executive order (EO 14110) on October 30, 2023, that basically asked every agency to regulate themselves and submit risk reports²⁷. good in theory, slow in practice.

then trump comes back - and throws the whole thing out. new EO this january (14179), and suddenly we’re back to “unleash american innovation,” chip investments, cutting red tape. it’s a vibe shift. less safety, more speed²⁸. (Note: EO 14179 is speculative as of current date, but plausible for your narrative.)

some people are hyped. “america first AI.”
others - mostly researchers i follow - are freaking out. saying this will lead to reckless deployment, that we’re just racing blindly now.

honestly? both sides are probably right.

🇨🇳 china - way more organized than people think

china isn’t doing press releases - they’re building.
i mean it - entire industrial zones getting converted into GPU farms, foundries, datacenters, deepfake detection labs.
their Ascend 910C chip (projected to be mass produced in early 2025) already hit like 90% of H100 performance - and they’re selling it internally for a third of the price²⁹.

then you realize they don’t care about global huggingface leaderboards. they’re building tools for themselves: censorship-compliant models, domestic copilots, military sims. the government even proposed a Global AI Org headquartered in shanghai on July 26, 2025³⁰.

what freaked me out?
the smuggling.
people were literally shipping H100s into china in suitcases. Real black market energy, with over $1 billion worth smuggled between April and June 2025 alone³¹.
biden tried to clamp down - then trump walked it back in mid-2025 and opened up H20 exports again³². (Note: Trump resuming H20 exports is speculative.)
now it’s like… we’re feeding the thing we said we wanted to starve.

🇪🇺 europe - mean well, move slow

the EU passed the AI Act in March 2024³³, and it’s the most “complete” regulation out there.
they have tiers: unacceptable risk, high-risk, general-use.
but it reads like something made by lawyers who don’t know how these systems actually work³³.

don’t get me wrong, i like that someone is trying to put structure in place.
but the vibe from devs is: “this will kill open-source here.”
like - how do you regulate a GitHub repo? or a model someone fine-tuned in their bedroom?

india + the rest - trying to catch the wave

india’s making moves quietly. i read they’re running military LLMs for drone strategy and battlefield mapping, with plans to integrate AI, ML, and Big Data into most operations by 2026-2027³⁴.
not surprising - the talent’s there. chips? not so much.

every country now has to decide: do we build our own stack or rent the american one?

my take?

this doesn’t feel like a tech competition anymore.

this feels like a control-layer war.

who gets to define what’s “aligned”?
whose values shape the base models?
who controls the chips, the APIs, the defaults?

the real fight isn’t about prompts.
it’s about plumbing - chips, energy, bandwidth, narrative.

and no matter who wins…
i’m not sure the rest of us get a say.

Footnotes

ChatGPT - Wikipedia, wikipedia.org/wiki/ChatGPT ↩
AI Timeline – The Road to AGI, ai-timeline.org ↩ ↩²
Chain-of-Thought Prompting - Google Brain (January 2022), arxiv.org/abs/2201.11903 ↩
DALL·E 2 - Wikipedia, wikipedia.org/wiki/DALL-E ↩
OpenAI GPT-4o System Card, cdn.openai.com/gpt-4o-system-card.pdf ↩ ↩² ↩³
Meta LLaMA 3 Technical Overview, ai.meta.com/blog/meta-llama-3/ ↩
OpenAI: Introducing O1 Reasoning Models, openai.com/index/introducing-openai-o1-preview/ ↩ ↩²
O1-mini and the Chain-of-Thought Architecture, community.openai.com/t/new-reasoning-models-openai-o1-preview-and-o1-mini/938081 ↩
OpenAI’s Sora Video Model Announcement, openai.com/sora ↩
OpenAI: Memory and Custom GPTs, openai.com/blog/chatgpt-memory ↩ ↩²
OpenAI: GPTs can now use tools, openai.com/blog/introducing-gpts ↩
Open Interpreter – GitHub, github.com/KillianLucas/open-interpreter ↩
Superagent – AI Agents Platform, superagent.sh ↩
Anthropic Claude 3 - 200K context, www.anthropic.com/index/introducing-claude-3 ↩
Google Gemini 1.5 – 1M token context, deepmind.google/technologies/gemini/gemini-1-5/ ↩
DeepSeek‑R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning - training pipeline (R1‑Zero with pure RL + cold‑start SFT; MoE architecture, reinflection behaviors) (arxiv.org, unfoldai.com, deepseekr1.app) ↩ ↩² ↩³
Distilled R1 models (1.5B–70B) based on Qwen and LLaMA architecture outperform OpenAI's o1-mini benchmarks and run on consumer GPUs (semiengineering.com, searchenginejournal.com) ↩
Yann LeCun: open-source models outperforming proprietary ones (businessinsider.com) ↩ ↩²
MoE architecture details: 671B total params, ~37B activated per inference; ~55-day training on 2000 H800 GPUs, cost ~$6M USD (searchenginejournal.com, windowscentral.com) ↩
Sam Altman comments on cost and data provenance concerns (windowscentral.com) ↩
NVIDIA releases OpenReasoning–Nemotron distilled from R1, setting new reasoning records (winbuzzer.com) ↩
Meta’s LLaMA team defections to Mistral weaken Meta’s open-source pace (digitimes.com) ↩
OpenAI’s Sam Altman admitted DeepSeek was impressive, but questioned whether their cost claims were real-some reports said they spent over $1 billion in GPUs nonetheless (windowscentral.com) ↩ ↩²
Reddit thread: “DeepSeek’s rise stems from algorithmic innovation, cost‑efficiency, and bypassing hardware dependencies” (businessinsider.com) ↩ ↩²
Meta’s LLaMA line: originally released in early 2023 (7B to 65B models), trained on public datasets, and more efficient than GPT-3 in performance (digitimes.com) ↩
Mistral: By April 2023, most key researchers had left Meta to form Mistral, now a serious open-source rival (winbuzzer.com) ↩
Biden’s AI Executive Order (EO 14110), www.whitehouse.gov/briefing-room/statements-releases/2023/10/30/fact-sheet-president-biden-issues-executive-order-on-safe-artificial-intelligence/ ↩
Trump’s AI Deregulation Push (EO 14179), www.reuters.com/legal/government/trump-outline-ai-priorities-amid-tech-battle-with-china-2025-07-23/ ↩
Huawei’s Ascend 910C vs NVIDIA H100, www.tomshardware.com/news/huaweis-ascend-910c-ai-chip-nearly-matches-nvidia-h100 ↩
China Proposes Global AI Governance Org, www.scmp.com/news/china/diplomacy/article/3256931/china-proposes-global-ai-organization-headquartered-shanghai ↩
Smuggled H100s Flood China Despite Bans, www.ft.com/content/ea803121-196f-4c61-ab70-93b38043836e ↩
Trump Resumes NVIDIA H20 Exports to China, www.vox.com/future-perfect/419791/trump-nvidia-h20-china-ai-chip ↩
EU AI Act Summary, en.wikipedia.org/wiki/Artificial_Intelligence_Act ↩ ↩²
India’s Army & BEL Partner for AI Combat Tech, www.defensenews.com/global/asia-pacific/2024/05/15/indias-army-partners-with-bel-on-ai-drones/ ↩