<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>AI on Hongjiang Bao's Blog</title><link>http://baohongjiang.com/en/tags/ai/</link><description>Recent content in AI on Hongjiang Bao's Blog</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Thu, 23 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="http://baohongjiang.com/en/tags/ai/index.xml" rel="self" type="application/rss+xml"/><item><title>Standing in 2026 — The Next Stage of How I Use AI</title><link>http://baohongjiang.com/en/p/standing-in-2026-the-next-stage-of-how-i-use-ai/</link><pubDate>Thu, 23 Apr 2026 00:00:00 +0000</pubDate><guid>http://baohongjiang.com/en/p/standing-in-2026-the-next-stage-of-how-i-use-ai/</guid><description>&lt;h2 id="foreword"&gt;Foreword
&lt;/h2&gt;&lt;p&gt;Around this time last year I wrote &lt;a class="link" href="http://baohongjiang.com/en/p/standing-in-2025-looking-back-at-ai-and-looking-forward/" &gt;Standing in 2025 — Looking Back at AI, and Looking Forward&lt;/a&gt;. My take back then was: &lt;strong&gt;AI is an amplifier — own the framework, hand the details to AI.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;A year on, that conclusion isn&amp;rsquo;t &lt;em&gt;wrong&lt;/em&gt;. It&amp;rsquo;s just no longer enough.&lt;/p&gt;
&lt;p&gt;Because &amp;ldquo;have AI handle the details&amp;rdquo; has itself splintered into multiple wildly different layers. The productivity gap between people stuck on different layers is widening fast.&lt;/p&gt;
&lt;p&gt;And I&amp;rsquo;ve personally hit a new bottleneck. This post is me writing down how I see that bottleneck and what I think the next stage is.&lt;/p&gt;
&lt;h2 id="my-five-stages-of-using-ai"&gt;My five stages of using AI
&lt;/h2&gt;&lt;p&gt;A quick recap of how my own usage has evolved:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Form&lt;/th&gt;
&lt;th&gt;Representative&lt;/th&gt;
&lt;th&gt;What I do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;L1&lt;/td&gt;
&lt;td&gt;Web chat&lt;/td&gt;
&lt;td&gt;ChatGPT web&lt;/td&gt;
&lt;td&gt;Paste question in, paste answer out&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L2&lt;/td&gt;
&lt;td&gt;IDE plugin&lt;/td&gt;
&lt;td&gt;GitHub Copilot&lt;/td&gt;
&lt;td&gt;AI completes alongside me; I&amp;rsquo;m in the driver&amp;rsquo;s seat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L3&lt;/td&gt;
&lt;td&gt;AI-native IDE&lt;/td&gt;
&lt;td&gt;Cursor&lt;/td&gt;
&lt;td&gt;AI edits multiple files; I review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L4&lt;/td&gt;
&lt;td&gt;Terminal-native agent&lt;/td&gt;
&lt;td&gt;Claude Code&lt;/td&gt;
&lt;td&gt;AI can touch my whole machine; I confirm via dialogue&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;L5&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;td&gt;?&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;The throughline is one thing: &lt;strong&gt;AI&amp;rsquo;s operating boundary keeps expanding, and humans are progressively freed from layer after layer of concrete operation.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Each leap is an order-of-magnitude jump.&lt;/p&gt;
&lt;h2 id="the-bottleneck-top-experts-cant-get-enough-out-of-three-max-20xs-i-cant-even-use-up-one-pro"&gt;The bottleneck: top experts can&amp;rsquo;t get enough out of three Max 20x&amp;rsquo;s; I can&amp;rsquo;t even use up one Pro
&lt;/h2&gt;&lt;p&gt;I just bought Claude&amp;rsquo;s Max 5x. The result: &lt;strong&gt;I can&amp;rsquo;t even fully use it.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Meanwhile, some top developers publicly say they can&amp;rsquo;t get enough capacity out of running three Max 20x accounts at the same time.&lt;/p&gt;
&lt;p&gt;That contrast made me stop and think — same tools, why can they burn through ten times the compute? Experts this strong, even &lt;em&gt;they&lt;/em&gt; don&amp;rsquo;t have enough? Where&amp;rsquo;s the actual gap? &lt;strong&gt;How do I close it?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Thinking it through, the answer is clear:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;I&amp;rsquo;m the bottleneck.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;For every task I&amp;rsquo;m still going back and forth with Claude Code:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&amp;ldquo;Search for related code first before changing anything&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;That approach isn&amp;rsquo;t right, try a different angle&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Run the tests first&amp;rdquo;&lt;/li&gt;
&lt;li&gt;&amp;ldquo;Confirm before committing&amp;rdquo;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I&amp;rsquo;m running 3–5 Agents in parallel and my brain is already taxed. Human context-switching has a cost. Tops, you can manage 5–7.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s a strange feeling — &lt;strong&gt;I&amp;rsquo;m holding a rocket but I&amp;rsquo;m still shifting gears one at a time.&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="the-next-stage-from-operator-to-legislator"&gt;The next stage: from &amp;ldquo;operator&amp;rdquo; to &amp;ldquo;legislator&amp;rdquo;
&lt;/h2&gt;&lt;p&gt;How do you break this bottleneck?&lt;/p&gt;
&lt;p&gt;After working through Anthropic&amp;rsquo;s public docs, the workflow notes from Boris Cherny (the creator of Claude Code), and a bunch of heavy-user practices, my conclusion is:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The next stage isn&amp;rsquo;t &amp;ldquo;manage more Agents.&amp;rdquo; It&amp;rsquo;s that you stop personally managing Agents at all.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;That sounds abstract, but unpack it and it&amp;rsquo;s very concrete:&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;L4 (where I am)&lt;/th&gt;
&lt;th&gt;L5 (next stage)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Am I in the loop?&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What do I do?&lt;/td&gt;
&lt;td&gt;Talk, confirm, correct&lt;/td&gt;
&lt;td&gt;Write rules, define acceptance, arbitrate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;When does AI stop?&lt;/td&gt;
&lt;td&gt;When I say stop&lt;/td&gt;
&lt;td&gt;When the rule says it&amp;rsquo;s done&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;If I leave the keyboard for 8 hours&lt;/td&gt;
&lt;td&gt;The system halts, waiting for me&lt;/td&gt;
&lt;td&gt;The system has been making progress for 8 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;&lt;strong&gt;The single test for L5: when you walk away from the keyboard for 8 hours and come back, has the system stopped waiting for you, or has it already finished its run?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In L5, the focus of your work fundamentally changes:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Write &lt;strong&gt;specs&lt;/strong&gt;, not prompts&lt;/li&gt;
&lt;li&gt;Define acceptance criteria (tests, lint, human-review checkpoints), don&amp;rsquo;t review every step&lt;/li&gt;
&lt;li&gt;Design &lt;strong&gt;hooks and guardrails&lt;/strong&gt;, not confirmation buttons&lt;/li&gt;
&lt;li&gt;Build a &lt;strong&gt;feedback loop&lt;/strong&gt; (failures auto-feed back to the AI to keep iterating), don&amp;rsquo;t manually retry&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You open your laptop in the morning, and what you see is no longer &amp;ldquo;Claude is waiting on me to approve something.&amp;rdquo; It&amp;rsquo;s: &lt;strong&gt;&amp;ldquo;Out of last night&amp;rsquo;s 12 PRs, 9 already passed automated acceptance, 3 are flagged red for me to judge.&amp;rdquo;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The only thing you do: &lt;strong&gt;look at those 3 reds, and patch the rule that let them turn red.&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="im-already-crawling-toward-it-what-l45-actually-feels-like"&gt;I&amp;rsquo;m already crawling toward it: what L4.5 actually feels like
&lt;/h2&gt;&lt;p&gt;L5 sounds far off, but my L4 has actually been transitioning into L4.5.&lt;/p&gt;
&lt;p&gt;The feeling is already different. Claude Code can run remotely, run in the background; I&amp;rsquo;m not staring at every line it writes. Most of my day is spent &lt;strong&gt;looking at the reports it hands me&lt;/strong&gt;, then doing a few things:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Judge whether the direction is right&lt;/li&gt;
&lt;li&gt;Decide: keep going, switch approach, or send it back to redo&lt;/li&gt;
&lt;li&gt;Make calls and decisions at key checkpoints&lt;/li&gt;
&lt;li&gt;Handle whatever it can&amp;rsquo;t crack and gets stuck on&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Honestly, I&amp;rsquo;m now more like a remote-control manager than a programmer.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I didn&amp;rsquo;t write the code, but &lt;strong&gt;I set the direction, I drew the boundaries, I judged the quality&lt;/strong&gt;. It&amp;rsquo;s a strange state — the rhythm has changed completely. I get a noticeably larger amount done in a day, but every individual decision carries far more weight. One bad direction can torch hours of downstream output.&lt;/p&gt;
&lt;p&gt;This still isn&amp;rsquo;t L5. L5 is when even the &amp;ldquo;look at the reports&amp;rdquo; step is partly handed off — the rules auto-filter 80% of the content, and I only look at the remaining 20%. But this transition state, L4.5, is already enough to make me feel it: &lt;strong&gt;managing AI feels nothing like writing code yourself. They are two completely different jobs.&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="the-productivity-gap-is-widening"&gt;The productivity gap is widening
&lt;/h2&gt;&lt;p&gt;This is the thing I most want to call out: &lt;strong&gt;the productivity gap between people stuck on different stages is widening at an order-of-magnitude rate.&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;L1&lt;/strong&gt; people use AI to answer questions and look stuff up. Limited gains, but real.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;L2 / L3&lt;/strong&gt; people fold AI into their code-editing flow. Multiple times more efficient than L1.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;L4&lt;/strong&gt; people get AI to run and complete a whole task autonomously. Multiple times more efficient again.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;L5&lt;/strong&gt; people have a dozen tasks running in parallel and only do arbitration and legislation. Another order of magnitude on top.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The thresholds between these layers aren&amp;rsquo;t linear. L1 → L2 is easy: install a plugin. L4 → L5 is &lt;em&gt;very&lt;/em&gt; hard — it requires you to redesign the entire shape of your work. &lt;strong&gt;You stop being &amp;ldquo;the person using the tool&amp;rdquo; and become &amp;ldquo;the person designing the rules for using the tool.&amp;rdquo;&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="what-it-asks-of-you-both-knowledge-density-and-total-knowledge"&gt;What it asks of you: both knowledge density &lt;em&gt;and&lt;/em&gt; total knowledge
&lt;/h2&gt;&lt;p&gt;Here&amp;rsquo;s the most counter-intuitive part.&lt;/p&gt;
&lt;p&gt;A lot of people assume the AI era lowers the bar — &amp;ldquo;AI handles details, I just need a sketch.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wrong. The exact opposite.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The AI era &lt;strong&gt;raises&lt;/strong&gt; the bar — and on &lt;strong&gt;two axes at once&lt;/strong&gt;:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Total knowledge — you need to know more.&lt;/strong&gt; Judging direction, drawing boundaries, arbitrating: every one of those decisions stands on actually understanding the domain. If you only know a tech stack at a surface level, you literally cannot tell that the code AI wrote is bad.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Knowledge density — your judgments per unit time must be higher and sharper.&lt;/strong&gt; In L5, your day looks like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;10 minutes reading a PR summary, decide whether to merge&lt;/li&gt;
&lt;li&gt;5 minutes reading a failure report, decide whether to fix code or fix the rule&lt;/li&gt;
&lt;li&gt;20 minutes writing a spec that defines a feature&amp;rsquo;s acceptance edges&lt;/li&gt;
&lt;li&gt;30 minutes reviewing a new rule, judging whether it&amp;rsquo;ll friendly-fire other tasks&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Every action is a decision; nothing is &amp;ldquo;execution.&amp;rdquo;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In less time you have to make more decisions, more accurately: see at a glance where the AI will derail; while writing the spec, predict where the boundary will explode; the moment you see a new rule, know what it&amp;rsquo;ll friendly-fire.&lt;/p&gt;
&lt;p&gt;These are exactly the abilities AI cannot replace. Because at their core they&amp;rsquo;re &lt;strong&gt;experience + taste + judgment&lt;/strong&gt;, not information processing.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;My read: &lt;strong&gt;second half of this year through the first half of next year&lt;/strong&gt;, the first genuinely native L5 product will land — possibly Anthropic itself, possibly a third party on top of the Claude API. By that point, the gap between people will be wider still.&lt;/p&gt;</description></item><item><title>Standing in 2025 — Looking Back at AI, and Looking Forward</title><link>http://baohongjiang.com/en/p/standing-in-2025-looking-back-at-ai-and-looking-forward/</link><pubDate>Wed, 12 Mar 2025 00:00:00 +0000</pubDate><guid>http://baohongjiang.com/en/p/standing-in-2025-looking-back-at-ai-and-looking-forward/</guid><description>&lt;p&gt;Foreword:
It&amp;rsquo;s now 2025, and AI is white-hot. In this post I want to share my personal take on AI — what I understand, what I expect, and how it&amp;rsquo;s helping my career and life.&lt;/p&gt;
&lt;h2 id="looking-at-the-present-from-the-past-and-predicting-the-future-from-now"&gt;Looking at the present from the past, and predicting the future from now.
&lt;/h2&gt;&lt;h3 id="an-article-from-january-2015-that-predicted-todays-ai"&gt;An article from January 2015 that predicted today&amp;rsquo;s AI
&lt;/h3&gt;&lt;p&gt;On &lt;strong&gt;January 27, 2015&lt;/strong&gt;, an article was published that completely upended my view of AI. Here&amp;rsquo;s the (Chinese-translated) link:
&lt;a class="link" href="https://zhuanlan.zhihu.com/p/19950456" target="_blank" rel="noopener"
&gt;Artificial Intelligence may very well lead to humanity&amp;rsquo;s immortality or extinction — and it&amp;rsquo;s quite possible all of this will happen within our lifetimes&lt;/a&gt; (in Chinese)
If you&amp;rsquo;re interested, please read the whole thing! I strongly recommend it!
In this post I&amp;rsquo;ll just lift two of its conclusions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Human technological progress is exponential!
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image.png"
width="1120"
height="688"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image_hu_c739e075b9c35979.png 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image_hu_2a9dba521881dfda.png 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="162"
data-flex-basis="390px"
&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;AI&amp;rsquo;s growth could leap past human cognition in an instant!
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-1.png"
width="1376"
height="1124"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-1_hu_408c1a35966e6ad.png 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-1_hu_c11b3f0903ed8ea4.png 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="122"
data-flex-basis="293px"
&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;At the time, most people would have called this nonsense. After all, in January 2015 OpenAI didn&amp;rsquo;t even exist. Look at the timeline:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;span class="lnt"&gt;3
&lt;/span&gt;&lt;span class="lnt"&gt;4
&lt;/span&gt;&lt;span class="lnt"&gt;5
&lt;/span&gt;&lt;span class="lnt"&gt;6
&lt;/span&gt;&lt;span class="lnt"&gt;7
&lt;/span&gt;&lt;span class="lnt"&gt;8
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2015 — OpenAI founded, focused on AGI.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2016 — AlphaGo beats Lee Sedol; AI surpasses humans at Go.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2017 — Transformers (Google) ignite the NLP revolution and become the foundation for GPT, BERT, etc.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2019 — GPT-2 (OpenAI) released, showing strong text generation.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2020 — GPT-3 (175B parameters) released, kicking off the AIGC craze.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2022 — ChatGPT (GPT-3.5) released and instantly explodes.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2023 — GPT-4 released, far more capable, multimodal (text + images).
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;2023 onward — AI products everywhere.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;If you read that article, you should be a little stunned right now. Its predictions are almost spot on — incredibly forward-looking!
&lt;strong&gt;Ten years ago, when the internet had only just gone mainstream and most people had only just gotten a smartphone, the author already predicted, accurately, where AI would be a decade later.&lt;/strong&gt;
Once again, I strongly recommend everyone who hasn&amp;rsquo;t read it go read it.&lt;/p&gt;
&lt;p&gt;History has confirmed the article&amp;rsquo;s accuracy. So let&amp;rsquo;s see what that 2015 article predicted for &lt;em&gt;after&lt;/em&gt; 2025.
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-2.png"
width="1190"
height="979"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-2_hu_1a87f551a058d522.png 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-2_hu_99fd9718dc53ef8b.png 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="121"
data-flex-basis="291px"
&gt;
The article splits AI into three stages: Artificial Narrow Intelligence (ANI), Artificial General Intelligence (AGI), and Artificial Super Intelligence (ASI). I think we&amp;rsquo;d all agree that as humans, we&amp;rsquo;ve now reached AGI.&lt;/p&gt;
&lt;p&gt;My personal take:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The moment ASI arrives, its intelligence will, in an instant, dwarf the sum of all human intelligence — growing at unbelievable, absurd exponential rates. Humanity is therefore extinguished, or made immortal.&lt;/li&gt;
&lt;li&gt;The most optimistic estimate: ASI in 2030. Conservative: 2050. Pessimistic: 2080, or never.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All of today&amp;rsquo;s AI is essentially modeled on the human brain. Look at this from a compute angle:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt;1
&lt;/span&gt;&lt;span class="lnt"&gt;2
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;As of 2024, the world&amp;#39;s fastest supercomputer (e.g. Frontier) has reached 1.2 EFLOPS (1.2 × 10¹⁸ FLOPS).
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Because the human brain doesn&amp;#39;t run like a computer, direct comparisons are hard, but typical estimates put it at 1 EFLOPS to 100 EFLOPS (10¹⁸ to 10²⁰ FLOPS).
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;You can see, today&amp;rsquo;s strongest compute is approaching the brain. I&amp;rsquo;ve always believed: when computer compute exceeds the brain, intelligence will get very close to the brain too. And right now something like o3 (whose compute is still nowhere near a brain) is already far past the vast majority of humans. So what about the future? What happens as compute keeps growing?&lt;/p&gt;
&lt;p&gt;In short: I think AI intelligence will continue to compound exponentially, just as that article predicted, possibly all the way to AGI and beyond.&lt;/p&gt;
&lt;h3 id="an-article-i-wrote-in-december-2023-about-how-i-understand-and-use-ai"&gt;An article I wrote in December 2023 about how I understand and use AI
&lt;/h3&gt;&lt;p&gt;Back in the GPT-3 era I&amp;rsquo;d already started using AI heavily. GPT-4&amp;rsquo;s arrival especially gave my technical chops, vision, and thinking a quantum leap.
Original: &lt;a class="link" href="http://baohongjiang.com/en/p/how-i-use-ai-notes-from-the-field/" &gt;How I Use AI — Notes from the Field&lt;/a&gt;
Now, with even more, even stronger AI products around, my hands-on ability has grown massively, and I&amp;rsquo;m even more convinced that what I wrote then was correct.
That&amp;rsquo;s exactly what the next section is about.&lt;/p&gt;
&lt;h2 id="how-ai-completely-flips-the-way-i-solve-problems-and-think-about-them"&gt;How AI completely flips the way I solve problems and think about them.
&lt;/h2&gt;&lt;p&gt;Let me re-quote my own conclusion:
&lt;strong&gt;Human life is finite. We can&amp;rsquo;t master that much knowledge or that many skills.
AI overturns the traditional learn-then-execute pipeline: master the framework, leave the details to AI, and your efficiency and ceiling go way up.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Some specific points:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;div class="chroma"&gt;
&lt;table class="lntable"&gt;&lt;tr&gt;&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code&gt;&lt;span class="lnt"&gt; 1
&lt;/span&gt;&lt;span class="lnt"&gt; 2
&lt;/span&gt;&lt;span class="lnt"&gt; 3
&lt;/span&gt;&lt;span class="lnt"&gt; 4
&lt;/span&gt;&lt;span class="lnt"&gt; 5
&lt;/span&gt;&lt;span class="lnt"&gt; 6
&lt;/span&gt;&lt;span class="lnt"&gt; 7
&lt;/span&gt;&lt;span class="lnt"&gt; 8
&lt;/span&gt;&lt;span class="lnt"&gt; 9
&lt;/span&gt;&lt;span class="lnt"&gt;10
&lt;/span&gt;&lt;span class="lnt"&gt;11
&lt;/span&gt;&lt;span class="lnt"&gt;12
&lt;/span&gt;&lt;span class="lnt"&gt;13
&lt;/span&gt;&lt;span class="lnt"&gt;14
&lt;/span&gt;&lt;span class="lnt"&gt;15
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;
&lt;td class="lntd"&gt;
&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-fallback" data-lang="fallback"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;### AI completely changes how we learn knowledge and master skills
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;The traditional model demands you internalize a full system; AI fills in the specific details and dramatically speeds up learning.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;You only need to grasp the &amp;#34;trunks&amp;#34; of the knowledge tree; AI fills in the &amp;#34;leaves.&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;### AI accelerates execution — going from learning to shipping is way faster
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Projects that previously required full mastery before you could start can now be moved forward as long as you know it&amp;#39;s broadly feasible — AI handles the rest.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Examples: building a blog, an AI WeChat public account, an AI chat website — all started as a concept, then AI fleshed out the details and got it shipped.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;### AI is an efficient executor, but the ceiling is still set by you
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;AI helps with problems that already have mature solutions, but on frontier exploratory problems it still struggles.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Your own ability and depth of thought decide the final outcome — AI is just an amplifier.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;### Practice is everything; the best way to use AI is &amp;#34;do it with your hands&amp;#34;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;Real growth comes from actually doing it. The tricks come naturally as you go.
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&amp;#34;Use it if you can, just dive in&amp;#34; — far more important than talking about technique.
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;
&lt;/div&gt;
&lt;/div&gt;&lt;p&gt;For me personally, AI now touches &lt;strong&gt;80%+&lt;/strong&gt; of what I do, work and dev included.
DeepSeek R1&amp;rsquo;s reasoning ability can attack a problem from every angle. Its breadth and knowledge far exceed mine. If one day I genuinely learn that kind of structured thinking, I can&amp;rsquo;t even imagine the result.
As I said: practice is everything. Use it deeply, you&amp;rsquo;ll get it. Without using it, talk is moonlight on water — useless.&lt;/p&gt;
&lt;h2 id="roundup-of-ai-products-as-of-today"&gt;Roundup of AI products as of today
&lt;/h2&gt;&lt;p&gt;As of writing, March 12, 2025, here&amp;rsquo;s my summary of the popular AIs out there, including ones I&amp;rsquo;ve personally used. Their characteristics, pros and cons. Includes more than just LLMs. For your reference:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;LLMs&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;GPT family
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;o3, o1&lt;/strong&gt; — currently the strongest &lt;em&gt;and&lt;/em&gt; the most expensive reasoning models. Massive context (o1 = 200K), the strongest reasoning and logic right now. Especially good at code analysis and logic. I only call this big brother in when its little siblings can&amp;rsquo;t solve a problem.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;o1-mini, o3-mini, o3-mini-high&lt;/strong&gt; — mini versions with better speed and price/performance trade-offs. For hard code problems, I usually try them first.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;4o&lt;/strong&gt; — the multimodal one. Well-rounded, fast. Reads images directly, parses audio. Speed and price are excellent. My most-used model right now.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;4.5&lt;/strong&gt; — newest GPT, way better creativity and &amp;ldquo;humanness,&amp;rdquo; but seriously expensive. I save it for creative work and copy.&lt;/li&gt;
&lt;li&gt;Other GPT models — all weaker siblings; the only upside is a slightly cheaper price. Not worth discussing.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;DeepSeek family
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;R1&lt;/strong&gt; — currently the strongest Chinese-language reasoning model and a price/performance monster. You&amp;rsquo;ve all seen how strong it is. Cheap, especially good at very Chinese-flavored reasoning problems. Heads up: its coding ability is weaker than other models — really, don&amp;rsquo;t use it for code. I use it for problems with strong Chinese-language texture. Very grounded.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;V3&lt;/strong&gt; — non-reasoning version. I rarely use it.&lt;/li&gt;
&lt;li&gt;Other parameter sizes — DeepSeek&amp;rsquo;s strength is hitting GPT-4o-class performance with the lowest compute and hardware. A lot of websites sneakily pass off 32B / 70B versions as the full 671B. I&amp;rsquo;m calling it out because &lt;strong&gt;R1 32B 4-bit quantized&lt;/strong&gt; is the best model you can run locally on a 4090 24G consumer GPU. From experience, useful for things involving confidentiality.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Claude family
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Claude 3.7&lt;/strong&gt; — multiple variants; the everyday one is Sonnet. Currently the second-best after GPT. Very humane to use; many heavy users say its coding ability beats GPT-4o. Great UX — they pioneered Canvas, and GitHub Copilot put it in their membership, which says a lot about its code price/performance. I use it to cross-check GPT&amp;rsquo;s code.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Grok family
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Grok 3&lt;/strong&gt; — Musk&amp;rsquo;s AI. Reportedly uses the most reasoning compute of anyone right now. In code it doesn&amp;rsquo;t have an obvious edge over the others. Worth noting: its content moderation is extremely loose.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Gemini family
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Gemini 2.0&lt;/strong&gt; — Google&amp;rsquo;s. Thanks to Google&amp;rsquo;s infrastructure muscle, very fast. Natively multimodal. Search ability clearly beats other models. (Of course — Google!) I don&amp;rsquo;t use it much; intelligence-wise it&amp;rsquo;s not clearly ahead.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Other / local models
&lt;ul&gt;
&lt;li&gt;On Hugging Face there&amp;rsquo;s a flood of models, plus QwQ from China and others — each with their own strengths. Most are smaller, specialized for a specific domain. With limited bandwidth, I generally don&amp;rsquo;t bother.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Manus
&lt;ul&gt;
&lt;li&gt;From all the noise, this is hype and a scam. Once people actually use it, we&amp;rsquo;ll see if it&amp;rsquo;s gold.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;AI image generation&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Stable Diffusion&lt;/strong&gt; — the famous SD. Open source, deeply customizable. Different base models = different styles. Tons of plugins. Easy to deploy and run on a home PC. Downsides: output is fully tied to the base model, and the learning curve is steep.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Midjourney&lt;/strong&gt; — also famous (MJ). Very strong, only available as a service. Wide range of styles. Downsides: little customization, expensive. The polar opposite of SD.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;DALL·E 3&lt;/strong&gt; — way behind the above two. The only upside is integration into the native ChatGPT web. Not really useful.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;TTS (text-to-speech)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;VITS family&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;VITS was originally released by a Chinese developer; lots of forks and downstream work. Currently the strongest open-source one is GPT-SoVITS. With just a few minutes of source audio, it can produce highly similar multilingual speech. A home PC is enough to fine-tune and infer.
Some &amp;ldquo;do it with your hands&amp;rdquo; results from me:
Inferred version of my own voice
&lt;audio controls&gt;
&lt;source src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/my_AI.wav" type="audio/mpeg"&gt;
&lt;/audio&gt;
WW2 commentary, original voice
&lt;audio controls&gt;
&lt;source src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/ww2Start.wav" type="audio/mpeg"&gt;
&lt;/audio&gt;
WW2 commentary, AI-synthesized
&lt;audio controls&gt;
&lt;source src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/ww2Start_AI.wav" type="audio/mpeg"&gt;
&lt;/audio&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Other vendors — Microsoft TTS, Google TTS, Douyin, all closed-source. Custom fine-tuning is expensive; otherwise you&amp;rsquo;re stuck with their pretrained voices. That said, the out-of-the-box quality is already excellent!&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;STT (speech-to-text)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Whisper&lt;/strong&gt; — OpenAI&amp;rsquo;s open-source model, supports 100+ languages. Currently the strongest open-source STT. Runs locally on a home PC.&lt;/li&gt;
&lt;li&gt;Others — Microsoft, Google, iFlytek, etc., all have plenty of APIs.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Other&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Suno&lt;/strong&gt; — AI music generation, sounds good, but practical use is still weak. Future looks promising.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sora&lt;/strong&gt; — video synthesis is everywhere now, but most output looks weird. Another path is heavily customized SD with image stitching for video. Both are being explored.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Combo / tooling&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt; — strongly recommended. Microsoft&amp;rsquo;s flagship. Deeply integrated into IDEs, especially VS Code and Visual Studio. Free tier exists, paid is $10/month. Wraps GPT, Claude, and Gemini. Once it sees your IDE context, it&amp;rsquo;s incredibly convenient.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cursor&lt;/strong&gt; — an AI-first IDE built on VS Code, neck-and-neck with Copilot. Also very convenient. But because it&amp;rsquo;s an IDE rather than a plugin, you give up some flexibility.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Poe&lt;/strong&gt; — an aggregator over multiple AIs; basically a wrapper frontend that calls each vendor&amp;rsquo;s API. Pros: one-stop access, some free quota. Cons: API calls usually fall short of native vendor sites in features and quality.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id="my-personal-outlook-on-the-future-of-ai"&gt;My personal outlook on the future of AI
&lt;/h2&gt;&lt;p&gt;I think AI has already permanently changed how I think and how I solve problems.
&lt;del&gt;Of course, if ASI shows up and humanity gets wiped out, none of this matters. So our default assumption has to be that AI carries us up the next tech tier.&lt;/del&gt;
In this era you have to keep up, keep leveling up, keep absorbing new knowledge.
&lt;strong&gt;The only constant in the world is change itself.&lt;/strong&gt;
After watching the inflated boom in the China market, I&amp;rsquo;ve concluded: AI on its own doesn&amp;rsquo;t generate huge value the way some prior technologies did. It only produces magic when combined with another field.
Like internet dev, traditional retail, traditional manufacturing, literature &amp;amp; film, education, etc.&lt;/p&gt;
&lt;p&gt;Either way, we should all think like founders. Letting AI fill in the details and do the grunt work for you is the right move. Build out your own knowledge framework, stop sweating the details. Grab the essence of the problem and the main contradiction.
In the future, I&amp;rsquo;ll go even deeper with AI in everything I do. I also hope to lift my whole team&amp;rsquo;s AI fluency and improve the workflow.
Here are a few practical AI deployments I&amp;rsquo;ve distilled from work — some already shipped, some that I want my team to ship later:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Code dev&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Heavy use of GitHub Copilot to dramatically speed up coding and to auto-catch errors. Especially good for structured code. For unfamiliar libraries and APIs, it&amp;rsquo;s totally possible to be in a state of &lt;em&gt;&amp;ldquo;I don&amp;rsquo;t remember it, but somehow I can use it&amp;rdquo;&lt;/em&gt; / &lt;em&gt;&amp;ldquo;I haven&amp;rsquo;t read the docs, but once I get the framework I can write code right away.&amp;rdquo;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;Use git hooks to AI-review every commit. Keeps quality up and obvious mistakes out.&lt;/li&gt;
&lt;li&gt;Standardize project layout. Use the &lt;code&gt;tree&lt;/code&gt; command (Windows) plus AI to keep directory structure clean.&lt;/li&gt;
&lt;li&gt;Reasoning about new requirements. When you face something you&amp;rsquo;ve never built, your first plan often misses corners. Ask AI to analyze: what&amp;rsquo;s the solution path, what should you watch for?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Game design / config&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Use AI to name variables. Few people pull off &amp;ldquo;faithful, expressive, elegant&amp;rdquo; naming. Things get confusing fast — AI helps a lot.&lt;/li&gt;
&lt;li&gt;Use AI for localization, even one-click scripts. High-quality, low-cost passable multi-language.&lt;/li&gt;
&lt;li&gt;Use AI for Excel formulas and number-crunching. Heavy Excel formulas are insane — just ask AI.&lt;/li&gt;
&lt;li&gt;Use AI to debug error messages. Designers usually aren&amp;rsquo;t from a code background and are stumped by errors. Paste the error, ask AI; most of the time you get something actionable. (This applies to anyone touching a project!)&lt;/li&gt;
&lt;li&gt;Use AI to write designer-side tools. Spot pain points in your own task and quickly write a tool to remove them.&lt;/li&gt;
&lt;li&gt;Idea collection. AI can quickly throw out a lot of ideas; you still have to filter and synthesize.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Art&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Reference and inspiration. Set up SD and Midjourney pipelines. Anything you can&amp;rsquo;t Google up in the right &amp;ldquo;vibe,&amp;rdquo; ask AI to draw. Quickly confirm direction with the requesting team. Build a curated prompt library by style — you can rapidly produce on-brief mockups, and unimportant cutscenes / scene art can ship as-is.
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3637.JPG"
width="1440"
height="1440"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3637_hu_deb101f2b9da5adb.JPG 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3637_hu_56b09b48e4b0619f.JPG 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="100"
data-flex-basis="240px"
&gt;
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3638.JPG"
width="1024"
height="1024"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3638_hu_76fae60eaae589b0.JPG 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3638_hu_ee4920c99c95a52.JPG 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="100"
data-flex-basis="240px"
&gt;
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3639.JPG"
width="1024"
height="1024"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3639_hu_765f13e76441d771.JPG 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/IMG_3639_hu_4fd654ffc6f5b52.JPG 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="100"
data-flex-basis="240px"
&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Item icons, character concept art&lt;/strong&gt; — already shipped successfully in the past. Tons of icons, even character concepts. AI does the first pass, art polishes — huge efficiency gain.
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-3.png"
width="1659"
height="939"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-3_hu_2a55b21be1c42035.png 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-3_hu_6c55373286c0191c.png 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="176"
data-flex-basis="424px"
&gt;
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-4.png"
width="348"
height="417"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-4_hu_d75f1c9a0f893500.png 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-4_hu_548a94bff79da85d.png 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="83"
data-flex-basis="200px"
&gt;
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-5.png"
width="411"
height="273"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-5_hu_98304cfb504941fd.png 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-5_hu_ba2edfd91e621e9f.png 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="150"
data-flex-basis="361px"
&gt;
&lt;img src="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-6.png"
width="238"
height="279"
srcset="http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-6_hu_7f1075378eaf5293.png 480w, http://baohongjiang.com/p/%E7%AB%99%E5%9C%A82025%E5%B9%B4%E5%9B%9E%E9%A1%BE%E5%92%8C%E5%B1%95%E6%9C%9Bai/image-6_hu_45f26d89b521b3be.png 1024w"
loading="lazy"
alt="alt text"
class="gallery-image"
data-flex-grow="85"
data-flex-basis="204px"
&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Other&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;Voiceover. If you need it, GPT-SoVITS does customized VO well.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Two reference videos for understanding AI&amp;rsquo;s origins:
&lt;a class="link" href="https://www.youtube.com/watch?v=RUtvBkibIT8" target="_blank" rel="noopener"
&gt;【Computer History】 NLP from &amp;ldquo;Past&amp;rdquo; to &amp;ldquo;Present&amp;rdquo;&lt;/a&gt;
&lt;a class="link" href="https://www.youtube.com/watch?v=7iNJyEbYDdc&amp;amp;t=408s" target="_blank" rel="noopener"
&gt;Why the Feynman Technique is called the ultimate learning method&lt;/a&gt;&lt;/p&gt;</description></item><item><title>How I Use AI — Notes from the Field</title><link>http://baohongjiang.com/en/p/how-i-use-ai-notes-from-the-field/</link><pubDate>Tue, 26 Dec 2023 00:00:00 +0000</pubDate><guid>http://baohongjiang.com/en/p/how-i-use-ai-notes-from-the-field/</guid><description>&lt;h3 id="gpt--the-greatest-invention-in-history"&gt;GPT — the greatest invention in history
&lt;/h3&gt;&lt;p&gt;Personal opinion. At least, as of right now.
My programming and CS chops have leveled up massively, and I really owe that to GPT, especially GPT-4. Yes, it&amp;rsquo;s gone through many iterations and is far from where it started.&lt;/p&gt;
&lt;h3 id="the-mainstream-ais"&gt;The mainstream AIs
&lt;/h3&gt;&lt;p&gt;A quick rundown of the popular conversational LLMs right now, ordered by my recommendation:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;GPT-4&lt;/strong&gt; — strongest and most useful&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Claude 2&lt;/strong&gt; — up to 100k context&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;New Bing&lt;/strong&gt; — built on GPT-4, free to use, good for search&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GPT-3.5&lt;/strong&gt; — cheap, mostly used for high-volume API calls&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bard&lt;/strong&gt; — Google&amp;rsquo;s offering, so-so. Upside: free&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;iFlytek Spark (讯飞星火)&lt;/strong&gt; — decent. Upside: convenient if you&amp;rsquo;re in China&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Baidu ERNIE Bot (百度文心一言)&lt;/strong&gt; — no comment&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Other miscellaneous ones&lt;/strong&gt; — with so many better options, why use junk?&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Local LLMs&lt;/strong&gt; — can be deployed locally, can be fine-tuned. Compared with using someone else&amp;rsquo;s product, you have far more creative possibilities.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In actual use, GPT-4 is the most helpful for both work and life, and the best one to use. The downsides are that registration and subscription are annoying, and $20/month is pricey. But I can tell you with full confidence: it is &lt;em&gt;absolutely&lt;/em&gt; worth every cent. Truly insane.
From here on, I&amp;rsquo;ll only talk about GPT-4. Other products aren&amp;rsquo;t worth discussing.&lt;/p&gt;
&lt;h3 id="what-gpt-4-changed-for-me"&gt;What GPT-4 changed for me
&lt;/h3&gt;&lt;p&gt;There are a million articles online about how amazing GPT-4 is and how to use it. I&amp;rsquo;m not going to repeat any of that. I&amp;rsquo;ll just talk about the biggest shifts for me personally.&lt;/p&gt;
&lt;h4 id="a-complete-overhaul-of-how-i-learn-and-how-i-get-things-done"&gt;A complete overhaul of how I learn and how I get things done
&lt;/h4&gt;&lt;p&gt;Traditionally, to write code or learn a new skill, you have to study the whole thing front-to-back, and only after enough projects, enough hitting walls, can you confidently say &amp;ldquo;yeah, I can handle this kind of problem.&amp;rdquo;
The world&amp;rsquo;s knowledge is an ocean — the more you learn, the more you realize you don&amp;rsquo;t know. In one limited human life you can only ever master a tiny sliver. Especially in tech, innovation outpaces your ability to keep up.
Picture all knowledge and skills as a giant tree. Each module is a trunk — say, Python is one trunk; Python&amp;rsquo;s syntax and libraries are the leaves on that trunk.
Normally, to write decent Python software, you have to internalize the trunk and most of the leaves.
Now here&amp;rsquo;s the problem: Python, Go, C#, Java, JS, TS, C++, etc. — that&amp;rsquo;s already a long list. Then there&amp;rsquo;s everything Linux: nginx, ufw, vim, OpenVPN, and on and on. .NET land has another whole stack. That&amp;rsquo;s before you get to algorithms, frameworks, design patterns, Docker, jump servers, and so on.
With this much, even a whole life only gets you mastery of one or two. Everything else, you have to pretend you didn&amp;rsquo;t see.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Human life is finite. We can&amp;rsquo;t master that much.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;GPT-4 totally upends this. It can answer questions and crank out tons of code on demand. (Of course, your professional level still caps the ceiling.)&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;GPT-4 massively speeds up learning, especially looking things up and debugging.&lt;/li&gt;
&lt;li&gt;GPT-4 acts like a junior who completes the work you direct. Your own ability sets the upper bound.&lt;/li&gt;
&lt;li&gt;Plus a huge amount of miscellaneous work.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;GPT-4 is an extension of my ability — it handles the details.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;To stick with the tree metaphor: I learn the trunks and the big-picture frame myself. The leaves and details, GPT-4 fills in.
Walking every trunk &lt;em&gt;and&lt;/em&gt; every leaf is too much for me. But touring all the trunks is easy. When I need a specific leaf, I tap GPT-4 to flesh it out.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Cover as many trunks as possible. When a problem hits, lean on GPT-4 for the leaves.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This completely overturns my old &amp;ldquo;learn it all first, then do it&amp;rdquo; methodology. Now, as long as I have a sense of all the trunks, I can ship something that&amp;rsquo;s theoretically possible but whose details I don&amp;rsquo;t yet know — and ship it fast.&lt;/p&gt;
&lt;p&gt;A few examples:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Building this blog.&lt;/strong&gt; In theory, set up a website, get a server. The details are massive — see &lt;a class="link" href="http://baohongjiang.com/en/p/the-tech-behind-my-blog/" &gt;The Tech Behind My Blog&lt;/a&gt;. That much technical detail, even though I&amp;rsquo;d never done it before, was tractable because I knew the trunks; the leaves I learned from GPT-4.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Building an AI WeChat public account.&lt;/strong&gt; From an idea, starting with one official OpenAI API example, then a secondary WeChat account driven by simulated PC WeChat clicks, step by step it grew. I really was clueless at the start. Eventually it had Midjourney, Stable Diffusion, GPT-3.5/4, New Bing, voice-recognition chat, a membership system, and more. Tons of system and technical details. I knew it was theoretically doable and had a vague plan; GPT-4 filled in the rest. By the way, the account was called &lt;strong&gt;Xiao Hui Hen Zhi Hui (小慧很智慧)&lt;/strong&gt;. At its peak it had 4k+ subscribers. Costs got too high and I had limited bandwidth, so I stopped maintaining it.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Building an AI chat website.&lt;/strong&gt; Started by forking an open-source vue3+express site from GitHub. Later I rewrote the backend in Python (FastAPI), with a database, account auth, etc. Eventually added WeChat Pay, SMS phone verification, and more. WeChat Pay especially is a beast — you also need an ICP-filed mainland China server and domain. But I knew it was possible, and I made it work in the end!&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="http://baohongjiang.com/en/p/building-a-dynamic-multi-server-lan-over-openvpn-tunnels/" &gt;Building a giant LAN over OpenVPN tunnels&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;There are many similar cases in everyday work and life. I&amp;rsquo;ll stop listing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;GPT-4 completely flips the old &amp;ldquo;learn-it-all-then-start&amp;rdquo; model. Now, if it&amp;rsquo;s theoretically possible, I can move on it immediately and finish at terrifying speed.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Of course, for highly complex, exploratory work, both GPT-4 and I are stuck. Like research-grade new algorithms, or a distributed system that handles tens of millions of concurrent connections.
But for most &lt;strong&gt;tasks someone has already solved before&lt;/strong&gt;, with GPT-4 in the loop, I can do them fairly well.&lt;/p&gt;
&lt;p&gt;As for specific tips and tricks — there&amp;rsquo;s already a flood of those online, and a few sentences won&amp;rsquo;t cover it. If you actually care, just go bang on it. If you can use it, the tricks come naturally as you go. If you can&amp;rsquo;t, no amount of talk will help.&lt;/p&gt;</description></item></channel></rss>