书房Library
塑造我思考的输入。点开任意条目查看笔记。Inputs that shaped my thinking. Open any entry for notes.
书Books
-
哥德尔、艾舍尔、巴赫 — Douglas Hofstadter · 读毕read 2026-04-30 ★★★★★
这本书让我意识到:奇异循环(strange loops)是心智的底质。
The book that taught me strange loops are the substrate of mind.
-
心智是平的 — Nick Chater · 读毕read 2025-11-15 ★★★★
Chater 的论点:你的行为背后并没有一座丰富的内心世界。心智按需临场即兴地生成答案、信念、偏好——事后才生成——而我们把”即兴”误认成了”挖掘”。
它做得好的地方
- 跨越虚构(confabulation)、选择盲(choice blindness)、偏好反转(preference reversal)的统一论证真的很扎实。他没在挑樱桃;实验记录确实在说一些让人不安的事。
- 对心理学与经济学的含义——你内心并不存在一个稳定的”真实偏好”等待被发现——处理得既清醒又坦率。
- 与预测处理(参见 Predictive processing 101)出乎意料地配合:大脑不是从深储中读取内容,它在生成。
我对它的反对
- “Flat”夸大了。仍然存在某种深度——长期语义记忆、运动图式、情景锚点——即使意识从深处的拉取是重构性的而不是检索性的。
- 这本书没有认真地处理具身。一旦把身体也当作底质的一部分(Chater 把身体视为感觉外设),很多”扁平性”就会软化。
我从中持续在用的
当有人问我对一个我已经几个月没具体想过的话题”真正怎么看”时,我现在会注意到那个答案是被现场制造的,置信度按社交需求校准。Chater 教我不再相信自己对偏好的快速自报,并且——如果想让一个想法持续存在——要尽快把它写下来。
交叉链接:The loop and the self —— 平心智论以有趣的方式让奇异循环式的自我说法变得复杂。
Chater’s thesis: there is no rich inner mental life behind your behaviour. The mind improvises answers, beliefs, and preferences on demand, post-hoc, and we mistake the improvisation for excavation.
What’s strong
- The unifying argument across confabulation, choice blindness, and preference reversal is genuinely tight. He doesn’t cherry-pick; the experimental record really does say something uncomfortable.
- The implications for psychology and economics — there is no stable “true preference” inside you to discover — are bracingly handled.
- Pairs unexpectedly well with predictive-processing accounts (see Predictive processing 101): the brain doesn’t read off content from a deep store, it generates it.
Where I push back
- “Flat” overstates the case. There is some depth — long-term semantic memory, motor schemata, episodic anchors — even if the conscious pull from depth is reconstructive rather than retrieval-based.
- The book does not engage seriously enough with embodiment. A lot of the “flatness” softens once you treat the body as part of the substrate, which Chater treats as a sensory periphery.
What I keep using from it
When someone asks me what I “really think” about a topic I haven’t specifically thought about in months, I now notice the answer being manufactured live, with confidence calibrated to social demands. Chater taught me to mistrust my own snap reports of preference and to write things down quickly if I want them to persist.
Cross-link: The loop and the self — flat-mind theory complicates strange-loop accounts of self in interesting ways.
-
主人与他的使者 — Iain McGilchrist · 读毕read 2025-06-10 ★★★★
一本 600 页的论著,论点是:左右两个大脑半球以根本不同的方式关注世界,而西方文化一直在逐步偏向错误的那一个。
论点的紧凑版
- 右半球以广阔、警觉、情境性、具身的方式关注——关注整体、关注新事物、关注隐含的部分。
- 左半球以聚焦、抽象、操控、符号的方式关注——关注局部、关注已知、关注显式的部分。
- 一个健康的心智两者并用,但右半球是”主人”——它锚定意义。左半球是”使者”——它处理、操控、表达。
- McGilchrist 的担忧:后启蒙的西方知识生活逐步过度授权使者,使其与主人脱节。这便是晚近现代性的种种特征性病态——官僚化、抽象化、用地图取代领土。
我认为对的部分
- 半球特化的神经科学比流行心理学的”左脑 vs 右脑”复杂得多。它不映射到”逻辑 vs 创造力”,而映射到 McGilchrist 正确强调的关注的方式。
- 文化-历史那几章虽然大胆,但论证上是认真的——他在细读文本,不是在拼金句。
我认为夸大的部分
- 这本书的叙事弧线——西方历史追随了半球平衡——更像一首诗,而不是一个可被检验的命题。挑樱桃的风险是真实的。
- 他低估了”左半球过度扩张”中有多少其实可以由经济与制度激励来更好地解释,而不是由脑结构。
在我的思考中存活下来的部分
关注的模式这一区分是我反复在用的。它改变了我读论文的方式、让我注意到讨论何时塌陷为纯抽象、以及让我评估 AI 系统的方式——它们几乎在构造上就是没有主人的使者。
交叉链接:Attention as relation, not state——关注的模式正是那个被脑电状态化框架抹平的关系性维度。
A 600-page argument that the two cerebral hemispheres attend to the world in fundamentally different modes, and that Western culture has been gradually privileging the wrong one.
The thesis, fairly compressed
- The right hemisphere attends in a broad, vigilant, contextual, embodied way — to the whole, to the new, to the implicit.
- The left hemisphere attends in a focused, abstract, manipulative, symbolic way — to the part, to the known, to the explicit.
- A healthy mind uses both, but the right hemisphere is the “master” in the sense that it grounds meaning. The left is the “emissary” that handles, manipulates, and articulates.
- McGilchrist’s worry: post-Enlightenment Western intellectual life has progressively over-empowered the emissary, severing it from the master. Hence the characteristic pathologies of late modernity — bureaucracy, abstraction, the substitution of map for territory.
What I think is right
- The neuroscience of hemispheric specialisation is more substantial than pop-psych “left brain vs right brain” suggests. It just doesn’t map onto “logic vs creativity”; it maps onto mode of attending, which is what McGilchrist correctly highlights.
- The cultural-historical chapters, while bold, are at least argumentatively serious — he reads texts closely, not in soundbites.
What I find overstated
- The book’s narrative arc — that Western history has tracked hemispheric balance — is more poetic than testable. The cherry-picking risk is real.
- He underweights how much of “left-hemisphere overreach” might be better explained by economic and institutional incentives than by brain architecture.
What survived contact with my own thinking
The distinction between modes of attention is the part I keep using. It has changed how I read papers, how I notice when a discussion has collapsed into pure abstraction, and how I evaluate AI systems — they are, almost by construction, emissaries without masters.
Cross-link: Attention as relation, not state — mode of attending is the relational dimension that the EEG-state framing flattens out.
-
索拉里斯星 — 斯坦尼斯瓦夫·莱姆 (Stanisław Lem) · 读毕read 2024-08-20 ★★★★★
如果科幻小说有一条公理是莱姆教给我的,那就是:我们不会理解外 星人。不是因为外星人难懂,而是因为”理解”这个动作本身依赖于一 套与外星智能毫无共享的范畴。
索拉里斯之海做的事
它读取了人类研究者最深处的、潜意识的、被压抑的人物形象,并把它 们 物化 成肉体可见的”客人”——不是模仿,不是重现,是一种基于 我们自己却完全脱离我们意图的建造。
主角凯尔文的妻子哈丽——他十年前自杀的妻子——出现在空间站里,是 她又不是她。她不知道自己之前发生了什么。她爱他,因为这是那个原 始记忆里的核心。她不能离开他超过几米,因为那不在原始记忆的范围 内。
这不是惊悚片里”鬼回来了”。这是 存在论级别的不安:什么叫人? 什么叫连续?什么叫爱?
为什么 Tarkovsky 的电影没有真正抓住它
Tarkovsky 把它拍成了关于人类乡愁与记忆的电影——很美,但莱姆本人 公开抗议过,说他写的不是这个。莱姆写的是人类认知的局限: 我们以为自己在与外星智能对话,其实索拉里斯之海连我们存在都未必 确认;它只是在它无法理解的方式下做了它无法解释的事。
这个落差——以为自己在沟通、其实在对着镜子说话——是莱姆全部作品 的核心,也是后来我读 The loop and the self 时最先想到的那个 对照。
我重读时画下的句子
“我们不需要别的世界。我们需要镜子。”
这句话在 LLM 时代有了让人不舒服的新意义。我们当然欢迎 GPT 类系 统的”对话”——因为那是一面在我们训练数据上拟合的镜子,反射的全部 是我们自己。
推荐顺序
如果没读过莱姆,先读《索拉里斯》和《其主之声》。前者讲外星智能 的不可达,后者讲外星信息的不可解。两本都比《无敌号》《伊甸》要 深,比《机器人大师》要严肃。
延伸: 心流与机器, The loop and the self.
If science fiction has one axiom Lem taught me, it is this: we will not understand aliens. Not because aliens are hard, but because the very act of “understanding” depends on categories that share nothing with alien intelligence.
What Solaris’s ocean does
It reads the deepest, subconscious, repressed figures from a human researcher’s mind and materialises them as visitors of flesh — not imitations, not reproductions, but constructions built from our own substance and entirely detached from our intent.
The protagonist Kelvin’s wife Hari — who killed herself ten years ago — appears on the station as her, and not her. She doesn’t know what happened before. She loves him because that’s the core of the original memory. She cannot leave his side by more than a few metres, because that wasn’t in the original memory either.
This isn’t horror-film “the ghost returns”. This is ontological unease: what is a person? what is continuity? what is love?
Why Tarkovsky’s film didn’t quite catch it
Tarkovsky made it into a film about human nostalgia and memory — beautiful, but Lem himself protested in public that he hadn’t written that book. Lem wrote about the limits of human cognition: we believe we are in dialogue with an alien intelligence, when in fact the Solaris ocean has not even confirmed our existence; it merely does inexplicable things in its inexplicable way.
That gap — believing you are communicating when you are speaking to a mirror — is the spine of Lem’s entire body of work, and the first thing I thought about when I later read The loop and the self.
Sentences I underlined on re-reading
“We don’t need other worlds. We need mirrors.”
This sentence has acquired an uncomfortable new meaning in the LLM era. We do welcome the “conversation” with GPT-class systems — they are mirrors fit on our training data, reflecting back nothing but ourselves.
Reading order
If you haven’t read Lem, start with Solaris and His Master’s Voice. The first is about the unreachability of alien intelligence, the second the unreadability of alien messages. Both are deeper than The Invincible or Eden, more serious than The Cyberiad.
Cross-links: 心流与机器, The loop and the self.
-
三体(三部曲) — 刘慈欣 · 读毕read 2024-03-12 ★★★★★
我读过两遍。第一遍像被一拳打在胸口,第二遍才看清那一拳是从哪个 方向来的。
三部曲的内核,逐部递进
《三体》:地球文明被迫与一个寒冷、贫瘠、绝望的恒星系文明共 享存在论坐标。重点不是外星人来了——是 你不再是宇宙剧本的中心。
《黑暗森林》:把博弈论压到宇宙尺度,得出一个让人无法反驳的、 冷酷的、形式上自洽的结论——任何对其他文明的暴露都是自杀。这是 我读过对费米悖论最有美感、也最让人怀疑作者是否真的相信它的 回答。
《死神永生》:把维度本身当作可被攻击的资源。文明的毁灭不再 是灾难——是热力学。这一部最难读,也最不像传统意义的科幻;它在 某种意义上是宇宙学论文的小说化复述。
为什么它是真正的硬科幻
不是因为公式多。是因为它敢于让世界观决定人物命运,而不是反 过来。罗辑、章北海、程心——他们的选择是被宇宙规则逼出来的,不是 作者出于戏剧需要安排的。这在中文科幻里极罕见。
也因此它有时候被批评”人物扁平”——其实是把人物理解为社会角色 × 宇宙规则的交点而不是心理深度的承载体。这是一种风格选择,不是 失败。
我学到的一件事
刘慈欣最深的洞察不在三体人,不在黑暗森林,而在——
“弱小和无知不是生存的障碍,傲慢才是。”
这句话有人当成对人类宇宙地位的讽刺。我读到的是另一个层次:它适 用于个人,适用于学派,适用于 AI 安全讨论里我们对自己理解能力的 评估。傲慢的代价不是失败——是绝迹。
推荐入门顺序
- 没读过中国科幻:直接读《三体》。第一部相对慢,但第二部立刻 起飞。
- 读过《三体》觉得一般:读《球状闪电》和《乡村教师》。短篇里 的刘慈欣比长篇更锋利。
- 不喜欢硬科幻:可能就是不喜欢,没必要勉强。
延伸: [[索拉里斯星]] —— 两本都在处理”宇宙不为我们而设”,但莱姆 关心认识论的不可达,刘慈欣关心博弈论的冷酷。两种悲凉。
I’ve read the trilogy twice. The first read was a punch in the chest; the second read showed me which direction the punch came from.
The core of each volume, escalating
The Three-Body Problem: Earth civilisation is forced to share ontological coordinates with a cold, barren, desperate stellar civilisation. The point isn’t that aliens are coming — it’s that you are no longer at the centre of the cosmic script.
The Dark Forest: pushes game theory to the cosmic scale and produces an irrefutable, cold, formally consistent conclusion: any exposure to another civilisation is suicide. This is the most aesthetically beautiful — and most “did the author actually believe it?” — answer to the Fermi paradox I have read.
Death’s End: treats dimensionality itself as an attackable resource. The destruction of civilisations is no longer disaster — it is thermodynamics. The hardest of the three to read, and the least like traditional SF; in a sense it is a cosmology paper novelised.
Why it is genuinely hard SF
Not because the formulas are dense. Because it dares to let the worldview determine the characters’ fates, rather than the other way round. Luo Ji, Zhang Beihai, Cheng Xin — their choices are extracted by cosmic rules, not arranged by the author for dramatic need. This is rare in Chinese SF.
For the same reason it gets criticised for “flat characters” — it’s really treating characters as the intersection of social role and cosmic rule rather than as bearers of psychological depth. A stylistic choice, not a failure.
One thing I learned
Liu’s deepest insight isn’t the Trisolarans, isn’t the dark forest, it’s:
“Weakness and ignorance are not barriers to survival. Arrogance is.”
Some people read this as a sneer at humanity’s cosmic position. I read it on another level: it applies to individuals, to schools of thought, and to the assessment of our own comprehension in AI safety debates. The cost of arrogance is not failure — it is extinction.
Suggested reading order
- Never read Chinese SF: start with The Three-Body Problem. Volume one is relatively slow, but volume two takes off immediately.
- Read it but underwhelmed: read Ball Lightning and The Village Teacher. Liu in short stories is sharper than in long fiction.
- Don’t like hard SF: maybe you just don’t, and that’s fine.
Cross-link: [[索拉里斯星]] — both books deal with “the universe is not made for us”, but Lem cares about epistemological unreachability and Liu cares about game-theoretic cruelty. Two flavours of cosmic sorrow.
论文与链接Papers & links
-
注意力即全部所需 · NeurIPS 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia PolosukhinTransformer 那篇。值得一读,不只是因为它对 NLP 做了什么,更是因 为它彻底重组了整个领域对”序列建模”这件事的直觉。
我反复回到的几点
- 在架构上有勇气完全去掉循环。2017 年 RNN 的霸权是完整的; 转向纯自注意力不是渐进的,是绝对的。
- 归纳偏置弱得令人震惊——没有空间结构、没有时间核,只有学到的位 置编码与两两交互。当时大部分先前工作都假设强序列先验是必要 的;这篇论文表明那是拐杖。
- 算力故事至少和算法故事一样重要。自注意力沿序列轴可并行, RNN 不行。如果没有那个算力性质,Transformer 会留作奇观,而不 是每一个前沿模型的底座。
要避开的误解
论文标题里的 “attention” 是指对学到的表示做点积加权聚合。在 机制上,它与认知神经科学或心理学意义上的”注意”几乎没有关系。把 两者当作同一个概念是类比,不是同一性。(参见 Attention as relation, not state 看为什么这一点要紧。)
它没说的
- 它没有确立”把 Transformer 扩展之后会得到通用智能”。那是后来 在经验上发生的,作者并没有预测到。
- 它没有谈样本效率、对齐、可解释性,或后续八年里主导研究议题的 那些问题。
但你不能因为一篇奠基性论文没有预见到它的后裔而苛责它。读它,是 为了那一招的干净。
交叉链接:The loop and the self,关于 attention-as-aggregation 是否可作为自我的候选底质。
The Transformer paper. Notable not just for what it did to NLP, but for how completely it reorganised the field’s intuitions about what “sequence modelling” means.
What I keep returning to
- The architectural courage of removing recurrence entirely. RNN hegemony in 2017 was complete; the move to pure self-attention was not gradual, it was absolute.
- The inductive bias is shockingly weak — no spatial structure, no temporal kernel, just learned positional encodings and pairwise interaction. Most of the prior work assumed strong sequence priors were necessary; this paper showed they were a crutch.
- The compute story matters at least as much as the algorithmic story. Self-attention is parallelisable along the sequence axis in a way RNNs are not. Without that compute property, Transformers would remain a curiosity, not the substrate of every frontier model.
Misunderstandings to avoid
The “attention” in the paper title is dot-product weighted aggregation over learned representations. It has, mechanistically, almost no relationship to attention as understood in cognitive neuroscience or psychology. Treating the two as the same concept is an analogy, not an identity. (See Attention as relation, not state for why this matters.)
What it doesn’t say
- It doesn’t establish that scaling Transformers gives you general intelligence. That came later, empirically, and the authors didn’t predict it.
- It doesn’t address sample efficiency, alignment, interpretability, or any of the things that ended up dominating the next eight years of follow-on work.
But you don’t fault a foundational paper for not foreseeing its own descendants. You read it for the cleanness of the move.
Cross-link: The loop and the self, for the question of whether attention-as-aggregation is a candidate substrate for selfhood.
-
意识、前意识与阈下加工——一个可检验的分类 · Trends in Cognitive Sciences, 10(5), 204–211
Stanislas Dehaene, Jean-Pierre Changeux, Lionel Naccache, Jérôme Sackur, Claire Sergent意识的全局工作空间说法在这篇论文里被写得最为干净——写给一群 要的是经验承诺、不是要形而上学的读者。
分类
三种状态,每一种都有特征性的神经签名:
- 阈下 —— 刺激太弱/被掩蔽/未被注意;表征在局部存在但不传 播。无 P3b。无额顶点燃。
- 前意识 —— 刺激强但未被注意;表征被深度处理、可被启动、 可达意识,但当前并未广播。
- 意识 —— 刺激强且被注意;表征点燃全局工作空间,出现额顶远 距离同步、P3b 标志、可被报告。
关键的经验性一招是把注意性放大确认为门控机制。强但未被注意的 刺激停留在前意识;注意才把它们推入工作空间。
我即便没有完全买账也觉得它有用的原因
- 它把意识做得计算上可证伪。要么点燃签名能预测可报告性,要 么不能。(事实上能,相当稳健。)
- 它澄清了为什么如此多的精巧认知能够发生在意识之外——前意识那一 档承担了很重的活儿。
- 它在注意、工作记忆、意识研究之间提供了一套不会把它们互相塌缩 掉的共享词汇。
它不够的地方
- 这套框架对接入意识远比对现象意识擅长。它对感受质几乎不发 言;它发言的是什么会被报告。
- “工作空间”听起来可疑地功能主义——原则上它可以被任何足够灵活的 底质实现,包括某些带广播组件的 LLM 类系统。这个事实是给该理 论增光还是把它琐碎化,取决于你自己的哲学承诺。
配套阅读:Predictive processing 101 —— 两个框架并不冲突;它 们说话的层次不同。以及 The loop and the self,谈何种类的 广播才是自我所需。
The crisp version of the Global Workspace account of consciousness, written for an audience that wants empirical commitments rather than metaphysics.
The taxonomy
Three regimes, each with characteristic neural signatures:
- Subliminal — stimulus too weak / masked / unattended; representation exists locally but does not propagate. No P3b. No fronto-parietal ignition.
- Preconscious — stimulus strong but not attended; representation is processed deeply, primable, available to consciousness but not currently broadcast.
- Conscious — stimulus strong AND attended; representation ignites the global workspace, fronto-parietal long-range synchrony, P3b marker, reportable.
The key empirical move is identifying attentional amplification as the gating mechanism. Strong-but-unattended stimuli stay preconscious; attention is what tips them into the workspace.
Why I find it useful even if I don’t fully buy it
- It makes consciousness computationally falsifiable. Either the ignition signature predicts reportability, or it doesn’t. (It does, pretty robustly.)
- It clarifies why so much sophisticated cognition can occur outside consciousness — the preconscious regime does heavy lifting.
- It gives a shared vocabulary across attention, working memory, and consciousness research that doesn’t collapse them into one another.
Where it falls short
- The framework is much better at access consciousness than phenomenal consciousness. It doesn’t really speak to qualia; it speaks to what gets reported.
- “Workspace” is suspiciously functionalist — it could in principle be implemented in any sufficiently flexible substrate, including certain LLM-class systems with broadcasting components. Whether that fact dignifies or trivialises the theory depends on your philosophical commitments.
Pair with: Predictive processing 101 — the two frameworks are not in conflict; they speak at different levels. And The loop and the self for the question of what kind of broadcasting is required for selfhood.
-
自由能原理——一个统一的大脑理论? · Nature Reviews Neuroscience, 11(2), 127–138
Karl Friston把自由能原理从一个小众形式系统推到当代理论神经科学通用语的,就 是这篇论文。
论文主张
任何在波动的环境中维持自身边界的自组织系统,在稳态下,其行为必 然仿佛在最小化某个生成模型下感觉状态的变分自由能。这是一个 来自非平衡统计物理的近乎同义反复的命题——但它对我们如何思考大脑 的后果一点都不同义反复。
为什么重要
- 它把感知(在固定参数下做推断)、学习(对参数做推断)、行动 (改变世界以匹配预测)统一在一个数学目标之下。
- 它为主动推断——这一决策论意义上的推广——提供了一个干净的底 座,并由此孕育出了一个高产的研究纲领。
- 它是可执行的——你可以写出最小化变分自由能的小型 agent,看它 们形成习惯、表现出好奇、在生成模型崩塌时陷入恐慌。
我保持怀疑的地方
- “仿佛”承担了太多论证。原理与”系统在最小化自由能”是相容的;它 并没有证明系统实现了那一套变分机制。
- 论文里反复出现一种概念上的模糊——“大脑表征一个生成模型”和”大 脑可以被一个生成模型很好地描述”。前者是经验性命题,后者是方 法论命题。
- 对高层认知(语言、规划),形式系统与真实神经实现之间的鸿沟 仍然很大。
尽管如此,这是过去十年里我采纳的最有用的一个框架。配套幼苗: Predictive processing 101。
The paper that took the free-energy principle from a niche formalism to the lingua franca of contemporary theoretical neuroscience.
What the paper claims
Any self-organising system that maintains a boundary against a fluctuating environment must, in steady state, behave as if it were minimising the variational free energy of its sensory states under a generative model. This is a tautological claim from nonequilibrium statistical physics — but its consequences for how we think about the brain are not tautological.
Why it matters
- It unifies perception (inference under fixed parameters), learning (inference over parameters), and action (changing the world to match prediction) under a single mathematical objective.
- It gives a clean substrate for active inference, the decision-theoretic generalisation that has spawned a productive research programme.
- It is executable — you can write small agents that minimise variational free energy and watch them form habits, get curious, and panic when their generative model collapses.
Where I’m sceptical
- The “as if” is doing a lot of work. The principle is consistent with a system minimising free energy; it does not prove the system implements the variational machinery.
- There is a recurring conceptual blur between “the brain represents a generative model” and “the brain is well-described by a generative model”. The first is empirical; the second is methodological.
- For high-level cognition (language, planning), the gap between the formalism and the actual neural implementation is still wide.
Despite all that, this is the single most useful framework I’ve adopted in the last decade. Companion seedling: Predictive processing 101.