nanobot 项目学习指南

目标读者：从未接触过该项目的开发者
阅读时间：约 30-45 分钟
版本：v0.1.4.post2

1. 项目简介

一句话说明

nanobot 是一个超轻量级（约 4000 行代码）的个人 AI 助手框架，支持多平台聊天集成（Telegram、Discord、飞书、WhatsApp 等），具备工具调用、记忆管理、定时任务等核心能力。

核心功能列表

功能模块	说明
🤖 AI 对话	基于 LLM 的智能对话，支持多轮交互
🔧 工具调用	文件读写、Shell 执行、Web 搜索、网页抓取
🧠 记忆系统	长期记忆（MEMORY.md）+ 历史日志（HISTORY.md）双层存储
📱 多平台集成	Telegram、Discord、飞书、钉钉、Slack、WhatsApp、QQ、Matrix、Email
⏰ 定时任务	Cron 表达式支持，可定时触发 Agent 执行任务
💓 心跳任务	周期性检查 HEARTBEAT.md 中的待办事项
🎯 技能系统	通过 Markdown 文件扩展 Agent 能力
🔌 MCP 支持	Model Context Protocol，可接入外部工具服务器

技术栈总览

层级	技术/依赖	版本
语言	Python	≥3.11
CLI 框架	Typer	≥0.20.0
LLM 路由	LiteLLM	≥1.81.5
配置验证	Pydantic	≥2.12.0
异步通信	asyncio + websockets	-
日志	Loguru	≥0.7.3
终端美化	Rich	≥14.0.0
WhatsApp 桥接	Node.js + Baileys	≥20.0.0

2. 目录结构说明

nanobot/
├── __init__.py              # 版本信息
├── __main__.py              # 模块入口：python -m nanobot
├── agent/                   # 🧠 核心 Agent 逻辑
│   ├── loop.py              #    Agent 主循环（LLM ↔ 工具执行）
│   ├── context.py           #    上下文构建器（Prompt 组装）
│   ├── memory.py            #    记忆系统（MEMORY.md + HISTORY.md）
│   ├── skills.py            #    技能加载器
│   ├── subagent.py          #    子 Agent 管理（后台任务）
│   └── tools/               #    内置工具集
│       ├── base.py          #       工具基类
│       ├── registry.py      #       工具注册表
│       ├── filesystem.py    #       文件操作工具
│       ├── shell.py         #       Shell 执行工具
│       ├── web.py           #       Web 搜索/抓取工具
│       ├── message.py       #       消息发送工具
│       ├── spawn.py         #       子 Agent 启动工具
│       ├── cron.py          #       定时任务工具
│       └── mcp.py           #       MCP 客户端
├── bus/                     # 🚌 消息总线
│   ├── events.py            #    消息事件定义（Inbound/Outbound）
│   └── queue.py             #    异步消息队列
├── channels/                # 📱 聊天平台集成
│   ├── base.py              #    频道基类
│   ├── manager.py           #    频道管理器
│   ├── telegram.py          #    Telegram 实现
│   ├── discord.py           #    Discord 实现
│   ├── feishu.py            #    飞书实现
│   ├── dingtalk.py          #    钉钉实现
│   ├── slack.py             #    Slack 实现
│   ├── whatsapp.py          #    WhatsApp 实现
│   ├── mochat.py            #    MoChat 实现
│   ├── qq.py                #    QQ 实现
│   ├── matrix.py            #    Matrix 实现
│   └── email.py             #    Email 实现
├── config/                  # ⚙️ 配置管理
│   ├── schema.py            #    Pydantic 配置模型
│   └── loader.py            #    配置加载/保存
├── cron/                    # ⏰ 定时任务
│   ├── service.py           #    Cron 服务实现
│   └── types.py             #    Cron 数据类型
├── heartbeat/               # 💓 心跳服务
│   └── service.py           #    周期性任务检查
├── providers/               # 🤖 LLM 提供商
│   ├── base.py              #    提供商基类
│   ├── registry.py          #    提供商注册表
│   ├── litellm_provider.py  #    LiteLLM 实现
│   ├── openai_codex_provider.py  #    OpenAI Codex 实现
│   └── custom_provider.py   #    自定义 OpenAI 兼容实现
├── session/                 # 💬 会话管理
│   └── manager.py           #    会话存储与检索
├── templates/               # 📝 默认模板文件
│   ├── AGENTS.md            #    Agent 行为指令
│   ├── SOUL.md              #    人格/性格定义
│   ├── USER.md              #    用户自定义指令
│   ├── TOOLS.md             #    工具使用说明
│   └── HEARTBEAT.md         #    心跳任务模板
├── skills/                  # 🎯 内置技能
│   └── README.md            #    技能系统说明
├── cli/                     # 🖥️ 命令行接口
│   └── commands.py          #    CLI 命令实现
└── utils/                   # 🛠️ 工具函数
    └── helpers.py           #    通用辅助函数

bridge/                      # 🔗 WhatsApp 桥接（Node.js）
├── package.json             #    Node.js 依赖
├── tsconfig.json            #    TypeScript 配置
└── src/
    ├── index.ts             #    桥接服务入口
    ├── server.ts            #    WebSocket 服务器
    ├── whatsapp.ts          #    Baileys 客户端
    └── types.d.ts           #    类型定义

3. 架构设计

整体架构模式

nanobot 采用事件驱动 + 插件化架构，核心设计理念：

单一职责：每个模块只做一件事（Agent 只处理消息，Channel 只处理通信）
解耦通信：通过 MessageBus 实现异步消息传递，Channels 与 Agent 不直接依赖
配置驱动：所有行为通过 ~/.nanobot/config.json 配置，无需修改代码
渐进扩展：Provider、Channel、Tool 均可通过注册表动态扩展

核心模块划分与职责

┌─────────────────────────────────────────────────────────────┐
│                        CLI Layer                             │
│              (commands.py - Typer 命令定义)                   │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Gateway Mode                            │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐  │
│  │   Agent     │  │   Cron      │  │     Heartbeat       │  │
│  │   Loop      │  │  Service    │  │     Service         │  │
│  └──────┬──────┘  └─────────────┘  └─────────────────────┘  │
│         │                                                    │
│         ▼                                                    │
│  ┌─────────────────────────────────────────────────────┐    │
│  │              Message Bus (queue.py)                  │    │
│  │   Inbound Queue ◄── Channels    Agent ──► Outbound   │    │
│  └─────────────────────────────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                     Channel Layer                            │
│   Telegram  Discord  飞书  钉钉  Slack  WhatsApp  ...        │
└─────────────────────────────────────────────────────────────┘

模块间调用关系与数据流向

消息处理流程

用户消息 → Channel → MessageBus.publish_inbound() → AgentLoop.run()
                                              ↓
响应消息 ← Channel.send() ← MessageBus.consume_outbound() ← LLM + Tools

Agent 内部处理流程

# nanobot/agent/loop.py
async def _process_message(msg: InboundMessage):
    # 1. 获取/创建会话
    session = self.sessions.get_or_create(key)
    
    # 2. 构建上下文（系统提示 + 历史 + 当前消息）
    messages = self.context.build_messages(history, msg.content)
    
    # 3. 调用 LLM
    response = await self.provider.chat(messages, tools)
    
    # 4. 如有工具调用，执行工具
    if response.has_tool_calls:
        for tool_call in response.tool_calls:
            result = await self.tools.execute(tool_call.name, tool_call.arguments)
    
    # 5. 保存会话历史
    self.sessions.save(session)
    
    # 6. 返回响应
    return OutboundMessage(content=final_content)

4. 核心流程解析

4.1 应用启动流程

CLI 入口 → Gateway 启动

# nanobot/__main__.py
from nanobot.cli.commands import app
if __name__ == "__main__":
    app()  # 进入 Typer CLI

# nanobot/cli/commands.py - gateway 命令
@app.command()
def gateway(port: int = 18790, verbose: bool = False):
    # 1. 加载配置
    config = load_config()
    
    # 2. 初始化消息总线
    bus = MessageBus()
    
    # 3. 创建 LLM Provider
    provider = _make_provider(config)
    
    # 4. 创建 Agent Loop
    agent = AgentLoop(
        bus=bus,
        provider=provider,
        workspace=config.workspace_path,
        model=config.agents.defaults.model,
        # ... 其他配置
    )
    
    # 5. 创建 Channel Manager
    channels = ChannelManager(config, bus)
    
    # 6. 启动所有服务
    async def run():
        await cron.start()      # 定时任务
        await heartbeat.start() # 心跳服务
        await asyncio.gather(
            agent.run(),         # Agent 主循环
            channels.start_all() # 所有频道
        )
    
    asyncio.run(run())

4.2 消息处理流程

# nanobot/agent/loop.py - 核心处理逻辑
async def _process_message(self, msg: InboundMessage) -> OutboundMessage | None:
    # 1. 会话管理
    key = msg.session_key  # 格式: "channel:chat_id"
    session = self.sessions.get_or_create(key)
    
    # 2. 特殊命令处理
    if msg.content.strip().lower() == "/new":
        # 清空会话，开始新对话
        session.clear()
        return OutboundMessage(content="New session started.")
    
    # 3. 记忆整合检查（当消息数超过阈值）
    if len(session.messages) - session.last_consolidated >= self.memory_window:
        await self._consolidate_memory(session)
    
    # 4. 构建 LLM 上下文
    history = session.get_history(max_messages=self.memory_window)
    messages = self.context.build_messages(
        history=history,
        current_message=msg.content,
        channel=msg.channel,
        chat_id=msg.chat_id,
    )
    
    # 5. 运行 Agent 迭代循环
    final_content, tools_used, all_msgs = await self._run_agent_loop(messages)
    
    # 6. 保存会话
    self._save_turn(session, all_msgs)
    self.sessions.save(session)
    
    return OutboundMessage(channel=msg.channel, chat_id=msg.chat_id, content=final_content)

4.3 工具调用流程

# nanobot/agent/loop.py - Agent 迭代循环
async def _run_agent_loop(self, initial_messages, on_progress=None):
    messages = initial_messages
    iteration = 0
    
    while iteration < self.max_iterations:
        iteration += 1
        
        # 调用 LLM
        response = await self.provider.chat(
            messages=messages,
            tools=self.tools.get_definitions(),  # 所有可用工具定义
        )
        
        if response.has_tool_calls:
            # 记录工具调用到消息历史
            messages = self.context.add_assistant_message(
                messages, response.content, tool_calls
            )
            
            # 执行每个工具调用
            for tool_call in response.tool_calls:
                result = await self.tools.execute(
                    tool_call.name, 
                    tool_call.arguments
                )
                # 将结果添加到消息历史
                messages = self.context.add_tool_result(
                    messages, tool_call.id, tool_call.name, result
                )
        else:
            # LLM 直接回复，无需工具调用
            final_content = response.content
            break
    
    return final_content, tools_used, messages

4.4 记忆整合流程

# nanobot/agent/memory.py - 记忆整合
async def consolidate(self, session, provider, model):
    """
    双层记忆系统：
    - MEMORY.md: 长期记忆（结构化事实）
    - HISTORY.md: 历史日志（可 grep 搜索）
    """
    
    # 1. 提取待整合的消息
    old_messages = session.messages[session.last_consolidated:-keep_count]
    
    # 2. 调用 LLM 进行整合（通过工具调用）
    response = await provider.chat(
        messages=[{
            "role": "system",
            "content": "You are a memory consolidation agent."
        }, {
            "role": "user", 
            "content": f"Process this conversation: {old_messages}"
        }],
        tools=[{
            "type": "function",
            "function": {
                "name": "save_memory",
                "parameters": {
                    "history_entry": "摘要写入 HISTORY.md",
                    "memory_update": "更新后的 MEMORY.md 内容"
                }
            }
        }]
    )
    
    # 3. 保存整合结果
    if response.has_tool_calls:
        args = response.tool_calls[0].arguments
        self.append_history(args["history_entry"])
        self.write_long_term(args["memory_update"])
        
        # 4. 更新会话的整合标记
        session.last_consolidated = len(session.messages) - keep_count

4.5 心跳任务流程

# nanobot/heartbeat/service.py
async def _tick(self):
    """每 30 分钟执行一次"""
    content = self._read_heartbeat_file()  # 读取 HEARTBEAT.md
    
    # Phase 1: 决策（是否有任务需要执行）
    action, tasks = await self._decide(content)
    # 通过虚拟工具调用让 LLM 判断：
    # - action: "skip" 或 "run"
    # - tasks: 任务描述（如果 action=run）
    
    if action == "run":
        # Phase 2: 执行任务
        response = await self.on_execute(tasks)
        # 将结果通知用户
        await self.on_notify(response)

5. 关键设计与实现

5.1 使用的设计模式

模式	应用场景	代码位置
抽象基类 (ABC)	Provider、Channel、Tool 的接口定义	`providers/base.py`, `channels/base.py`, `agent/tools/base.py`
注册表模式	Provider 和 Tool 的动态发现与管理	`providers/registry.py`, `agent/tools/registry.py`
策略模式	不同 Provider 的切换（LiteLLM vs Codex）	`cli/commands.py:_make_provider()`
观察者模式	MessageBus 的事件订阅与发布	`bus/queue.py`
工厂模式	Channel 的创建（根据配置动态实例化）	`channels/manager.py:_init_channels()`
单例模式	PromptSession 的全局管理	`cli/commands.py:_PROMPT_SESSION`

5.2 数据模型 / 数据库设计

无传统数据库，使用文件系统作为存储：

存储类型	文件路径	用途
配置	`~/.nanobot/config.json`	用户配置（JSON）
会话	`~/.nanobot/sessions/{key}.json`	对话历史（JSON，按 session_key 分文件）
长期记忆	`~/.nanobot/workspace/memory/MEMORY.md`	结构化事实（Markdown）
历史日志	`~/.nanobot/workspace/memory/HISTORY.md`	时间线日志（Markdown）
定时任务	`~/.nanobot/data/cron/jobs.json`	Cron 任务定义（JSON）

会话数据结构（session/manager.py）：

class Session:
    key: str                    # "telegram:123456"
    messages: list[dict]        # 消息历史（OpenAI 格式）
    last_consolidated: int      # 上次整合的消息索引
    created_at: datetime
    updated_at: datetime

5.3 状态管理 / 数据流方案

全局状态：

AgentLoop._running: 控制主循环启停
AgentLoop._active_tasks: 跟踪每个会话的活跃任务（用于 /stop 命令）
AgentLoop._consolidating: 防止并发记忆整合

数据流原则：

单向数据流：Channel → Bus → Agent → Bus → Channel
不可变消息：InboundMessage/OutboundMessage 使用 dataclass，创建后不可修改
异步隔离：每个消息处理创建独立 Task，通过 asyncio.Lock 保证同一会话串行处理

5.4 错误处理与日志策略

错误处理层级：

# 1. 工具执行错误（可恢复）
async def execute(self, name, params):
    try:
        result = await tool.execute(**params)
    except Exception as e:
        return f"Error executing {name}: {str(e)}"

# 2. 消息处理错误（记录并通知用户）
async def _dispatch(self, msg):
    try:
        response = await self._process_message(msg)
    except Exception:
        logger.exception("Error processing message")
        await self.bus.publish_outbound(
            OutboundMessage(content="Sorry, I encountered an error.")
        )

# 3. 全局异常捕获（防止崩溃）
async def start(self):
    try:
        await channel.start()
    except Exception as e:
        logger.error("Channel failed: {}", e)

日志策略：

使用 loguru 替代标准库 logging
生产环境：logger.disable("nanobot") 关闭日志
调试模式：--verbose 启用 DEBUG 级别
日志轮转：通过 loguru 自动管理

5.5 安全机制

机制	实现	说明
访问控制	`BaseChannel.is_allowed()`	基于 `allow_from` 白名单
工作区隔离	`tools.restrict_to_workspace`	限制文件/Shell 操作在指定目录
Shell 超时	`ExecToolConfig.timeout`	默认 60 秒超时
敏感信息过滤	配置中的 API Key 不打印	CLI 状态显示时截断
OAuth 支持	`OpenAICodexProvider`	不存储 API Key，使用令牌

6. 环境搭建与运行

6.1 环境依赖与前置条件

必需：

Python ≥3.11
pip 或 uv（推荐）

可选：

Node.js ≥20（仅 WhatsApp 功能需要）
Docker（容器化部署）

6.2 安装、配置、启动步骤

方式一：pip 安装（推荐用户）

# 1. 安装
pip install nanobot-ai

# 2. 初始化配置
nanobot onboard

# 3. 编辑配置，添加 API Key
vim ~/.nanobot/config.json

# 4. 测试 CLI 对话
nanobot agent -m "Hello!"

# 5. 启动网关（支持聊天平台）
nanobot gateway

方式二：源码安装（推荐开发者）

# 1. 克隆仓库
git clone https://github.com/HKUDS/nanobot.git
cd nanobot

# 2. 安装依赖（可编辑模式）
pip install -e .

# 3. 后续步骤同上
nanobot onboard

最小配置示例（`~/.nanobot/config.json`）

{
  "providers": {
    "openrouter": {
      "apiKey": "sk-or-v1-xxx"
    }
  },
  "agents": {
    "defaults": {
      "model": "anthropic/claude-opus-4-5",
      "provider": "openrouter"
    }
  }
}

6.3 关键环境变量说明

变量	说明	示例
`NANOBOT_AGENTS__DEFAULTS__MODEL`	默认模型	`anthropic/claude-3-opus`
`NANOBOT_PROVIDERS__OPENROUTER__API_KEY`	OpenRouter API Key	`sk-or-v1-xxx`
`NANOBOT_TOOLS__RESTRICT_TO_WORKSPACE`	限制工具在工作区	`true`

环境变量优先级高于配置文件，使用 __ 作为嵌套分隔符。

7. 推荐学习路线

阶段 1：入门（1-2 小时）

目标：理解项目整体，能运行和简单修改

顺序	文件	学习目标
1	`README.md`	了解项目功能和使用方式
2	`pyproject.toml`	理解依赖和项目元数据
3	`nanobot/cli/commands.py`	理解 CLI 入口和命令结构
4	`nanobot/config/schema.py`	理解配置模型（重点看 `Config` 类）
5	`nanobot/bus/events.py`	理解消息数据结构
6	`nanobot/bus/queue.py`	理解异步消息总线

验证任务：

修改 nanobot/__init__.py 中的 __logo__，更换 emoji
在 nanobot/templates/SOUL.md 中添加自定义人格描述
运行 nanobot agent -m "你好" 测试

阶段 2：进阶（3-5 小时）

目标：理解核心机制，能添加新功能

顺序	文件	学习目标
1	`nanobot/agent/loop.py`	理解 Agent 主循环和消息处理流程
2	`nanobot/agent/context.py`	理解 Prompt 构建和上下文管理
3	`nanobot/agent/memory.py`	理解双层记忆系统
4	`nanobot/agent/tools/registry.py`	理解工具注册和执行机制
5	`nanobot/channels/base.py`	理解频道抽象接口
6	`nanobot/channels/manager.py`	理解频道初始化和生命周期
7	`nanobot/providers/registry.py`	理解 Provider 注册和匹配逻辑