V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
dancancer
V2EX  ›  OpenAI

Gemini CLI 交互记录分析

  •  
  •   dancancer · 31 天前 · 1539 次点击

    通过遥测工具抓取了一下 Gemini CLI 的对话消息记录,并做了一下简单的分析,感觉有所启发,记录一下。 对话内容是让 Gemini CLI 解析一下自己的代码,下面就来逐条消息进行一下分析。

    初始设置

    第一条用户 prompt 是由 Gemini CLI 发出的,内容如下

    {
      "role": "user",
      "parts": [
        {
          "text": "This is the Gemini CLI. We are setting up the context for our chat.\n  Today's date is 2025 年 8 月 1 日星期五.\n  My operating system is: darwin\n  I'm currently working in the directory: /Users/xxx/mycode/gemini-cli\n  Showing up to 200 items (files + folders). Folders or files indicated with ... contain more items not shown, were ignored, or the display limit (200 items) was reached.\n\n/Users/xxx/mycode/gemini-cli/\n├───.editorconfig\n├───.gitattributes\n├───.gitignore\n├───.npmrc\n├───.nvmrc\n├───.prettierrc.json\n├───CONTRIBUTING.md\n├───Dockerfile\n├───esbuild.config.js\n├───eslint.config.js\n├───GEMINI.md\n├───LICENSE\n├───Makefile\n├───package-lock.json\n├───package.json\n├───pnpm-lock.yaml\n├───README.md\n├───ROADMAP.md\n├───tsconfig.json\n├───.gcp/\n│   ├───Dockerfile.gemini-code-builder\n│   └───release-docker.yaml\n├───.gemini/\n├───.git/...\n├───.github/\n│   ├───CODEOWNERS\n│   ├───pull_request_template.md\n│   ├───actions/\n 
          ...省略
          "
    
        }
      ]
    }
    

    从记录中可以清楚看到第一条消息提供了几个比较重要的信息

    1. 首先是告诉模型目前角色是 Gemini CLI: This is the Gemini CLI
    2. 紧接着是明确告诉模型现在是在做上下文设定: We are setting up the context for our chat.
    3. 再接着是告诉模型当前的时间、操作系统: Today's date is 2025 年 8 月 1 日星期五.\n My operating system is: darwin
    4. 最后是告诉模型当前的工作目录,以及目录树结构(限制了两百个)

    模型接收到这些信息后就有了第一条返回

    {
      "role": "model",
      "parts": [{ "text": "Got it. Thanks for the context!" }]
    }
    

    很礼貌的回复了一下 "Got it. Thanks for the context!"

    真实用户 prompt

    在对话中,由用户输入的 prompt 其实只有一句:

    {
      "role": "user",
      "parts": [
        {
          "text": "帮我分析一下 @packages/core/ 中的代码,详细分析其技术架构、工作原理,并画出流程图"
        }
      ]
    }
    

    模型回复

    在消息记录中模型的回复内容为:

    {
      "role": "model",
      "parts": [
        {
          "text": "好的,我们来深入分析一下 `@packages/core` 的代码。\n\n 这个包是 Gemini CLI 的核心,负责处理与 Google AI 模型的交互、工具执行、状态管理和配置等关键任务。可以把它看作是连接用户输入(来自 `packages/cli`)和模型能力( Google AI )的“大脑”。\n\n 首先,我需要浏览一下 `@packages/core` 的目录结构来更好地理解其内部模块。\n\n"
        }
      ]
    },
    {
      "role": "model",
      "parts": [
        {
          "functionCall": {
            "name": "list_directory",
            "args": {
              "path": "/Users/xxx/mycode/gemini-cli/packages/core/src/"
            }
          }
        }
      ]
    }
    

    这一段其实只包含了模型思考后的正式回复,完整的信息也抓取到了,如下,可以看到其中模型的思考过程,消息历史记录中思考的部分都被省略了

    {
      "type": "response",
      "content": [
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Initiating Code Analysis**\n\n\\n\\n\n\nI'm focusing on breaking down the user's request. My first step is identifying the core task, which is to analyze the `@packages/core/` directory. Then, I will address the required outputs, like detailed code summaries and potential vulnerabilities. I'm now considering how best to structure these outputs for clarity and impact.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Refining Output Structures**\n\n\\n\\n\n\nI'm working on the structure of the technical architecture analysis. I'm also structuring the explanation of working principles, ensuring clarity and readability in Chinese. The flowchart generation is the next step; I'm considering tools and approaches to best visualize the code's logic.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Examining Module Interactions**\n\n\\n\\n\n\nI've just finished analyzing `index.ts` to understand the module exports. The next step is examining the interactions between modules, especially how `core`, `mcp`, and `config` are connected. I'll focus on function calls and dependencies to build a clear map of data flow. This will help me understand the technical architecture better, which is what I'm targeting next.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Defining Core Package Modules**\n\n\\n\\n\n\nI've moved on to analyzing the `index.ts` files within key subdirectories, like `core`, `mcp`, and `tools`. I'm identifying what each module exports to understand its role in the larger system. Now I'm beginning to generate initial descriptions for each module, and I'm also thinking about the interactions and data flow.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Examining Core Module Structure**\n\n\\n\\n\n\nI'm initiating the directory listing to obtain the complete file structure. After that, I'll analyze the `index.ts` file to understand the module's public API. I will then examine the key subdirectories. This approach helps me grasp the core functions and the relationships among the modules within the package.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Summarizing Module Roles**\n\n\\n\\n\n\nI'm currently reading the `index.ts` files within the `core`, `mcp`, `tools`, and `config` directories. I'm focusing on the function exports to determine each module's function within the larger context. The key objective is to understand their roles in the system. I am also starting to think about the data flow among them.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Defining Initial Hypotheses**\n\n\\n\\n\n\nI'm consolidating my initial findings. I now understand the directory structure and the basic responsibilities of the modules based on the `index.ts` exports. My next step is to confirm the roles of each module, specifically `core`, `mcp`, and `tools`. Then I will start synthesizing the response in Chinese and begin generating a text-based flowchart.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Defining Execution Steps**\n\n\\n\\n\n\nI'm planning the execution phase, including the order of file reads and the analysis process for each. I will start by listing the directory's content, then proceed with analyzing key files like `index.ts` in core, mcp, and tools. The goal is to extract module exports and understand interdependencies. I'm structuring my actions using the identified tool usage.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Structuring Architectural Overview**\n\n\\n\\n\n\nI'm concentrating on a response structure for the detailed architectural overview. I'm dividing the technical architecture section by module (core, mcp, tools, config, etc.) for clarity. Also, I'm planning the end-to-end flow description. I will then use a numbered list to guide the user through the steps in a typical interaction. After this, I will build the text-based Mermaid flowchart for visualization.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "thought": true,
                    "text": "**Defining Core Modules**\n\n\\n\\n\n\nI'm focused on using the directory listing and examining `index.ts` files to define key modules. My goal is to understand the responsibilities of `core`, `mcp`, `tools`, and `config`. I am also planning to include a review of `services`, `telemetry`, and `utils`. The next phase is synthesizing my understanding of the data flow between the modules.\n\n\n"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "text": "好的,我们来深入分析一下 `@packages/core` 的代码。\n\n 这个包是 Gemini CLI 的核心,"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "text": "负责处理与 Google AI 模型的交互、工具执行、状态管理和配置等关键任务。可以把它看作是连接用户输入(来自 `packages/cli`)和模型能力( Google AI )的“大脑”。\n\n 首先,我需要浏览一下 `@packages/core` 的目录结构来更好地"
                  }
                ]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [{ "text": "理解其内部模块。\n\n" }]
              }
            }
          ],
          "usageMetadata": { "trafficType": "PROVISIONED_THROUGHPUT" }
        },
        {
          "candidates": [
            {
              "content": {
                "role": "model",
                "parts": [
                  {
                    "functionCall": {
                      "name": "list_directory",
                      "args": {
                        "path": "/Users/xupeng/mycode/gemini-cli/packages/core/src/"
                      }
                    }
                  }
                ]
              },
              "finishReason": "STOP"
            }
          ],
          "usageMetadata": {
            "promptTokenCount": 12968,
            "candidatesTokenCount": 120,
            "totalTokenCount": 15550,
            "trafficType": "PROVISIONED_THROUGHPUT",
            "promptTokensDetails": [{ "modality": "TEXT", "tokenCount": 12968 }],
            "candidatesTokensDetails": [
              { "modality": "TEXT", "tokenCount": 120 }
            ],
            "thoughtsTokenCount": 2462
          }
        }
      ]
      }
    

    工具调用及结果返回

    可以看到在上一段的模型回复中模型发回了一段特殊的回复,parts 中只包含了 functionCall 节点,这其实就是模型调用对应工具的的方式,这个回复中包含了需要调用的工具名称以及参数信息。 Gemini CLI 在收到工具调用信息后会根据相关信息执行工具调用,然后将工具调用的结果再次返回给模型,具体的消息如下:

    {
      "role": "user",
      "parts": [
        {
          "functionResponse": {
            "id": "list_directory-1754016459561-12bef4b2f24e9",
            "name": "list_directory",
            "response": {
              "output": "Directory listing for /Users/xxx/mycode/gemini-cli/packages/core/src/:\n[DIR] __mocks__\n[DIR] code_assist\n[DIR] config\n[DIR] core\n[DIR] mcp\n[DIR] services\n[DIR] telemetry\n[DIR] tools\n[DIR] utils\nindex.test.ts\nindex.ts"
            }
          }
        }
      ]
    }
    

    总结

    通过分析交互历史数据,我们可以看到 Gemini CLI 与用户之间的交互遵循一个清晰的流程:

    1. 上下文设置:CLI 首先向模型提供详细的上下文信息,包括系统环境、目录结构等
    2. 用户请求:用户提出具体的问题或任务请求
    3. 模型处理:模型分析请求并决定下一步行动
    4. 工具调用:模型可能需要调用工具来获取更多信息或执行操作
    5. 结果反馈:工具执行结果返回给模型,模型生成最终响应
    6. 用户响应:最终响应返回给用户
    5 条回复    2025-08-02 10:24:50 +08:00
    goinghugh
        1
    goinghugh  
       31 天前
    4.工具调用:模型可能需要调用工具来获取更多信息或执行操作
    5.结果反馈:工具执行结果返回给模型,模型生成最终响应
    这两步应该会反复多次进行,应该会用到一些模式吧?
    snow0
        2
    snow0  
       31 天前
    Agent 的标准处理流程
    keakon
        3
    keakon  
       31 天前
    没有 list_directory 的 schema 定义部分吗?它怎么选择工具,并准确传入参数的?
    dancancer
        4
    dancancer  
    OP
       30 天前
    @keakon 通过 function calling 来传入的,类似 open ai 的方式
    dancancer
        5
    dancancer  
    OP
       30 天前
    @goinghugh 循环执行控制有一个 checkNextSpeaker 的类来处理,综合了一些判断条件,并且调用了 LLM 判断最后一次模型对话输出内容来兜底
    关于   ·   帮助文档   ·   自助推广系统   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   922 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 26ms · UTC 20:52 · PVG 04:52 · LAX 13:52 · JFK 16:52
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.