Conversation
Add a new instruction handler that enables remote binary updates on edge devices. The handler downloads a binary from a provided artifact URL, validates ELF magic bytes, atomically replaces the current executable via same-filesystem rename, and schedules a restart using syscall.Exec. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add UpdateBinary variant to the Instruction and InstructionContent enums with PingPong-based session handling. Expose POST /action/update_binary for single-bot updates and POST /action/update_binary_all for fleet-wide binary updates. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a dashboard page for remotely updating bot binaries. Includes API client with zod-validated request/response schemas, a page loader that fetches online robots, and a form UI with single-bot and fleet-wide update actions. Adds an Update entry to the dashboard sidebar. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthrough新增跨前端/服务端/机器人端的二进制实时更新功能:包含文档、机器人端指令与下载/验证/原子替换实现、服务端单机与批量 API、前端客户端与 UI,以及 WebSocket 发送完成信号的协同与异步重启语义(无其他兼容性变更)。 Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Frontend as Frontend (Svelte)
participant API as Service API (Rust)
participant Bot as Bot (Go)
User->>Frontend: 输入 artifact_url 并触发更新
Frontend->>API: POST /action/update_binary(_all) {artifact_url, robot_id?}
API->>Bot: 通过会话发送 Instruction::UpdateBinary {artifact_url}
activate Bot
Bot->>Bot: 解析可执行路径并在同目录创建临时文件
Bot->>Bot: 下载工件(HTTP)并验证 ELF 魔数
Bot->>Bot: chmod +x 并原子重命名替换可执行文件
Bot->>Bot: 等待写完成通知(Done 通道)并异步 syscall.Exec 重启
Bot-->>API: 返回 UpdateBinaryResponse {status, message}
deactivate Bot
API-->>Frontend: 返回单机或聚合结果
Frontend->>User: 展示结果(含每机器人状态)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 诗歌
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Document the end-to-end binary update flow, wire protocol JSON examples, REST API endpoints, safety guarantees (atomic rename, ELF validation, same-filesystem temp), and restart semantics via syscall.Exec. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
packages/service/src/api/action/update_binary.rs (1)
4-8: 可选改进:考虑使用更强的类型约束。
robot_id字段使用String类型是可行的,但如果 API 始终期望 UUID 格式,使用uuid::Uuid类型可以在编译时和反序列化时提供更强的验证。当前实现可以正常工作,这只是一个可选的类型安全增强建议。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/service/src/api/action/update_binary.rs` around lines 4 - 8, The robot_id field in UpdateBinaryRequest currently uses String; to enforce UUID format, change UpdateBinaryRequest::robot_id to uuid::Uuid, add the uuid::Uuid import, ensure serde support for Uuid is enabled (or add #[serde(with = "...")] if needed), and update any code that constructs or reads UpdateBinaryRequest (e.g., API handlers, tests) to pass/parse a Uuid instead of a raw String so deserialization and compile-time typing enforce UUID validity.packages/workstation/src/routes/dashboard/update/+page.ts (1)
7-13: 考虑为数据获取添加错误处理。当前
fetchStatsOnlineRobots调用失败时会直接导致页面加载失败。考虑添加 try-catch 以提供更友好的降级体验(例如返回空数组并在 UI 中显示提示)。♻️ 可选的错误处理方案
export const load: PageLoad = async ({ fetch }) => { - const onlineRobots = await fetchStatsOnlineRobots(fetch); - - return { - onlineRobots, - }; + try { + const onlineRobots = await fetchStatsOnlineRobots(fetch); + return { onlineRobots }; + } catch (err) { + console.error('Failed to fetch online robots:', err); + return { onlineRobots: [] }; + } };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workstation/src/routes/dashboard/update/`+page.ts around lines 7 - 13, 在 PageLoad 的 load 函数中为 fetchStatsOnlineRobots 添加错误处理:把调用 fetchStatsOnlineRobots(fetch) 包在 try-catch 中(引用函数名 fetchStatsOnlineRobots 和返回字段 onlineRobots),在 catch 分支返回友好的降级值(例如 onlineRobots: []),并在 catch 中记录或上报错误以便调试;确保返回结构与原来一致以让页面接收降级数据。packages/workstation/src/routes/dashboard/update/+page.svelte (1)
82-94: 可选改进:考虑分离加载状态。当前两个按钮共享同一个
loading状态,导致点击任一按钮时两个按钮都会显示 Spinner。虽然按钮已正确禁用,但如果希望更精确地指示哪个操作正在进行,可以考虑使用独立的加载状态。♻️ 分离加载状态的可选方案
- let loading = $state(false); + let loadingSelected = $state(false); + let loadingAll = $state(false); + const loading = $derived(loadingSelected || loadingAll);然后在各自的处理函数中使用对应的状态变量,并在对应按钮中检查具体的加载状态来决定是否显示 Spinner。
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/workstation/src/routes/dashboard/update/`+page.svelte around lines 82 - 94, Current shared loading state causes both buttons to show a Spinner; introduce separate boolean states (e.g., loadingUpdateSelected and loadingUpdateAll) and switch the bindings: in the handler for the selected-bot update (the function bound to the first Button's onclick) set loadingUpdateSelected true/false around the async work, and in handleUpdateAll set loadingUpdateAll true/false; update the Buttons to disable based on their specific loading state (disabled={loadingUpdateSelected || !artifactUrl} and disabled={loadingUpdateAll || !artifactUrl}) and show Spinner conditionally using {`#if` loadingUpdateSelected} and {`#if` loadingUpdateAll} respectively so only the active action shows a spinner.docs/live-update.md (1)
7-32: 为流程图代码块添加语言标识符。静态分析工具提示此代码块缺少语言标识符。对于 ASCII 流程图,建议使用
text或plaintext作为语言标识符以满足 lint 规则。📝 建议修复
-``` +```text Frontend Service Bot🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/live-update.md` around lines 7 - 32, 文档中的 ASCII 流程图代码块缺少语言标识符(在 docs/live-update.md 的三重反引号代码块),导致静态分析报警;请在该代码块的起始 ``` 后添加语言标识符(例如 ```text 或 ```plaintext),以标明这是纯文本/ASCII 流程图并满足 lint 规则,确保只修改包含 “Frontend Service Bot” 流程图的代码块。
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/bot/eventloop/instructions/update_binary.go`:
- Around line 44-45: 不要把完整的预签名 URL 打到日志;在 UpdateBinaryAction 中,当记录
req.ArtifactUrl 时(函数名 UpdateBinaryAction、变量 req.ArtifactUrl、logger.Logger().Info
调用处),把 URL 的敏感部分掩码或只记录 host/path 摘要:解析 req.ArtifactUrl,移除或替换 query
string(或将其全部打码为 "***"),然后把处理后的安全字符串传给 logger,而不是原始 req.ArtifactUrl。
- Around line 72-109: The current flow only checks ELF magic and then replaces
the running binary; enforce transport and content integrity: (1) verify
req.ArtifactUrl uses HTTPS and return an error if not; (2) after io.Copy to
tmpPath (and after the ELF magic check), compute the file's SHA-256 and compare
it to an expected checksum provided by the update request (e.g., req.SHA256) or,
if the request supplies a signature (e.g., req.Signature and a known public
key), verify the signature against the downloaded bytes; (3) only proceed to
chmod/rename/replace the executable when the checksum or signature verification
succeeds; (4) keep existing cleanup paths (cleanup(), os.Remove(tmpPath)) on any
verification failure. Reference symbols to change: req.ArtifactUrl,
req.SHA256/req.Signature, tmpFile/tmpPath, io.Copy, elfMagic, the ELF validation
block, and the final chmod/rename replacement logic.
- Around line 73-78: 在 UpdateBinaryAction 中不要使用 http.Get 无上下文/超时;用 ctx 和显式超时的
http.Client 替换:创建一个带 Timeout 的 http.Client(例如 30s),用
http.NewRequestWithContext(ctx, http.MethodGet, req.ArtifactUrl, nil) 构造请求并通过
client.Do 发起请求,按原逻辑在请求构建或 Do 返回错误时调用 cleanup() 并返回 UpdateBinaryResponse,成功后继续使用
resp 并确保 resp.Body.Close() 被 defer 调用以释放资源。
In `@packages/service/src/api/action.rs`:
- Around line 136-140: CI is failing cargo fmt; run cargo fmt on the repository
and commit the formatting changes so the code around CONNECTIONS.get(...),
conn.value().send_instruction::<serde_json::Value>(Instruction::UpdateBinary {
artifact_url: request.artifact_url.clone(), ... }) and the similar block at the
later location (the 188-193 section) are formatted per rustfmt; ensure you stage
and push the formatted files before merging.
- Around line 187-217: The current loop in update_binary_all uses
CONNECTIONS.iter() and await on conn.value().send_instruction::<AnyDeserialize>
serially and treats any Ok(_) as success, which misrepresents bot-side failures
and causes linear latency; change this to either (A) fire-and-forget by spawning
tasks for each conn.send_instruction and immediately return UpdateBinaryResponse
with "sent" semantic, or (B) concurrently collect results by mapping each conn
to a spawned async task (or join_all) that deserializes into a strong response
type (replace AnyDeserialize with the concrete update response type), aggregate
per-robot statuses/errors, and return an overall UpdateBinaryResponse reflecting
failures. Ensure to reference CONNECTIONS.iter(),
send_instruction::<AnyDeserialize>(), and update_binary::UpdateBinaryResponse
when making the change.
In `@packages/workstation/src/lib/api/action/update_binary.ts`:
- Around line 39-44: 非 2xx 响应不应只抛出 status/statusText,要读取并包含服务端返回的错误正文来保留后端具体原因;在
update_binary.ts 的非 OK 分支(检查 response.ok 的地方)先 await response.text()(或尝试
response.json() 并回退到 text),然后将该文本包含到抛出的 Error 消息中(例如 "Error updating binary:
<status> <statusText>: <serverBody>"),并在文件中其它类似位置(提到的 62-67 区块)做相同处理;保持对正常分支使用
await response.json() 并传入 ActionUpdateBinaryResponse.parse(data) 不变。
---
Nitpick comments:
In `@docs/live-update.md`:
- Around line 7-32: 文档中的 ASCII 流程图代码块缺少语言标识符(在 docs/live-update.md
的三重反引号代码块),导致静态分析报警;请在该代码块的起始 ``` 后添加语言标识符(例如 ```text 或
```plaintext),以标明这是纯文本/ASCII 流程图并满足 lint 规则,确保只修改包含 “Frontend Service Bot”
流程图的代码块。
In `@packages/service/src/api/action/update_binary.rs`:
- Around line 4-8: The robot_id field in UpdateBinaryRequest currently uses
String; to enforce UUID format, change UpdateBinaryRequest::robot_id to
uuid::Uuid, add the uuid::Uuid import, ensure serde support for Uuid is enabled
(or add #[serde(with = "...")] if needed), and update any code that constructs
or reads UpdateBinaryRequest (e.g., API handlers, tests) to pass/parse a Uuid
instead of a raw String so deserialization and compile-time typing enforce UUID
validity.
In `@packages/workstation/src/routes/dashboard/update/`+page.svelte:
- Around line 82-94: Current shared loading state causes both buttons to show a
Spinner; introduce separate boolean states (e.g., loadingUpdateSelected and
loadingUpdateAll) and switch the bindings: in the handler for the selected-bot
update (the function bound to the first Button's onclick) set
loadingUpdateSelected true/false around the async work, and in handleUpdateAll
set loadingUpdateAll true/false; update the Buttons to disable based on their
specific loading state (disabled={loadingUpdateSelected || !artifactUrl} and
disabled={loadingUpdateAll || !artifactUrl}) and show Spinner conditionally
using {`#if` loadingUpdateSelected} and {`#if` loadingUpdateAll} respectively so
only the active action shows a spinner.
In `@packages/workstation/src/routes/dashboard/update/`+page.ts:
- Around line 7-13: 在 PageLoad 的 load 函数中为 fetchStatsOnlineRobots 添加错误处理:把调用
fetchStatsOnlineRobots(fetch) 包在 try-catch 中(引用函数名 fetchStatsOnlineRobots 和返回字段
onlineRobots),在 catch 分支返回友好的降级值(例如 onlineRobots: []),并在 catch
中记录或上报错误以便调试;确保返回结构与原来一致以让页面接收降级数据。
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 47a68a4c-37c7-4610-9144-011c350a89f3
📒 Files selected for processing (11)
docs/live-update.mdpackages/bot/eventloop/instructions/instructions.gopackages/bot/eventloop/instructions/update_binary.gopackages/service/src/api/action.rspackages/service/src/api/action/update_binary.rspackages/service/src/service/instructions.rspackages/service/src/service/instructions/update_binary.rspackages/workstation/src/lib/api/action/update_binary.tspackages/workstation/src/routes/dashboard/SidebarContent.sveltepackages/workstation/src/routes/dashboard/update/+page.sveltepackages/workstation/src/routes/dashboard/update/+page.ts
Strip query string from artifact_url before logging to prevent presigned URL credentials from leaking into log output. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…load Replace bare http.Get() with http.NewRequestWithContext and an http.Client with a 30-second timeout to prevent the handler from blocking indefinitely on slow or unresponsive artifact sources. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace AnyDeserialize with serde_json::Value to parse bot responses and report per-robot status/message. Return a new UpdateBinaryAllResponse type with individual RobotUpdateResult entries and an overall status of "partial_failure" when any robot reports an error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Read response body text on non-OK responses instead of only showing status/statusText, so the UI can display backend-provided error details. Also update the update_binary_all types to match the new per-robot aggregate response from the service. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3a96bc2 to
5b03f7b
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/live-update.md`:
- Around line 7-32: The fenced diagram block in docs/live-update.md is missing a
language marker and triggers MD040; update the opening fence (the
triple-backtick that begins the diagram) to include a plain text language tag
such as "text" or "plain" (i.e., change ``` to ```text) so the code fence is
lint-compliant while preserving the diagram content.
- Around line 89-104: 文档中的 update_binary_all 响应示例和文字描述仍然使用旧的 {status, message}
结构,需要更新为与服务端实现一致的 {status, results[]};修改 live-update.md 中示例响应 JSON 为包含 results
数组(每项描述单个机器人更新结果和可能的 per-bot error 信息或状态),并调整说明文字为“服务端返回 status 和 results[],每个
results 项对应一个机器人,单机错误只记录在 results 中/服务器端日志,不会使整个请求失败”(参照 action.rs
中的实现以确保字段名和结构一致)。
In `@packages/bot/eventloop/instructions/update_binary.go`:
- Around line 143-150: The fixed 1s sleep before calling syscall.Exec is unsafe;
replace the sleep-based restart with a synchronization signal from the WebSocket
writer: create a restart/done channel (e.g., restartDone chan struct{}) and have
eventloopSendJson send a completion signal on it after the wsjson.Write path has
finished handling the response; change the goroutine that currently sleeps to
instead block on that channel (with an optional timeout fallback) and only call
syscall.Exec(execPath, os.Args, os.Environ()) after receiving the signal; update
references to eventloopSendJson and the exec goroutine so the writer emits the
done signal and the exec waiter consumes it.
In `@packages/service/src/api/action.rs`:
- Around line 147-156: The code extracts status/message from the JSON (via
info.get(...).and_then(...).unwrap_or(...)) and currently defaults
missing/invalid status to "unknown" and only treats "error" as failure, which
lets malformed responses be reported as success; change the parsing in the
status extraction (the variables created from info.get("status")) to treat
missing/invalid/unknown as a failure sentinel (e.g., default to "error" or set a
boolean parse_failed flag), and update the downstream check (the logic that
currently only treats status == "error") to consider parse failures/unrecognized
values as failure (i.e., treat anything other than explicit "ok" as failure) so
that functions/methods handling status in action.rs consistently mark
malformed/unknown responses as failed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9c1b26a2-8727-4c93-962b-8700fb559c5e
📒 Files selected for processing (6)
docs/live-update.mdpackages/bot/eventloop/instructions/update_binary.gopackages/service/src/api/action.rspackages/service/src/api/action/update_binary.rspackages/workstation/src/lib/api/action/update_binary.tspackages/workstation/src/routes/dashboard/update/+page.svelte
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/workstation/src/routes/dashboard/update/+page.svelte
…ation
Update the response schema from {status, message} to {status, results[]}
to match the current per-robot aggregation endpoint. Clarify batch
behavior description.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…restart Introduce SendEnvelope with a done channel that the eventloop send goroutine closes after wsjson.Write completes. WrapResponseAction now sends responses through this envelope, and UpdateBinaryAction waits on the done signal before calling syscall.Exec, eliminating the race between the fixed sleep and actual WebSocket flush. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Default missing status to "error" instead of "unknown", and default
missing message to "invalid response: missing fields". In the
update_binary_all aggregation, check for the expected success status
("post_update") rather than only matching "error", so malformed or
unrecognised responses correctly trigger partial_failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
docs/live-update.md (1)
7-7:⚠️ Potential issue | 🟡 Minor给这个围栏代码块补上语言标记。
这里仍会触发 markdownlint 的 MD040;把 opening fence 改成带
text语言标记的 fence 即可。🛠️ 建议修改
-``` +```text🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/live-update.md` at line 7, The code fence in the documentation's bare triple-backtick block should include a language tag to satisfy markdownlint MD040; update the opening fence from ``` to ```text so the block becomes language-marked (i.e., replace the bare opening fence with one that specifies "text").
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/live-update.md`:
- Around line 98-115: Update the example response's top-level "status" from "ok"
to "partial_failure" to match the aggregation semantics implemented in action.rs
(the API's batch update aggregation) when any item in "results" has status
"error" — i.e., change the sample JSON to show "status": "partial_failure" to
reflect that one robot returned an error.
In `@packages/bot/eventloop/eventloop.go`:
- Around line 60-64: The current code closes the done channel regardless of
backend.SendJson outcome, causing callers (e.g., the restart goroutine in
update_binary.go) to treat failed writes as successful; change the logic in the
function that calls backend.SendJson so that close(done) is executed only when
err == nil (i.e., after a successful backend.SendJson), keep done untouched on
error, and if you still need to trigger a restart or fallback on send failures
implement a separate failure/timeout signal (e.g., a dedicated errChan or
context timeout) instead of reusing done.
In `@packages/bot/eventloop/instructions/update_binary.go`:
- Around line 110-128: 当前只检查 ELF 魔数(elfMagic)会把错误架构的二进制当作有效文件覆盖掉可执行文件;在执行
os.Rename 并返回 post_update 之前,请打开并解析该临时文件的 ELF 头(例如使用 debug/elf 包的 elf.Open 或读取
e_ident / e_type / e_machine 字段),并验证 ELF Class(32/64)和 Machine 与当前运行时匹配(将
elf.File.Class 与期望的 elf.ELFCLASS64/ELFCLASS32、elf.File.Machine 与对应
runtime.GOARCH 映射比对);如果不匹配,删除 tmpPath 并返回 UpdateBinaryResponse
错误(类似现有错误分支),只有在验证通过后再执行 os.Rename、调用 post_update 并允许 syscall.Exec 执行新二进制。
In `@packages/service/src/api/action.rs`:
- Around line 136-144: The HTTP handler is directly awaiting
conn.value().send_instruction(...) (called with Instruction::UpdateBinary) so a
stuck bot blocks the request; wrap each send_instruction call in a timeout
(e.g., tokio::time::timeout) at the API layer and treat a timeout as a per-robot
failure response instead of hanging the whole handler—apply this change for the
send_instruction usage at the CONNECTIONS.get(&request.robot_id) site and
similarly for the batch path (the update_binary_all usage around the other
occurrence), returning a clear timeout/failure result for that robot while
allowing others to proceed.
---
Duplicate comments:
In `@docs/live-update.md`:
- Line 7: The code fence in the documentation's bare triple-backtick block
should include a language tag to satisfy markdownlint MD040; update the opening
fence from ``` to ```text so the block becomes language-marked (i.e., replace
the bare opening fence with one that specifies "text").
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 8be5e88b-c625-458e-befa-185c269baa5e
📒 Files selected for processing (6)
docs/live-update.mdpackages/bot/eventloop/eventloop.gopackages/bot/eventloop/instructions/update_binary.gopackages/bot/eventloop/share/action.gopackages/bot/lib/context.gopackages/service/src/api/action.rs
Align the live-update batch response example with the service aggregation semantics so mixed outcomes report partial_failure at the top level.
Only signal update completion after SendJson succeeds so restart handlers do not treat failed websocket writes as flushed responses.
Wrap update_binary instruction waits with a 60-second API timeout so hung bots stop blocking HTTP handlers and batch updates report timeouts as per-robot failures.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/service/src/api/action.rs (1)
178-198: 超时与指令失败的 HTTP 状态码不一致。当前实现中:
- 指令失败(
Ok(Err(err)))返回 HTTP 400(Line 186-188)- 超时(
Err(_))返回 HTTP 200 并在响应体中标记status: "error"(Line 196)这种不一致可能导致客户端处理逻辑复杂化。建议统一:要么都返回 HTTP 200 + 业务状态码,要么将超时也作为 HTTP 错误返回。
🛠️ 可选方案:统一返回 HTTP 200 + 业务错误
Ok(Err(err)) => { log::error!( "Failed to update binary on robot {}: {:?}", request.robot_id, err ); - Err(GenericResponse::BadRequest(PlainText( - "failed to update binary".to_string(), - ))) + Ok(Json(update_binary::UpdateBinaryResponse { + status: "error".to_string(), + message: format!("instruction failed: {}", err), + })) }packages/bot/eventloop/eventloop.go (1)
55-58: 考虑兼容*lib.SendEnvelope,提高接口健壮性。当前仓库中 SendEnvelope 仅以值类型构造并发送(见
packages/bot/eventloop/share/action.go:48),但在packages/bot/eventloop/eventloop.go:55的断言中只处理值类型。为避免未来上游代码意外传入指针时出现静默失效,建议同时处理指针类型。♻️ 建议修改
payload := msg var done chan struct{} - if env, ok := msg.(lib.SendEnvelope); ok { - payload = env.Payload - done = env.Done - } + switch env := msg.(type) { + case lib.SendEnvelope: + payload = env.Payload + done = env.Done + case *lib.SendEnvelope: + if env != nil { + payload = env.Payload + done = env.Done + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/bot/eventloop/eventloop.go` around lines 55 - 58, The type assertion for lib.SendEnvelope in the msg handling only covers the value type and should also accept a pointer to make the interface robust; update the branch around the msg type check (the block that currently does `if env, ok := msg.(lib.SendEnvelope) { payload = env.Payload; done = env.Done }`) to additionally check for `*lib.SendEnvelope` (or use a type switch) and extract Payload and Done from the pointer case as well so both value and pointer SendEnvelope instances are handled identically.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/bot/eventloop/eventloop.go`:
- Around line 55-58: The type assertion for lib.SendEnvelope in the msg handling
only covers the value type and should also accept a pointer to make the
interface robust; update the branch around the msg type check (the block that
currently does `if env, ok := msg.(lib.SendEnvelope) { payload = env.Payload;
done = env.Done }`) to additionally check for `*lib.SendEnvelope` (or use a type
switch) and extract Payload and Done from the pointer case as well so both value
and pointer SendEnvelope instances are handled identically.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ec6b50e6-1074-4abf-86af-5582e25c9a98
📒 Files selected for processing (4)
docs/live-update.mdpackages/bot/eventloop/eventloop.gopackages/bot/lib/context.gopackages/service/src/api/action.rs
There was a problem hiding this comment.
♻️ Duplicate comments (1)
docs/live-update.md (1)
7-7:⚠️ Potential issue | 🟡 Minor请为该围栏代码块补充语言标记(仍会触发 MD040)。
这里的流程图代码块起始围栏缺少语言标记,
markdownlint-cli2会持续告警。建议改为```text(或```plain)。建议修改
-``` +```text Frontend Service Bot🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/live-update.md` at line 7, 当前文档中的流程图围栏代码块缺少语言标记,导致 markdownlint-cli2 的 MD040 报警;请在该起始围栏后添加语言标记(例如 将 ``` 改为 ```text 或 ```plain),以便消除警告并保持渲染不变;查找文档中该流程图的起始 ``` 围栏并替换为 ```text(或 ```plain)。
🧹 Nitpick comments (1)
packages/service/src/api/action.rs (1)
228-281: 考虑并发执行以减少总耗时。当前实现串行等待每个机器人的响应。在最坏情况下(N 台机器人都超时),总耗时为 N × 60 秒。如果机器人数量较多,可以考虑使用
futures::future::join_all并发执行以改善响应时间。♻️ 并发执行示例(可选)
use futures::future::join_all; // 收集所有 futures let futures: Vec<_> = CONNECTIONS .iter() .map(|conn| { let robot_id = conn.key().clone(); let artifact_url = request.artifact_url.clone(); let conn_value = conn.value().clone(); // 需要 Connection 实现 Clone async move { let result = timeout( UPDATE_BINARY_TIMEOUT, conn_value.send_instruction::<serde_json::Value>( Instruction::UpdateBinary { artifact_url }, ), ) .await; (robot_id, result) } }) .collect(); // 并发执行 let outcomes = join_all(futures).await;注意:这需要确认
Connection类型支持克隆或使用Arc包装。如果当前架构不支持,串行执行也是可接受的方案。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/service/src/api/action.rs` around lines 228 - 281, The loop over CONNECTIONS is serial and should be converted to concurrent execution: build a Vec of futures by mapping CONNECTIONS.iter() to async blocks that capture robot_id, request.artifact_url, and a cloned/Arc-wrapped conn.value (so send_instruction can be called concurrently), each future performing the timeout(UPDATE_BINARY_TIMEOUT, conn_value.send_instruction::<serde_json::Value>(Instruction::UpdateBinary { artifact_url })). Use futures::future::join_all to await all futures, then iterate the outcomes and for each (robot_id, result) run the same match logic that calls parse_update_binary_response, update_binary_instruction_failure_response, or update_binary_timeout_response and pushes update_binary::RobotUpdateResult while setting has_failure appropriately; ensure Connection is Clone or wrap it in Arc before cloning.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@docs/live-update.md`:
- Line 7: 当前文档中的流程图围栏代码块缺少语言标记,导致 markdownlint-cli2 的 MD040 报警;请在该起始围栏后添加语言标记(例如
将 ``` 改为 ```text 或 ```plain),以便消除警告并保持渲染不变;查找文档中该流程图的起始 ``` 围栏并替换为 ```text(或
```plain)。
---
Nitpick comments:
In `@packages/service/src/api/action.rs`:
- Around line 228-281: The loop over CONNECTIONS is serial and should be
converted to concurrent execution: build a Vec of futures by mapping
CONNECTIONS.iter() to async blocks that capture robot_id, request.artifact_url,
and a cloned/Arc-wrapped conn.value (so send_instruction can be called
concurrently), each future performing the timeout(UPDATE_BINARY_TIMEOUT,
conn_value.send_instruction::<serde_json::Value>(Instruction::UpdateBinary {
artifact_url })). Use futures::future::join_all to await all futures, then
iterate the outcomes and for each (robot_id, result) run the same match logic
that calls parse_update_binary_response,
update_binary_instruction_failure_response, or update_binary_timeout_response
and pushes update_binary::RobotUpdateResult while setting has_failure
appropriately; ensure Connection is Clone or wrap it in Arc before cloning.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 425723e4-1ebd-4965-844b-456396344f7b
📒 Files selected for processing (3)
docs/live-update.mdpackages/bot/eventloop/eventloop.gopackages/service/src/api/action.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/bot/eventloop/eventloop.go
Live-Update 支持功能实现
概述
该 PR 为系统增加了 Live-Update(在线更新)能力,包含机器人端的指令处理、事件循环消息封装改动、服务端 API、前端控制面板与使用文档。实现支持从指定 artifact URL 下载新二进制、在可执行文件同目录创建临时文件并验证 ELF 魔数、设置可执行权限后原子替换当前可执行文件(rename)、并调度进程重启(通过 syscall.Exec,响应返回后异步触发)。支持单机与批量更新、每机结果上报与聚合。PR 主要为新增功能与文档,未破坏现有公共接口签名,但引入了出站消息封装与完成信号的语义变动,需注意兼容性。
主要变更
文档
机器人端(Go)
服务端 API(Rust)
前端(TypeScript / Svelte)
错误处理、安全与实现要点
导出与兼容性影响
影响范围与审查关注点