Skilljar 官方 Quiz 题库 + 答案推理册

Quiz on Accessing Claude with the API

6 题 · Skilljar Vol.8 课程

对应本地卷： V8·L01 API 基础 V8·L05 API 请求与认证

Question 1

You're building a multi-turn chatbot. The user sends a second message. How do you preserve context with the Anthropic API?

Re-send only the new user message; the API remembers the previous turn
Use a session_id query parameter to look up history server-side
Send the full prior message list (user + assistant turns) plus the new user message
Put the prior conversation inside the system prompt

推荐答案C

Anthropic API 是无状态的——每次请求必须把完整的 messages 数组（含历史 user/assistant 轮次）发回，模型才能看到上下文。没有 session_id 之类的服务器端记忆（B 错），把历史塞进 system 会把角色都混成「指令」（D 错）。

出处：V8·L05 — Conversation State 与 messages 数组

Question 2

Your `max_tokens=200` request returns a response with `stop_reason: "max_tokens"`. What happened?

Your input prompt exceeded 200 tokens and was truncated
The model reached the output budget before finishing its answer
Anthropic billing capped your account at 200 tokens for the day
The conversation context window is now full

推荐答案B

max_tokens 是输出长度上限，不是输入限制（A 错）也不是计费限额（C 错）。命中后 stop_reason 返回 max_tokens，说明回答被截断，需要调高上限或拆分任务。

出处：V8·L01 — max_tokens 与 stop_reason

Question 3

You need to summarize 50,000 customer reviews per day, each ~500 words. Cost matters; quality bar is moderate. Which model is the right default?

Always Opus, because it has the strongest reasoning
Haiku, because it offers the best cost-latency profile for high-volume, well-bounded tasks
Whichever model has the largest context window
Sonnet, because it's the only model that supports streaming

推荐答案B

高吞吐、任务清晰、质量门槛中等的场景应选 Haiku——成本与延迟都最低。所有 Claude 模型都支持 streaming（D 错），上下文窗口不是这里的瓶颈（C 错），无脑选 Opus 会浪费成本（A 错）。

出处：V8·L01 — 模型选择与成本权衡

Question 4

Your team is reviewing a pull request that adds Claude API integration. Where should the API key live?

Hard-coded as a constant in the source file with a comment "rotate yearly"
Committed in `.env` so deployment is reproducible across machines
In an environment variable / secret manager, never committed to the repo
Embedded in the frontend bundle so the backend stays stateless

推荐答案C

API key 是生产凭证，必须通过环境变量或 secret manager 注入。硬编码（A）、提交 `.env`（B，会污染 git 历史）、放进前端 bundle（D，等同公开）都是常见账单泄漏来源。

出处:V8·L05 — Authentication 与 secret 管理

Question 5

You set `stop_sequences=[""]`. The model generates text containing that string. What happens?

The API returns an error because stop sequences must not collide with content
Generation halts at the first match; `stop_reason` becomes `stop_sequence` and the matched string is NOT included in output
The matched string is included verbatim and generation continues
Stop sequences are silently ignored unless temperature is below 0.5

推荐答案B

命中 stop sequence 时：生成立即停止、`stop_reason` 设为 `stop_sequence`、匹配的字符串本身**不出现**在 content 里。这是配合 prefill 锁定 JSON / 结构化输出的关键机制——温度高低不影响是否生效（D 错）。

出处：V8·L01 — Stop Sequence 行为

Question 6

Why count tokens before sending a request to Claude?

It's optional — the API will reject anything that doesn't fit and retry shrinks automatically
Only required for streaming; non-streaming requests are unbounded
To stay within the model's input window AND to estimate cost before committing to the call
To choose the right `temperature` value

推荐答案C

Token 计数有两个目的：保证输入 + 输出之和落在模型窗口内；预先估算成本（每千 token 计费）。API 不会自动「retry shrinks」（A 错）；token 上限对 streaming / 非 streaming 都生效（B 错）；与 temperature 无关（D 错）。

出处：V8·L01 — Token 计数与成本估算

Quiz on Prompt Evaluation

5 题 · Skilljar Vol.8 课程

对应本地卷： V8·L02 Prompt 工程与评测 V8·L06 Prompt Evaluation 流程

Question 1

A prompt tweak improves Q1–Q3 of your eval set but silently regresses Q5. What is this telling you?

Q5 is noise; ignore single-question regressions
The model has a bug; switch to a different Claude version
Prompt changes have tradeoffs across distributions; only a full eval set surfaces regressions
You should revert Q1–Q3 improvements until Q5 stabilizes

推荐答案C

Eval 的主要价值就是发现「局部赢、整体输」的回归。盲目忽略 Q5（A）放任问题；归咎模型版本（B）跳过自己 prompt 的责任；为了一题回滚整次改动（D）失去优化收益。正确做法是改进 prompt 让 Q1–Q3 与 Q5 都过。

出处：V8·L06 — Eval 用于回归检测

Question 2

Which grading style fits "the output JSON must validate against this schema"?

Model-based grading: ask Claude to judge schema validity
Code-based grading: parse JSON and run a schema validator
Manual review by the prompt author
Either — they produce equivalent results

推荐答案B

确定性、可机械判定的标准（schema 校验、字段存在、数值范围）必须用代码评分——结果可复现且零成本。让 Claude 自评（A）会引入随机性；人工评审（C）不可扩展；二者并不等价（D 错）。

出处：V8·L06 — Code-based vs Model-based Grading

Question 3

When should you refresh your eval dataset?

Never — a stable eval set is the whole point
After every prompt change, to keep it aligned
When production data drifts or when new failure modes emerge from real users
Once a year, regardless of usage

推荐答案C

Eval 集要跟随真实分布——产品变更、用户群扩展、被新发现的 edge case 都是触发器。永不更新（A）会让 eval 与现实脱节；每次改 prompt 就改 eval（B）等于自欺欺人；按时间表更新（D）忽略了驱动信号。

出处：V8·L06 — Eval 数据集维护

Question 4

You're building your first eval set. Which inclusion matters most?

The 10 prompts your team uses most often
A mix of common cases, edge cases, adversarial inputs, and real production samples
Synthetic inputs generated by Claude itself, for scale
Only inputs that the current prompt already handles correctly

推荐答案B

代表性 + 困难性 + 真实性是 eval 集的三个支柱。团队最爱的 10 条（A）有偏；Claude 自己生成（C）会复制模型偏好；只放当前能过的（D）等同于「考试只考会的题」——回归永远不会被发现。

出处：V8·L06 — Eval 数据集组成

Question 5

Why log prompts and responses in production?

It's required by Anthropic for billing audits
To rebuild eval datasets with the real distribution of user inputs and to debug regressions
To train a custom Claude variant on your domain
To detect malicious users automatically

推荐答案B

生产日志是 eval 集的最佳来源——它反映真实用户行为分布，也是事后复现回归 bug 的唯一途径。计费审计（A）由 Anthropic 自己处理；个人无法 fine-tune Claude（C 错）；安全检测虽是副产品但不是主用途（D）。

出处：V8·L06 — 生产日志作为 eval 来源

Quiz on Prompt Engineering Techniques

6 题 · Skilljar Vol.8 课程

对应本地卷： V8·L02 Prompt 工程与评测 V8·L07 Prompt Engineering 技术

Question 1

"Be brief" vs "Limit to 200 words, exactly 3 bullets, no markdown headers" — which works better and why?

Both are equivalent; Claude infers intent
"Be brief" is better; over-specifying confuses the model
The specific version, because measurable constraints replace ambiguous adjectives
Depends on temperature; higher temperature handles vague instructions

推荐答案C

「Specific」原则：把模糊形容词换成可度量的约束（字数、结构、格式）。LLM 不擅长猜「brief 究竟多简」；明确量化才有稳定输出。Temperature 与歧义无关（D 错）；过度具体不会迷惑 Claude（B 错）。

出处：V8·L07 — Specific Constraints

Question 2

Few-shot works for your task but the examples eat too much budget. What's the next move?

Drop few-shot entirely
Switch to a higher temperature to make examples optional
Move stable examples to a cached prefix (prompt caching) so they amortize across requests
Concatenate all examples into a single short paragraph

推荐答案C

Prompt caching 正是为这种场景设计的——稳定的 few-shot 前缀只算一次成本，后续请求按缓存价计费。砍掉 few-shot（A）会损失质量；调温度（B）与示例数无关；压成一段（D）会破坏示例边界。

出处：V8·L10 — Prompt Caching 应用

Question 3

Your prompt feeds Claude a 50-page document followed by 3 lines of instructions. Outputs sometimes drift off-task. What's the fix?

Always place instructions BEFORE the document, never after
Repeat the key instructions AFTER the document, near the end of the prompt
Shrink the document to fit instructions at the top
Increase temperature to escape document anchoring

推荐答案B

长上下文场景里，靠近输出位置的指令权重更高。Anthropic 推荐「文档在前、指令在后再重复」的模式。强行砍文档（C）会丢信息；高温（D）让输出更不稳；instruction 单独放最前（A）容易被长文档稀释。

出处：V8·L07 — 长上下文指令位置

Question 4

When is asking Claude to "think step by step" the WRONG choice?

Multi-step math problems
Code review tasks
High-volume, latency-sensitive classification with simple inputs
Complex policy reasoning

推荐答案C

Chain-of-thought 增加输出 token，从而增加成本与延迟。简单分类（情感、垃圾邮件）用 CoT 是浪费；它的价值在路径不确定的复杂推理（A/B/D）。「永远用」或「永远不用」都是错觉，要按任务复杂度选择。

出处：V8·L07 — Chain-of-Thought 适用范围

Question 5

"Act as a senior security auditor reviewing this code" — what's the actual mechanism behind role prompts?

Claude loads a different model fine-tuned for that role
It enables hidden tools tied to the role
It biases Claude's vocabulary, depth, and priorities toward that domain's writing patterns
It's purely cosmetic and has no measurable effect

推荐答案C

Role prompt 通过设定语境锚定输出风格、术语密度、关注点优先级。它不切换模型（A 错）也不解锁工具（B 错），但绝非无效（D 错）——只是效果体现在分布偏移而非「身份切换」。

出处：V8·L07 — Role Prompt 机制

Question 6

"Clear and direct" 与 "specific" 在 prompt 工程中的区别是什么？

它们是同一个原则的不同说法
Clear/direct 强调任务表述无歧义；specific 强调补充目标、受众、约束、格式与成功标准
Clear/direct 仅用于 system prompt；specific 仅用于 user prompt
Clear 是英文专属；specific 是中文专属

推荐答案B

两者互补：先把任务说清楚（不绕弯、用主动句），再把细节填具体（量化约束、格式要求、验收标准）。它们不是同义词（A 错），也不限于某种 message 角色（C 错），与语言无关（D 错）。

出处：V8·L07 — Clear/Direct vs Specific

Quiz on Tool Use with Claude

6 题 · Skilljar Vol.8 课程

对应本地卷： V8·L03 工具调用与结构化输出 V8·L08 Tool Use 生产模式

Question 1

Claude returns a `tool_use` block with `id="toolu_xyz"`. Your code runs the tool. What do you send back?

A new user message saying "the tool returned X"
A system message updating Claude's instructions
A `tool_result` content block with `tool_use_id="toolu_xyz"` and the result payload, inside a user message
Nothing — Claude assumes success unless an error is raised

推荐答案C

Tool 协议要求：用户角色消息内携带 `tool_result` 块，并通过 `tool_use_id` 与上一轮的 `tool_use` 块配对。自由文本（A）会让 Claude 把它当成普通用户输入；system message（B）混淆角色；不发回（D）会让 Claude 卡在等待状态。

出处：V8·L03 — Tool Result 协议

Question 2

Your tool description reads: "gets data". Claude often picks the wrong tool. Most likely cause?

The model is too small — switch to Opus
The tool description is too vague to disambiguate from siblings
Tool descriptions don't influence selection — it's all from parameter names
You forgot to set `tool_choice="auto"`

推荐答案B

Tool description 是 Claude 决定调用哪个工具的主要信号——必须说明用途、何时调用、何时**不**调用。「gets data」对客户工具、订单工具、库存工具都成立。换大模型（A）解决不了模糊语义；description 不是无效（C 错）；`tool_choice` 是开关而非语义来源（D）。

出处：V8·L08 — Tool Description 设计

Question 3

Your refund agent has a `process_refund` tool that issues real money. Right gating pattern?

Auto-execute everything Claude calls — that's the point of agents
Hide the tool entirely; never expose write actions to Claude
Have Claude propose the refund; require human confirmation before the side effect runs
Run the tool but log it for retroactive audit

推荐答案C

高风险写操作（资金、删除、对外发送）应该让 Claude 提议、人类确认、系统执行——这是「proposal-confirm-execute」模式。自动执行（A）放任错误；完全隐藏（B）失去 agent 能力；事后审计（D）已经晚了。

出处：V8·L08 — 高风险动作的确认门

Question 4

Your tool returns `{"error": "rate_limited"}`. What should Claude do next, in a well-designed loop?

Treat error as success and continue
Read the error, decide between retry, alternative path, or escalate to the user
Always retry immediately with the same args
Crash the entire conversation

推荐答案B

错误信息是给 Claude 看的——回传后它应判断：可重试就重试、能换路径就换、否则告知用户。忽略错误（A）破坏可靠性；盲目重试（C）会撞同一堵墙；崩对话（D）最差，损失上下文。

出处：V8·L08 — 错误处理与重试

Question 5

When are parallel tool calls a good fit?

Always — it speeds up everything
Only with the code execution tool
When operations are independent, e.g. fetching weather AND fetching news for the same response
Never — Claude can't handle parallel results

推荐答案C

独立、无依赖的工具调用并行最有价值——一次往返就拿齐资料。有依赖时（如「先查订单再查商品」）必须串行。Claude 完全支持并行结果（D 错）；不限工具种类（B 错）；总是并行（A）会破坏依赖顺序。

出处：V8·L08 — 并行工具调用

Question 6

A tool's `input_schema` requires fields `customer_id: string` and `since: ISO date`. Claude calls with `customer_id: 12345` (integer). What's the right fix?

Loosen the schema to accept any type
Switch to a smarter model
Tighten the description: clarify "customer_id is a string like 'cust_001', NOT a numeric ID"
Validate types only after Claude has executed the tool

推荐答案C

类型错误本质是描述不清——Claude 看不出 customer_id 是字符串就当整数。改 schema 是放弃质量（A）；换模型（B）治标；事后校验（D）已经发生错误。在 description 里写明类型与示例是源头治理。

出处：V8·L08 — Tool Schema 与字段说明

Quiz on Features of Claude

6 题 · Skilljar Vol.8 课程

对应本地卷： V8·L10 Claude API 扩展功能

Question 1

Extended thinking — what does it actually do?

Replaces RAG by giving Claude internal knowledge access
Always produces faster responses than non-thinking mode
Allocates a visible reasoning budget before the final answer, helping on complex multi-step problems
Only available for code generation tasks

推荐答案C

Extended thinking 在最终答案前生成 thinking 块作为推理空间，对多约束规划、数学、复杂代码改造尤其有效。它不是 RAG 替代（A 错）；输出更多 token 通常更慢（B 错）；适用范围远不止代码（D 错）。

出处：V8·L10 — Extended Thinking

Question 2

Prompt caching — what gets cached and how?

The final answer, keyed by the input hash
A stable prefix marked by a `cache_control` breakpoint; subsequent requests reuse it at reduced cost
Caching is automatic for any prompt over 1k tokens
Tool definitions, but not message content

推荐答案B

Prompt caching 缓存的是**输入侧的稳定前缀**（system prompt、tools、长文档、few-shot），通过 `cache_control` 显式标记 breakpoint。它不缓存输出（A 错）；不是自动触发（C 错）；可缓存的内容包括消息块和工具定义（D 错）。

出处：V8·L10 — Prompt Caching 机制

Question 3

You want Claude's answer to cite the source paragraph from your document. What's required?

Just write "cite your sources" in the prompt
Use higher temperature to force citation behavior
Use the Citations feature: pass the document with `citations: { enabled: true }` so Claude can attach citation blocks to spans of its output
Manually post-process Claude's text and match keywords to the document

推荐答案C

Citations 是结构化特性——开启后，Claude 输出的每段会带 citation 块指向源文档的具体位置。仅在 prompt 写「请引用」（A）只是请求文本格式而无可靠绑定；温度（B）与引用无关；事后正则匹配（D）脆弱且失真。

出处：V8·L10 — Citations 特性

Question 4

When is the Files API a better fit than just inlining the file content into a message?

Whenever the file is over 100 KB
When the same large file will be referenced across many requests, so uploading once is cheaper than re-sending
When you need to encrypt the file end-to-end
When you want streaming output

推荐答案B

Files API 的核心价值是「上传一次、多次引用」，避免重复传输大文件。它不是按大小自动触发（A 错）；不是端到端加密机制（C 错）；与 streaming 正交（D 错）。

出处：V8·L10 — Files API

Question 5

Message Batches — main reason to use it?

Faster responses than streaming
Up to 50% cost reduction for async, non-time-sensitive workloads
Required for any workload above a certain RPS
Replaces streaming for chat applications

推荐答案B

Message Batches 提供约 50% 折扣，代价是结果在 24 小时内异步返回——适合夜间批量摘要、离线分类、回填等场景。它比 streaming 慢（A 错）；不是高 RPS 强制项（C 错）；与聊天界面（需要实时回流）正相反（D 错）。

出处：V8·L10 — Message Batches

Question 6

PDF + image input — which statement is true?

PDFs must be converted to plain text before sending
Images are free and don't count toward input tokens
Both are accepted natively as content blocks and consume input tokens proportional to size and pages
Vision capability requires a separate Anthropic Vision API endpoint

推荐答案C

Claude 模型原生支持 PDF 与图像 content block，按大小/页数折算成输入 token 计费。无需先 OCR（A 错）；图像绝非免费（B 错）；用同一 messages 接口，没有独立 Vision endpoint（D 错）。

出处：V8·L10 — PDF / Image 多模态

Quiz on Model Context Protocol

6 题 · Skilljar Vol.8 课程

对应本地卷： V8·L11 API 应用中的 MCP V9·L01 MCP 基础架构

Question 1

API tool use vs MCP — what's the primary difference?

MCP runs faster than API tool use
API tool use is deprecated and being replaced by MCP
MCP standardizes a server protocol so the same tools/resources/prompts can be reused across many clients without rewriting
They are identical; MCP is just marketing

推荐答案C

API tool use 是「应用内」声明工具；MCP 是「服务端」标准化暴露能力，让一个 server 同时被 Claude Code、Claude Desktop、第三方 host 等客户端复用。两者并存（B 错）；性能不是主要差异（A 错）；远非同义（D 错）。

出处：V8·L11 — API Tool Use vs MCP

Question 2

Among MCP primitives — tools, resources, prompts — which is intended for read-only context fetch?

Tools
Resources
Prompts
Sampling

推荐答案B

MCP 三原语职责：tools 做有副作用的动作；resources 提供只读上下文（文件、记录、文档片段）；prompts 是可复用工作流模板。Sampling 不是原语（D 错）。

出处：V9·L01 — Tools / Resources / Prompts 区分

Question 3

Your MCP server exposes internal company HR data. What MUST your design account for?

Trust the protocol to enforce access automatically
Default-allow then audit afterwards
Per-user authentication, scoped resource visibility, audit logs of every call
Disable HTTPS to simplify debugging

推荐答案C

MCP 协议本身不替你做安全——server 必须实现身份认证、按用户/角色限定可见资源、记录可审计日志。「默认允许」（B）是数据泄漏温床；信任协议层（A）是误解；关 HTTPS（D）危险。

出处：V8·L11 — MCP 安全模型

Question 4

What role do MCP `prompts` play?

System prompts that the client must use verbatim
A way to override the client's tool definitions
Server-defined, reusable prompt templates the client can surface to users (e.g., "summarize this PR")
Client-side only — they don't cross the protocol

推荐答案C

Prompts 是 server 提供的命名模板，让用户/Claude 一键触发「场景化的预设交互」。它不是强制 system prompt（A 错）；不修改 tools（B 错）；本质就是穿越协议传递给 client 的（D 错）。

出处：V9·L01 — Prompts 原语

Question 5

When is adding MCP overkill?

When the model is Sonnet
Whenever cost matters
When you have a single, tightly-coupled tool that no other client will ever consume — direct API tool use is simpler
Never — MCP is always the right choice

推荐答案C

MCP 的价值在标准化与复用——单应用、单工具、不会被其他 client 用到时，直接 API tool use 心智负担更低。模型选择（A）和成本（B）不是判定标准；「永远 MCP」（D）是过度工程。

出处：V8·L11 — MCP 适用边界

Question 6

An attacker plants a malicious instruction inside an MCP `resource` returned to Claude. Best defense?

HTTPS alone — transport encryption stops injection
Trust Claude to ignore obvious instructions
Server-side validation of resource content + client-side instruction isolation (treat resource bodies as data, not commands)
Disable resources entirely; only allow tools

推荐答案C

Prompt injection via resource 是真实攻击面：必须服务端净化输出，客户端把 resource 当数据而非指令（例如包在明确分隔块里）。HTTPS 只防中间人（A 错）；模型对指令式语言敏感，无法靠「相信它」（B 错）；禁用资源（D）等于自废武功。

出处：V8·L11 — Prompt Injection 防御

Quiz on Agents and Workflows

6 题 · Skilljar Vol.8 课程

对应本地卷： V8·L04 RAG、扩展功能与智能体模式 V8·L12 Agents、Workflows 与总复习

Question 1

When does the workflow pattern beat the agent pattern?

When the task involves any tool calls
When the steps are known, predictable, and you need stable cost / latency / testability
When you're using a smaller model like Haiku
When the user prefers conversational responses

推荐答案B

Workflow 适合路径已知场景——可写测试、成本可预测、易于回滚。Agent 适合路径不确定、需要多轮反馈与验证的场景。任意工具（A）、模型大小（C）、对话风格（D）都不是判定标准。

出处：V8·L12 — Workflow vs Agent

Question 2

In an agent loop, why is "verify" a distinct step?

It's a billing milestone
To compress conversation history
To confirm the action achieved the goal and surface tool failures or unintended effects before continuing
To switch models mid-loop

推荐答案C

Verify 是「闭环安全网」——工具调用可能失败、环境可能变、输出可能不符合目标。少了 verify 的 agent 会在错误状态上继续累积。它不是计费（A 错）、压缩（B 错）或换模型（D 错）的步骤。

出处：V8·L12 — Observe-Plan-Act-Verify 循环

Question 3

What are sensible stop conditions for an agent loop?

Only when the goal is fully achieved — don't bound iterations
Only when an error occurs
Goal met OR max iterations reached OR explicit user halt OR cost / time budget exceeded
Stop after exactly 5 turns regardless

推荐答案C

Agent loop 必须有多个出口：成功、迭代上限、用户中止、预算耗尽。没有上限（A）会无限烧钱；只看错误（B）放任空转；硬编 5 轮（D）武断。

出处：V8·L12 — Stop Conditions

Question 4

A pipeline stage is a fixed sequence of three deterministic API calls. How should you implement it?

Wrap it in an agent loop for "future flexibility"
Use extended thinking
Plain code orchestrating three Claude calls — workflow, no agent loop needed
Hardcode the calls inside the system prompt

推荐答案C

三步固定流程就是 workflow——用代码串联三次 Claude 调用，可单元测试、易于监控、成本可预测。套 agent loop（A）是过度设计，无谓引入不确定性；extended thinking（B）解决推理深度不解决流程；写进 system prompt（D）会让模型自己跑流程，反而失控。

出处：V8·L12 — Workflow 实现模式

Question 5

For high-stakes verification, why use an INDEPENDENT Claude call rather than asking the same agent to self-check?

Independent calls are always faster
It's required by the API for tool use
Self-check shares context bias; an independent call sees only the artifact and gives a less anchored judgment
Self-checks are forbidden by Anthropic policy

推荐答案C

同一上下文里的「再检查一遍」容易被原推理锚定，难以发现自己的错误。独立 verifier（新会话、只看输入与输出）能给出近似第三方判断。这与速度（A）、API 要求（B）、政策（D）都无关。

出处：V8·L12 — Independent Verification

Question 6

How do you keep agent costs predictable in production?

Costs are inherently unpredictable; budget generously
Cap latency only — cost will follow
Cap iterations + log per-turn token usage + alert / escalate on cost spikes
Run only with Haiku regardless of task

推荐答案C

成本可治理：迭代上限避免空转、每轮 token 记录便于归因、阈值告警让异常被人发现。「认命」（A）是放弃；只看延迟（B）漏成本维度；强制 Haiku（D）牺牲能力。

出处：V8·L12 — Agent 成本控制

Final Assessment

8 题 · Skilljar Vol.8 课程 · 综合判断

对应本地卷： V8·L12 总复习 V8·L13 Quiz Bank 与 Final Assessment Drill

Question 1

A customer-support bot must answer questions from internal policy docs and never fabricate. Best architecture?

Pure agent loop with web search
Inline the entire policy into the system prompt every request
RAG over policy docs + Citations + an eval set that tests "I don't know" / escalation paths
Fine-tune a custom Claude on the policy docs

推荐答案C

RAG 解决资料来源、Citations 让答案可追溯、eval 覆盖「未检索到时拒答」的关键路径——三者组合是政策问答的标准答案。Agent loop（A）路径过宽；inline 系统提示（B）token 爆炸且不易更新；fine-tune（D）是反应过度，且 Claude 不开放此路径。

出处：V8·L09 RAG 与 Agentic Search

Question 2

Compliance team needs to audit every Claude response against source documents. Best feature combo?

High temperature + manual review
Citations + structured output (JSON for audit fields) + full request/response logging
Just bigger model and trust the answers
Streaming, so reviewers see drafts in real time

推荐答案B

合规审计需要：可追溯（Citations）、机器可读（JSON 结构化）、可重现（完整日志）。高温（A）增加随机性反而难审计；换大模型（C）不解决追溯；streaming（D）只解决体感而非记录。

出处：V8·L10 — Citations 与结构化输出

Question 3

You're generating SQL from natural-language questions. Which temperature setting is sane?

1.0+, to encourage creative joins
0.0–0.2, for deterministic, syntactically correct SQL
It doesn't matter for code generation
0.7, the default that fits everything

推荐答案B

SQL 是确定性语言，要稳定语法和重复可控的查询计划——低温是必然选择。高温（A）增加语法错误和不一致；temperature 对代码任务影响显著（C 错）；默认值不是「万能值」（D 错）。

出处：V8·L01 — Temperature 选择

Question 4

Daily report job analyses the same 100k-token policy book against today's incidents. Cost optimization?

Switch all calls to Haiku regardless of quality needs
Truncate the policy book to 10k tokens
Use prompt caching with the policy book as a stable prefix — every call after the first hits the cached copy
Run streaming so partial results lower wall-clock time

推荐答案C

长且稳定的前缀 + 多次调用 = prompt caching 教科书场景，可大幅降低重复输入成本。降级模型（A）牺牲质量；砍内容（B）牺牲覆盖；streaming（D）解决感知延迟而不省成本。

出处：V8·L10 — Prompt Caching 价值场景

Question 5

An agent has a `process_refund` tool. The biggest production risk?

The tool description being too long
Using JSON instead of XML for the schema
Auto-executing refunds without a human-in-the-loop confirmation gate
Calling it from Sonnet instead of Opus

推荐答案C

高风险写动作（资金、删除、对外发送）的最大风险一律是「未经确认就自动执行」。description 长度（A）影响选择正确率但不致命；schema 格式（B）只要被 Claude 理解皆可；模型选择（D）影响成功率但不是首要风险。

出处：V8·L08 — Human-in-the-loop 模式

Question 6

How do you detect prompt regressions across versions?

Wait for user complaints
Manual spot-check the new prompt on a few prompts you remember
Maintain a versioned eval set with consistent grading; run it on every prompt version and compare scores
Trust streaming output to surface anomalies

推荐答案C

回归检测的工程标准：稳定 eval 集 + 一致评分 + 版本化对比。等用户投诉（A）已经造成损失；记忆抽查（B）有偏；streaming（D）和回归无关。

出处：V8·L06 — Eval 用于回归检测

Question 7

You need Claude to actually run Python code, not just generate it. Best mechanism?

Add "execute the code" to the system prompt
Spin up a separate agent that copies code from Claude's text output
Use the code execution tool — sandboxed runtime that returns real results to Claude
Switch to a bigger model

推荐答案C

Code execution 是一个工具，给 Claude 一个隔离的运行环境，把执行结果回传供继续推理。System prompt 的「请执行」（A）只是文字祈愿；自建 agent 解析（B）等于重发明；换模型（D）不解决执行问题。

出处：V8·L10 — Code Execution 工具

Question 8

Your team exposes the company CRM via an MCP server. What governs which records each user can see?

The MCP protocol enforces row-level security automatically
The system prompt on the client side
Server-side authentication + per-user scope on resources/tools, enforced before responding to any MCP call
Trust by default; let Claude decide what to expose

推荐答案C

权限治理永远在服务端——MCP server 必须验明用户身份、按权限决定可见的 resources/tools 子集。协议本身不做（A 错）；客户端 system prompt 是最弱的防线（B 错）；trust by default（D）是数据泄漏标准模板。

出处：V8·L11 — MCP 安全治理

Quiz on Claude Code

6 题 · Skilljar Vol.4 课程

对应本地卷： V4·L02 Setup V4·L03 Adding Context V4·L06 Custom Commands 与 Skills V4·L07 MCP 与 GitHub V4·L08 Hooks V4·L09 Agent SDK

Question 1

What does a `CLAUDE.md` file in a project root actually do?

It's a README replacement that GitHub renders specially
It must be present for `claude` to launch in that directory
It's auto-loaded as persistent project memory — instructions, conventions, and file pointers stay visible to Claude across sessions
It overrides Claude's safety system prompt

推荐答案C

CLAUDE.md 是 Claude Code 的项目级记忆文件——每次启动自动加载，记录团队约定、关键路径、避免的坑。它不是 GitHub 渲染（A 错），也不是启动必需（B 错），更不能覆盖安全 system prompt（D 错）。可以在用户级 `~/.claude/CLAUDE.md` 写跨项目偏好。

出处：V4·L02 — CLAUDE.md 项目记忆

Question 2

You want Claude to look at `src/auth/login.ts`. Best way to add it as context?

Paste the entire file content into the prompt
Reference it with `@src/auth/login.ts` — Claude reads on demand and avoids bloating the context window
Copy the entire repo and let Claude scan everything
Disable context limits and dump the whole project

推荐答案B

`@` 让 Claude 按需读取文件，避免一开场就把不相关内容塞进上下文。整文件粘贴（A）、扫整个 repo（C）、关闭限制（D，根本不存在的开关）都会浪费 token、稀释关注点，反而降低答题质量。

出处：V4·L03 — Adding Context（@-mention）

Question 3

Your team keeps re-asking Claude to "review the diff for accessibility issues using our checklist." How do you make this a one-shot command?

Paste the checklist into every prompt
Edit the global system prompt
Create `.claude/commands/a11y-review.md` with the workflow + checklist; invoke via `/a11y-review`, supports `$ARGUMENTS` for parameters
Hard-code it in a shell alias

推荐答案C

Custom slash commands 把复用工作流变成 Markdown 文件——可 git 管理、可团队共享、可参数化（`$ARGUMENTS`）。每次粘贴清单（A）失去自动化；改 system prompt（B）影响所有任务；shell 别名（D）只调起 CLI 不携带工作流。

出处：V4·L06 — Custom Commands

Question 4

Adding a Linear MCP server so Claude Code can read tickets — where does the config go?

It's hard-coded inside the `claude` binary; you can't add servers
`.mcp.json` at the project root declares per-project servers (command, args, env); commit it for the team
Only `~/.claude/settings.json` works — MCP is global-only
Pass `--mcp` flag with the server URL on every launch

推荐答案B

`.mcp.json`（项目根）把 MCP server 声明绑定到具体仓库，团队成员 clone 后即可启用。它支持 stdio / http / sse 多种传输；可通过 `claude mcp add` 命令编辑。Claude Code 不只支持全局（C 错），也不要每次传 flag（D 错）。

出处：V4·L07 — .mcp.json 项目级配置

Question 5

A `PreToolUse` hook in `settings.json` — what can it do?

Only log the tool call to stdout — purely cosmetic
Fire after the tool finishes, to record results
Run before the tool call; exit code or JSON output can deny, allow, or modify the call
Replace the entire Claude system prompt for one session

推荐答案C

PreToolUse 是 hook 链中真正能「拦截」的环节——脚本 exit code 非零或返回 deny 会阻止工具执行；返回 modify 可改写参数。它不只是日志（A 错）；那是 PostToolUse（B）的事；hooks 不能改 system prompt（D 错）。

出处：V4·L08 — PreToolUse Hook

Question 6

Running Claude Code in a GitHub Actions workflow to triage PRs — what's the right shape?

Run the interactive CLI; pipe stdin from the PR body
Skip Claude in CI; only run it locally
Use the Claude Agent SDK or GitHub Action: stateless invocation with the prompt + repo checkout, `ANTHROPIC_API_KEY` from secrets, output posted as PR comment / patch
Hard-code the API key into the workflow YAML

推荐答案C

CI 集成走 SDK / GitHub Action，把 Claude Code 跑成无状态的 agent——拿到 PR 上下文、产生评论或补丁。交互式 CLI（A）无人按 Enter；放弃 CI（B）失去自动化价值；硬编 key（D）会进 git 历史，必须用 secret。

出处：V4·L09 — Agent SDK 与 GitHub Action

Final assessment on MCP Vol.9 入门

6 题 · Skilljar Vol.9 课程

对应本地卷： V9·L01 基础架构 V9·L02 Clients 接入 V9·L03 Project Setup 与 Inspector V9·L04 Defining Tools V9·L05 Implementing a Client V9·L06 Resources 与 Prompts

Question 1

In MCP, who initiates the connection — host, client, or server?

The server connects out to the host on startup
The host (e.g., Claude Code) spawns or connects a client, which initiates the handshake with the server
Server and client connect peer-to-peer with no host involved
The order doesn't matter — MCP is symmetric

推荐答案B

MCP 三角：host（如 Claude Code / Claude Desktop）拥有 LLM 上下文，host 实例化 client，client 与 server 建立连接并执行 capability negotiation。Server 不主动连 host（A 错），不是对等关系（C 错），方向是固定的（D 错）。

出处：V9·L01 — Host / Client / Server 角色

Question 2

You're building an MCP server for a knowledge base. Users should be able to (a) search articles, (b) read full article text, (c) trigger a "draft a reply" workflow. Which primitives map to which?

All three should be tools — keep it simple
Search = tool (action with side-effect of result), full text = resource (read-only context), draft-reply workflow = prompt (named template)
All three should be resources — they all read data
All three should be prompts — they're triggered by user intent

推荐答案B

MCP 三原语职责清晰：tools 做有逻辑/参数的动作（搜索）、resources 提供可寻址的只读内容（文章正文）、prompts 是可由用户触发的命名模板（「起草回复」工作流）。把三者全归一类（A/C/D）会失去原语区分带来的客户端 UX 价值。

出处：V9·L06 — Tools / Resources / Prompts 选型

Question 3

MCP Inspector — primary purpose?

It's a deployment service that hosts your MCP server in production
A local debugging UI that connects to your server, lists its tools/resources/prompts, and lets you call them manually to verify behavior
An IDE plugin that auto-generates MCP server code
A logging aggregator for production MCP traffic

推荐答案B

MCP Inspector 是开发阶段的「客户端模拟器 + 浏览器」——它发送真实的 MCP 协议请求，把 server 当成黑盒来探查。它不部署（A 错）、不生成代码（C 错）、也不是生产观测工具（D 错）。先在 Inspector 里跑通，再接 Claude Code。

出处：V9·L03 — MCP Inspector

Question 4

A tool's `input_schema` lists `{"query": "string", "limit": "integer"}` with no descriptions. What's wrong?

Nothing — types are sufficient for the LLM to call correctly
The schema must use `kind` instead of `type`
Without per-field descriptions, the LLM has no signal about valid ranges, defaults, or semantic meaning — quality degrades
Schemas don't matter; only the tool description does

推荐答案C

每个字段的 `description` 是 LLM 判断「填什么值合理」的关键信号——没有它，model 只能瞎猜默认 `limit`、可接受的 query 形态等。类型本身不够（A 错）；JSON Schema 用 `type` 而非 `kind`（B 错）；schema 与 tool description 都重要（D 错）。

出处：V9·L04 — Tool Schema 字段描述

Question 5

In Claude Code, you want a Linear MCP server visible only to your team in this repo. Which scope and config?

User scope (`~/.claude.json`) — easiest to maintain
Project scope: commit `.mcp.json` in the repo root so every collaborator picks it up
Local scope: `claude mcp add --scope local`, then commit your machine's settings file
Hard-code it in CLAUDE.md so team members read about it

推荐答案B

Project scope = `.mcp.json` 提交进 repo，全队 clone 即可启用。User scope（A）只在你的机器生效；local scope（C）也是个人级；写进 CLAUDE.md（D）只是告知，不是配置。Project scope 是「团队共享 MCP server」的标准答案。

出处：V9·L02 — MCP scope 与 .mcp.json

Question 6

You're writing your own MCP client (not using Claude Code). What does the client need to do besides JSON-RPC plumbing?

Nothing — JSON-RPC handles everything
Negotiate capabilities at startup, surface tools/resources/prompts to the LLM, route LLM tool calls to the right server, and pass tool_result back into the conversation
Implement its own LLM internally; servers refuse plain HTTP clients
Encrypt every payload before sending

推荐答案B

Client 是 host 与 server 之间的「翻译层」：lifecycle（initialize / capability negotiation）、把 server 暴露的能力映射到 LLM 可用的 tool list、把 LLM 的 tool_use 路由到正确 server、再把 tool_result 注回对话。JSON-RPC 只是底层（A 错）；不需要内嵌 LLM（C 错）；加密由传输层负责（D 错）。

出处：V9·L05 — Implementing a Client

Assessment on MCP concepts Vol.10 进阶

6 题 · Skilljar Vol.10 课程

对应本地卷： V10·L01 Lifecycle V10·L02 Sampling V10·L03 Notifications V10·L04 Roots V10·L05 STDIO Transport V10·L06 Streamable HTTP

Question 1

Sampling in MCP — what does it actually do?

The server sends statistics about LLM token usage to the host
The server requests the host's LLM to generate a completion on its behalf — capability negotiation must enable it, and the host approves before forwarding
It's a load-balancing technique for multiple MCP servers
It samples a subset of resources to avoid overloading the model

推荐答案B

Sampling 让 server 反向调用 host 的 LLM——server 抛出 prompt + 参数，host 检查并代理给模型，结果回给 server。这是 server 完成自身工作流（如总结大文档）需要 LLM 推理时的标准模式。它与统计采样（A/D）或负载均衡（C）无关。

出处：V10·L02 — Sampling

Question 2

Roots — what are they for?

The set of administrators allowed to install MCP servers
Cryptographic root certificates for secure transport
Filesystem boundaries the host advertises to servers — "you may operate within these directories" — letting servers scope their work appropriately
The top-level routes in an HTTP-based MCP server

推荐答案C

Roots 是 host 告知 server「这是我授权的工作目录」——文件系统类 server（如 codebase 索引、文档读取）按 roots 决定扫描范围。这与权限管理（A）、TLS 证书（B）、HTTP 路由（D）都无关，是 MCP 协议的项目边界机制。

出处：V10·L04 — Roots

Question 3

Picking transport: STDIO vs Streamable HTTP. Which fits "internal Linear MCP server hosted on a private VM, used by 50 employees with team-shared auth"?

STDIO — keeps things simple
Streamable HTTP — multi-user remote access requires a network transport with auth and session handling
Either works equivalently
Neither — MCP can't serve multiple users

推荐答案B

STDIO 是 host 与 server 在同一台机器、通过子进程通信——单用户、无网络。Streamable HTTP 才能支持远程多用户、auth、session、负载均衡。STDIO 在 50 人远程场景行不通（A/C 错），MCP 完全支持多用户（D 错）。

出处：V10·L05 — STDIO vs Streamable HTTP

Question 4

Your MCP tool runs a 90-second build. Without progress notifications, what happens on the host side?

The host shows "running..." indefinitely with no feedback — users assume it's stuck and may abort
Streaming HTTP automatically emits keep-alive — no action needed
JSON-RPC times out at 30s and fails the call
The tool can't run longer than 30s in MCP at all

推荐答案A

长任务必须主动发送 `progress` notifications（带 token、percent、message），否则 host UI 只能显示「运行中」无法显示进度，体验极差且用户倾向中止。Keep-alive（B）维持连接但不传业务进度；30s 不是协议硬上限（C/D 错）。

出处：V10·L03 — Progress Notifications

Question 5

Streamable HTTP — stateless vs stateful session. When should you go stateful?

Always — stateful is more reliable
Never — stateless scales better
When the server holds per-session resources (in-flight cursor, partial uploads, conversation memory) that can't be reconstructed from a single request
Whenever you have more than 10 concurrent users

推荐答案C

Streamable HTTP 默认无状态便于水平扩展；当 server 必须维持「跨请求的连续状态」（如分页游标、上传进度、会话记忆）时才转 stateful，并接受随之而来的部署复杂度。「永远 stateful」（A）是过度设计，「永远 stateless」（B）也教条；用户数（D）不是判断维度。

出处：V10·L06 — Streamable HTTP State

Question 6

Productionizing an HTTP MCP server. Which is the most important security control?

Choose a unique port number to obscure the server
Authenticate every request, scope tools/resources per identity, audit log every call, and treat all external content as untrusted (injection defense)
Disable HTTPS for performance — internal traffic is safe
Run with default settings; MCP includes built-in security

推荐答案B

生产 MCP server 的安全清单：身份认证 + 授权范围 + 审计日志 + 注入防御——四项都做才合格。端口隐藏（A）是 security through obscurity；关 HTTPS（C）暴露密钥；MCP 协议本身不带安全（D 错），是开发者的责任。

出处：V10·L06 — 生产安全

Quiz on Claude Cowork

5 题 · Skilljar Vol.5 课程

对应本地卷： V5·L01 What is Cowork? V5·L02 Task loop 与 context V5·L03 Projects V5·L04 Plugins / Skills V5·L05 Scheduled tasks V5·L07 Permissions 与 quiz

Question 1

Three tasks land on your desk. Which one is the strongest fit for Cowork (vs. Chat or Claude Code)?

Quickly summarize a single article you pasted into the chat
Refactor a Python module's tests in a git repository
Sort 200 mixed PDFs in `~/Downloads/` into year/month subfolders, generate an Excel index, and produce a one-page summary
Brainstorm taglines with a colleague over a chat thread

推荐答案C

Cowork 专为「多步骤、本机文件、有真实产物」的知识工作设计。整理 200 个 PDF + 索引 + 摘要正中靶心：批量、跨文件、产生原生 Excel/文档。单文章摘要（A）用 Chat 更轻；代码重构（B）是 Claude Code 的领域；头脑风暴（D）是 Chat 的对话场景。

出处：V5·L01 — Cowork vs Chat vs Code

Question 2

In the Cowork task loop, your role is primarily…?

Hands-off — kick off the task and let Claude run end-to-end
Continuous co-pilot — type alongside Claude on every step
Plan reviewer + outcome auditor — describe the goal, approve plans, sample-check results, halt early when direction is wrong
Just a log reader — Cowork can't be interrupted once running

推荐答案C

Cowork 的循环把人放在两个关键节点：开始时定义目标 + 输入 + 输出 + 约束 + 验收；运行后审阅 + 复核。文件操作不可逆，因此「描述—批准—执行—交付」中的批准与复核必须由人承担。它不是自驾（A）也不是结对编程（B）；可以随时打断（D 错）。

出处：V5·L02 — Task Loop 与人的角色

Question 3

A Cowork Project differs from a one-shot Cowork task in that it…?

Runs in Anthropic's cloud, freeing your laptop
Persists instructions, file context, scheduled tasks, and memory inside a named local workspace — repeatable work without re-explaining context
Is the same concept as a Claude Code project; just a UI rename
Makes the Project's data shared across everyone in your team automatically

推荐答案B

Cowork Project 是本机持久工作区——它把团队背景、目录结构、输出格式、定时任务一次写定，每次任务自动加载。它仍跑在你电脑上（A 错）；与 Claude Code 项目是不同概念（C 错）；默认私有、不自动跨人共享（D 错）。

出处：V5·L03 — Cowork Projects

Question 4

You set up "every Friday 5 pm: generate weekly digest." On Friday at 5 pm your laptop is closed in your bag. What happens?

Cowork runs anyway in Anthropic's cloud — output appears Monday
Nothing — Cowork scheduled tasks need Claude Desktop running and the computer awake; missed runs may catch up next time you open the app, depending on the schedule
Your phone runs it via push notification
The next launch raises an error and disables the schedule

推荐答案B

Cowork scheduled tasks 不是云端 cron——必须 Claude Desktop 开着 + 电脑醒着。这一限制决定了排程任务的设计：选低风险、可补做、有失败通知的工作。云端跑（A）、手机推（C）都是误解；不会自动 disable schedule（D 错）。

出处：V5·L05 — Scheduled Tasks 运行条件

Question 5

Inside Cowork, what's the relationship between Plugins, Skills, Connectors, and Sub-agents?

They're four names for the same connector concept
Skills are the umbrella; the others plug into a Skill
A Plugin packages Skills (how to do something), Connectors (where to reach), and Sub-agents (who handles which part) into a specialist workflow for a role
Sub-agents only exist in Claude Code, not in Cowork

推荐答案C

Plugin = 角色化工作包：Skill 决定怎么做、Connector 决定能接触哪里、Sub-agent 负责把复杂任务拆给专门角色并行处理。它们职责分明（A 错）；Plugin 才是包装层，不是 Skill（B 错）；Cowork 也支持 Sub-agent（D 错）。

出处：V5·L04 — Plugin / Skill / Connector / Sub-agent

Course Quiz Vol.2

5 题 · Skilljar Vol.2「AI Capabilities and Limitations」

对应本地卷： V2·L03 Token 与 Next-Token Prediction V2·L04 训练流水线 V2·L06 局限性 V2·L07 幻觉 V2·L10 Course Quiz Drill

Question 1

When Claude generates text, what is it actually predicting?

The semantic meaning of the next sentence as a whole
The full final answer, then it checks back from the end
The next token, given all prior tokens — repeated until a stop condition
A vector embedding that's later decoded into characters

推荐答案C

LLM 是「自回归 next-token prediction」机器——每一步只决定下一个 token，循环到 stop sequence / max_tokens / EOT。它不预测语义整段（A 错），不从结尾回溯（B 错），输出端不是 embedding 解码（D 错）。理解这一机制就解释了为什么 prompt 顺序、prefill、stop sequence 都生效。

出处：V2·L03 — Next-Token Prediction

Question 2

Claude confidently states a fact that turns out to be wrong. The most accurate explanation?

The model is broken; report it for retraining
Hallucination is an inherent risk of next-token prediction over imperfect training data — verification, citations, and RAG are the mitigations, not "telling Claude to be careful"
Claude is intentionally deceptive in some scenarios
It happens only in low-temperature settings

推荐答案B

幻觉来自 LLM 的根本机制（按概率续写）+ 训练数据不完美——它是结构性问题，不是 bug。降低靠 RAG（外部资料）、Citations（追溯）、eval（回归检测），不是靠 prompt 里写「请别瞎说」。它不是故意（C 错），与温度无强关系（D 错）。

出处：V2·L07 — 幻觉成因与缓解

Question 3

Pre-training, fine-tuning, and RLHF — pick the order and what each contributes.

RLHF → fine-tuning → pre-training, building from style to knowledge
They are alternatives; you pick one based on use case
Pre-training (next-token on huge corpora — broad world knowledge) → fine-tuning (task-specific data — sharper behavior) → RLHF (human preference signals — alignment with helpfulness/safety)
Only pre-training matters; the rest are marketing names

推荐答案C

三阶段管线：pre-training 学语言与世界（参数最多）、fine-tuning 学任务（小数据集塑形）、RLHF 学人类偏好（让回答有用且安全）。顺序反过来（A 错）；不是替代关系（B 错）；后两步对体感质量影响巨大（D 错）。

出处：V2·L04 — 训练流水线

Question 4

Today's date is past Claude's training cutoff. Which question is the riskiest to take Claude's answer at face value?

"Explain the origins of the Roman numeral system"
"Write a Python function to reverse a string"
"Who's currently the CEO of [recent-news company X]?"
"What's a metaphor for resilience?"

推荐答案C

事实在 cutoff 后会变化的领域（高管、当下事件、新版本号、定价、法规）最容易过时。历史（A）、确定性代码（B）、修辞建议（D）受 cutoff 影响小。需要新事实时用 RAG / web search / connectors 而不是直接问。

出处：V2·L06 — Knowledge Cutoff 局限

Question 5

A teammate writes "make this email better" as a Claude prompt. Why does it underperform "rewrite this email to be 100 words, polite-firm tone, ending with a clear next-step request"?

Shorter prompts always work worse
It needs a higher temperature setting
"Better" is undefined; Claude has no shared rubric. Specifying length, tone, and required structure replaces ambiguous adjectives with measurable constraints
You must use XML tags or it won't work

推荐答案C

「make better」是模糊形容词；LLM 没有你的内心评分标准。把目标拆成长度、语气、结构、动作要求 = clear + specific 原则。短不等于差（A 错）；温度无关（B 错）；XML 是工具不是必需（D 错）。

出处：V2·L10 — Course Quiz Drill

Course Quiz Vol.3

5 题 · Skilljar Vol.3「Claude Code 101」

对应本地卷： V3·L01 什么是 Claude Code V3·L03 Project Setup 与 CLAUDE.md V3·L05 编辑模式与提议 V3·L07 常见误用与边界 V3·L08 Course Quiz Drill

Question 1

Three colleagues describe Claude Code. Which one has the right product mental model?

"It's a chatbot in a browser tab where I paste code snippets"
"It's a terminal-native AI pair-programmer that operates inside my repository — reads files, proposes edits, runs commands"
"It's an autocomplete extension; it suggests one line at a time"
"It's a no-code visual flow builder"

推荐答案B

Claude Code 跑在终端里，对当前 repo 有完整文件系统访问权——它读文件、跑命令、改代码、提议 diff。不是浏览器聊天（A）、不是行级补全（C，那是 IDE inline）、也不是可视化（D）。理解这个心智决定了用户怎么写 prompt 与放权。

出处：V3·L01 — 什么是 Claude Code

Question 2

First time setting up Claude Code in a repo. What's the right first move?

Just run `claude` and start asking questions — it figures everything out
Run `/init` (or write a small `CLAUDE.md`) capturing the repo's stack, conventions, key directories, and "do not touch" zones — Claude loads this every session
Copy the entire codebase into the prompt to "give context"
Disable file write so Claude can only read

推荐答案B

CLAUDE.md（或 `/init` 自动生成）是项目记忆——团队约定、目录地图、禁区一次写定，每次会话自动加载。「干跑 claude」（A）让 Claude 重新摸索；粘贴整 repo（C）烧 token；只读模式（D）失去价值。先写 CLAUDE.md，再开活。

出处：V3·L03 — CLAUDE.md 与 /init

Question 3

Claude proposes a 12-file edit. You glance, think it looks good, and want to accept. Best practice?

Accept all — Claude wouldn't propose something wrong
Reject everything — too risky to read
Review the diff, accept partial / file-by-file, run tests, and commit only what passed — Claude Code is a proposer, not an autopilot
Always run `git add -A` before reading the diff to "checkpoint"

推荐答案C

Claude Code 的编辑模式是「提议 → 你审 → 你接受 → 运行测试 → commit」。无脑接受（A）累积错误；全拒（B）否定工具价值；预 add（D）会污染 staging。审 diff + 跑测试 + 部分接受是协作的核心。

出处：V3·L05 — 编辑模式

Question 4

When is Claude Code the WRONG tool?

You need to refactor a 30-file module
You want to add tests for a new feature
You need to brainstorm a product roadmap with a stakeholder over coffee — there's no codebase context, just a conversation
You want to debug a flaky CI job

推荐答案C

Claude Code 的价值在「有代码 + 终端 + 文件操作」的场景。纯对话、无代码（C）应该用 Claude Chat。多文件重构（A）、写测试（B）、CI 排错（D）都是 Code 的强项。「拿 IDE 当 PowerPoint」是常见误用。

出处：V3·L07 — 常见误用

Question 5

Mid-task you realize Claude Code is heading the wrong direction. Best action?

Wait for it to finish; then start over
Force-kill the terminal and discard everything
Press Esc / interrupt, redirect with new context, and use `/clear` only if the off-track context is irreversible — keeping useful history saves restart cost
Open a separate `claude` session in another terminal so they vote

推荐答案C

中断 + 修正方向是 Claude Code 的协作核心。等到结束（A）浪费时间与 token；杀进程（B）失去全部上下文；并行投票（D）只会更乱。`/clear` 只在上下文被污染到无法救回时使用——多数情况下接续修正即可。

出处：V3·L08 — Course Quiz Drill

Course Quiz Vol.13

5 题 · Skilljar Vol.13「AI Fluency for nonprofits」

对应本地卷： V13·L02 4D Framework V13·L03 Researching with AI V13·L05 隐私与数据 V13·L06 Data Analysis V13·L10 Course Quiz Drill

Question 1

In the 4D framework (Delegation / Description / Discernment / Diligence), which D handles "deciding whether AI should do this task at all"?

Description
Delegation
Discernment
Diligence

推荐答案B

Delegation 是「任务路由」：判断这个任务由人做、由 AI 做、还是协作。Description 是写好 prompt（怎么交代）、Discernment 是判断输出质量、Diligence 是负责到底。先选对工具，再做后面的事。

出处：V13·L02 — 4D Framework

Question 2

Your nonprofit uses Claude to research grant opportunities. The output lists 5 promising funders. Discernment step says you should…?

Trust the list — Claude has access to current grant databases
Verify each funder still exists, the deadline is current, and the eligibility criteria match — Claude may hallucinate or use outdated data
Ask Claude to verify itself by re-running the prompt
Forward the list to staff without review

推荐答案B

Discernment 的核心是「不把 AI 输出当事实」——尤其涉及外部资助方、deadline、合规。Claude 可能幻觉、可能用过时知识。每条都要去官网或可信数据库核实。「让 Claude 自检」（C）等于让原作者改自己的卷。

出处：V13·L03 — Researching with AI

Question 3

Which type of nonprofit data is the LEAST safe to paste into a public Claude conversation?

Your annual report PDF (already public)
Generic fundraising email templates
Beneficiary case files containing names, contact info, health details, or immigration status
Your mission statement

推荐答案C

PII / 敏感受众资料（医疗、移民身份、案件细节）是最高敏感度——一旦泄漏会真实伤害弱势群体。已公开材料（A/D）和通用模板（B）风险低。处理这类资料时应使用合规企业方案、最小化、匿名化。

出处：V13·L05 — 隐私与数据

Question 4

"Diligence" in the 4D framework operationally means…?

Working harder than the AI does
Owning the final outcome — humans remain accountable for what was sent / decided / published, and document the AI's role + limits in their workflow
Reading every Claude transcript end-to-end
Running the same prompt three times for consensus

推荐答案B

Diligence = 人类对最终成果负责 + 记录 AI 在流程中的角色与局限。它不是体力（A）、不是逐字阅读（C）、也不是多投票（D）。考点：把 AI 用法放进可审计的工作流，而不是把责任「转嫁给模型」。

出处：V13·L02 — Diligence 边界

Question 5

Your director asks Claude to analyze 5 years of donation data and recommend a fundraising strategy. When is AI-only analysis NOT enough?

Always — AI can never do data analysis
Never — AI handles it end-to-end if you upload the spreadsheet
When decisions affect real budgets, staff, or beneficiaries — AI surfaces patterns and questions, humans validate the math, judge the strategy, and own the call
Only when the dataset is over 100k rows

推荐答案C

AI 在数据分析上能加速发现模式、生成假设、起草沟通——但策略性决策（涉及预算、人事、受众）必须人类把关。AI 万能（A）和 AI 无能（B）都是极端；行数（D）不是判断维度。考点：augmentation 而非 automation。

出处：V13·L06 — Data Analysis with AI

Quiz on Accessing Claude with the API

7 题 · Skilljar Vertex 课程

对应本地卷： V8·L01 API 基础 V8·L02 Prompt 工程 V17·L01 Vertex 部署

Question 1

You're building a chat app that talks to Claude. Where should you put your API key?

In the user's browser
On your secure server
In your website's JavaScript code
In a public GitHub repository

推荐答案B · On your secure server

API Key 必须留在受信任的后端环境。放浏览器 / JS 代码 / 公开仓库都会被盗用导致账单失控。前端只与你的服务器通信，由服务器代为调用 Anthropic API。

出处：V8·L01 API 基础 — Authentication 安全实践

Question 2

What is the primary purpose of a system prompt when working with Claude?

To authenticate API requests to the Anthropic service
To provide instructions that customize Claude's tone, style, and approach
To limit the number of tokens Claude can generate in a response
To store the conversation history between multiple requests

推荐答案B

System prompt 是「角色 + 行为基调」的定义层——告诉 Claude 它是谁、用什么口吻、遵守哪些规则。认证用 API Key（A 错），token 限额用 max_tokens（C 错），对话历史在 messages 数组里（D 错）。

出处：V8·L02 Prompt 工程 — System Prompt 用法

Question 3

Your users complain that your chat app feels slow because they wait 20 seconds staring at a loading spinner, then a bunch generated text suddenly appears. What feature should you add to fix this?

Shorter prompts
Response streaming
Multiple chatbots
Faster internet connection

推荐答案B · Response streaming

Streaming 让 token 一边生成一边推送给前端，把「20 秒空白后一次性出现」变成「即刻开始打字」。这是体感延迟问题的标准解，不需要改 prompt 长度或换网络。

出处：V8·L01 — Streaming 与 SSE

Question 4

You want Claude to generate only clean JSON code without any explanations or markdown formatting. Which combination of techniques works best?

Request shorter responses
Ask nicely in the prompt
Use high temperature and long prompts
Prefill with "{" and use "```" as a stop sequence

推荐答案D

两个技术的组合：response prefill（在 assistant 消息前置 {）锁定 JSON 起始；stop sequence（```）阻断 markdown 代码块结尾。这比「礼貌请求」可靠得多。

出处：V8·L02 — Response Prefill 与 Stop Sequence

Question 5

You're building a chatbot to answer factual questions about your company. You want consistent, reliable answers every time. What temperature setting should you use?

0.1 (low temperature)
It doesn't matter
0.8 (high temperature)
1.0 (high temperature)

推荐答案A

温度越低输出越确定，相同输入趋于一致输出。事实问答场景需要可预测性，0.1 是典型选择；高温适合创意写作或头脑风暴。

出处：V8·L01 — Temperature 参数

Question 6

Claude reads your message "I love quantum physics." What happens first?

Claude breaks the text into smaller pieces called tokens
Claude researches quantum physics
Claude writes a response immediately
Claude translates it to another language

推荐答案A

所有 LLM 输入都先经 tokenizer 切成 token，模型只看 token ID。Claude 不联网做研究（B 错），不会跳过 tokenization 直接回答（C 错）。

出处：V2·L03 — Token 与 Next-Token Prediction

Question 7

You ask Claude "What's the best programming language?" but you want it to specifically argue for Python. What technique helps you control this?

Use a higher temperature setting
Ask the question multiple times
Make the text bigger
Add an assistant message starting with "Python is the best because"

推荐答案D

同样是 response prefill 技术——在 messages 数组里加一条 assistant 角色的「开头」，Claude 只能从那个开头继续写，从而锁定立场或格式。这比 prompt 里反复嘱咐有效得多。

出处：V8·L02 — Assistant Prefill 立场锁定

Quiz on Prompt Engineering Techniques

6 题 · Skilljar Vertex 课程

对应本地卷： V8·L02 Prompt 工程 V11·L01 Description 四要素

Question 1

You're giving your AI a long document to analyze along with your instructions. What would help the AI understand your prompt better?

Put everything in one big paragraph
Write the document in all capital letters
Use XML tags like <document> and <instructions>
Separate sections with lots of blank lines

推荐答案C

Anthropic 官方推荐用 XML 标签为 prompt 划分语义边界——Claude 训练时大量见过此格式，能精确识别 <document> 与 <instructions> 的不同角色。空行（D）和大写（B）都没有结构化能力。

出处：V8·L02 — XML 标签结构化 Prompt

Question 2

You want your AI to write a book review. Which approach would be more helpful?

"Write a book review that's 300 words, includes plot summary, mentions two characters, and gives a rating"
"Write a book review that's good and interesting"
"Tell me what you think about books in general"
"Write something about the book I just read"

推荐答案A

「Be specific」原则：明确长度、必含元素、可衡量标准。「good and interesting」「something about」全是模糊词，AI 只能猜。这正是 4D 框架中 Description 四要素（角色/任务/格式/标准）的应用。

出处：V11·L01 — Description 四要素

Question 3

You want to improve a prompt that isn't working well. What should you do first after writing your initial prompt?

Test it and measure how well it performs
Add more examples immediately
Rewrite it completely from scratch
Make it longer and more detailed

推荐答案A

「Eval-first」原则——没有度量就没有改进方向。盲目加例子（B）、重写（C）或加长（D）都可能让 prompt 变差而不自知。先建评估集，再迭代。

出处：V8·L02 — Prompt 评估循环

Question 4

Your evaluation report shows your prompt scored poorly on "missing calorie information" across multiple test cases. What does this tell you?

You should ignore this feedback and try something else
You need to specifically tell your prompt to include calorie information
The test cases are too hard
The evaluation system is broken

推荐答案B

评估反馈指出系统性遗漏点——直接在 prompt 里把缺失项明确化。Claude 不会自动猜你想要什么，缺什么就写什么。归咎评估（C/D）或忽视（A）都是甩锅。

出处：V8·L02 — 反馈驱动迭代

Question 5

You want your AI to detect sarcasm in social media posts, but it keeps missing sarcastic comments. What would help most?

Tell it to "try harder" to find sarcasm
Use bigger fonts in your prompt
Show it examples of sarcastic posts with correct labels
Ask it to guess when posts might be sarcastic

推荐答案C

Few-shot examples 是最有效的复杂判断引导——尤其讽刺这种「难以言说」的概念，举例比定义更直接。「try harder」（A）对 LLM 完全无意义。

出处：V8·L02 — Few-shot Examples

Question 6

Why is the first line of your prompt considered the most important part?

It determines which AI model will be used
It determines how fast the AI will respond
It sets the stage for everything that follows and should be clear and direct
It controls the length of the AI's response

推荐答案C

首行确立任务框架——LLM 强烈受首句锚定，含糊或离题的开场会让后续指令失焦。Anthropic 的「Be clear and direct」原则尤其强调首行。

出处：V8·L02 — Clear and Direct

Quiz on Prompt Evaluation

4 题 · Skilljar Vertex 课程

对应本地卷： V8·L02 Prompt 工程 / 评估 V11·L01 Discernment

Question 1

You've learned techniques for writing better prompts, but now you want to measure how well they actually work. What do you need?

More prompt engineering techniques
Prompt evaluation methods
More training data
A faster AI model

推荐答案B

题目本身就是答案的提示——「measure how well they work」即评估。Prompt engineering（A）是写得更好，evaluation 是衡量好坏。两者循环互补。

出处：V8·L02 — Eval Workflow

Question 2

You need test cases for your prompt evaluation but don't want to write them all by hand. What's a good alternative?

Ask users to create test cases
Only test with one example
Skip testing and deploy immediately
Use Claude to generate test cases automatically

推荐答案D

用 Claude 生成多样化的测试用例是公认效率方案——可以指定边界、角色、噪声场景批量产出。其他选项要么牺牲覆盖率（B），要么牺牲质量（C），要么转嫁负担（A）。

出处：V8·L02 — Test Dataset 生成

Question 3

You write a prompt and test it twice with your own inputs. It looks good, so you deploy it. What's the main risk?

The prompt will work too slowly
The prompt will become too expensive
The AI model will stop working
Users might provide unexpected inputs that break it

推荐答案D

两次自测的覆盖率几乎为零，真实用户输入的多样性远超预想——边界情况、对抗输入、语言混合都可能触发崩溃。这是 Discernment 不到位的典型失败模式。

出处：V11·L01 — Discernment 强制步骤

Question 4

In a typical evaluation workflow, what happens right after you feed your prompts through Claude?

You change the prompt and start over
You feed the responses through a grader
You create a new dataset
You deploy the prompt to production

推荐答案B

Eval workflow 标准顺序：dataset → prompt → response → grader → score。grader 可以是代码（确定性检查）或另一个 LLM（model-based grading）。修改 prompt（A）和部署（D）发生在拿到分数之后。

出处：V8·L02 — Grader 模式

Quiz on Tool Use with Claude

7 题 · Skilljar Vertex 课程

对应本地卷： V8·L03 Tool Use / 结构化输出

Question 1

When Claude wants to use a tool, it sends back a response that's different from usual. What does this response contain?

Error messages only
Both text blocks and tool use blocks
Only tool requests with no text
Only text like normal

推荐答案B

Tool use 响应的 content 数组通常包含「Claude 的思考说明」（text block）+「工具调用请求」（tool_use block）。stop_reason 会是 tool_use。极少只发 tool 不带文字（C 也偶有，但 B 是典型情况）。

出处：V8·L03 — Tool Use Response 结构

Question 2

You want to create a tool that gets the current time. What type of code do you need to write?

A regular Python function
A web page
A database query
A complex AI algorithm

推荐答案A

Tool 本质就是普通函数——Claude 输出 tool_use 请求，你的应用执行函数（如 datetime.now()），把结果作为 tool_result 回传。无需任何 AI 算法。

出处：V8·L03 — Tool 函数实现

Question 3

A user asks Claude "What day will it be 30 days from today?" To answer this, Claude needs to use multiple tools. What happens?

Claude asks the user to do the math
Claude uses one tool and guesses the rest
Claude calls tools in sequence — first getting today's date, then adding 30 days
Claude gives up and says it can't help

推荐答案C

多轮 tool use（agentic loop）模式——Claude 自主规划工具调用顺序，每次拿到 tool_result 后决定下一步。这正是 agent 的核心能力。

出处：V8·L03 — Multi-turn Tool Use

Question 4

Sarah asks Claude "What's the weather like today?" but Claude says it doesn't have current weather data. What would solve this problem?

Waiting for Claude to update itself
Giving Claude access to tools that fetch current data
Asking Claude to guess the weather
Training Claude on more weather information

推荐答案B

Claude 训练数据有时间截止点，无法访问实时信息。Tool use 是补足实时数据的标准方案——例如 weather API 工具。等待自更新（A）或重训（D）都不现实。

出处：V2·L04 — 知识截止与工具补足

Question 5

You're building a chat app with Claude. A user asks for today's stock prices, but Claude responds "I don't have access to current stock information." What's the core problem?

The user asked the wrong question
Claude is broken
Claude only knows information from its training data
Claude needs to be restarted

推荐答案C

这是 LLM 本质局限——无外部访问能力，只能召回训练数据中的信息。理解这一点是决定何时引入工具的前提。Q4 是「怎么办」，Q5 是「为什么」。

出处：V2·L04 — 知识截止

Question 6

You want to give Claude the ability to search the web for current information. What do you need to implement?

Just a simple schema — Claude handles the searching
Your own search engine
A complex web scraping system
Permission from Google

推荐答案A

指 Anthropic 内置的 web_search tool——只需在 tools 数组里声明，Claude 自动调用并把结果整合进回答。无需自建搜索引擎或绕过权限。

出处：V8·L03 — Web Search Tool

Question 7

You've written a Python function for Claude to use. What else do you need so Claude knows how to call it?

Permission from Claude
A JSON schema describing the function
A special license
A user manual

推荐答案B

Tool 定义 = name + description + input_schema（JSON Schema）。Claude 只看 schema 决定何时调用、传什么参数；函数代码由你的应用执行，Claude 看不到。

出处：V8·L03 — Tool Schema 规范

Quiz on Retrieval Augmented Generation

8 题 · Skilljar Vertex 课程

对应本地卷： V8·L04 RAG / Agent 模式

Question 1

You're setting up a system to handle large documents. Instead of using everything at once, you break documents into smaller pieces and search for relevant ones. What is this approach called?

The chunking approach
File splitting
Text summarization
Document compression

推荐答案A

Chunking 是 RAG 的术语标准——按语义/段落把文档切成可索引的小块（chunk）。File splitting 是文件层面操作，summarization 改变内容，compression 不分块。

出处：V8·L04 — Chunking Strategy

Question 2

You try to include a massive 800-page document directly in your Claude prompt. What problems will you likely face?

There are hard limits on text length, reduced effectiveness, and higher costs
The document will be perfectly processed
Claude will work faster than normal
Only the cost will increase slightly

推荐答案A

三重问题：① context window 上限（200K 默认 / 1M Opus）；② 长上下文中信息检索效率下降（lost-in-the-middle）；③ 输入 token 计费成倍增加。这正是 RAG 存在的理由。

出处：V8·L04 — Why RAG

Question 3

You have an 800-page financial report and want to ask Claude specific questions about it. What does RAG help you do?

Ask only yes/no questions
Put the entire document into each prompt
Summarize the whole document first
Find and include only the relevant sections for each question

推荐答案D

RAG = Retrieval-Augmented Generation——按问题检索相关 chunk 注入 prompt，让 Claude 看到「足够且必要」的上下文。不是塞全文（B），也不是先总结（C，会丢失细节）。

出处：V8·L04 — RAG 检索流程

Question 4

You send the text "The cat is happy" to an embedding model. What do you get back?

A summary of the text
A translation in another language
A list of keywords
A long list of numbers

推荐答案D

Embedding model 把文本映射成定长向量（如 1536 维浮点数组）——这串数字捕获语义，可用余弦相似度做语义搜索。Embedding 不解释、不翻译、不抽词。

出处：V8·L04 — Text Embeddings

Question 5

What problem does contextual retrieval solve in RAG systems?

It makes search queries run faster
It reduces the storage space needed for embeddings
It addresses the issue of chunks losing their connection to broader document context when documents are split
It eliminates the need for vector databases

推荐答案C

Anthropic 提出的 Contextual Retrieval：在 chunk 前缀一段 Claude 生成的「该 chunk 在原文中的位置/角色」摘要，恢复被切断的上下文，显著提升检索准确率。不影响速度（A）或存储（B）。

出处：V8·L04 — Contextual Retrieval

Question 6

You have search results from both semantic search and BM25 search. They use different scoring systems. How do you combine them into one ranked list?

Use Reciprocal Rank Fusion (RRF) based on rank positions
Take the average of both scores
Add the scores together directly
Use only the semantic search results

推荐答案A

RRF 基于「排名位置」融合（不同 scorer 的绝对分不可比）——每个文档的最终分 = Σ 1/(k + rank)。这是混合检索的工业标准方法。

出处：V8·L04 — RRF 融合排序

Question 7

What is the purpose of re-ranking in RAG pipelines?

To compress the vector database for faster searches
To generate better embeddings for text chunks
To use an LLM to intelligently reorder search results after initial retrieval
To split documents into more appropriate chunk sizes

推荐答案C

Re-ranking 是检索后处理阶段——召回粗排出 N 条候选，用更慢但更准的 reranker（通常是 cross-encoder 或小 LLM）二次精排，把真正相关的顶到前面。

出处：V8·L04 — Reranking

Question 8

You're searching for a specific incident ID like "INC-2023-Q4-011" in your documents. Semantic search isn't finding it well. What additional search method would help?

Bigger vector database
BM25 lexical search for exact term matching
Longer embeddings
More chunks

推荐答案B

Semantic search 擅长「意思相近」，BM25 擅长「字面精确匹配」（ID、SKU、错误码、人名）——两者互补构成 hybrid search。这正是 Q6 RRF 要融合的两条召回通路。

出处：V8·L04 — BM25 与 Hybrid Search

Quiz on Features of Claude

6 题 · Skilljar Vertex 课程

对应本地卷： V2·L09 Extended Thinking / Vision V8·L04 Prompt Caching / Citations

Question 1

When Extended Thinking is enabled, what two parts will Claude's response contain?

A summary block and a detail block
A thinking block and a text block
A draft block and a final block
A question block and an answer block

推荐答案B

Extended Thinking 模式下 content 数组先有一个 thinking block（推理过程），再有 text block（最终回答）。这是 4.x 系列混合模型的标准结构。

出处：V2·L09 — Extended Thinking

Question 2

You ask Claude "How many marbles are in this image?" but get the wrong count. What's the best way to improve accuracy?

Ask the question in all capital letters
Send a higher quality image
Upload the image multiple times
Provide detailed counting steps and methodology

推荐答案D

Vision 计数任务的最佳实践：要求 Claude 显式说出方法论（分区、画网格、逐块计数再相加），相当于给视觉任务套上 chain-of-thought。直接问「多少个」准确率较低。

出处：V2·L09 — Vision 提示策略

Question 3

What's the minimum amount of content needed for caching to work?

Any amount of text
500 tokens
1024 tokens
2000 tokens

推荐答案C

Anthropic 文档：默认模型最小可缓存 prefix 为 1024 tokens（部分小模型为 2048）。低于此阈值的 cache_control 会被忽略——这是判断「值不值得用 caching」的关键参数。

出处：V8·L04 — Prompt Caching 阈值

Question 4

You want to cache your tool definitions. Where should you place the cache breakpoint?

On the last tool in your list
On the middle tool in your list
On every tool in your list
On the first tool in your list

推荐答案A

Cache breakpoint 缓存的是「断点之前的所有内容」——放在最后一个 tool 上意味着所有 tool 定义都被缓存。放第一个（D）只缓存第一个工具，效果最弱。

出处：V8·L04 — Cache Breakpoint 放置

Question 5

How does prompt caching work?

It makes Claude remember conversations forever
It prevents Claude from making mistakes
It reuses computational work from previous requests
It translates messages into different languages

推荐答案C

Prompt caching 本质：Anthropic 服务端缓存 prefix 的 KV cache（注意力中间计算结果），命中时跳过重复的 forward pass，命中部分 token 价格降至 10%。不是对话记忆（A）。

出处：V8·L04 — Prompt Caching 原理

Question 6

You're building an app where users ask questions about documents. What's the main benefit of enabling citations?

It reduces the cost of each request
It shows users exactly where information came from
It makes Claude's responses longer
It makes the app run faster

推荐答案B

Citations API 让 Claude 在回答中标注每条主张对应原文哪段——可验证性大幅提升，也降低幻觉风险。这是文档问答场景的合规和信任刚需。

出处：V8·L04 — Citations API

Quiz on Model Context Protocol

7 题 · Skilljar Vertex 课程

对应本地卷： V9·L01 MCP 入门 V9·L02 Server 实现 V10·L01 MCP 进阶

Question 1

User-controlled workflows that are triggered through UI interactions like button clicks or slash commands. This definition describes which MCP primitive?

Resources
Sessions
Tools
Prompts

推荐答案D

MCP 三类原语的「控制者」：Tools = model-controlled（Claude 决定何时调），Resources = app-controlled（应用按需加载），Prompts = user-controlled（用户点 UI 触发，例如 slash command）。

出处：V9·L01 — MCP 三类原语

Question 2

What does "transport agnostic" mean in the context of MCP communication?

MCP automatically chooses the fastest available network connection
MCP requires specific hardware to function properly
MCP only works with HTTP connections
MCP clients and servers can communicate using different methods like HTTP, WebSockets, or standard input/output

推荐答案D

MCP 协议层与传输层解耦——同样的 JSON-RPC 消息可走 stdio（本地子进程）、SSE/HTTP（远程）、WebSocket。开发者按部署场景选传输，业务代码不变。

出处：V10·L01 — Transport 机制

Question 3

What are Resources in the context of MCP?

App-controlled data access for UI purposes or adding context to conversations
Model-controlled functions for performing calculations
User-triggered commands that start predefined workflows
Server configuration settings that control performance

推荐答案A

Resources 由应用决定何时加载（不像 tool 是模型自主调）——典型用途：把当前文件、数据库行、知识库片段作为上下文传给 Claude。B 是 Tool，C 是 Prompt。

出处：V9·L01 — Resources 定义

Question 4

In MCP architecture, what is the relationship between MCP Clients and MCP Servers?

MCP Clients generate AI responses while MCP Servers handle user input
MCP Clients store data while MCP Servers process requests
MCP Clients connect to MCP Servers that contain tools, prompts, and resources
MCP Clients and MCP Servers are the same component with different names

推荐答案C

标准 client-server 架构——Client（如 Claude Desktop / Cursor / Claude Code）连接多个 Server，Server 暴露 tools/prompts/resources 三类能力。Server 是能力提供方，Client 是消费方。

出处：V9·L01 — Client/Server 架构

Question 5

What is Model Context Protocol (MCP)?

A programming language specifically designed for AI applications
A communication layer that provides Claude with context and tools without requiring tedious integration code
A security protocol for encrypting AI model responses
A database management system for storing AI conversations

推荐答案B

MCP 的「USB-C for AI」类比——一套标准协议替代 N 个 AI 应用 × M 个数据源的两两胶水代码。开发者一次实现 MCP Server，所有兼容客户端即可使用。

出处：V9·L01 — USB-C 类比

Question 6

What is the MCP Server Inspector?

A command-line tool for monitoring server performance
A code editor specifically designed for writing MCP servers
A security tool for scanning MCP servers for vulnerabilities
A browser-based interface for testing and debugging MCP servers in real-time

推荐答案D

官方 MCP Inspector 是一个 web UI（npx @modelcontextprotocol/inspector），可手动调用 server 的 tools / 读取 resources / 触发 prompts，用于开发期调试。

出处：V9·L02 — Server Inspector 调试

Question 7

Which of the following correctly describes Tools in MCP?

App-controlled data that populates UI elements
Static configuration files that define server behavior
User-controlled workflows that can be triggered on demand
Model-controlled functions that Claude decides when to call

推荐答案D

与 Q1/Q3 形成完整对照：Tools 由 model 决定何时用（如 search_web、execute_sql），Resources 由 app 控制（A 描述），Prompts 由 user 触发（C 描述）。

出处：V9·L01 — Tools 定义

Quiz on Agents and Workflows

7 题 · Skilljar Vertex 课程

对应本地卷： V5 Claude Cowork V6 子智能体 V8·L04 Agent 模式

Question 1

You want Claude to analyze a product image for 6 different materials at once. Instead of one huge prompt, what should you do?

Write a longer, more detailed prompt
Send 6 separate requests in parallel, then combine results
Ask the user to pick one material first
Use an agent with material tools

推荐答案B

Parallelization 模式——独立子任务并发执行，最后聚合。避免单 prompt 因任务过多而注意力分散，也降低延迟（6 个并发 vs 串行）。

出处：V8·L04 — Parallelization Workflow

Question 2

You're building a system where Claude creates content, then checks if it's good enough, then improves it if needed. What pattern is this?

Parallelization
Evaluator-optimizer
Routing
Chaining

推荐答案B

Evaluator-optimizer：generator 出稿 → evaluator 打分/给反馈 → generator 据反馈改写，循环至达标。区别于 Chaining（线性多步）和 Routing（分类分发）。

出处：V8·L04 — Evaluator-Optimizer 模式

Question 3

You're building an app where users upload photos and always get the same 4-step process to enhance them. Which approach should you use?

An agent with photo tools
A single complex prompt
Multiple agents working together
A workflow with predefined steps

推荐答案D

Workflow vs Agent 关键区分：步骤固定且已知 → workflow（可控、可预测、低成本）；步骤需 Claude 自主规划 → agent。这道题「same 4-step」明示流程固定。

出处：V8·L04 — Workflow vs Agent

Question 4

Your app needs to handle both "cooking recipes" and "workout routines" with completely different styles. What pattern helps?

Combine both styles in one prompt
Use the same prompt for both
Routing — categorize first, then use specialized prompts
Always ask users to specify the category

推荐答案C

Routing 模式——先用轻量分类器（小模型/规则）判断输入类型，再分发到专属 prompt。比通用 prompt 准确率高，比让用户分类（D）体验好。

出处：V8·L04 — Routing 模式

Question 5

What defines an agent when working with Claude?

A predetermined sequence of steps that Claude must follow exactly
A setup where Claude is given a goal and tools, then figures out how to complete the goal
A system that categorizes user requests into different types
A method for breaking complex tasks into parallel subtasks

推荐答案B

Agent 的本质 = 目标导向 + 自主规划工具调用顺序。A 是 workflow，C 是 routing，D 是 parallelization——这些都是固定结构，不是 agent。

出处：V6·L01 — Agent 定义

Question 6

What is environment inspection in the context of AI agents?

The process of categorizing user requests into different workflow types
The technique of running multiple specialized tasks simultaneously
A method for breaking large tasks into smaller sequential steps
Claude's ability to observe and understand the results of its actions

推荐答案D

Environment inspection = agent 在每次行动后读取环境反馈（tool_result、文件状态、错误信息）以决定下一步。这是 agentic loop 的核心反馈环节。

出处：V6·L01 — Agentic Loop

Question 7

You're building an agent. In general, should you give it a "Refactor Code" tool or basic tools like "read file" and "write file"?

"Refactor Code" tool — it's more specific
Basic tools — they're more flexible
Both tools together
Neither — agents don't need tools

推荐答案B

Anthropic agent 设计原则：「fewer, more general tools」。基础原子工具（read/write/exec）让 agent 自主组合解决任意问题；高层抽象工具（Refactor Code）覆盖窄、需求一变就要重做。

出处：V8·L04 — Tool 设计原则

Final Assessment Quiz Vertex

6 题 · Skilljar Vertex 课程 · 综合判断

对应本地卷： V17·L01 Vertex 概览 V17·L02 IAM 与 Service Account V17·L03 Vertex API 请求差异 V17·L12 迁移清单

Question 1

You're porting a Python script from `anthropic.Anthropic()` (direct API) to `anthropic.AnthropicVertex(...)`. The `model` parameter must change how?

It stays the same — `claude-sonnet-4-5`
Prefix with `vertex.`, e.g. `vertex.claude-sonnet-4-5`
Use the Vertex Model Garden ID, typically `claude-sonnet-4-5@` or the current endpoint alias
The SDK rewrites it automatically; pass anything

推荐答案C

Vertex 用 Model Garden 的 ID，常见形式是 `claude-X@YYYYMMDD` 版本日期或较新的 endpoint alias（要看 Model Garden 当前配置）。直接复用 Anthropic ID 会查无此模型（A 错）；`vertex.` 是杜撰前缀（B 错，那是 Bedrock 的 `anthropic.` 前缀的混淆版本）；SDK 不会替你猜（D 错）。

出处：V17·L03 — Vertex 模型 ID 格式

Question 2

Most of your users are in the US and you need predictable latency plus US-only data residency. Endpoint choice?

`global` — let Vertex route freely
A US multi-region endpoint (`us`) or a specific US region — keeps requests within US borders
`eu` — closer to global average
Skip Vertex; call Anthropic API directly for lower latency

推荐答案B

数据驻留要求时必须显式选 US（多区域或具体 region）。`global`（A）会跨大陆路由，无法保证驻留；`eu`（C）方向相反；绕过 Vertex（D）放弃了 GCP 治理与审计——这是题面要求的反面。

出处：V17·L01 — Endpoint 选择

Question 3

Production auth on Vertex — best pattern when running on GKE?

Hard-code Service Account JSON in container image
Reuse the existing `ANTHROPIC_API_KEY` env var
Workload Identity — Pod assumes a Google Service Account without any JSON key on disk
Have users sign in with their personal Google accounts

推荐答案C

GKE 上的生产应用应使用 Workload Identity——免静态密钥、按 Pod 授权、可审计。镜像里硬编 SA JSON（A）等于把生产密钥放进任何拿到镜像的人手里；Anthropic API key 在 Vertex 上无效（B 错）；让最终用户登录（D）让应用拿不到一致的服务身份。

出处：V17·L02 — Workload Identity 与 SA

Question 4

Compliance auditor asks "show me every Claude call your team made last month, with caller identity." Where do you look?

Anthropic Console usage page
Cloud Audit Logs in the GCP project — Vertex records every API call with caller identity and metadata
Application logs only — Vertex doesn't expose this
It's not retained; you'd have to log it yourself manually going forward

推荐答案B

Vertex 的每次调用都进入 Cloud Audit Logs（Data Access logs），含主体身份与请求元数据——这正是企业选 Vertex 而非 Anthropic 直连的关键合规理由。Anthropic Console（A）不显示 Vertex 调用；应用日志（C）只看你写的；Audit Logs 默认开启（D 错）。

出处：V17·L02 — Cloud Audit Logs

Question 5

If you write Vertex requests with the native REST body (not via the Anthropic SDK), what's the gotcha?

Native REST is forbidden; you must use the SDK
You must include `anthropic_version: "vertex-2023-10-16"` in the request body — Vertex requires it
The body is identical to Anthropic API requests, no changes
You must encrypt the body with a public key first

推荐答案B

原生 Vertex 请求体必须带 `anthropic_version: "vertex-2023-10-16"`，否则被拒。SDK 自动填这个字段；自己拼请求体时容易漏。Native REST 是允许的（A 错）；与 Anthropic API 不完全相同（C 错）；不需要客户端加密（D 错）。

出处：V17·L03 — anthropic_version 字段

Question 6

Migrating an Anthropic-API-built app to Vertex. Which step is most often skipped and most likely to break production?

Renaming the GCP project
Rotating the (no-longer-used) Anthropic API key
Re-running your eval set against the Vertex-versioned model and target region — quotas, latencies, and (rarely) behavior nuances differ
Switching all calls to Sonnet

推荐答案C

迁移最容易漏的是「在新平台跑一遍 eval」——配额限制、区域可用性、流式行为、版本钉扎都可能变。项目改名（A）和密钥轮换（B）是周边动作；强制 Sonnet（D）是与迁移无关的选择。生产事故最常源于「不重测 eval 就上线」。

出处：V17·L12 — 迁移清单

Quiz on Working with the API Bedrock

4 题 · Skilljar Bedrock 课程

对应本地卷： V8·L01 API 基础 V16·L01 Bedrock 部署

Question 1

You're building a chat app and users complain that responses take too long to appear. What feature should you implement?

Send requests faster
Add a loading spinner
Make the messages shorter
Use streaming to show text as it's generated

推荐答案D

Streaming 是体感延迟的标准解——token 边生成边推送，用户看到"打字效果"而非长时间等待。Loading spinner（B）只是装饰，不解决根本问题。

出处：V8·L01 — Streaming 与 SSE

Question 2

You're using Claude to extract data from documents and need the same consistent format every time. What temperature setting should you use?

Temperature close to 0 for consistent, predictable outputs
Temperature 0.5 for balanced responses
Temperature 1.0 for maximum creativity
Temperature doesn't matter for data extraction

推荐答案A

数据抽取需要可复现的精确输出——低温（接近 0）让模型每次都选最高概率 token，保证格式稳定。高温引入随机性，破坏数据 pipeline 一致性。

出处：V8·L01 — Temperature 参数

Question 3

You want to build a customer service bot that only talks about your company's products and stays professional. What's the best approach?

Tell users to only ask product questions
Add "be professional" to every user message
Use a system prompt that makes Claude act like a customer service representative
Set the temperature to maximum creativity

推荐答案C

System prompt 是定义角色与边界的标准位置——"你是 X 公司客服代表，只回答产品相关问题…"——比每条消息重复（B）高效，比依赖用户自律（A）可靠。

出处：V8·L02 — System Prompt 角色定义

Question 4

You're building a chatbot where users ask follow-up questions. A user asks "What's 2+2?" and then asks "Add 5 to that." What do you need to do for the second question to make sense?

Send both the first question and Claude's previous answer along with the second question
Restart the conversation from the beginning
Wait 30 seconds before sending the second question
Send only the second question to Claude

推荐答案A

Claude API 是无状态的——每次请求必须把完整 messages 数组（含之前的 user + assistant 轮次）一起发送。"Add 5 to that"中的"that"只能从历史里推断。

出处：V8·L01 — Multi-turn Conversation

Quiz on Prompt Engineering Bedrock

4 题 · Skilljar Bedrock 课程

对应本地卷： V8·L02 Prompt 工程

Question 1

You want an AI to write movie reviews in a specific style. What's the best way to show the AI exactly what you want?

Use very strict formatting rules
Tell it to copy famous movie critics
Describe the style in great detail
Give it a sample movie review as an example

推荐答案D

Few-shot example——对于"风格""口吻"这种难以言说的属性，举一个真实样本远胜于抽象描述。一个示例胜过千言万语，正是 few-shot prompting 的核心理由。

出处：V8·L02 — Few-shot Examples

Question 2

You're improving a prompt that generates workout plans. What should you do after writing your first version?

Test it, see how well it works, then improve it
Use it immediately for all workouts
Write five different versions at once
Ask other people to guess what it does

推荐答案A

Eval-driven prompt iteration——写完先测，看输出问题再针对性修改。盲目并行写五版（C）或直接上线（B）都是无效循环。

出处：V8·L02 — Prompt 评估循环

Question 3

You're asking an AI to analyze a long customer review mixed in with your instructions. What helps the AI understand which part is the review?

Put the review at the very end
Put the review between XML tags like <review></review>
Write the review in a different font
Make the review all uppercase

推荐答案B

XML 标签是 Anthropic 推荐的内容隔离方式——Claude 训练时大量见过此格式，能精准识别 <review> 是数据，外面是指令。位置（A）和大小写（D）都没有结构化能力。

出处：V8·L02 — XML 标签结构化 Prompt

Question 4

You want an AI to write a book summary. Which opening instruction works best?

Write a three-paragraph summary of this book
What do you think about summarizing things?
I was wondering if you could maybe help with something about books?
Books are interesting, aren't they?

推荐答案A

"Be clear and direct"原则：祈使句 + 明确数量约束（三段）。试探性提问（B/C）或闲聊（D）会让 Claude 不确定该做什么。

出处：V8·L02 — Clear and Direct

Quiz on Prompt Evaluations Bedrock

7 题 · Skilljar Bedrock 课程

对应本地卷： V8·L02 Prompt 工程 / 评估 V11·L01 Discernment

Question 1

You need test data for evaluating your prompt. You want to create realistic examples quickly without writing them all by hand. What's the best approach?

Use the same example over and over
Use Claude to automatically generate test cases
Ask your friends to write them
Copy examples from the internet

推荐答案B

用 LLM 生成测试数据是行业标准做法——可指定边界、噪声、对抗场景批量产出，比人工写更快、覆盖更广。重复样本（A）和网络抄袭（D）都不可控。

出处：V8·L02 — Test Dataset 生成

Question 2

You've written a prompt for Claude and want to know if it works well. What's the difference between prompt engineering and prompt evaluation?

Prompt engineering is for beginners, prompt evaluation is for experts
Prompt engineering tests the prompt, prompt evaluation writes it
Prompt engineering writes better prompts, prompt evaluation measures how well they work
They're the same thing with different names

推荐答案C

两者是互补环节：engineering 是"创造"（写 prompt 的技巧），evaluation 是"度量"（测 prompt 效果）。形成"写→测→改"循环。

出处：V8·L02 — Eval Workflow

Question 3

In prompt evaluation, what is a "grader" used for?

To save your prompts to a file
To give objective scores measuring output quality
To write better prompts automatically
To make Claude respond faster

推荐答案B

Grader 是评估 pipeline 的打分组件——可以是代码（确定性检查格式/关键词）或 LLM（model-based grading），将主观感觉变成可对比的数值。

出处：V8·L02 — Grader 模式

Question 4

You just wrote a prompt for your app. You test it once and it works great, so you decide to use it. What's the main risk with this approach?

The prompt will stop working after a few days
Users will provide unexpected inputs that break it
It will be too expensive to run
Other developers won't understand your code

推荐答案B

单测试样本 = 0 覆盖率。真实用户输入的多样性（边界、错别字、对抗、跨语言）远超开发者预期，"自测两次就上线"是最常见的生产事故根因。

出处：V11·L01 — Discernment 强制步骤

Question 5

You're running a prompt evaluation. After creating your dataset and feeding questions through Claude, what's the next step?

Feed the responses through a grader to get scores
Write a completely new prompt
Ask users what they think
Publish your prompt immediately

推荐答案A

标准 eval 顺序：dataset → prompt → response → grader → score。grader 是从"输出"到"分数"的关键环节，没有它无法量化好坏。

出处：V8·L02 — Grader 模式

Question 6

You're using another AI model to evaluate Claude's responses. To get better scores than just random numbers around 6, what should you ask the grader to provide?

Just "good" or "bad"
Strengths, weaknesses, reasoning, and a score
A rewritten version of the response
Only a number from 1-10

推荐答案B

让 LLM grader 先做 chain-of-thought（说优点缺点、给推理）再给分，分数明显更准。直接要数字（D）会得到中位偏 6 的随机分布——LLM 不"思考"无法精确评判。

出处：V8·L02 — Model-based Grading

Question 7

You want to check if Claude's output contains certain keywords and has the right length. Which type of grader should you use?

Manual grader
Code grader
Model grader
Human grader

推荐答案B

关键词包含、长度等可程序化的判断用 code grader（正则、字符串匹配、长度计数）——确定性、零成本、零延迟。Model grader 留给"语气是否专业""逻辑是否合理"等主观维度。

出处：V8·L02 — Code-based Grading

Quiz on Tool Use Bedrock

4 题 · Skilljar Bedrock 课程

对应本地卷： V8·L03 Tool Use / 结构化输出

Question 1

When using tools, what happens right after Claude asks for specific external data?

Claude analyzes the original question again
The user needs to approve the data request
Your server runs code to fetch the requested information
Claude provides the final answer immediately

推荐答案C

Tool use 协议：Claude 返回 tool_use block → 你的应用执行函数 → 把结果作为 tool_result 回传给 Claude → Claude 据此给最终回答。中间执行环节由你的服务器负责，不是 Claude。

出处：V8·L03 — Tool Use Workflow

Question 2

You want to force Claude to use a specific tool for data extraction. Which toolChoice setting should you use?

{"toolChoice": {"auto": {}}}
{"toolChoice": {"tool": {"name": "tool-name"}}}
{"toolChoice": {"any": {}}}
{"toolChoice": {"required": true}}

推荐答案B

tool_choice 三种模式：auto（Claude 决定是否用工具）、any（必须用某个工具但 Claude 选）、tool（强制用指定工具）。"force specific tool"对应第三种。

出处：V8·L03 — tool_choice 参数

Question 3

You ask Claude "What's the weather today?" but it says it doesn't have current weather information. What would tools help Claude do?

Access live weather data from external sources
Ask you to check the weather yourself
Guess the weather based on the date
Remember previous weather conversations

推荐答案A

Tool 突破训练数据时间截止——通过 weather API 工具，Claude 可调取实时数据。这就是 tool use 解决"知识截止"的本质。

出处：V8·L03 — Tool 补足实时数据

Question 4

You're writing a tool function for Claude. What's the most important thing to include when creating the JSON schema?

The programming language being used
Detailed descriptions of what the tool does and its parameters
Your contact information
The function's source code

推荐答案B

Claude 只通过 description 字段决定何时调用以及怎么传参——清晰、明确的描述是工具被正确使用的关键。源码（D）Claude 看不到，语言（A）无关，联系方式（C）荒唐。

出处：V8·L03 — Tool Schema 描述规范

Quiz on Retrieval Augmented Generation Bedrock

5 题 · Skilljar Bedrock 课程

对应本地卷： V8·L04 RAG / Agent 模式

Question 1

What is contextual retrieval?

A technique that adds context to document chunks before storing them to improve search accuracy
A way to reduce the size of document chunks
A system for automatically generating new content from existing documents
A method for searching through documents faster

推荐答案A

Anthropic 提出的 Contextual Retrieval——存储前用 LLM 给每个 chunk 加一段"它在原文中的位置/角色"摘要，恢复被切断的上下文，索引和检索时都更准。

出处：V8·L04 — Contextual Retrieval

Question 2

What is a vector database in the context of RAG systems?

A specialized database optimized for storing, comparing, and searching through numerical embeddings
A system for backing up document files
A regular database that stores text documents as files
A database that only stores mathematical equations

推荐答案A

Vector DB（如 Pinecone、Weaviate、Milvus、pgvector）针对高维向量的余弦相似度查询做了专门优化——传统关系数据库扫表会慢得不可用。

出处：V8·L04 — Vector Database

Question 3

You're searching for a specific incident ID "INC-2023-Q4-011" in your documents. Semantic search isn't finding it well. What search method would work better?

Converting everything to lowercase first
Searching only document titles
BM25 lexical search for exact keyword matching
Using longer text embeddings

推荐答案C

Semantic search（embedding）擅长意思相近，但对 ID、SKU、错误码、专有名词等需要字面精确匹配的场景表现差——BM25 这类传统词频检索更可靠。两者结合即 hybrid search。

出处：V8·L04 — BM25 与 Hybrid Search

Question 4

You send the text "The cat sat on the mat" to an embedding model. What do you get back?

A shorter version of the same text
Keywords extracted from the text
A list of about 1024 numbers representing the meaning
A translation in another language

推荐答案C

Embedding model 输出固定维度（如 1024 维）浮点向量——这串数字就是文本的语义坐标，可用余弦相似度衡量两段文本的语义距离。Embedding 不是摘要也不是关键词。

出处：V8·L04 — Text Embeddings

Question 5

You have an 800-page financial report and want to ask an AI specific questions about it. What does RAG help you do?

Send only the relevant sections to the AI for each question
Make the document shorter by deleting pages
Translate the document into simpler language
Create a summary of the entire document

推荐答案A

RAG = Retrieval-Augmented Generation——按问题动态检索最相关的段落注入 prompt，让 Claude 看到"刚好够用"的上下文。这避开了 context window 上限、降低 token 成本、提升 lost-in-the-middle 抗性。

出处：V8·L04 — RAG 检索流程

Quiz on Features of Claude Bedrock

4 题 · Skilljar Bedrock 课程

对应本地卷： V2·L09 Extended Thinking / Vision V8·L04 Prompt Caching

Question 1

You send Claude the same long document twice in a row. What does prompt caching help with?

It stores your conversation history permanently
It automatically summarizes repeated content
It reduces the document's file size
It saves the computational work from processing the text in the document

推荐答案D

Prompt caching 缓存的是 KV cache（注意力计算的中间结果），第二次请求复用 prefix 的计算，跳过 forward pass，命中部分 token 价格降至原价 10%。不是"对话记忆"（A）。

出处：V8·L04 — Prompt Caching 原理

Question 2

You've optimized your prompt but Claude still isn't accurate enough on a complex task. What should you consider next?

Rewrite the prompt completely from scratch
Break the task into smaller pieces
Use extended thinking to improve accuracy
Switch to a different AI model

推荐答案C

Extended Thinking 让 Claude 在回答前显式做长推理（thinking block），针对数学、复杂规划、多步分析等任务能显著提升准确率。任务拆分（B）也可行但更费工——先试 thinking。

出处：V2·L09 — Extended Thinking

Question 3

You want to cache a short message that's 500 tokens long. What will happen?

It will be automatically expanded to meet requirements
It won't be cached because it's too short
It will be cached normally
It will be cached at half price

推荐答案B

Anthropic 文档：默认模型最低缓存 prefix 1024 tokens（部分小模型 2048）。500 tokens 低于阈值，cache_control 会被静默忽略——cache 不生效。

出处：V8·L04 — Caching 最小阈值

Question 4

What is an effective technique for increasing Claude's effectiveness with images?

Uploading more images
Using prompt engineering techniques
Using JPEG instead of PNG images
Providing zoomed-in images

推荐答案B

Vision 任务的提升点不在图片格式或数量，而在 prompt——明确说要看什么、给推理步骤、用 chain-of-thought。这是文字-视觉联合任务的通用规律。

出处：V2·L09 — Vision 提示策略

Quiz on Model Context Protocol Bedrock

8 题 · Skilljar Bedrock 课程

对应本地卷： V9·L01 MCP 入门 V9·L02 Server 实现 V10·L01 MCP 进阶

Question 1

You're building a chat app where users ask Claude about their GitHub data. Without MCP, what's the main problem you'd face?

Claude can't connect to the internet
GitHub doesn't allow API access
You'd have to write and maintain all the GitHub tool functions yourself
Users can't type GitHub questions

推荐答案C

MCP 解决的是 N×M 集成问题——没有 MCP 时每个应用都得自己实现 GitHub 工具的 schema、调用、错误处理。有了 MCP，社区维护一个 GitHub Server，所有兼容客户端即可使用。

出处：V9·L01 — MCP 解决的问题

Question 2

You're creating an MCP server tool using the Python SDK. What's the easiest way to define a new tool?

Create a separate configuration file
Send HTTP requests to register tools
Write complex JSON schemas manually
Use the @mcp.tool decorator on a function

推荐答案D

Python SDK 的 @mcp.tool 装饰器自动从函数签名（参数、类型注解、docstring）生成 JSON Schema——无需手写 schema。这是 MCP SDK 设计的核心便利。

出处：V9·L02 — @mcp.tool 装饰器

Question 3

Claude automatically decides to use a calculator tool when you ask "What's 15 × 23?" Who is controlling this tool usage?

The MCP server providing the tool
The application showing the chat
Claude (the AI model) itself
The user who asked the question

推荐答案C

Tools 是 model-controlled 原语——由 Claude（模型）自主决定何时调用。这与 Resources（app-controlled）和 Prompts（user-controlled）形成 MCP 三类原语的完整对照。

出处：V9·L01 — MCP 三类原语

Question 4

You just wrote an MCP server and want to test if your tools work correctly. What's the best first step?

Ask other developers to try it
Connect it to Claude immediately
Write unit tests for each function
Use the MCP Inspector in your browser

推荐答案D

官方 MCP Inspector（npx @modelcontextprotocol/inspector）是浏览器调试 UI——可直接调用 server 的 tools / 读取 resources / 触发 prompts，看到原始 JSON-RPC 消息。开发期最快验证手段。

出处：V9·L02 — Server Inspector 调试

Question 5

You want to let users type "@document_name" to automatically include document content in their message. Should you use a tool or a resource?

Tool - because Claude needs the information
Resource - because documents are files
Resource - because the app fetches data for the UI
Tool - because it involves documents

推荐答案C

区分关键不在"是文件还是数据"，而在"谁控制何时取用"。@提及由 app 解析并主动注入上下文 → 这是 app-controlled = Resource。Tool 是 Claude 自主决定调用，与本场景不符。

出处：V9·L01 — Resources vs Tools

Question 6

You're running an MCP client and server on the same computer. How do they most commonly communicate with each other?

Using Bluetooth connection
Through standard input/output
Through email messages
By writing files to disk

推荐答案B

本机进程间通信用 stdio transport——客户端把 server 作为子进程启动，通过标准输入输出收发 JSON-RPC 消息。简单、零配置、零网络开销。远程 server 用 SSE/HTTP transport。

出处：V10·L01 — Transport 机制

Question 7

Your MCP client needs to find out what capabilities an MCP server offers. What message type should it send?

GetServerInfo
CheckCapabilities
ListToolsRequest
CallToolRequest

推荐答案C

MCP 协议定义了 ListToolsRequest（列出可用工具）、ListResourcesRequest、ListPromptsRequest——三类原语各有 list 方法。CallToolRequest 是调用特定工具，不是发现。A/B 不是协议方法名。

出处：V9·L02 — JSON-RPC 方法

Question 8

Your MCP server has tools, resources, and prompts. A user clicks a "Format Document" button in your app. Which primitive is being used?

Resources - because it accesses documents
Tools - because it formats something
All three at the same time
Prompts - because the user directly triggered it

推荐答案D

Prompts 是 user-controlled 原语——专门为"用户点按钮 / 输入 slash command 触发预定义工作流"而设计。点击"Format Document"按钮触发预设格式化流程，正是 Prompts 的典型用法。

出处：V9·L01 — Prompts 定义

Final Assessment Quiz Bedrock

6 题 · Skilljar Bedrock 课程 · 综合判断

对应本地卷： V16·L01 Bedrock 概览 V16·L02 IAM 与模型访问 V16·L03 Bedrock Runtime API V16·L12 迁移清单

Question 1

Porting from Anthropic API to Bedrock. The model identifier changes how?

Stays the same — `claude-sonnet-4-5`
Bedrock model ID prefixes the family with `anthropic.`, e.g. `anthropic.claude-sonnet-4-5-v2:0` — and may need an inference profile prefix for cross-region capacity
Use the Vertex format `claude-sonnet-4-5@`
Bedrock auto-resolves any string

推荐答案B

Bedrock model ID 形式：`anthropic.claude-X-vN:0`，跨区域调用时还可能需要 inference profile 前缀（如 `us.anthropic.claude-...`）。直接复用 Anthropic ID（A 错）；Vertex 的 `@date` 是另一平台（C 错）；Bedrock 不会替你猜（D 错）。

出处：V16·L03 — Bedrock 模型 ID 与 inference profile

Question 2

Why does your first call to a new Claude model on Bedrock fail with `AccessDeniedException`?

Bedrock requires an Anthropic API key as fallback
You need to email AWS support for every model
Bedrock requires you to request and grant model access in the Bedrock console before invocation; access is not auto-granted per account
The model is unavailable in any AWS region

推荐答案C

Bedrock 的 model access 需要在控制台显式申请并被授予后才能调用——这是与 Anthropic 直连最大的运营差异。Anthropic key 不参与（A 错）；不需要邮件 support（B 错）；模型通常多区域可用，不是地理问题（D 错）。

出处：V16·L02 — Model Access 申请流程

Question 3

Production app on EC2 calling Bedrock. Best auth pattern?

Hard-code AWS access key + secret in the app config
Reuse the Anthropic API key as `AWS_ACCESS_KEY_ID`
Attach an IAM role to the EC2 instance profile; SDK uses the role's temporary credentials
Use the AWS root account credentials

推荐答案C

EC2 → IAM instance profile 是 AWS 生产标配——临时凭证、自动轮换、审计可追溯。硬编静态 key（A）是最常见的泄漏入口；Anthropic key 与 AWS IAM 完全不同（B 错）；root account（D）严禁日常使用。

出处：V16·L02 — IAM 与 Instance Profile

Question 4

Cross-region inference profile — what does it actually do?

Reduces cost by 50% for the first 1k requests
Routes invocations across multiple AWS regions for capacity & availability without code changes
Replaces VPC endpoints
Required to enable streaming

推荐答案B

Inference profile 把多个区域聚成一个逻辑端点——AWS 自动选可用容量，应用代码无需变化。它不打折（A 错）；与 VPC endpoint（C）正交；streaming 不需要它（D 错）。代价：需要明确允许跨区域数据驻留。

出处：V16·L03 — Cross-Region Inference Profile

Question 5

Compliance team needs evidence of every Claude invocation. Where on AWS?

Bedrock console "History" tab — only shows the last 24h
CloudTrail records all Bedrock control-plane and (with Data Events enabled) data-plane API calls — including `InvokeModel` and `Converse`
Anthropic Console — Bedrock proxies traffic there
It's not captured; deploy your own logging

推荐答案B

CloudTrail 是 AWS 全平台审计源——Bedrock 的 control plane 默认开启，数据 plane（如 InvokeModel）需要显式启用 Data Events。这是 Bedrock 优于直连的关键合规价值。Bedrock 控制台不是审计源（A 错）；与 Anthropic Console 无关（C 错）；自建日志（D）反而是「不知道有 CloudTrail」的应急手段。

出处：V16·L02 — CloudTrail 审计

Question 6

Migrating an Anthropic-API-built app to Bedrock. Most subtle compatibility risk to test for?

System prompts are removed in Bedrock
`max_tokens` is unsupported in Bedrock
Tool use, content block, and request/response shapes have minor differences — eval set must re-pass on the Bedrock SDK end-to-end
Streaming is unavailable on Bedrock

推荐答案C

大方向兼容（system prompt、max_tokens、streaming 都支持），坑在细节：tool use 协议字段、message content block、错误码、配额限制都可能微妙不同——必须在 Bedrock 上完整跑一遍 eval 才能信任迁移。System prompt 没被移除（A 错）；max_tokens 支持（B 错）；streaming 支持（D 错）。

出处：V16·L12 — 迁移清单

Vol.6 · Introduction to subagents — 练习

8 题 · 本地补充练习（Skilljar 无官方 quiz）

对应本地卷： V6·L01 什么是子智能体 V6·L02 创建子智能体 V6·L03 设计有效子智能体 V6·L04 有效使用子智能体

Question 1

子智能体（subagent）与主会话最核心的区别是什么？

子智能体只能运行 Bash 命令，不能编辑文件
子智能体只能通过 slash command 手动调用
子智能体拥有独立的上下文、系统提示词和工具权限，与主会话隔离
子智能体的响应速度始终比主会话慢

推荐答案C

子智能体最关键的架构特征就是独立上下文——高输出搜索、日志分析、测试结果不会污染主会话。子智能体可配置独立的系统提示词和工具权限，这与主会话形成隔离。子智能体可以拥有编辑权限（A 错）；除了手动调用，还可通过 description 匹配自动委派（B 错）；速度并非定义特征（D 错）。

出处：V6·L01 — 子智能体定义与四大核心优势

Question 2

以下哪项任务最适合委派给子智能体，而非在主会话中处理？

修改单个配置文件中的一个变量名
对全仓库进行安全扫描，只需要最终的问题清单
与用户持续讨论架构方案，需要多轮交互
需要密切参考主会话中刚刚讨论的上下文来做决策

推荐答案B

安全扫描会产生大量工具输出，如果放在主会话中会迅速占满上下文窗口。子智能体可以独立完成扫描，只将问题清单（结论）返回主会话——这正是「高输出、只需结论」的典型场景。小修改在主会话中更高效（A 错）；需要持续交互或紧密上下文的任务不适合子智能体（C、D 错）。

出处：V6·L01 — 何时使用 vs 不使用子智能体

Question 3

子智能体配置中，哪两个字段是必填的？

name 和 model
name 和 tools
name 和 description
description 和 model

推荐答案C

子智能体 YAML 配置中只有 name 和 description 是必填字段。name 是唯一标识符；description 直接影响自动委派——Claude Code 根据它来判断何时将任务委派给该子智能体。model、tools 等字段可选，省略时继承主会话设置。

出处：V6·L02 — 创建子智能体 · 必填字段

Question 4

子智能体配置文件存放在 `.claude/agents/`（项目级）和 `~/.claude/agents/`（用户级）。以下说法正确的是？

用户级子智能体优先级始终高于项目级
项目级子智能体不能被团队成员使用
项目级子智能体可随仓库提交，团队成员 clone 后即可使用
两种级别的子智能体功能不同——用户级不支持 tool use

推荐答案C

项目级（.claude/agents/）子智能体位于仓库内，可通过 git 共享给团队。名称冲突时较高优先级会覆盖较低优先级（A 方向反了）；项目级正是为了团队共享而设计（B 错）；两种级别的功能相同（D 错）。

出处：V6·L02 — 存储路径：个人 vs 项目

Question 5

子智能体的 `description` 字段写为 "Helps with coding tasks"，这有什么问题？

description 太长，会浪费上下文
太宽泛——几乎任何任务都可能触发，导致 Claude 误判委派时机
description 不能包含英文，必须用中文
description 中不能出现 "coding" 等通用词

推荐答案B

子智能体最常见的失败模式就是 description 过于宽泛。"Helps with coding tasks" 几乎匹配任何编程相关的任务，导致 Claude 在不合适的时机也尝试委派。好的 description 应该说明：何时使用、任务边界、输出格式。例如 "Use after code changes to review TypeScript files for security, correctness, and maintainability. Read-only."

出处：V6·L03 — 单一职责原则与 description 设计

Question 6

为代码审查（code review）子智能体配置工具权限时，最佳实践是什么？

赋予全部工具权限，以便子智能体可以自主修复发现的问题
仅赋予 Read、Glob、Grep 等只读工具——审查不应修改代码
赋予 Edit 和 Write 权限但禁止 Bash
不配置 tools 字段，让子智能体继承主会话的完整权限

推荐答案B

审查子智能体的职责是发现问题，不是修复问题。赋予写入权限可能导致未经批准的修改。最佳实践是仅赋予只读工具（Read、Glob、Grep），让审查结果返回主会话后由人工或专门的 fix 子智能体处理。权限最小化是子智能体设计的关键原则。

出处：V6·L03 — 工具权限设计表

Question 7

设计子智能体时，单一职责原则（Single Responsibility Principle）的核心含义是？

每个子智能体只能使用一种工具
每个子智能体应专注于一类明确的任务，而非充当万能助手
每个项目中最多只能创建一个子智能体
子智能体一次只能处理一个文件

推荐答案B

单一职责原则在子智能体设计中意味着：每个子智能体有清晰的任务边界（如代码审查、安全扫描、测试诊断），而非一个通用的"帮我编程"助手。这直接关系到 description 的精确度和自动委派的准确率。子智能体可以使用多种工具（A 错），项目中可以创建多个（C 错），可以处理多个文件（D 错）。

出处：V6·L03 — 单一职责原则

Question 8

关于子智能体的前台（foreground）与后台（background）执行，以下哪种场景最适合后台执行？

代码修改任务，需要用户的权限确认
代码审查，结果需要立即讨论
对多个外部文档进行长时间研究搜索，结果供后续参考
执行数据库迁移，需要确认迁移结果成功才能继续

推荐答案C

后台执行适合长时间运行、不需要即时交互的独立任务——子智能体在后台搜索研究，主会话可以继续处理其他工作。需要权限确认（A）、即时讨论（B）、或阻塞等待结果（D）的任务应使用前台执行，以确保流程正确。

出处：V6·L04 — 前台 vs 后台执行

Vol.7 · Introduction to agent skills — 练习

8 题 · 本地补充练习（Skilljar 无官方 quiz）

对应本地卷： V7·L01 什么是 Skill V7·L02 创建 Skill V7·L03 配置与多文件 V7·L04 Skill vs 其他功能 V7·L05 分享 Skill V7·L06 排查 Skill 问题

Question 1

一个 Agent Skill 的入口文件是什么？

README.md
AGENT.md
SKILL.md
index.md

推荐答案C

Skill 是一个目录，其入口文件必须命名为 SKILL.md——包含 YAML frontmatter 和正文内容。Claude Code 通过这个文件名来发现和识别 Skill。其他文件名不会被识别为 Skill 入口。

出处：V7·L01 — Skill 定义与目录结构

Question 2

Skill 的 `description` 字段主要作用是什么？

在 UI 中展示 Skill 的版本信息
帮助 Claude 判断何时自动调用该 Skill
定义 Skill 可以使用的工具列表
指定 Skill 的输出文件格式

推荐答案B

description 是 Claude 判断「何时调用 Skill」的核心依据——它描述触发条件和使用场景，Claude 在每次对话中根据它来匹配当前任务。工具列表由 allowed-tools 控制（C 错）；版本和输出格式不是 description 的职责。

出处：V7·L01 — description 字段与自动调用

Question 3

项目级 Skill 的正确存储路径是什么？

~/.claude/agents/<skill-name>/SKILL.md
.claude/skills/<skill-name>/SKILL.md
.claude/commands/<skill-name>/SKILL.md
skills/<skill-name>/SKILL.md

推荐答案B

项目级 Skill 存放在仓库的 .claude/skills/ 目录下，可随 git 提交与团队共享。~/.claude/skills/ 是用户级路径（跨项目个人使用），~/.claude/agents/ 是子智能体路径（A 错）。.claude/commands/ 存放自定义命令，不是 Skill（C 错）。

出处：V7·L02 — 存储路径

Question 4

Skill 的 `allowed-tools` 字段的正确理解是什么？

它定义了 Skill 被禁止使用的工具列表
它预批准（pre-approve）所列工具——Skill 激活时这些工具无需用户额外确认
它限制 Skill 只能使用这些工具，其他工具被完全禁止
它继承自父智能体的工具权限，不可单独配置

推荐答案B

allowed-tools 是「预批准列表」，不是「限制列表」——Skill 激活时，列表中的工具被预授权无需逐次确认，但它并不阻止 Claude 使用其他工具。如果真正需要限制工具，应通过权限拒绝规则实现，而非依赖 allowed-tools。

出处：V7·L03 — allowed-tools 的正确理解

Question 5

一个 Skill 执行时会产生副作用（如发送 Slack 消息、创建 PR），最佳实践是什么？

在 description 中写明 "This skill has side effects"
不配置 allowed-tools，让用户逐次确认
设置 disable-model-invocation: true，只允许用户手动调用
将 Skill 放在用户级路径避免被团队误用

推荐答案C

有副作用的 Skill（发送消息、创建 PR、修改外部系统）不应被 Claude 自动触发——disable-model-invocation: true 确保只有用户通过 slash command 手动调用时才会执行。仅在 description 中标明（A）或依赖权限确认（B）都不足以防止误触发。

出处：V7·L03 — disable-model-invocation

Question 6

某团队需要给 Claude 接入公司内部数据库以实时查询客户信息，应选择哪种机制？

创建一个包含数据库查询步骤的 Skill
构建 MCP Server 暴露数据库查询工具
在 CLAUDE.md 中写入数据库 schema
创建一个包含数据库凭据的子智能体

推荐答案B

Skill 是文档/流程——Claude 可以阅读但无法执行实时查询。MCP（Model Context Protocol）正是为这种场景设计：提供实时工具接口，允许 Claude 调用外部系统获取数据。Skill 适合固定工作流和知识包，MCP 适合需要实时数据的外部系统集成。

出处：V7·L04 — Skill vs MCP 边界

Question 7

以下哪种方式是团队共享 Skill 的主要途径？

通过邮件发送 SKILL.md 文件给团队成员
将 Skill 放在项目 .claude/skills/ 目录，随 git 提交
上传到 Anthropic 官方 Skill 市场
使用 ~/.claude/skills/ 路径并导出环境变量

推荐答案B

项目级 .claude/skills/ 可随仓库 git 提交，是团队共享 Skill 的主要方式——成员 clone 后自动获得。用户级（~/.claude/skills/）仅限个人使用（D 错）。更完整的打包分发可使用 Plugin（含 Skill + 子智能体 + 命令 + MCP 等），但目前没有官方 Skill 市场（C 错）。

出处：V7·L05 — 共享方式

Question 8

Skill 无法被 Claude 自动发现，排查的第一步应该检查什么？

文件路径是否正确，入口文件是否命名为 SKILL.md，frontmatter 是否有效
description 是否包含足够多的关键词
Skill 目录中是否包含 examples 文件夹
CLAUDE.md 中是否声明了该 Skill

推荐答案A

排查顺序：路径 → frontmatter → description → 调用方式。Skill 不出现的首要原因通常是路径不正确或入口文件未命名为 SKILL.md，或 YAML frontmatter 格式有误。description 问题会导致「不触发」而非「不出现」（B 属于后续排查步骤）。

出处：V7·L06 — 排查检查清单

Vol.11 · AI Fluency: Framework & Foundations — 练习

10 题 · 本地补充练习（Skilljar 无官方 quiz）

对应本地卷： V11·L03 4D 框架 V11·L04 生成式 AI 基础 V11·L05 能力与局限 V11·L06 Delegation 详解 V11·L08 Description 详解 V11·L10 Discernment 详解 V11·L12 Diligence 详解

Question 1

AI Fluency 的四个目标是什么？

快速、准确、低成本、可扩展
有效（effective）、高效（efficient）、合乎伦理（ethical）、安全（safe）
自动化、增强、代理、协作
委派、描述、辨别、勤勉

推荐答案B

AI Fluency 的四个目标是 effective（有效）、efficient（高效）、ethical（合乎伦理）、safe（安全）——涵盖能力、效率、价值观和安全四个维度。选项 C 是三种协作模式，选项 D 是 4D 框架的四个能力，不是目标层级。

出处：V11·L01 — AI Fluency 定义与四大目标

Question 2

「人类设定目标、边界和行为规则，AI 更独立地工作」描述的是哪种协作模式？

Automation（自动化）
Augmentation（增强）
Agency（代理）
Assistance（辅助）

推荐答案C

三种协作模式：Automation = AI 按指令执行特定任务；Augmentation = 人与 AI 共同思考、创造、分析；Agency = 人类设定目标和边界，AI 更独立地工作（如配置智能体定期整理研究资料）。Agency 模式下 AI 自主性最高，Diligence（勤勉/责任）也最重要。

出处：V11·L01 — 三种协作模式

Question 3

4D 框架中的四个能力分别是什么？

Define, Design, Develop, Deploy
Discover, Diagnose, Decide, Deliver
Delegation, Description, Discernment, Diligence
Data, Dialogue, Decision, Deployment

推荐答案C

4D = Delegation（委派：谁来做、做到什么程度）、Description（描述：如何清晰沟通目标与过程）、Discernment（辨别：AI 输出是否可用）、Diligence（勤勉：如何负责、披露、安全使用）。四个 D 不是线性步骤，而是循环——评估不满意时回到描述或委派。

出处：V11·L03 — 4D 框架总览

Question 4

Delegation（委派）的三要素不包括以下哪项？

Problem Awareness（问题意识）：我真正要达成什么？
Platform Awareness（平台意识）：AI 擅长和不擅长什么？
Task Delegation（任务委派）：哪些由人做、哪些由 AI 做？
Performance Evaluation（绩效评估）：AI 的输出效率如何？

推荐答案D

Delegation 三要素：Problem Awareness（明确目标与约束）、Platform Awareness（了解工具能力与风险）、Task Delegation（分配人机分工）。常见错误是跳过前两步直接跳到分工。Performance Evaluation 属于 Discernment 而不是 Delegation。

出处：V11·L06 — Delegation 三要素

Question 5

Description（描述）的三种类型是什么？

任务描述、格式描述、风格描述
Product Description（产品描述）、Process Description（过程描述）、Performance Description（表现描述）
输入描述、处理描述、输出描述
目标描述、方法描述、结果描述

推荐答案B

三种 Description：Product（要什么输出——格式、受众、长度、风格）、Process（怎么处理——步骤、方法、参考来源、约束）、Performance（怎么交互——语气、主动性、错误处理）。大多数人只写了 Product Description，加入 Process 和 Performance 可显著减少返工。

出处：V11·L08 — Description 三种类型

Question 6

Discernment（辨别）评估后，以下哪项不是正确的后续行动？

直接采用（低风险格式化输出）
要求修改（给出具体反馈）
返回重新描述或切换工具
忽略发现的问题，因为 AI 输出通常都会有小错

推荐答案D

Discernment 的五个正确后续行动：直接采用、要求修改、补充信息重试、切换工具或外部验证、升级到人类专家。关键规则是每次 Discernment 必须有明确的下一步——仅仅说「不够好」而不采取行动等于没做。忽略已知问题的做法违反了 Diligence 原则。

出处：V11·L10 — Discernment 五大后续行动

Question 7

Description-Discernment 循环的核心逻辑是什么？

写好提示词 → AI 生成 → 如果不好就换一个模型
描述目标 → AI 生成/行动 → 评估产品/过程/表现 → 定位差距 → 调整描述或重新委派
描述任务 → AI 自动评估 → 人类最终确认
一次性写好详细提示词，避免反复修改

推荐答案B

高质量 AI 协作通常需要 2-5 轮循环：描述→生成→评估产品/过程/表现→定位差距→调整描述或回到委派。如果连续两轮没有改进，问题可能在委派层面（任务本身不适合 AI）。这不是「写好一次提示词」就能完成的。

出处：V11·L11 — Description-Discernment 循环

Question 8

Diligence（勤勉）的三种类型是？

数据勤勉、代码勤勉、内容勤勉
输入勤勉、处理勤勉、输出勤勉
Creation Diligence（创建勤勉）、Transparency Diligence（透明勤勉）、Deployment Diligence（部署勤勉）
事前勤勉、事中勤勉、事后勤勉

推荐答案C

三种 Diligence：Creation（选择什么 AI 系统？是否适合此任务？）、Transparency（谁需要知道 AI 参与了？如何披露？）、Deployment（我能为即将分享的输出负责吗？最终审查和标签）。三者覆盖了从选择工具到发布成果的完整责任链。

出处：V11·L12 — Diligence 三种类型

Question 9

以下关于大语言模型「幻觉」（hallucination）的说法，哪项最准确？

大模型不会产生幻觉，只要使用最新版本即可避免
AI 输出可能流畅自信但包含虚构事实，因为模型基于模式生成而非从数据库检索
幻觉只会出现在中文等非英语语言中
只要在系统提示词中写「不要编造」，就可以杜绝幻觉

推荐答案B

语言模型通过预测下一个 token 生成内容，并非从数据库中检索确定答案——因此可能生成流畅、自信但完全虚构的内容。幻觉是所有大语言模型的固有问题（A 错），不限于特定语言（C 错），也無法仅通过一句提示词彻底消除（D 错）。正确做法是要求引述来源、人工核实关键事实。

出处：V11·L05 — 五大局限之幻觉

Question 10

以下哪项属于 Diligence 的常见失败模式？

使用 AI 生成初稿后人工修改
将未核实的 AI 输出作为事实直接发布
在报告中声明使用了 AI 辅助
根据任务风险等级决定 AI 的参与程度

推荐答案B

Diligence 四大失败模式：将敏感数据放入不合适的工具、发布未核实的 AI 输出作为事实、在关键工作中隐瞒 AI 参与、让 AI 取代专业责任。「直接发布未核实输出」是最常见也最危险的失败——AI 输出即使流畅也需要独立验证。其他三项都是正确的 Diligence 实践。

出处：V11·L12 — Diligence 失败模式

Vol.12 · AI Fluency for educators — 练习

6 题 · 本地补充练习（Skilljar 无官方 quiz）

对应本地卷： V12·L01 教育者导入 V12·L02 4D 框架复习（教育者视角） V12·L03 课程设计与学习成果 V12·L04 学习材料与作业设计

Question 1

教育场景中 AI 应用的四个层次不包括以下哪项？

课程层（Course layer）：教学大纲、活动设计
材料层（Material layer）：阅读材料、示例、练习题
管理层层（Administration layer）：排课、考勤、预算
制度层（Institutional layer）：政策草案、伦理规范

推荐答案C

教育 AI 应用四层：课程层、材料层、学习层、制度层。这四个层次贯穿「AI 可以协助什么」和「什么必须由人判断」的边界。行政管理虽也涉及 AI，但不在这四个教育专业性层次之内——这四层关注的是教学核心而非行政效率。

出处：V12·L01 — 四个教育应用层

Question 2

设计课堂 AI 活动的五个步骤中，「分配辨别任务」的目的是什么？

让学生比较不同 AI 工具的速度
让学生主动评估 AI 输出的准确性、偏见和适用性，培养批判性思维
让教师检查 AI 是否按预期工作
为 AI 输出打分，确定哪些学生可以用 AI

推荐答案B

五步课堂设计：明确学习目标 → 设计 AI 参与点 → 分配辨别任务（学生主动评估 AI 输出）→ 反思讨论 → 迁移练习。辨别任务的核心目的是培养学生的批判性思维——不是让学生被动接受 AI 输出，而是主动判断其准确性、偏见和适用性。

出处：V12·L01 — 五步课堂活动设计

Question 3

在 AI 时代，学术诚信的评估框架发生了什么转变？

从「是否注明引用来源」转变为「是否使用了 AI」
从「这是你自己的作品吗」转变为「使用方式是否促进了真正的学习」
从「是否抄袭」转变为「AI 使用比例是否低于 30%」
从「独立完成」转变为「鼓励使用 AI 完成全部作业」

推荐答案B

旧框架关注「作品是否原创」（抄袭、买论文、作弊），新框架关注「使用方式是否促进真正学习」——AI 生成的文本归属问题、AI 辅助写作的边界、理解证明 vs. 仅会提示词。核心不再仅仅是「检测 AI」，而是「学生是否真正理解了、会判断了、能负责」。

出处：V12·L01 — 学术诚信转变

Question 4

以下哪个学科中 AI 的「辨别需求」最高——即最需要人工核验 AI 输出？

创意写作——AI 可以提供风格建议
法律——AI 可能生成看似合理但引用错误的法律条文
编程入门——AI 可以解释语法错误
体育——AI 可以分析运动数据

推荐答案B

法律、历史、医学属于「高辨别需求」学科——AI 在这些领域的事实、法规、案例方面的幻觉率较高，可能生成看似合理但引用错误的条文或判例。创意写作和编程属于「高价值」使用场景但辨别需求相对于事实型学科较低。体育等需要身体技能的领域 AI 价值有限。

出处：V12·L01 — 学科 AI 适用性

Question 5

教育者视角的 Delegation-Diligence 循环中，第一步应该做什么？

让 AI 生成教学材料的第一版
教师先确定哪些任务不可委派给 AI（如学习目标设定、最终评分）
让学生试用 AI 工具并收集反馈
查阅其他学校的 AI 使用政策作为模板

推荐答案B

教育者 Delegation-Diligence 循环的第一步是设定委派边界——学习目标设定、评估判断、学生反馈的最终决定是专业判断，不可委派给 AI。材料生成、题目变体、效率任务可以委派。关键在于「先保护学习目标，再用 AI 提升备课效率」，而非反过来。

出处：V12·L02 — 教育者 Delegation-Diligence 循环

Question 6

关于作业设计，材料中建议的核心理念是什么？

完全禁止 AI 在作业中的使用，以保持学术诚信
从「检测 AI 使用」转向「让学生展示理解过程、判断力和责任」
允许学生在任何作业中自由使用 AI，无需声明
为每份作业设置 AI 使用比例上限（如不超过 30%）

推荐答案B

作业设计的关键转变：要求学生提交问题定义、源材料、草稿修改和反思——展示他们的理解和判断过程，而不仅是最终产品。让学生解释为什么接受或拒绝 AI 建议。融入课堂讨论、个人观察或本地数据，明确注明哪些 AI 使用允许、哪些需声明、哪些禁止。

出处：V12·L04 — 作业设计：让过程可见

Vol.14 · Teaching AI Fluency — 练习

6 题 · 本地补充练习（Skilljar 无官方 quiz）

对应本地卷： V14·L01 教学方法 V14·L02 Delegation-Diligence 循环教学 V14·L03 Description-Discernment 循环教学 V14·L04 评量 4D V14·L05 作业设计 V14·L06 学科影响

Question 1

教授 AI Fluency 时，教学的核心应该是什么？

教学生使用最新 AI 工具的具体按钮和功能
教可迁移的判断框架，而非特定工具的操作技能
教学生写出最优美的提示词模板
教学生比较不同 AI 模型的性能基准

推荐答案B

AI 工具快速迭代，教具体按钮和提示词格式会很快过时。AI Fluency 教学的核心是传授可迁移的判断框架（4D）——让学生在任何工具、任何场景下都能做出可解释、可负责的 AI 协作决策。教的是判断力，不是工具操作。

出处：V14·L01 — 教学核心理念

Question 2

四种 AI Fluency 教学法中，哪种最接近真实 AI 协作场景但需要最多的课堂时间？

Linear（线性）：按顺序教四个 D
Non-linear（非线性）：根据任务需要跳跃式教学
Focused（聚焦）：在一个 D 上深度教学
Loop-based（循环式）：围绕两大循环设计活动

推荐答案D

Loop-based（循环式）教学围绕 Delegation-Diligence 和 Description-Discernment 两大循环设计活动——最接近真实 AI 协作流程，但需要充分课时让学生经历完整循环。Linear 适合初学者但可能僵硬；Non-linear 适合有经验的学生；Focused 适合嵌入现有课程。

出处：V14·L01 — 四种教学法

Question 3

Delegation-Diligence 循环教学活动中，最重要的产出是什么？

学生对分工判断和责任方式的清晰解释
AI 生成的高质量最终作品
学生编写的精美提示词
AI 使用的时间效率数据

推荐答案A

Delegation-Diligence 循环教学的关键产出不是 AI 生成的内容，而是学生能否清晰解释他们的分工判断和能力责任。活动结构：给任务 → 写委派计划 → 使用 AI → 写勤勉笔记 → 反思改进。核心评估的是学生的判断力，而非 AI 输出质量。

出处：V14·L02 — Delegation-Diligence 教学活动

Question 4

评量 AI Fluency 时，最稳健的策略是什么？

仅看最终作品质量
仅看学生的 AI 聊天记录
结合结果（最终作品）、过程（版本变化、决策记录）和反思（学生自述），三者并用
仅看学生对 AI 使用的自我反思

推荐答案C

三种评量策略各有利弊：Outcome-based 只看最终作品看不到过程；Process-based 看聊天记录但依赖学生愿意记录；Reflection-based 看自述但可能流于空泛。最稳健的方案是三者结合——看作品、看过程、看学生能否解释自己的 AI 决策。

出处：V14·L04 — 三种评量策略

Question 5

AI Fluency 作业设计的三个原则是什么？

简单、快速、可自动评分
真实性（Authenticity）、迭代（Iteration）、文档化（Documentation）
标准化、统一化、集中化
开放性、创造性、协作性

推荐答案B

三个设计原则：Authenticity（作业模拟真实场景，不为用 AI 而用 AI）、Iteration（含多轮尝试与修订，提交版本变化）、Documentation（学生记录 AI 使用过程和决策）。如果作业只要求提交最终答案，就难以评量 AI Fluency——过程与产品同等重要。

出处：V14·L05 — 作业设计三原则

Question 6

分析 AI 对学科的影响时，以下哪种判断是正确的？

所有学科的课程内容都应保持不变，只需在考试中禁用 AI
学科中容易被 AI 替代的操作需要更新教学内容，难以替代的判断力则更加珍贵
所有学科的评估方式都应改为口试
AI 对 STEM 学科没有影响，只影响人文学科

推荐答案B

AI 对学科的影响分三类：自动化操作（格式化、草稿生成、语言润色）可能被替代；增强能力（批判性判断、问题定义、来源验证）更加重要；稳定核心（学科价值观、方法论、证据标准、伦理责任）不变。如果一门课只训练 AI 可替代的操作，需要更新；如果训练难以替代的判断力，价值更高。

出处：V14·L06 — AI 对学科的影响

Vol.15 · AI Fluency for students — 练习

6 题 · 本地补充练习（Skilljar 无官方 quiz）

对应本地卷： V15·L01 学生导入 V15·L02 学生版 4D V15·L03 AI 作为学习伙伴 V15·L04 AI 与职业规划 V15·L05 人在回路中

Question 1

学生在学习场景中使用 AI 时，最关键的边界是什么？

每天使用 AI 的时间不超过 2 小时
增强（Augmentation）vs 自动化（Automation）——AI 应让你更有能力，而非替你完成该学的内容
只能使用学校指定的 AI 工具
使用 AI 时必须全程录屏以便教师检查

推荐答案B

学生学习场景中最重要的边界是 Augmentation vs Automation：增强让你更有能力（AI 解释概念后你用自己的话重述、AI 指出逻辑漏洞后你自己修改）；自动化让 AI 替你完成本该学习的内容（直接复制 AI 答案、用 AI 写读后感替代阅读）。铁律是：不管 AI 参与多少，所有提交内容必须是你能够解释、应用和负责的。

出处：V15·L01 — 增强 vs 自动化

Question 2

「铁律」（Iron Rule）的核心含义是什么？

永远不要使用 AI 完成任何作业
无论 AI 参与多少，所有提交内容必须是你能够解释、应用和负责的
AI 只能用于课外学习，不能用于课堂相关任务
每次使用 AI 后必须向老师提交使用报告

推荐答案B

铁律（Iron Rule）不禁止使用 AI，而是设立问责标准——你可以用 AI 解释概念、检查逻辑、提供练习，但最终提交的内容必须是你能独立解释、应用和为之负责的。如果做不到这三条，说明 AI 可能正在绕过你的学习过程，而非增强它。

出处：V15·L01 — 铁律

Question 3

学生版 Learning Context Document（学习背景文档）不包含以下哪项？

你是谁（年级、专业、当前课程）
学习目标（这次学习要真正掌握什么）
当前水平（已知什么、卡在哪里）
过往所有考试成绩和 GPA

推荐答案D

Learning Context Document 包含五要素：身份（年级/专业/课程）、学习目标、当前水平与困难点、偏好风格（类比/例子/推导/提问）、交互要求（不给答案，引导思考）。目的不是做全面档案，而是为每次学习建立清晰的对话背景——让 AI 像家教而非答题机一样回应。

出处：V15·L02 — Learning Context Document

Question 4

AI 作为学习伙伴时，以下哪种使用方式最有利于学习？

让 AI 生成完整答案后抄写到作业本上
让 AI 生成练习题（不含答案），自己先做，之后让 AI 给予反馈
考试时用 AI 实时查询答案
用 AI 代写读书笔记，节省时间多看几本书

推荐答案B

好的 AI 学习使用模式：AI 生成问题但不给答案 → 学生先自己做 → AI 给反馈。这保持了学习的核心——学生自己在思考和实践。直接抄答案（A）、考试中用 AI（C）、替代阅读（D）都属于 Automation，绕过了学习过程。核验方式很简单：关闭 AI 后你还能独立完成吗？

出处：V15·L03 — 学习伙伴的正确使用方式

Question 5

关于使用 AI 准备求职材料，以下哪项是正确的？

AI 可以为没有实习经历的学生编造合理的项目经验
AI 可以帮助改善简历的表达清晰度，但真实经历和个人声音必须由你保持
直接提交 AI 生成的通用求职信是最有效率的方式
AI 可以替学生决定最适合的职业方向

推荐答案B

AI 在职业规划中的三个角色：职业探索（解释行业、路径、技能要求）→ 你判断适配性；求职材料（改善结构、语调、清晰度）→ 你保持真实经历和个人声音；面试准备（模拟面试官、追问、给反馈）→ 你练习真实回答，不背 AI 脚本。AI 不能编造经历（A 错），不能替你决策（D 错），直接提交通用内容会失去个人辨识度（C 错）。

出处：V15·L04 — AI 在职业规划中的角色

Question 6

提交 AI 辅助作业前的最终自检（Self-Check）不包含以下哪项？

我能在没有 AI 的情况下解释这个内容吗？
我知道哪些部分来自 AI，哪些是我自己的判断吗？
我核实了事实、来源或计算吗？
我使用了最新的 AI 模型版本吗？

推荐答案D

提交前五个自检问题：能否独立解释？是否知道 AI 贡献与自己的判断的分界？是否核实了事实/来源/计算？课程或机构是否允许此类 AI 使用？如果老师问起过程，能否诚实解释？使用哪个 AI 模型版本不是重点——重点是你是否理解、判断、核实并能为所提交内容负责。

出处：V15·L05 — 最终自检清单

Skilljar 官方 Quiz 题库

📑 Vol.8 Building with the Claude API（7 quiz + 1 assessment · 49 题）

📑 Vol.4 Claude Code in Action（1 quiz · 6 题）

📑 Vol.9 + Vol.10 MCP（入门 + 进阶）（2 assessment · 12 题）

📑 Vol.5 Introduction to Claude Cowork（1 quiz · 5 题）

📑 Vol.2 / Vol.3 / Vol.13 入门 + 应用 Course Quiz（3 quiz · 15 题）

📑 Vertex 课程（9 quiz · 58 题）

📑 Bedrock 课程（8 quiz · 42 题）

📑 Vol.6 / Vol.7 智能体系列 补充练习（2 卷 · 16 题）

📑 Vol.11–Vol.15 AI Fluency 系列 补充练习（4 卷 · 28 题）

📦 Vol.8 Building with the Claude API

Quiz on Accessing Claude with the API

Quiz on Prompt Evaluation

Quiz on Prompt Engineering Techniques

Quiz on Tool Use with Claude

Quiz on Features of Claude

Quiz on Model Context Protocol

Quiz on Agents and Workflows

Final Assessment

📦 Vol.4 Claude Code in Action

Quiz on Claude Code

📦 Vol.9 + Vol.10 MCP（入门 + 进阶）

Final assessment on MCP Vol.9 入门

Assessment on MCP concepts Vol.10 进阶

📦 Vol.5 Introduction to Claude Cowork

Quiz on Claude Cowork

📦 Vol.2 / Vol.3 / Vol.13 入门 + 应用 Course Quiz

Course Quiz Vol.2

Course Quiz Vol.3

Course Quiz Vol.13

📦 Vertex 课程

Quiz on Accessing Claude with the API

Quiz on Prompt Engineering Techniques

Quiz on Prompt Evaluation

Quiz on Tool Use with Claude

Quiz on Retrieval Augmented Generation

Quiz on Features of Claude

Quiz on Model Context Protocol

Quiz on Agents and Workflows

Final Assessment Quiz Vertex

📦 Bedrock 课程

Quiz on Working with the API Bedrock

Quiz on Prompt Engineering Bedrock

Quiz on Prompt Evaluations Bedrock

Quiz on Tool Use Bedrock

Quiz on Retrieval Augmented Generation Bedrock

Quiz on Features of Claude Bedrock

Quiz on Model Context Protocol Bedrock

Final Assessment Quiz Bedrock

📦 Vol.6 Introduction to subagents（补充练习）

Vol.6 · Introduction to subagents — 练习

📦 Vol.7 Introduction to agent skills（补充练习）

Vol.7 · Introduction to agent skills — 练习

📦 Vol.11 AI Fluency: Framework & Foundations（补充练习）

Vol.11 · AI Fluency: Framework & Foundations — 练习

📦 Vol.12 AI Fluency for educators（补充练习）

Vol.12 · AI Fluency for educators — 练习

📦 Vol.14 Teaching AI Fluency（补充练习）

Vol.14 · Teaching AI Fluency — 练习

📦 Vol.15 AI Fluency for students（补充练习）

Vol.15 · AI Fluency for students — 练习

📑 Vol.6 / Vol.7 智能体系列补充练习（2 卷 · 16 题）

📑 Vol.11–Vol.15 AI Fluency 系列补充练习（4 卷 · 28 题）