Skilljar 官方 Quiz 题库

抓取自 Anthropic Skilljar 十门官方课程(32 个 Quiz · 187 题)+ 6 卷本地补充练习(44 题)· 共 231 题 · 含推荐答案、推理理由与本地卷出处

📘 关于推荐答案
Skilljar 平台仅暴露「我的选择 + 正/误标记」,不公开正确选项。本册的推荐答案由 Claude 基于 Anthropic 官方文档 + 本书系 17 卷知识点逐题推理而来——绿色高亮即为推荐选项,下方附「理由 + 出处」。Vol.8、Vertex、Bedrock 三门课的同主题 quiz 题面互不重复——同样的考点用不同场景重新出题,可作三面练习。

Vol.6/7/11/12/14/15 为本地补充练习(共 44 题),覆盖子智能体、Agent Skills、AI Fluency 框架、教育者/教学/学生场景——这些 Skilljar 课程无官方 quiz,题目基于对应 lesson 教材编写,用于补全测验覆盖。

📑 Vol.8 Building with the Claude API(7 quiz + 1 assessment · 49 题)

  1. Quiz on Accessing Claude with the API6 题
  2. Quiz on Prompt Evaluation5 题
  3. Quiz on Prompt Engineering Techniques6 题
  4. Quiz on Tool Use with Claude6 题
  5. Quiz on Features of Claude6 题
  6. Quiz on Model Context Protocol6 题
  7. Quiz on Agents and Workflows6 题
  8. Final Assessment8 题

📑 Vol.4 Claude Code in Action(1 quiz · 6 题)

  1. Quiz on Claude Code6 题

📑 Vol.9 + Vol.10 MCP(入门 + 进阶)(2 assessment · 12 题)

  1. Final assessment on MCP(Vol.9 入门)6 题
  2. Assessment on MCP concepts(Vol.10 进阶)6 题

📑 Vol.5 Introduction to Claude Cowork(1 quiz · 5 题)

  1. Quiz on Claude Cowork5 题

📑 Vol.2 / Vol.3 / Vol.13 入门 + 应用 Course Quiz(3 quiz · 15 题)

  1. Course Quiz(Vol.2 AI Capabilities)5 题
  2. Course Quiz(Vol.3 Claude Code 101)5 题
  3. Course Quiz(Vol.13 AI Fluency for nonprofits)5 题

📑 Vertex 课程(9 quiz · 58 题)

  1. Quiz on Accessing Claude with the API7 题
  2. Quiz on Prompt Engineering Techniques6 题
  3. Quiz on Prompt Evaluation4 题
  4. Quiz on Tool Use with Claude7 题
  5. Quiz on Retrieval Augmented Generation8 题
  6. Quiz on Features of Claude6 题
  7. Quiz on Model Context Protocol7 题
  8. Quiz on Agents and Workflows7 题
  9. Final Assessment Quiz6 题

📑 Bedrock 课程(8 quiz · 42 题)

  1. Quiz on Working with the API4 题
  2. Quiz on Prompt Engineering4 题
  3. Quiz on Prompt Evaluations7 题
  4. Quiz on Tool Use4 题
  5. Quiz on Retrieval Augmented Generation5 题
  6. Quiz on Features of Claude4 题
  7. Quiz on Model Context Protocol8 题
  8. Final Assessment Quiz6 题

📑 Vol.6 / Vol.7 智能体系列 补充练习(2 卷 · 16 题)

  1. Vol.6 · Introduction to subagents 练习8 题
  2. Vol.7 · Introduction to agent skills 练习8 题

📑 Vol.11–Vol.15 AI Fluency 系列 补充练习(4 卷 · 28 题)

  1. Vol.11 · AI Fluency: Framework & Foundations 练习10 题
  2. Vol.12 · AI Fluency for educators 练习6 题
  3. Vol.14 · Teaching AI Fluency 练习6 题
  4. Vol.15 · AI Fluency for students 练习6 题

📦 Vol.8 Building with the Claude API

Skilljar「Building with the Claude API」— 7 个 Quiz + 1 Final Assessment · 主线覆盖 API access / Prompt eval / Prompt engineering / Tool use / Features / MCP / Agents

Quiz on Accessing Claude with the API

6 题 · Skilljar Vol.8 课程
Question 1

You're building a multi-turn chatbot. The user sends a second message. How do you preserve context with the Anthropic API?

  1. Re-send only the new user message; the API remembers the previous turn
  2. Use a session_id query parameter to look up history server-side
  3. Send the full prior message list (user + assistant turns) plus the new user message
  4. Put the prior conversation inside the system prompt
推荐答案C
Anthropic API 是无状态的——每次请求必须把完整的 messages 数组(含历史 user/assistant 轮次)发回,模型才能看到上下文。没有 session_id 之类的服务器端记忆(B 错),把历史塞进 system 会把角色都混成「指令」(D 错)。
Question 2

Your `max_tokens=200` request returns a response with `stop_reason: "max_tokens"`. What happened?

  1. Your input prompt exceeded 200 tokens and was truncated
  2. The model reached the output budget before finishing its answer
  3. Anthropic billing capped your account at 200 tokens for the day
  4. The conversation context window is now full
推荐答案B
max_tokens 是输出长度上限,不是输入限制(A 错)也不是计费限额(C 错)。命中后 stop_reason 返回 max_tokens,说明回答被截断,需要调高上限或拆分任务。
Question 3

You need to summarize 50,000 customer reviews per day, each ~500 words. Cost matters; quality bar is moderate. Which model is the right default?

  1. Always Opus, because it has the strongest reasoning
  2. Haiku, because it offers the best cost-latency profile for high-volume, well-bounded tasks
  3. Whichever model has the largest context window
  4. Sonnet, because it's the only model that supports streaming
推荐答案B
高吞吐、任务清晰、质量门槛中等的场景应选 Haiku——成本与延迟都最低。所有 Claude 模型都支持 streaming(D 错),上下文窗口不是这里的瓶颈(C 错),无脑选 Opus 会浪费成本(A 错)。
Question 4

Your team is reviewing a pull request that adds Claude API integration. Where should the API key live?

  1. Hard-coded as a constant in the source file with a comment "rotate yearly"
  2. Committed in `.env` so deployment is reproducible across machines
  3. In an environment variable / secret manager, never committed to the repo
  4. Embedded in the frontend bundle so the backend stays stateless
推荐答案C
API key 是生产凭证,必须通过环境变量或 secret manager 注入。硬编码(A)、提交 `.env`(B,会污染 git 历史)、放进前端 bundle(D,等同公开)都是常见账单泄漏来源。
Question 5

You set `stop_sequences=[""]`. The model generates text containing that string. What happens?

  1. The API returns an error because stop sequences must not collide with content
  2. Generation halts at the first match; `stop_reason` becomes `stop_sequence` and the matched string is NOT included in output
  3. The matched string is included verbatim and generation continues
  4. Stop sequences are silently ignored unless temperature is below 0.5
推荐答案B
命中 stop sequence 时:生成立即停止、`stop_reason` 设为 `stop_sequence`、匹配的字符串本身**不出现**在 content 里。这是配合 prefill 锁定 JSON / 结构化输出的关键机制——温度高低不影响是否生效(D 错)。
Question 6

Why count tokens before sending a request to Claude?

  1. It's optional — the API will reject anything that doesn't fit and retry shrinks automatically
  2. Only required for streaming; non-streaming requests are unbounded
  3. To stay within the model's input window AND to estimate cost before committing to the call
  4. To choose the right `temperature` value
推荐答案C
Token 计数有两个目的:保证输入 + 输出之和落在模型窗口内;预先估算成本(每千 token 计费)。API 不会自动「retry shrinks」(A 错);token 上限对 streaming / 非 streaming 都生效(B 错);与 temperature 无关(D 错)。

Quiz on Prompt Evaluation

5 题 · Skilljar Vol.8 课程
Question 1

A prompt tweak improves Q1–Q3 of your eval set but silently regresses Q5. What is this telling you?

  1. Q5 is noise; ignore single-question regressions
  2. The model has a bug; switch to a different Claude version
  3. Prompt changes have tradeoffs across distributions; only a full eval set surfaces regressions
  4. You should revert Q1–Q3 improvements until Q5 stabilizes
推荐答案C
Eval 的主要价值就是发现「局部赢、整体输」的回归。盲目忽略 Q5(A)放任问题;归咎模型版本(B)跳过自己 prompt 的责任;为了一题回滚整次改动(D)失去优化收益。正确做法是改进 prompt 让 Q1–Q3 与 Q5 都过。
Question 2

Which grading style fits "the output JSON must validate against this schema"?

  1. Model-based grading: ask Claude to judge schema validity
  2. Code-based grading: parse JSON and run a schema validator
  3. Manual review by the prompt author
  4. Either — they produce equivalent results
推荐答案B
确定性、可机械判定的标准(schema 校验、字段存在、数值范围)必须用代码评分——结果可复现且零成本。让 Claude 自评(A)会引入随机性;人工评审(C)不可扩展;二者并不等价(D 错)。
Question 3

When should you refresh your eval dataset?

  1. Never — a stable eval set is the whole point
  2. After every prompt change, to keep it aligned
  3. When production data drifts or when new failure modes emerge from real users
  4. Once a year, regardless of usage
推荐答案C
Eval 集要跟随真实分布——产品变更、用户群扩展、被新发现的 edge case 都是触发器。永不更新(A)会让 eval 与现实脱节;每次改 prompt 就改 eval(B)等于自欺欺人;按时间表更新(D)忽略了驱动信号。
Question 4

You're building your first eval set. Which inclusion matters most?

  1. The 10 prompts your team uses most often
  2. A mix of common cases, edge cases, adversarial inputs, and real production samples
  3. Synthetic inputs generated by Claude itself, for scale
  4. Only inputs that the current prompt already handles correctly
推荐答案B
代表性 + 困难性 + 真实性是 eval 集的三个支柱。团队最爱的 10 条(A)有偏;Claude 自己生成(C)会复制模型偏好;只放当前能过的(D)等同于「考试只考会的题」——回归永远不会被发现。
Question 5

Why log prompts and responses in production?

  1. It's required by Anthropic for billing audits
  2. To rebuild eval datasets with the real distribution of user inputs and to debug regressions
  3. To train a custom Claude variant on your domain
  4. To detect malicious users automatically
推荐答案B
生产日志是 eval 集的最佳来源——它反映真实用户行为分布,也是事后复现回归 bug 的唯一途径。计费审计(A)由 Anthropic 自己处理;个人无法 fine-tune Claude(C 错);安全检测虽是副产品但不是主用途(D)。

Quiz on Prompt Engineering Techniques

6 题 · Skilljar Vol.8 课程
Question 1

"Be brief" vs "Limit to 200 words, exactly 3 bullets, no markdown headers" — which works better and why?

  1. Both are equivalent; Claude infers intent
  2. "Be brief" is better; over-specifying confuses the model
  3. The specific version, because measurable constraints replace ambiguous adjectives
  4. Depends on temperature; higher temperature handles vague instructions
推荐答案C
「Specific」原则:把模糊形容词换成可度量的约束(字数、结构、格式)。LLM 不擅长猜「brief 究竟多简」;明确量化才有稳定输出。Temperature 与歧义无关(D 错);过度具体不会迷惑 Claude(B 错)。
Question 2

Few-shot works for your task but the examples eat too much budget. What's the next move?

  1. Drop few-shot entirely
  2. Switch to a higher temperature to make examples optional
  3. Move stable examples to a cached prefix (prompt caching) so they amortize across requests
  4. Concatenate all examples into a single short paragraph
推荐答案C
Prompt caching 正是为这种场景设计的——稳定的 few-shot 前缀只算一次成本,后续请求按缓存价计费。砍掉 few-shot(A)会损失质量;调温度(B)与示例数无关;压成一段(D)会破坏示例边界。
Question 3

Your prompt feeds Claude a 50-page document followed by 3 lines of instructions. Outputs sometimes drift off-task. What's the fix?

  1. Always place instructions BEFORE the document, never after
  2. Repeat the key instructions AFTER the document, near the end of the prompt
  3. Shrink the document to fit instructions at the top
  4. Increase temperature to escape document anchoring
推荐答案B
长上下文场景里,靠近输出位置的指令权重更高。Anthropic 推荐「文档在前、指令在后再重复」的模式。强行砍文档(C)会丢信息;高温(D)让输出更不稳;instruction 单独放最前(A)容易被长文档稀释。
Question 4

When is asking Claude to "think step by step" the WRONG choice?

  1. Multi-step math problems
  2. Code review tasks
  3. High-volume, latency-sensitive classification with simple inputs
  4. Complex policy reasoning
推荐答案C
Chain-of-thought 增加输出 token,从而增加成本与延迟。简单分类(情感、垃圾邮件)用 CoT 是浪费;它的价值在路径不确定的复杂推理(A/B/D)。「永远用」或「永远不用」都是错觉,要按任务复杂度选择。
Question 5

"Act as a senior security auditor reviewing this code" — what's the actual mechanism behind role prompts?

  1. Claude loads a different model fine-tuned for that role
  2. It enables hidden tools tied to the role
  3. It biases Claude's vocabulary, depth, and priorities toward that domain's writing patterns
  4. It's purely cosmetic and has no measurable effect
推荐答案C
Role prompt 通过设定语境锚定输出风格、术语密度、关注点优先级。它不切换模型(A 错)也不解锁工具(B 错),但绝非无效(D 错)——只是效果体现在分布偏移而非「身份切换」。
Question 6

"Clear and direct" 与 "specific" 在 prompt 工程中的区别是什么?

  1. 它们是同一个原则的不同说法
  2. Clear/direct 强调任务表述无歧义;specific 强调补充目标、受众、约束、格式与成功标准
  3. Clear/direct 仅用于 system prompt;specific 仅用于 user prompt
  4. Clear 是英文专属;specific 是中文专属
推荐答案B
两者互补:先把任务说清楚(不绕弯、用主动句),再把细节填具体(量化约束、格式要求、验收标准)。它们不是同义词(A 错),也不限于某种 message 角色(C 错),与语言无关(D 错)。

Quiz on Tool Use with Claude

6 题 · Skilljar Vol.8 课程
Question 1

Claude returns a `tool_use` block with `id="toolu_xyz"`. Your code runs the tool. What do you send back?

  1. A new user message saying "the tool returned X"
  2. A system message updating Claude's instructions
  3. A `tool_result` content block with `tool_use_id="toolu_xyz"` and the result payload, inside a user message
  4. Nothing — Claude assumes success unless an error is raised
推荐答案C
Tool 协议要求:用户角色消息内携带 `tool_result` 块,并通过 `tool_use_id` 与上一轮的 `tool_use` 块配对。自由文本(A)会让 Claude 把它当成普通用户输入;system message(B)混淆角色;不发回(D)会让 Claude 卡在等待状态。
Question 2

Your tool description reads: "gets data". Claude often picks the wrong tool. Most likely cause?

  1. The model is too small — switch to Opus
  2. The tool description is too vague to disambiguate from siblings
  3. Tool descriptions don't influence selection — it's all from parameter names
  4. You forgot to set `tool_choice="auto"`
推荐答案B
Tool description 是 Claude 决定调用哪个工具的主要信号——必须说明用途、何时调用、何时**不**调用。「gets data」对客户工具、订单工具、库存工具都成立。换大模型(A)解决不了模糊语义;description 不是无效(C 错);`tool_choice` 是开关而非语义来源(D)。
Question 3

Your refund agent has a `process_refund` tool that issues real money. Right gating pattern?

  1. Auto-execute everything Claude calls — that's the point of agents
  2. Hide the tool entirely; never expose write actions to Claude
  3. Have Claude propose the refund; require human confirmation before the side effect runs
  4. Run the tool but log it for retroactive audit
推荐答案C
高风险写操作(资金、删除、对外发送)应该让 Claude 提议、人类确认、系统执行——这是「proposal-confirm-execute」模式。自动执行(A)放任错误;完全隐藏(B)失去 agent 能力;事后审计(D)已经晚了。
Question 4

Your tool returns `{"error": "rate_limited"}`. What should Claude do next, in a well-designed loop?

  1. Treat error as success and continue
  2. Read the error, decide between retry, alternative path, or escalate to the user
  3. Always retry immediately with the same args
  4. Crash the entire conversation
推荐答案B
错误信息是给 Claude 看的——回传后它应判断:可重试就重试、能换路径就换、否则告知用户。忽略错误(A)破坏可靠性;盲目重试(C)会撞同一堵墙;崩对话(D)最差,损失上下文。
Question 5

When are parallel tool calls a good fit?

  1. Always — it speeds up everything
  2. Only with the code execution tool
  3. When operations are independent, e.g. fetching weather AND fetching news for the same response
  4. Never — Claude can't handle parallel results
推荐答案C
独立、无依赖的工具调用并行最有价值——一次往返就拿齐资料。有依赖时(如「先查订单再查商品」)必须串行。Claude 完全支持并行结果(D 错);不限工具种类(B 错);总是并行(A)会破坏依赖顺序。
Question 6

A tool's `input_schema` requires fields `customer_id: string` and `since: ISO date`. Claude calls with `customer_id: 12345` (integer). What's the right fix?

  1. Loosen the schema to accept any type
  2. Switch to a smarter model
  3. Tighten the description: clarify "customer_id is a string like 'cust_001', NOT a numeric ID"
  4. Validate types only after Claude has executed the tool
推荐答案C
类型错误本质是描述不清——Claude 看不出 customer_id 是字符串就当整数。改 schema 是放弃质量(A);换模型(B)治标;事后校验(D)已经发生错误。在 description 里写明类型与示例是源头治理。

Quiz on Features of Claude

6 题 · Skilljar Vol.8 课程
对应本地卷: V8·L10 Claude API 扩展功能
Question 1

Extended thinking — what does it actually do?

  1. Replaces RAG by giving Claude internal knowledge access
  2. Always produces faster responses than non-thinking mode
  3. Allocates a visible reasoning budget before the final answer, helping on complex multi-step problems
  4. Only available for code generation tasks
推荐答案C
Extended thinking 在最终答案前生成 thinking 块作为推理空间,对多约束规划、数学、复杂代码改造尤其有效。它不是 RAG 替代(A 错);输出更多 token 通常更慢(B 错);适用范围远不止代码(D 错)。
Question 2

Prompt caching — what gets cached and how?

  1. The final answer, keyed by the input hash
  2. A stable prefix marked by a `cache_control` breakpoint; subsequent requests reuse it at reduced cost
  3. Caching is automatic for any prompt over 1k tokens
  4. Tool definitions, but not message content
推荐答案B
Prompt caching 缓存的是**输入侧的稳定前缀**(system prompt、tools、长文档、few-shot),通过 `cache_control` 显式标记 breakpoint。它不缓存输出(A 错);不是自动触发(C 错);可缓存的内容包括消息块和工具定义(D 错)。
Question 3

You want Claude's answer to cite the source paragraph from your document. What's required?

  1. Just write "cite your sources" in the prompt
  2. Use higher temperature to force citation behavior
  3. Use the Citations feature: pass the document with `citations: { enabled: true }` so Claude can attach citation blocks to spans of its output
  4. Manually post-process Claude's text and match keywords to the document
推荐答案C
Citations 是结构化特性——开启后,Claude 输出的每段会带 citation 块指向源文档的具体位置。仅在 prompt 写「请引用」(A)只是请求文本格式而无可靠绑定;温度(B)与引用无关;事后正则匹配(D)脆弱且失真。
Question 4

When is the Files API a better fit than just inlining the file content into a message?

  1. Whenever the file is over 100 KB
  2. When the same large file will be referenced across many requests, so uploading once is cheaper than re-sending
  3. When you need to encrypt the file end-to-end
  4. When you want streaming output
推荐答案B
Files API 的核心价值是「上传一次、多次引用」,避免重复传输大文件。它不是按大小自动触发(A 错);不是端到端加密机制(C 错);与 streaming 正交(D 错)。
Question 5

Message Batches — main reason to use it?

  1. Faster responses than streaming
  2. Up to 50% cost reduction for async, non-time-sensitive workloads
  3. Required for any workload above a certain RPS
  4. Replaces streaming for chat applications
推荐答案B
Message Batches 提供约 50% 折扣,代价是结果在 24 小时内异步返回——适合夜间批量摘要、离线分类、回填等场景。它比 streaming 慢(A 错);不是高 RPS 强制项(C 错);与聊天界面(需要实时回流)正相反(D 错)。
Question 6

PDF + image input — which statement is true?

  1. PDFs must be converted to plain text before sending
  2. Images are free and don't count toward input tokens
  3. Both are accepted natively as content blocks and consume input tokens proportional to size and pages
  4. Vision capability requires a separate Anthropic Vision API endpoint
推荐答案C
Claude 模型原生支持 PDF 与图像 content block,按大小/页数折算成输入 token 计费。无需先 OCR(A 错);图像绝非免费(B 错);用同一 messages 接口,没有独立 Vision endpoint(D 错)。

Quiz on Model Context Protocol

6 题 · Skilljar Vol.8 课程
Question 1

API tool use vs MCP — what's the primary difference?

  1. MCP runs faster than API tool use
  2. API tool use is deprecated and being replaced by MCP
  3. MCP standardizes a server protocol so the same tools/resources/prompts can be reused across many clients without rewriting
  4. They are identical; MCP is just marketing
推荐答案C
API tool use 是「应用内」声明工具;MCP 是「服务端」标准化暴露能力,让一个 server 同时被 Claude Code、Claude Desktop、第三方 host 等客户端复用。两者并存(B 错);性能不是主要差异(A 错);远非同义(D 错)。
Question 2

Among MCP primitives — tools, resources, prompts — which is intended for read-only context fetch?

  1. Tools
  2. Resources
  3. Prompts
  4. Sampling
推荐答案B
MCP 三原语职责:tools 做有副作用的动作;resources 提供只读上下文(文件、记录、文档片段);prompts 是可复用工作流模板。Sampling 不是原语(D 错)。
Question 3

Your MCP server exposes internal company HR data. What MUST your design account for?

  1. Trust the protocol to enforce access automatically
  2. Default-allow then audit afterwards
  3. Per-user authentication, scoped resource visibility, audit logs of every call
  4. Disable HTTPS to simplify debugging
推荐答案C
MCP 协议本身不替你做安全——server 必须实现身份认证、按用户/角色限定可见资源、记录可审计日志。「默认允许」(B)是数据泄漏温床;信任协议层(A)是误解;关 HTTPS(D)危险。
Question 4

What role do MCP `prompts` play?

  1. System prompts that the client must use verbatim
  2. A way to override the client's tool definitions
  3. Server-defined, reusable prompt templates the client can surface to users (e.g., "summarize this PR")
  4. Client-side only — they don't cross the protocol
推荐答案C
Prompts 是 server 提供的命名模板,让用户/Claude 一键触发「场景化的预设交互」。它不是强制 system prompt(A 错);不修改 tools(B 错);本质就是穿越协议传递给 client 的(D 错)。
Question 5

When is adding MCP overkill?

  1. When the model is Sonnet
  2. Whenever cost matters
  3. When you have a single, tightly-coupled tool that no other client will ever consume — direct API tool use is simpler
  4. Never — MCP is always the right choice
推荐答案C
MCP 的价值在标准化与复用——单应用、单工具、不会被其他 client 用到时,直接 API tool use 心智负担更低。模型选择(A)和成本(B)不是判定标准;「永远 MCP」(D)是过度工程。
Question 6

An attacker plants a malicious instruction inside an MCP `resource` returned to Claude. Best defense?

  1. HTTPS alone — transport encryption stops injection
  2. Trust Claude to ignore obvious instructions
  3. Server-side validation of resource content + client-side instruction isolation (treat resource bodies as data, not commands)
  4. Disable resources entirely; only allow tools
推荐答案C
Prompt injection via resource 是真实攻击面:必须服务端净化输出,客户端把 resource 当数据而非指令(例如包在明确分隔块里)。HTTPS 只防中间人(A 错);模型对指令式语言敏感,无法靠「相信它」(B 错);禁用资源(D)等于自废武功。

Quiz on Agents and Workflows

6 题 · Skilljar Vol.8 课程
Question 1

When does the workflow pattern beat the agent pattern?

  1. When the task involves any tool calls
  2. When the steps are known, predictable, and you need stable cost / latency / testability
  3. When you're using a smaller model like Haiku
  4. When the user prefers conversational responses
推荐答案B
Workflow 适合路径已知场景——可写测试、成本可预测、易于回滚。Agent 适合路径不确定、需要多轮反馈与验证的场景。任意工具(A)、模型大小(C)、对话风格(D)都不是判定标准。
Question 2

In an agent loop, why is "verify" a distinct step?

  1. It's a billing milestone
  2. To compress conversation history
  3. To confirm the action achieved the goal and surface tool failures or unintended effects before continuing
  4. To switch models mid-loop
推荐答案C
Verify 是「闭环安全网」——工具调用可能失败、环境可能变、输出可能不符合目标。少了 verify 的 agent 会在错误状态上继续累积。它不是计费(A 错)、压缩(B 错)或换模型(D 错)的步骤。
Question 3

What are sensible stop conditions for an agent loop?

  1. Only when the goal is fully achieved — don't bound iterations
  2. Only when an error occurs
  3. Goal met OR max iterations reached OR explicit user halt OR cost / time budget exceeded
  4. Stop after exactly 5 turns regardless
推荐答案C
Agent loop 必须有多个出口:成功、迭代上限、用户中止、预算耗尽。没有上限(A)会无限烧钱;只看错误(B)放任空转;硬编 5 轮(D)武断。
Question 4

A pipeline stage is a fixed sequence of three deterministic API calls. How should you implement it?

  1. Wrap it in an agent loop for "future flexibility"
  2. Use extended thinking
  3. Plain code orchestrating three Claude calls — workflow, no agent loop needed
  4. Hardcode the calls inside the system prompt
推荐答案C
三步固定流程就是 workflow——用代码串联三次 Claude 调用,可单元测试、易于监控、成本可预测。套 agent loop(A)是过度设计,无谓引入不确定性;extended thinking(B)解决推理深度不解决流程;写进 system prompt(D)会让模型自己跑流程,反而失控。
Question 5

For high-stakes verification, why use an INDEPENDENT Claude call rather than asking the same agent to self-check?

  1. Independent calls are always faster
  2. It's required by the API for tool use
  3. Self-check shares context bias; an independent call sees only the artifact and gives a less anchored judgment
  4. Self-checks are forbidden by Anthropic policy
推荐答案C
同一上下文里的「再检查一遍」容易被原推理锚定,难以发现自己的错误。独立 verifier(新会话、只看输入与输出)能给出近似第三方判断。这与速度(A)、API 要求(B)、政策(D)都无关。
Question 6

How do you keep agent costs predictable in production?

  1. Costs are inherently unpredictable; budget generously
  2. Cap latency only — cost will follow
  3. Cap iterations + log per-turn token usage + alert / escalate on cost spikes
  4. Run only with Haiku regardless of task
推荐答案C
成本可治理:迭代上限避免空转、每轮 token 记录便于归因、阈值告警让异常被人发现。「认命」(A)是放弃;只看延迟(B)漏成本维度;强制 Haiku(D)牺牲能力。

Final Assessment

8 题 · Skilljar Vol.8 课程 · 综合判断
Question 1

A customer-support bot must answer questions from internal policy docs and never fabricate. Best architecture?

  1. Pure agent loop with web search
  2. Inline the entire policy into the system prompt every request
  3. RAG over policy docs + Citations + an eval set that tests "I don't know" / escalation paths
  4. Fine-tune a custom Claude on the policy docs
推荐答案C
RAG 解决资料来源、Citations 让答案可追溯、eval 覆盖「未检索到时拒答」的关键路径——三者组合是政策问答的标准答案。Agent loop(A)路径过宽;inline 系统提示(B)token 爆炸且不易更新;fine-tune(D)是反应过度,且 Claude 不开放此路径。
Question 2

Compliance team needs to audit every Claude response against source documents. Best feature combo?

  1. High temperature + manual review
  2. Citations + structured output (JSON for audit fields) + full request/response logging
  3. Just bigger model and trust the answers
  4. Streaming, so reviewers see drafts in real time
推荐答案B
合规审计需要:可追溯(Citations)、机器可读(JSON 结构化)、可重现(完整日志)。高温(A)增加随机性反而难审计;换大模型(C)不解决追溯;streaming(D)只解决体感而非记录。
Question 3

You're generating SQL from natural-language questions. Which temperature setting is sane?

  1. 1.0+, to encourage creative joins
  2. 0.0–0.2, for deterministic, syntactically correct SQL
  3. It doesn't matter for code generation
  4. 0.7, the default that fits everything
推荐答案B
SQL 是确定性语言,要稳定语法和重复可控的查询计划——低温是必然选择。高温(A)增加语法错误和不一致;temperature 对代码任务影响显著(C 错);默认值不是「万能值」(D 错)。
Question 4

Daily report job analyses the same 100k-token policy book against today's incidents. Cost optimization?

  1. Switch all calls to Haiku regardless of quality needs
  2. Truncate the policy book to 10k tokens
  3. Use prompt caching with the policy book as a stable prefix — every call after the first hits the cached copy
  4. Run streaming so partial results lower wall-clock time
推荐答案C
长且稳定的前缀 + 多次调用 = prompt caching 教科书场景,可大幅降低重复输入成本。降级模型(A)牺牲质量;砍内容(B)牺牲覆盖;streaming(D)解决感知延迟而不省成本。
Question 5

An agent has a `process_refund` tool. The biggest production risk?

  1. The tool description being too long
  2. Using JSON instead of XML for the schema
  3. Auto-executing refunds without a human-in-the-loop confirmation gate
  4. Calling it from Sonnet instead of Opus
推荐答案C
高风险写动作(资金、删除、对外发送)的最大风险一律是「未经确认就自动执行」。description 长度(A)影响选择正确率但不致命;schema 格式(B)只要被 Claude 理解皆可;模型选择(D)影响成功率但不是首要风险。
Question 6

How do you detect prompt regressions across versions?

  1. Wait for user complaints
  2. Manual spot-check the new prompt on a few prompts you remember
  3. Maintain a versioned eval set with consistent grading; run it on every prompt version and compare scores
  4. Trust streaming output to surface anomalies
推荐答案C
回归检测的工程标准:稳定 eval 集 + 一致评分 + 版本化对比。等用户投诉(A)已经造成损失;记忆抽查(B)有偏;streaming(D)和回归无关。
Question 7

You need Claude to actually run Python code, not just generate it. Best mechanism?

  1. Add "execute the code" to the system prompt
  2. Spin up a separate agent that copies code from Claude's text output
  3. Use the code execution tool — sandboxed runtime that returns real results to Claude
  4. Switch to a bigger model
推荐答案C
Code execution 是一个工具,给 Claude 一个隔离的运行环境,把执行结果回传供继续推理。System prompt 的「请执行」(A)只是文字祈愿;自建 agent 解析(B)等于重发明;换模型(D)不解决执行问题。
Question 8

Your team exposes the company CRM via an MCP server. What governs which records each user can see?

  1. The MCP protocol enforces row-level security automatically
  2. The system prompt on the client side
  3. Server-side authentication + per-user scope on resources/tools, enforced before responding to any MCP call
  4. Trust by default; let Claude decide what to expose
推荐答案C
权限治理永远在服务端——MCP server 必须验明用户身份、按权限决定可见的 resources/tools 子集。协议本身不做(A 错);客户端 system prompt 是最弱的防线(B 错);trust by default(D)是数据泄漏标准模板。

📦 Vol.4 Claude Code in Action

Skilljar「Claude Code in Action」— 1 个 Quiz · 6 题 · 主线覆盖 setup / context / making changes / custom commands / skills / MCP / GitHub / hooks / Agent SDK

Quiz on Claude Code

6 题 · Skilljar Vol.4 课程
Question 1

What does a `CLAUDE.md` file in a project root actually do?

  1. It's a README replacement that GitHub renders specially
  2. It must be present for `claude` to launch in that directory
  3. It's auto-loaded as persistent project memory — instructions, conventions, and file pointers stay visible to Claude across sessions
  4. It overrides Claude's safety system prompt
推荐答案C
CLAUDE.md 是 Claude Code 的项目级记忆文件——每次启动自动加载,记录团队约定、关键路径、避免的坑。它不是 GitHub 渲染(A 错),也不是启动必需(B 错),更不能覆盖安全 system prompt(D 错)。可以在用户级 `~/.claude/CLAUDE.md` 写跨项目偏好。
Question 2

You want Claude to look at `src/auth/login.ts`. Best way to add it as context?

  1. Paste the entire file content into the prompt
  2. Reference it with `@src/auth/login.ts` — Claude reads on demand and avoids bloating the context window
  3. Copy the entire repo and let Claude scan everything
  4. Disable context limits and dump the whole project
推荐答案B
`@` 让 Claude 按需读取文件,避免一开场就把不相关内容塞进上下文。整文件粘贴(A)、扫整个 repo(C)、关闭限制(D,根本不存在的开关)都会浪费 token、稀释关注点,反而降低答题质量。
Question 3

Your team keeps re-asking Claude to "review the diff for accessibility issues using our checklist." How do you make this a one-shot command?

  1. Paste the checklist into every prompt
  2. Edit the global system prompt
  3. Create `.claude/commands/a11y-review.md` with the workflow + checklist; invoke via `/a11y-review`, supports `$ARGUMENTS` for parameters
  4. Hard-code it in a shell alias
推荐答案C
Custom slash commands 把复用工作流变成 Markdown 文件——可 git 管理、可团队共享、可参数化(`$ARGUMENTS`)。每次粘贴清单(A)失去自动化;改 system prompt(B)影响所有任务;shell 别名(D)只调起 CLI 不携带工作流。
Question 4

Adding a Linear MCP server so Claude Code can read tickets — where does the config go?

  1. It's hard-coded inside the `claude` binary; you can't add servers
  2. `.mcp.json` at the project root declares per-project servers (command, args, env); commit it for the team
  3. Only `~/.claude/settings.json` works — MCP is global-only
  4. Pass `--mcp` flag with the server URL on every launch
推荐答案B
`.mcp.json`(项目根)把 MCP server 声明绑定到具体仓库,团队成员 clone 后即可启用。它支持 stdio / http / sse 多种传输;可通过 `claude mcp add` 命令编辑。Claude Code 不只支持全局(C 错),也不要每次传 flag(D 错)。
Question 5

A `PreToolUse` hook in `settings.json` — what can it do?

  1. Only log the tool call to stdout — purely cosmetic
  2. Fire after the tool finishes, to record results
  3. Run before the tool call; exit code or JSON output can deny, allow, or modify the call
  4. Replace the entire Claude system prompt for one session
推荐答案C
PreToolUse 是 hook 链中真正能「拦截」的环节——脚本 exit code 非零或返回 deny 会阻止工具执行;返回 modify 可改写参数。它不只是日志(A 错);那是 PostToolUse(B)的事;hooks 不能改 system prompt(D 错)。
Question 6

Running Claude Code in a GitHub Actions workflow to triage PRs — what's the right shape?

  1. Run the interactive CLI; pipe stdin from the PR body
  2. Skip Claude in CI; only run it locally
  3. Use the Claude Agent SDK or GitHub Action: stateless invocation with the prompt + repo checkout, `ANTHROPIC_API_KEY` from secrets, output posted as PR comment / patch
  4. Hard-code the API key into the workflow YAML
推荐答案C
CI 集成走 SDK / GitHub Action,把 Claude Code 跑成无状态的 agent——拿到 PR 上下文、产生评论或补丁。交互式 CLI(A)无人按 Enter;放弃 CI(B)失去自动化价值;硬编 key(D)会进 git 历史,必须用 secret。

📦 Vol.9 + Vol.10 MCP(入门 + 进阶)

Skilljar「Introduction to MCP」+「MCP: Advanced Topics」— 2 个 Assessment · 12 题 · 主线覆盖 host/client/server / tools / resources / prompts / sampling / roots / transport / production

Final assessment on MCP Vol.9 入门

6 题 · Skilljar Vol.9 课程
Question 1

In MCP, who initiates the connection — host, client, or server?

  1. The server connects out to the host on startup
  2. The host (e.g., Claude Code) spawns or connects a client, which initiates the handshake with the server
  3. Server and client connect peer-to-peer with no host involved
  4. The order doesn't matter — MCP is symmetric
推荐答案B
MCP 三角:host(如 Claude Code / Claude Desktop)拥有 LLM 上下文,host 实例化 client,client 与 server 建立连接并执行 capability negotiation。Server 不主动连 host(A 错),不是对等关系(C 错),方向是固定的(D 错)。
Question 2

You're building an MCP server for a knowledge base. Users should be able to (a) search articles, (b) read full article text, (c) trigger a "draft a reply" workflow. Which primitives map to which?

  1. All three should be tools — keep it simple
  2. Search = tool (action with side-effect of result), full text = resource (read-only context), draft-reply workflow = prompt (named template)
  3. All three should be resources — they all read data
  4. All three should be prompts — they're triggered by user intent
推荐答案B
MCP 三原语职责清晰:tools 做有逻辑/参数的动作(搜索)、resources 提供可寻址的只读内容(文章正文)、prompts 是可由用户触发的命名模板(「起草回复」工作流)。把三者全归一类(A/C/D)会失去原语区分带来的客户端 UX 价值。
Question 3

MCP Inspector — primary purpose?

  1. It's a deployment service that hosts your MCP server in production
  2. A local debugging UI that connects to your server, lists its tools/resources/prompts, and lets you call them manually to verify behavior
  3. An IDE plugin that auto-generates MCP server code
  4. A logging aggregator for production MCP traffic
推荐答案B
MCP Inspector 是开发阶段的「客户端模拟器 + 浏览器」——它发送真实的 MCP 协议请求,把 server 当成黑盒来探查。它不部署(A 错)、不生成代码(C 错)、也不是生产观测工具(D 错)。先在 Inspector 里跑通,再接 Claude Code。
Question 4

A tool's `input_schema` lists `{"query": "string", "limit": "integer"}` with no descriptions. What's wrong?

  1. Nothing — types are sufficient for the LLM to call correctly
  2. The schema must use `kind` instead of `type`
  3. Without per-field descriptions, the LLM has no signal about valid ranges, defaults, or semantic meaning — quality degrades
  4. Schemas don't matter; only the tool description does
推荐答案C
每个字段的 `description` 是 LLM 判断「填什么值合理」的关键信号——没有它,model 只能瞎猜默认 `limit`、可接受的 query 形态等。类型本身不够(A 错);JSON Schema 用 `type` 而非 `kind`(B 错);schema 与 tool description 都重要(D 错)。
Question 5

In Claude Code, you want a Linear MCP server visible only to your team in this repo. Which scope and config?

  1. User scope (`~/.claude.json`) — easiest to maintain
  2. Project scope: commit `.mcp.json` in the repo root so every collaborator picks it up
  3. Local scope: `claude mcp add --scope local`, then commit your machine's settings file
  4. Hard-code it in CLAUDE.md so team members read about it
推荐答案B
Project scope = `.mcp.json` 提交进 repo,全队 clone 即可启用。User scope(A)只在你的机器生效;local scope(C)也是个人级;写进 CLAUDE.md(D)只是告知,不是配置。Project scope 是「团队共享 MCP server」的标准答案。
Question 6

You're writing your own MCP client (not using Claude Code). What does the client need to do besides JSON-RPC plumbing?

  1. Nothing — JSON-RPC handles everything
  2. Negotiate capabilities at startup, surface tools/resources/prompts to the LLM, route LLM tool calls to the right server, and pass tool_result back into the conversation
  3. Implement its own LLM internally; servers refuse plain HTTP clients
  4. Encrypt every payload before sending
推荐答案B
Client 是 host 与 server 之间的「翻译层」:lifecycle(initialize / capability negotiation)、把 server 暴露的能力映射到 LLM 可用的 tool list、把 LLM 的 tool_use 路由到正确 server、再把 tool_result 注回对话。JSON-RPC 只是底层(A 错);不需要内嵌 LLM(C 错);加密由传输层负责(D 错)。

Assessment on MCP concepts Vol.10 进阶

6 题 · Skilljar Vol.10 课程
Question 1

Sampling in MCP — what does it actually do?

  1. The server sends statistics about LLM token usage to the host
  2. The server requests the host's LLM to generate a completion on its behalf — capability negotiation must enable it, and the host approves before forwarding
  3. It's a load-balancing technique for multiple MCP servers
  4. It samples a subset of resources to avoid overloading the model
推荐答案B
Sampling 让 server 反向调用 host 的 LLM——server 抛出 prompt + 参数,host 检查并代理给模型,结果回给 server。这是 server 完成自身工作流(如总结大文档)需要 LLM 推理时的标准模式。它与统计采样(A/D)或负载均衡(C)无关。
Question 2

Roots — what are they for?

  1. The set of administrators allowed to install MCP servers
  2. Cryptographic root certificates for secure transport
  3. Filesystem boundaries the host advertises to servers — "you may operate within these directories" — letting servers scope their work appropriately
  4. The top-level routes in an HTTP-based MCP server
推荐答案C
Roots 是 host 告知 server「这是我授权的工作目录」——文件系统类 server(如 codebase 索引、文档读取)按 roots 决定扫描范围。这与权限管理(A)、TLS 证书(B)、HTTP 路由(D)都无关,是 MCP 协议的项目边界机制。
Question 3

Picking transport: STDIO vs Streamable HTTP. Which fits "internal Linear MCP server hosted on a private VM, used by 50 employees with team-shared auth"?

  1. STDIO — keeps things simple
  2. Streamable HTTP — multi-user remote access requires a network transport with auth and session handling
  3. Either works equivalently
  4. Neither — MCP can't serve multiple users
推荐答案B
STDIO 是 host 与 server 在同一台机器、通过子进程通信——单用户、无网络。Streamable HTTP 才能支持远程多用户、auth、session、负载均衡。STDIO 在 50 人远程场景行不通(A/C 错),MCP 完全支持多用户(D 错)。
Question 4

Your MCP tool runs a 90-second build. Without progress notifications, what happens on the host side?

  1. The host shows "running..." indefinitely with no feedback — users assume it's stuck and may abort
  2. Streaming HTTP automatically emits keep-alive — no action needed
  3. JSON-RPC times out at 30s and fails the call
  4. The tool can't run longer than 30s in MCP at all
推荐答案A
长任务必须主动发送 `progress` notifications(带 token、percent、message),否则 host UI 只能显示「运行中」无法显示进度,体验极差且用户倾向中止。Keep-alive(B)维持连接但不传业务进度;30s 不是协议硬上限(C/D 错)。
Question 5

Streamable HTTP — stateless vs stateful session. When should you go stateful?

  1. Always — stateful is more reliable
  2. Never — stateless scales better
  3. When the server holds per-session resources (in-flight cursor, partial uploads, conversation memory) that can't be reconstructed from a single request
  4. Whenever you have more than 10 concurrent users
推荐答案C
Streamable HTTP 默认无状态便于水平扩展;当 server 必须维持「跨请求的连续状态」(如分页游标、上传进度、会话记忆)时才转 stateful,并接受随之而来的部署复杂度。「永远 stateful」(A)是过度设计,「永远 stateless」(B)也教条;用户数(D)不是判断维度。
Question 6

Productionizing an HTTP MCP server. Which is the most important security control?

  1. Choose a unique port number to obscure the server
  2. Authenticate every request, scope tools/resources per identity, audit log every call, and treat all external content as untrusted (injection defense)
  3. Disable HTTPS for performance — internal traffic is safe
  4. Run with default settings; MCP includes built-in security
推荐答案B
生产 MCP server 的安全清单:身份认证 + 授权范围 + 审计日志 + 注入防御——四项都做才合格。端口隐藏(A)是 security through obscurity;关 HTTPS(C)暴露密钥;MCP 协议本身不带安全(D 错),是开发者的责任。

📦 Vol.5 Introduction to Claude Cowork

Skilljar「Introduction to Claude Cowork」— 1 个 Quiz · 5 题 · 主线覆盖 task loop / projects / plugins / scheduled tasks / file tasks / permissions

Quiz on Claude Cowork

5 题 · Skilljar Vol.5 课程
Question 1

Three tasks land on your desk. Which one is the strongest fit for Cowork (vs. Chat or Claude Code)?

  1. Quickly summarize a single article you pasted into the chat
  2. Refactor a Python module's tests in a git repository
  3. Sort 200 mixed PDFs in `~/Downloads/` into year/month subfolders, generate an Excel index, and produce a one-page summary
  4. Brainstorm taglines with a colleague over a chat thread
推荐答案C
Cowork 专为「多步骤、本机文件、有真实产物」的知识工作设计。整理 200 个 PDF + 索引 + 摘要正中靶心:批量、跨文件、产生原生 Excel/文档。单文章摘要(A)用 Chat 更轻;代码重构(B)是 Claude Code 的领域;头脑风暴(D)是 Chat 的对话场景。
Question 2

In the Cowork task loop, your role is primarily…?

  1. Hands-off — kick off the task and let Claude run end-to-end
  2. Continuous co-pilot — type alongside Claude on every step
  3. Plan reviewer + outcome auditor — describe the goal, approve plans, sample-check results, halt early when direction is wrong
  4. Just a log reader — Cowork can't be interrupted once running
推荐答案C
Cowork 的循环把人放在两个关键节点:开始时定义目标 + 输入 + 输出 + 约束 + 验收;运行后审阅 + 复核。文件操作不可逆,因此「描述—批准—执行—交付」中的批准与复核必须由人承担。它不是自驾(A)也不是结对编程(B);可以随时打断(D 错)。
Question 3

A Cowork Project differs from a one-shot Cowork task in that it…?

  1. Runs in Anthropic's cloud, freeing your laptop
  2. Persists instructions, file context, scheduled tasks, and memory inside a named local workspace — repeatable work without re-explaining context
  3. Is the same concept as a Claude Code project; just a UI rename
  4. Makes the Project's data shared across everyone in your team automatically
推荐答案B
Cowork Project 是本机持久工作区——它把团队背景、目录结构、输出格式、定时任务一次写定,每次任务自动加载。它仍跑在你电脑上(A 错);与 Claude Code 项目是不同概念(C 错);默认私有、不自动跨人共享(D 错)。
Question 4

You set up "every Friday 5 pm: generate weekly digest." On Friday at 5 pm your laptop is closed in your bag. What happens?

  1. Cowork runs anyway in Anthropic's cloud — output appears Monday
  2. Nothing — Cowork scheduled tasks need Claude Desktop running and the computer awake; missed runs may catch up next time you open the app, depending on the schedule
  3. Your phone runs it via push notification
  4. The next launch raises an error and disables the schedule
推荐答案B
Cowork scheduled tasks 不是云端 cron——必须 Claude Desktop 开着 + 电脑醒着。这一限制决定了排程任务的设计:选低风险、可补做、有失败通知的工作。云端跑(A)、手机推(C)都是误解;不会自动 disable schedule(D 错)。
Question 5

Inside Cowork, what's the relationship between Plugins, Skills, Connectors, and Sub-agents?

  1. They're four names for the same connector concept
  2. Skills are the umbrella; the others plug into a Skill
  3. A Plugin packages Skills (how to do something), Connectors (where to reach), and Sub-agents (who handles which part) into a specialist workflow for a role
  4. Sub-agents only exist in Claude Code, not in Cowork
推荐答案C
Plugin = 角色化工作包:Skill 决定怎么做、Connector 决定能接触哪里、Sub-agent 负责把复杂任务拆给专门角色并行处理。它们职责分明(A 错);Plugin 才是包装层,不是 Skill(B 错);Cowork 也支持 Sub-agent(D 错)。

📦 Vol.2 / Vol.3 / Vol.13 入门 + 应用 Course Quiz

Skilljar「AI Capabilities and Limitations」+「Claude Code 101」+「AI Fluency for nonprofits」— 3 个 Course Quiz · 15 题 · 主线覆盖 LLM 基本面 / Claude Code 入门边界 / AI Fluency 在非营利场景

Course Quiz Vol.2

5 题 · Skilljar Vol.2「AI Capabilities and Limitations」
Question 1

When Claude generates text, what is it actually predicting?

  1. The semantic meaning of the next sentence as a whole
  2. The full final answer, then it checks back from the end
  3. The next token, given all prior tokens — repeated until a stop condition
  4. A vector embedding that's later decoded into characters
推荐答案C
LLM 是「自回归 next-token prediction」机器——每一步只决定下一个 token,循环到 stop sequence / max_tokens / EOT。它不预测语义整段(A 错),不从结尾回溯(B 错),输出端不是 embedding 解码(D 错)。理解这一机制就解释了为什么 prompt 顺序、prefill、stop sequence 都生效。
Question 2

Claude confidently states a fact that turns out to be wrong. The most accurate explanation?

  1. The model is broken; report it for retraining
  2. Hallucination is an inherent risk of next-token prediction over imperfect training data — verification, citations, and RAG are the mitigations, not "telling Claude to be careful"
  3. Claude is intentionally deceptive in some scenarios
  4. It happens only in low-temperature settings
推荐答案B
幻觉来自 LLM 的根本机制(按概率续写)+ 训练数据不完美——它是结构性问题,不是 bug。降低靠 RAG(外部资料)、Citations(追溯)、eval(回归检测),不是靠 prompt 里写「请别瞎说」。它不是故意(C 错),与温度无强关系(D 错)。
Question 3

Pre-training, fine-tuning, and RLHF — pick the order and what each contributes.

  1. RLHF → fine-tuning → pre-training, building from style to knowledge
  2. They are alternatives; you pick one based on use case
  3. Pre-training (next-token on huge corpora — broad world knowledge) → fine-tuning (task-specific data — sharper behavior) → RLHF (human preference signals — alignment with helpfulness/safety)
  4. Only pre-training matters; the rest are marketing names
推荐答案C
三阶段管线:pre-training 学语言与世界(参数最多)、fine-tuning 学任务(小数据集塑形)、RLHF 学人类偏好(让回答有用且安全)。顺序反过来(A 错);不是替代关系(B 错);后两步对体感质量影响巨大(D 错)。
Question 4

Today's date is past Claude's training cutoff. Which question is the riskiest to take Claude's answer at face value?

  1. "Explain the origins of the Roman numeral system"
  2. "Write a Python function to reverse a string"
  3. "Who's currently the CEO of [recent-news company X]?"
  4. "What's a metaphor for resilience?"
推荐答案C
事实在 cutoff 后会变化的领域(高管、当下事件、新版本号、定价、法规)最容易过时。历史(A)、确定性代码(B)、修辞建议(D)受 cutoff 影响小。需要新事实时用 RAG / web search / connectors 而不是直接问。
Question 5

A teammate writes "make this email better" as a Claude prompt. Why does it underperform "rewrite this email to be 100 words, polite-firm tone, ending with a clear next-step request"?

  1. Shorter prompts always work worse
  2. It needs a higher temperature setting
  3. "Better" is undefined; Claude has no shared rubric. Specifying length, tone, and required structure replaces ambiguous adjectives with measurable constraints
  4. You must use XML tags or it won't work
推荐答案C
「make better」是模糊形容词;LLM 没有你的内心评分标准。把目标拆成长度、语气、结构、动作要求 = clear + specific 原则。短不等于差(A 错);温度无关(B 错);XML 是工具不是必需(D 错)。

Course Quiz Vol.3

5 题 · Skilljar Vol.3「Claude Code 101」
Question 1

Three colleagues describe Claude Code. Which one has the right product mental model?

  1. "It's a chatbot in a browser tab where I paste code snippets"
  2. "It's a terminal-native AI pair-programmer that operates inside my repository — reads files, proposes edits, runs commands"
  3. "It's an autocomplete extension; it suggests one line at a time"
  4. "It's a no-code visual flow builder"
推荐答案B
Claude Code 跑在终端里,对当前 repo 有完整文件系统访问权——它读文件、跑命令、改代码、提议 diff。不是浏览器聊天(A)、不是行级补全(C,那是 IDE inline)、也不是可视化(D)。理解这个心智决定了用户怎么写 prompt 与放权。
Question 2

First time setting up Claude Code in a repo. What's the right first move?

  1. Just run `claude` and start asking questions — it figures everything out
  2. Run `/init` (or write a small `CLAUDE.md`) capturing the repo's stack, conventions, key directories, and "do not touch" zones — Claude loads this every session
  3. Copy the entire codebase into the prompt to "give context"
  4. Disable file write so Claude can only read
推荐答案B
CLAUDE.md(或 `/init` 自动生成)是项目记忆——团队约定、目录地图、禁区一次写定,每次会话自动加载。「干跑 claude」(A)让 Claude 重新摸索;粘贴整 repo(C)烧 token;只读模式(D)失去价值。先写 CLAUDE.md,再开活。
Question 3

Claude proposes a 12-file edit. You glance, think it looks good, and want to accept. Best practice?

  1. Accept all — Claude wouldn't propose something wrong
  2. Reject everything — too risky to read
  3. Review the diff, accept partial / file-by-file, run tests, and commit only what passed — Claude Code is a proposer, not an autopilot
  4. Always run `git add -A` before reading the diff to "checkpoint"
推荐答案C
Claude Code 的编辑模式是「提议 → 你审 → 你接受 → 运行测试 → commit」。无脑接受(A)累积错误;全拒(B)否定工具价值;预 add(D)会污染 staging。审 diff + 跑测试 + 部分接受是协作的核心。
Question 4

When is Claude Code the WRONG tool?

  1. You need to refactor a 30-file module
  2. You want to add tests for a new feature
  3. You need to brainstorm a product roadmap with a stakeholder over coffee — there's no codebase context, just a conversation
  4. You want to debug a flaky CI job
推荐答案C
Claude Code 的价值在「有代码 + 终端 + 文件操作」的场景。纯对话、无代码(C)应该用 Claude Chat。多文件重构(A)、写测试(B)、CI 排错(D)都是 Code 的强项。「拿 IDE 当 PowerPoint」是常见误用。
Question 5

Mid-task you realize Claude Code is heading the wrong direction. Best action?

  1. Wait for it to finish; then start over
  2. Force-kill the terminal and discard everything
  3. Press Esc / interrupt, redirect with new context, and use `/clear` only if the off-track context is irreversible — keeping useful history saves restart cost
  4. Open a separate `claude` session in another terminal so they vote
推荐答案C
中断 + 修正方向是 Claude Code 的协作核心。等到结束(A)浪费时间与 token;杀进程(B)失去全部上下文;并行投票(D)只会更乱。`/clear` 只在上下文被污染到无法救回时使用——多数情况下接续修正即可。

Course Quiz Vol.13

5 题 · Skilljar Vol.13「AI Fluency for nonprofits」
Question 1

In the 4D framework (Delegation / Description / Discernment / Diligence), which D handles "deciding whether AI should do this task at all"?

  1. Description
  2. Delegation
  3. Discernment
  4. Diligence
推荐答案B
Delegation 是「任务路由」:判断这个任务由人做、由 AI 做、还是协作。Description 是写好 prompt(怎么交代)、Discernment 是判断输出质量、Diligence 是负责到底。先选对工具,再做后面的事。
Question 2

Your nonprofit uses Claude to research grant opportunities. The output lists 5 promising funders. Discernment step says you should…?

  1. Trust the list — Claude has access to current grant databases
  2. Verify each funder still exists, the deadline is current, and the eligibility criteria match — Claude may hallucinate or use outdated data
  3. Ask Claude to verify itself by re-running the prompt
  4. Forward the list to staff without review
推荐答案B
Discernment 的核心是「不把 AI 输出当事实」——尤其涉及外部资助方、deadline、合规。Claude 可能幻觉、可能用过时知识。每条都要去官网或可信数据库核实。「让 Claude 自检」(C)等于让原作者改自己的卷。
Question 3

Which type of nonprofit data is the LEAST safe to paste into a public Claude conversation?

  1. Your annual report PDF (already public)
  2. Generic fundraising email templates
  3. Beneficiary case files containing names, contact info, health details, or immigration status
  4. Your mission statement
推荐答案C
PII / 敏感受众资料(医疗、移民身份、案件细节)是最高敏感度——一旦泄漏会真实伤害弱势群体。已公开材料(A/D)和通用模板(B)风险低。处理这类资料时应使用合规企业方案、最小化、匿名化。
Question 4

"Diligence" in the 4D framework operationally means…?

  1. Working harder than the AI does
  2. Owning the final outcome — humans remain accountable for what was sent / decided / published, and document the AI's role + limits in their workflow
  3. Reading every Claude transcript end-to-end
  4. Running the same prompt three times for consensus
推荐答案B
Diligence = 人类对最终成果负责 + 记录 AI 在流程中的角色与局限。它不是体力(A)、不是逐字阅读(C)、也不是多投票(D)。考点:把 AI 用法放进可审计的工作流,而不是把责任「转嫁给模型」。
Question 5

Your director asks Claude to analyze 5 years of donation data and recommend a fundraising strategy. When is AI-only analysis NOT enough?

  1. Always — AI can never do data analysis
  2. Never — AI handles it end-to-end if you upload the spreadsheet
  3. When decisions affect real budgets, staff, or beneficiaries — AI surfaces patterns and questions, humans validate the math, judge the strategy, and own the call
  4. Only when the dataset is over 100k rows
推荐答案C
AI 在数据分析上能加速发现模式、生成假设、起草沟通——但策略性决策(涉及预算、人事、受众)必须人类把关。AI 万能(A)和 AI 无能(B)都是极端;行数(D)不是判断维度。考点:augmentation 而非 automation。

📦 Vertex 课程

Skilljar「Claude with Google Cloud's Vertex AI」— 9 个 Quiz · 58 题 · 主线覆盖 API / Prompt / Tool Use / RAG / MCP / Agent + Final Assessment

Quiz on Accessing Claude with the API

7 题 · Skilljar Vertex 课程
Question 1

You're building a chat app that talks to Claude. Where should you put your API key?

  1. In the user's browser
  2. On your secure server
  3. In your website's JavaScript code
  4. In a public GitHub repository
推荐答案B · On your secure server
API Key 必须留在受信任的后端环境。放浏览器 / JS 代码 / 公开仓库都会被盗用导致账单失控。前端只与你的服务器通信,由服务器代为调用 Anthropic API。
Question 2

What is the primary purpose of a system prompt when working with Claude?

  1. To authenticate API requests to the Anthropic service
  2. To provide instructions that customize Claude's tone, style, and approach
  3. To limit the number of tokens Claude can generate in a response
  4. To store the conversation history between multiple requests
推荐答案B
System prompt 是「角色 + 行为基调」的定义层——告诉 Claude 它是谁、用什么口吻、遵守哪些规则。认证用 API Key(A 错),token 限额用 max_tokens(C 错),对话历史在 messages 数组里(D 错)。
Question 3

Your users complain that your chat app feels slow because they wait 20 seconds staring at a loading spinner, then a bunch generated text suddenly appears. What feature should you add to fix this?

  1. Shorter prompts
  2. Response streaming
  3. Multiple chatbots
  4. Faster internet connection
推荐答案B · Response streaming
Streaming 让 token 一边生成一边推送给前端,把「20 秒空白后一次性出现」变成「即刻开始打字」。这是体感延迟问题的标准解,不需要改 prompt 长度或换网络。
Question 4

You want Claude to generate only clean JSON code without any explanations or markdown formatting. Which combination of techniques works best?

  1. Request shorter responses
  2. Ask nicely in the prompt
  3. Use high temperature and long prompts
  4. Prefill with "{" and use "```" as a stop sequence
推荐答案D
两个技术的组合:response prefill(在 assistant 消息前置 {)锁定 JSON 起始;stop sequence(```)阻断 markdown 代码块结尾。这比「礼貌请求」可靠得多。
Question 5

You're building a chatbot to answer factual questions about your company. You want consistent, reliable answers every time. What temperature setting should you use?

  1. 0.1 (low temperature)
  2. It doesn't matter
  3. 0.8 (high temperature)
  4. 1.0 (high temperature)
推荐答案A
温度越低输出越确定,相同输入趋于一致输出。事实问答场景需要可预测性,0.1 是典型选择;高温适合创意写作或头脑风暴。
Question 6

Claude reads your message "I love quantum physics." What happens first?

  1. Claude breaks the text into smaller pieces called tokens
  2. Claude researches quantum physics
  3. Claude writes a response immediately
  4. Claude translates it to another language
推荐答案A
所有 LLM 输入都先经 tokenizer 切成 token,模型只看 token ID。Claude 不联网做研究(B 错),不会跳过 tokenization 直接回答(C 错)。
Question 7

You ask Claude "What's the best programming language?" but you want it to specifically argue for Python. What technique helps you control this?

  1. Use a higher temperature setting
  2. Ask the question multiple times
  3. Make the text bigger
  4. Add an assistant message starting with "Python is the best because"
推荐答案D
同样是 response prefill 技术——在 messages 数组里加一条 assistant 角色的「开头」,Claude 只能从那个开头继续写,从而锁定立场或格式。这比 prompt 里反复嘱咐有效得多。

Quiz on Prompt Engineering Techniques

6 题 · Skilljar Vertex 课程
Question 1

You're giving your AI a long document to analyze along with your instructions. What would help the AI understand your prompt better?

  1. Put everything in one big paragraph
  2. Write the document in all capital letters
  3. Use XML tags like <document> and <instructions>
  4. Separate sections with lots of blank lines
推荐答案C
Anthropic 官方推荐用 XML 标签为 prompt 划分语义边界——Claude 训练时大量见过此格式,能精确识别 <document><instructions> 的不同角色。空行(D)和大写(B)都没有结构化能力。
Question 2

You want your AI to write a book review. Which approach would be more helpful?

  1. "Write a book review that's 300 words, includes plot summary, mentions two characters, and gives a rating"
  2. "Write a book review that's good and interesting"
  3. "Tell me what you think about books in general"
  4. "Write something about the book I just read"
推荐答案A
「Be specific」原则:明确长度、必含元素、可衡量标准。「good and interesting」「something about」全是模糊词,AI 只能猜。这正是 4D 框架中 Description 四要素(角色/任务/格式/标准)的应用。
Question 3

You want to improve a prompt that isn't working well. What should you do first after writing your initial prompt?

  1. Test it and measure how well it performs
  2. Add more examples immediately
  3. Rewrite it completely from scratch
  4. Make it longer and more detailed
推荐答案A
「Eval-first」原则——没有度量就没有改进方向。盲目加例子(B)、重写(C)或加长(D)都可能让 prompt 变差而不自知。先建评估集,再迭代。
Question 4

Your evaluation report shows your prompt scored poorly on "missing calorie information" across multiple test cases. What does this tell you?

  1. You should ignore this feedback and try something else
  2. You need to specifically tell your prompt to include calorie information
  3. The test cases are too hard
  4. The evaluation system is broken
推荐答案B
评估反馈指出系统性遗漏点——直接在 prompt 里把缺失项明确化。Claude 不会自动猜你想要什么,缺什么就写什么。归咎评估(C/D)或忽视(A)都是甩锅。
Question 5

You want your AI to detect sarcasm in social media posts, but it keeps missing sarcastic comments. What would help most?

  1. Tell it to "try harder" to find sarcasm
  2. Use bigger fonts in your prompt
  3. Show it examples of sarcastic posts with correct labels
  4. Ask it to guess when posts might be sarcastic
推荐答案C
Few-shot examples 是最有效的复杂判断引导——尤其讽刺这种「难以言说」的概念,举例比定义更直接。「try harder」(A)对 LLM 完全无意义。
Question 6

Why is the first line of your prompt considered the most important part?

  1. It determines which AI model will be used
  2. It determines how fast the AI will respond
  3. It sets the stage for everything that follows and should be clear and direct
  4. It controls the length of the AI's response
推荐答案C
首行确立任务框架——LLM 强烈受首句锚定,含糊或离题的开场会让后续指令失焦。Anthropic 的「Be clear and direct」原则尤其强调首行。

Quiz on Prompt Evaluation

4 题 · Skilljar Vertex 课程
Question 1

You've learned techniques for writing better prompts, but now you want to measure how well they actually work. What do you need?

  1. More prompt engineering techniques
  2. Prompt evaluation methods
  3. More training data
  4. A faster AI model
推荐答案B
题目本身就是答案的提示——「measure how well they work」即评估。Prompt engineering(A)是写得更好,evaluation 是衡量好坏。两者循环互补。
Question 2

You need test cases for your prompt evaluation but don't want to write them all by hand. What's a good alternative?

  1. Ask users to create test cases
  2. Only test with one example
  3. Skip testing and deploy immediately
  4. Use Claude to generate test cases automatically
推荐答案D
用 Claude 生成多样化的测试用例是公认效率方案——可以指定边界、角色、噪声场景批量产出。其他选项要么牺牲覆盖率(B),要么牺牲质量(C),要么转嫁负担(A)。
Question 3

You write a prompt and test it twice with your own inputs. It looks good, so you deploy it. What's the main risk?

  1. The prompt will work too slowly
  2. The prompt will become too expensive
  3. The AI model will stop working
  4. Users might provide unexpected inputs that break it
推荐答案D
两次自测的覆盖率几乎为零,真实用户输入的多样性远超预想——边界情况、对抗输入、语言混合都可能触发崩溃。这是 Discernment 不到位的典型失败模式。
Question 4

In a typical evaluation workflow, what happens right after you feed your prompts through Claude?

  1. You change the prompt and start over
  2. You feed the responses through a grader
  3. You create a new dataset
  4. You deploy the prompt to production
推荐答案B
Eval workflow 标准顺序:dataset → prompt → response → grader → score。grader 可以是代码(确定性检查)或另一个 LLM(model-based grading)。修改 prompt(A)和部署(D)发生在拿到分数之后。

Quiz on Tool Use with Claude

7 题 · Skilljar Vertex 课程
Question 1

When Claude wants to use a tool, it sends back a response that's different from usual. What does this response contain?

  1. Error messages only
  2. Both text blocks and tool use blocks
  3. Only tool requests with no text
  4. Only text like normal
推荐答案B
Tool use 响应的 content 数组通常包含「Claude 的思考说明」(text block)+「工具调用请求」(tool_use block)。stop_reason 会是 tool_use。极少只发 tool 不带文字(C 也偶有,但 B 是典型情况)。
Question 2

You want to create a tool that gets the current time. What type of code do you need to write?

  1. A regular Python function
  2. A web page
  3. A database query
  4. A complex AI algorithm
推荐答案A
Tool 本质就是普通函数——Claude 输出 tool_use 请求,你的应用执行函数(如 datetime.now()),把结果作为 tool_result 回传。无需任何 AI 算法。
Question 3

A user asks Claude "What day will it be 30 days from today?" To answer this, Claude needs to use multiple tools. What happens?

  1. Claude asks the user to do the math
  2. Claude uses one tool and guesses the rest
  3. Claude calls tools in sequence — first getting today's date, then adding 30 days
  4. Claude gives up and says it can't help
推荐答案C
多轮 tool use(agentic loop)模式——Claude 自主规划工具调用顺序,每次拿到 tool_result 后决定下一步。这正是 agent 的核心能力。
Question 4

Sarah asks Claude "What's the weather like today?" but Claude says it doesn't have current weather data. What would solve this problem?

  1. Waiting for Claude to update itself
  2. Giving Claude access to tools that fetch current data
  3. Asking Claude to guess the weather
  4. Training Claude on more weather information
推荐答案B
Claude 训练数据有时间截止点,无法访问实时信息。Tool use 是补足实时数据的标准方案——例如 weather API 工具。等待自更新(A)或重训(D)都不现实。
Question 5

You're building a chat app with Claude. A user asks for today's stock prices, but Claude responds "I don't have access to current stock information." What's the core problem?

  1. The user asked the wrong question
  2. Claude is broken
  3. Claude only knows information from its training data
  4. Claude needs to be restarted
推荐答案C
这是 LLM 本质局限——无外部访问能力,只能召回训练数据中的信息。理解这一点是决定何时引入工具的前提。Q4 是「怎么办」,Q5 是「为什么」。
Question 6

You want to give Claude the ability to search the web for current information. What do you need to implement?

  1. Just a simple schema — Claude handles the searching
  2. Your own search engine
  3. A complex web scraping system
  4. Permission from Google
推荐答案A
指 Anthropic 内置的 web_search tool——只需在 tools 数组里声明,Claude 自动调用并把结果整合进回答。无需自建搜索引擎或绕过权限。
Question 7

You've written a Python function for Claude to use. What else do you need so Claude knows how to call it?

  1. Permission from Claude
  2. A JSON schema describing the function
  3. A special license
  4. A user manual
推荐答案B
Tool 定义 = name + description + input_schema(JSON Schema)。Claude 只看 schema 决定何时调用、传什么参数;函数代码由你的应用执行,Claude 看不到。

Quiz on Retrieval Augmented Generation

8 题 · Skilljar Vertex 课程
对应本地卷: V8·L04 RAG / Agent 模式
Question 1

You're setting up a system to handle large documents. Instead of using everything at once, you break documents into smaller pieces and search for relevant ones. What is this approach called?

  1. The chunking approach
  2. File splitting
  3. Text summarization
  4. Document compression
推荐答案A
Chunking 是 RAG 的术语标准——按语义/段落把文档切成可索引的小块(chunk)。File splitting 是文件层面操作,summarization 改变内容,compression 不分块。
Question 2

You try to include a massive 800-page document directly in your Claude prompt. What problems will you likely face?

  1. There are hard limits on text length, reduced effectiveness, and higher costs
  2. The document will be perfectly processed
  3. Claude will work faster than normal
  4. Only the cost will increase slightly
推荐答案A
三重问题:① context window 上限(200K 默认 / 1M Opus);② 长上下文中信息检索效率下降(lost-in-the-middle);③ 输入 token 计费成倍增加。这正是 RAG 存在的理由。
Question 3

You have an 800-page financial report and want to ask Claude specific questions about it. What does RAG help you do?

  1. Ask only yes/no questions
  2. Put the entire document into each prompt
  3. Summarize the whole document first
  4. Find and include only the relevant sections for each question
推荐答案D
RAG = Retrieval-Augmented Generation——按问题检索相关 chunk 注入 prompt,让 Claude 看到「足够且必要」的上下文。不是塞全文(B),也不是先总结(C,会丢失细节)。
Question 4

You send the text "The cat is happy" to an embedding model. What do you get back?

  1. A summary of the text
  2. A translation in another language
  3. A list of keywords
  4. A long list of numbers
推荐答案D
Embedding model 把文本映射成定长向量(如 1536 维浮点数组)——这串数字捕获语义,可用余弦相似度做语义搜索。Embedding 不解释、不翻译、不抽词。
Question 5

What problem does contextual retrieval solve in RAG systems?

  1. It makes search queries run faster
  2. It reduces the storage space needed for embeddings
  3. It addresses the issue of chunks losing their connection to broader document context when documents are split
  4. It eliminates the need for vector databases
推荐答案C
Anthropic 提出的 Contextual Retrieval:在 chunk 前缀一段 Claude 生成的「该 chunk 在原文中的位置/角色」摘要,恢复被切断的上下文,显著提升检索准确率。不影响速度(A)或存储(B)。
Question 6

You have search results from both semantic search and BM25 search. They use different scoring systems. How do you combine them into one ranked list?

  1. Use Reciprocal Rank Fusion (RRF) based on rank positions
  2. Take the average of both scores
  3. Add the scores together directly
  4. Use only the semantic search results
推荐答案A
RRF 基于「排名位置」融合(不同 scorer 的绝对分不可比)——每个文档的最终分 = Σ 1/(k + rank)。这是混合检索的工业标准方法。
Question 7

What is the purpose of re-ranking in RAG pipelines?

  1. To compress the vector database for faster searches
  2. To generate better embeddings for text chunks
  3. To use an LLM to intelligently reorder search results after initial retrieval
  4. To split documents into more appropriate chunk sizes
推荐答案C
Re-ranking 是检索后处理阶段——召回粗排出 N 条候选,用更慢但更准的 reranker(通常是 cross-encoder 或小 LLM)二次精排,把真正相关的顶到前面。
Question 8

You're searching for a specific incident ID like "INC-2023-Q4-011" in your documents. Semantic search isn't finding it well. What additional search method would help?

  1. Bigger vector database
  2. BM25 lexical search for exact term matching
  3. Longer embeddings
  4. More chunks
推荐答案B
Semantic search 擅长「意思相近」,BM25 擅长「字面精确匹配」(ID、SKU、错误码、人名)——两者互补构成 hybrid search。这正是 Q6 RRF 要融合的两条召回通路。

Quiz on Features of Claude

6 题 · Skilljar Vertex 课程
Question 1

When Extended Thinking is enabled, what two parts will Claude's response contain?

  1. A summary block and a detail block
  2. A thinking block and a text block
  3. A draft block and a final block
  4. A question block and an answer block
推荐答案B
Extended Thinking 模式下 content 数组先有一个 thinking block(推理过程),再有 text block(最终回答)。这是 4.x 系列混合模型的标准结构。
Question 2

You ask Claude "How many marbles are in this image?" but get the wrong count. What's the best way to improve accuracy?

  1. Ask the question in all capital letters
  2. Send a higher quality image
  3. Upload the image multiple times
  4. Provide detailed counting steps and methodology
推荐答案D
Vision 计数任务的最佳实践:要求 Claude 显式说出方法论(分区、画网格、逐块计数再相加),相当于给视觉任务套上 chain-of-thought。直接问「多少个」准确率较低。
Question 3

What's the minimum amount of content needed for caching to work?

  1. Any amount of text
  2. 500 tokens
  3. 1024 tokens
  4. 2000 tokens
推荐答案C
Anthropic 文档:默认模型最小可缓存 prefix 为 1024 tokens(部分小模型为 2048)。低于此阈值的 cache_control 会被忽略——这是判断「值不值得用 caching」的关键参数。
Question 4

You want to cache your tool definitions. Where should you place the cache breakpoint?

  1. On the last tool in your list
  2. On the middle tool in your list
  3. On every tool in your list
  4. On the first tool in your list
推荐答案A
Cache breakpoint 缓存的是「断点之前的所有内容」——放在最后一个 tool 上意味着所有 tool 定义都被缓存。放第一个(D)只缓存第一个工具,效果最弱。
Question 5

How does prompt caching work?

  1. It makes Claude remember conversations forever
  2. It prevents Claude from making mistakes
  3. It reuses computational work from previous requests
  4. It translates messages into different languages
推荐答案C
Prompt caching 本质:Anthropic 服务端缓存 prefix 的 KV cache(注意力中间计算结果),命中时跳过重复的 forward pass,命中部分 token 价格降至 10%。不是对话记忆(A)。
Question 6

You're building an app where users ask questions about documents. What's the main benefit of enabling citations?

  1. It reduces the cost of each request
  2. It shows users exactly where information came from
  3. It makes Claude's responses longer
  4. It makes the app run faster
推荐答案B
Citations API 让 Claude 在回答中标注每条主张对应原文哪段——可验证性大幅提升,也降低幻觉风险。这是文档问答场景的合规和信任刚需。

Quiz on Model Context Protocol

7 题 · Skilljar Vertex 课程
Question 1

User-controlled workflows that are triggered through UI interactions like button clicks or slash commands. This definition describes which MCP primitive?

  1. Resources
  2. Sessions
  3. Tools
  4. Prompts
推荐答案D
MCP 三类原语的「控制者」:Tools = model-controlled(Claude 决定何时调),Resources = app-controlled(应用按需加载),Prompts = user-controlled(用户点 UI 触发,例如 slash command)。
Question 2

What does "transport agnostic" mean in the context of MCP communication?

  1. MCP automatically chooses the fastest available network connection
  2. MCP requires specific hardware to function properly
  3. MCP only works with HTTP connections
  4. MCP clients and servers can communicate using different methods like HTTP, WebSockets, or standard input/output
推荐答案D
MCP 协议层与传输层解耦——同样的 JSON-RPC 消息可走 stdio(本地子进程)、SSE/HTTP(远程)、WebSocket。开发者按部署场景选传输,业务代码不变。
Question 3

What are Resources in the context of MCP?

  1. App-controlled data access for UI purposes or adding context to conversations
  2. Model-controlled functions for performing calculations
  3. User-triggered commands that start predefined workflows
  4. Server configuration settings that control performance
推荐答案A
Resources 由应用决定何时加载(不像 tool 是模型自主调)——典型用途:把当前文件、数据库行、知识库片段作为上下文传给 Claude。B 是 Tool,C 是 Prompt。
Question 4

In MCP architecture, what is the relationship between MCP Clients and MCP Servers?

  1. MCP Clients generate AI responses while MCP Servers handle user input
  2. MCP Clients store data while MCP Servers process requests
  3. MCP Clients connect to MCP Servers that contain tools, prompts, and resources
  4. MCP Clients and MCP Servers are the same component with different names
推荐答案C
标准 client-server 架构——Client(如 Claude Desktop / Cursor / Claude Code)连接多个 Server,Server 暴露 tools/prompts/resources 三类能力。Server 是能力提供方,Client 是消费方。
Question 5

What is Model Context Protocol (MCP)?

  1. A programming language specifically designed for AI applications
  2. A communication layer that provides Claude with context and tools without requiring tedious integration code
  3. A security protocol for encrypting AI model responses
  4. A database management system for storing AI conversations
推荐答案B
MCP 的「USB-C for AI」类比——一套标准协议替代 N 个 AI 应用 × M 个数据源的两两胶水代码。开发者一次实现 MCP Server,所有兼容客户端即可使用。
Question 6

What is the MCP Server Inspector?

  1. A command-line tool for monitoring server performance
  2. A code editor specifically designed for writing MCP servers
  3. A security tool for scanning MCP servers for vulnerabilities
  4. A browser-based interface for testing and debugging MCP servers in real-time
推荐答案D
官方 MCP Inspector 是一个 web UI(npx @modelcontextprotocol/inspector),可手动调用 server 的 tools / 读取 resources / 触发 prompts,用于开发期调试。
Question 7

Which of the following correctly describes Tools in MCP?

  1. App-controlled data that populates UI elements
  2. Static configuration files that define server behavior
  3. User-controlled workflows that can be triggered on demand
  4. Model-controlled functions that Claude decides when to call
推荐答案D
与 Q1/Q3 形成完整对照:Tools 由 model 决定何时用(如 search_web、execute_sql),Resources 由 app 控制(A 描述),Prompts 由 user 触发(C 描述)。

Quiz on Agents and Workflows

7 题 · Skilljar Vertex 课程
Question 1

You want Claude to analyze a product image for 6 different materials at once. Instead of one huge prompt, what should you do?

  1. Write a longer, more detailed prompt
  2. Send 6 separate requests in parallel, then combine results
  3. Ask the user to pick one material first
  4. Use an agent with material tools
推荐答案B
Parallelization 模式——独立子任务并发执行,最后聚合。避免单 prompt 因任务过多而注意力分散,也降低延迟(6 个并发 vs 串行)。
Question 2

You're building a system where Claude creates content, then checks if it's good enough, then improves it if needed. What pattern is this?

  1. Parallelization
  2. Evaluator-optimizer
  3. Routing
  4. Chaining
推荐答案B
Evaluator-optimizer:generator 出稿 → evaluator 打分/给反馈 → generator 据反馈改写,循环至达标。区别于 Chaining(线性多步)和 Routing(分类分发)。
Question 3

You're building an app where users upload photos and always get the same 4-step process to enhance them. Which approach should you use?

  1. An agent with photo tools
  2. A single complex prompt
  3. Multiple agents working together
  4. A workflow with predefined steps
推荐答案D
Workflow vs Agent 关键区分:步骤固定且已知 → workflow(可控、可预测、低成本);步骤需 Claude 自主规划 → agent。这道题「same 4-step」明示流程固定。
Question 4

Your app needs to handle both "cooking recipes" and "workout routines" with completely different styles. What pattern helps?

  1. Combine both styles in one prompt
  2. Use the same prompt for both
  3. Routing — categorize first, then use specialized prompts
  4. Always ask users to specify the category
推荐答案C
Routing 模式——先用轻量分类器(小模型/规则)判断输入类型,再分发到专属 prompt。比通用 prompt 准确率高,比让用户分类(D)体验好。
Question 5

What defines an agent when working with Claude?

  1. A predetermined sequence of steps that Claude must follow exactly
  2. A setup where Claude is given a goal and tools, then figures out how to complete the goal
  3. A system that categorizes user requests into different types
  4. A method for breaking complex tasks into parallel subtasks
推荐答案B
Agent 的本质 = 目标导向 + 自主规划工具调用顺序。A 是 workflow,C 是 routing,D 是 parallelization——这些都是固定结构,不是 agent。
Question 6

What is environment inspection in the context of AI agents?

  1. The process of categorizing user requests into different workflow types
  2. The technique of running multiple specialized tasks simultaneously
  3. A method for breaking large tasks into smaller sequential steps
  4. Claude's ability to observe and understand the results of its actions
推荐答案D
Environment inspection = agent 在每次行动后读取环境反馈(tool_result、文件状态、错误信息)以决定下一步。这是 agentic loop 的核心反馈环节。
Question 7

You're building an agent. In general, should you give it a "Refactor Code" tool or basic tools like "read file" and "write file"?

  1. "Refactor Code" tool — it's more specific
  2. Basic tools — they're more flexible
  3. Both tools together
  4. Neither — agents don't need tools
推荐答案B
Anthropic agent 设计原则:「fewer, more general tools」。基础原子工具(read/write/exec)让 agent 自主组合解决任意问题;高层抽象工具(Refactor Code)覆盖窄、需求一变就要重做。

Final Assessment Quiz Vertex

6 题 · Skilljar Vertex 课程 · 综合判断
Question 1

You're porting a Python script from `anthropic.Anthropic()` (direct API) to `anthropic.AnthropicVertex(...)`. The `model` parameter must change how?

  1. It stays the same — `claude-sonnet-4-5`
  2. Prefix with `vertex.`, e.g. `vertex.claude-sonnet-4-5`
  3. Use the Vertex Model Garden ID, typically `claude-sonnet-4-5@` or the current endpoint alias
  4. The SDK rewrites it automatically; pass anything
推荐答案C
Vertex 用 Model Garden 的 ID,常见形式是 `claude-X@YYYYMMDD` 版本日期或较新的 endpoint alias(要看 Model Garden 当前配置)。直接复用 Anthropic ID 会查无此模型(A 错);`vertex.` 是杜撰前缀(B 错,那是 Bedrock 的 `anthropic.` 前缀的混淆版本);SDK 不会替你猜(D 错)。
Question 2

Most of your users are in the US and you need predictable latency plus US-only data residency. Endpoint choice?

  1. `global` — let Vertex route freely
  2. A US multi-region endpoint (`us`) or a specific US region — keeps requests within US borders
  3. `eu` — closer to global average
  4. Skip Vertex; call Anthropic API directly for lower latency
推荐答案B
数据驻留要求时必须显式选 US(多区域或具体 region)。`global`(A)会跨大陆路由,无法保证驻留;`eu`(C)方向相反;绕过 Vertex(D)放弃了 GCP 治理与审计——这是题面要求的反面。
Question 3

Production auth on Vertex — best pattern when running on GKE?

  1. Hard-code Service Account JSON in container image
  2. Reuse the existing `ANTHROPIC_API_KEY` env var
  3. Workload Identity — Pod assumes a Google Service Account without any JSON key on disk
  4. Have users sign in with their personal Google accounts
推荐答案C
GKE 上的生产应用应使用 Workload Identity——免静态密钥、按 Pod 授权、可审计。镜像里硬编 SA JSON(A)等于把生产密钥放进任何拿到镜像的人手里;Anthropic API key 在 Vertex 上无效(B 错);让最终用户登录(D)让应用拿不到一致的服务身份。
Question 4

Compliance auditor asks "show me every Claude call your team made last month, with caller identity." Where do you look?

  1. Anthropic Console usage page
  2. Cloud Audit Logs in the GCP project — Vertex records every API call with caller identity and metadata
  3. Application logs only — Vertex doesn't expose this
  4. It's not retained; you'd have to log it yourself manually going forward
推荐答案B
Vertex 的每次调用都进入 Cloud Audit Logs(Data Access logs),含主体身份与请求元数据——这正是企业选 Vertex 而非 Anthropic 直连的关键合规理由。Anthropic Console(A)不显示 Vertex 调用;应用日志(C)只看你写的;Audit Logs 默认开启(D 错)。
Question 5

If you write Vertex requests with the native REST body (not via the Anthropic SDK), what's the gotcha?

  1. Native REST is forbidden; you must use the SDK
  2. You must include `anthropic_version: "vertex-2023-10-16"` in the request body — Vertex requires it
  3. The body is identical to Anthropic API requests, no changes
  4. You must encrypt the body with a public key first
推荐答案B
原生 Vertex 请求体必须带 `anthropic_version: "vertex-2023-10-16"`,否则被拒。SDK 自动填这个字段;自己拼请求体时容易漏。Native REST 是允许的(A 错);与 Anthropic API 不完全相同(C 错);不需要客户端加密(D 错)。
Question 6

Migrating an Anthropic-API-built app to Vertex. Which step is most often skipped and most likely to break production?

  1. Renaming the GCP project
  2. Rotating the (no-longer-used) Anthropic API key
  3. Re-running your eval set against the Vertex-versioned model and target region — quotas, latencies, and (rarely) behavior nuances differ
  4. Switching all calls to Sonnet
推荐答案C
迁移最容易漏的是「在新平台跑一遍 eval」——配额限制、区域可用性、流式行为、版本钉扎都可能变。项目改名(A)和密钥轮换(B)是周边动作;强制 Sonnet(D)是与迁移无关的选择。生产事故最常源于「不重测 eval 就上线」。

📦 Bedrock 课程

Skilljar「Claude with Amazon Bedrock」— 8 个 Quiz · 42 题 · 同主题与 Vertex 完全互不重复,可作互补练习 + Final Assessment

Quiz on Working with the API Bedrock

4 题 · Skilljar Bedrock 课程
Question 1

You're building a chat app and users complain that responses take too long to appear. What feature should you implement?

  1. Send requests faster
  2. Add a loading spinner
  3. Make the messages shorter
  4. Use streaming to show text as it's generated
推荐答案D
Streaming 是体感延迟的标准解——token 边生成边推送,用户看到"打字效果"而非长时间等待。Loading spinner(B)只是装饰,不解决根本问题。
Question 2

You're using Claude to extract data from documents and need the same consistent format every time. What temperature setting should you use?

  1. Temperature close to 0 for consistent, predictable outputs
  2. Temperature 0.5 for balanced responses
  3. Temperature 1.0 for maximum creativity
  4. Temperature doesn't matter for data extraction
推荐答案A
数据抽取需要可复现的精确输出——低温(接近 0)让模型每次都选最高概率 token,保证格式稳定。高温引入随机性,破坏数据 pipeline 一致性。
Question 3

You want to build a customer service bot that only talks about your company's products and stays professional. What's the best approach?

  1. Tell users to only ask product questions
  2. Add "be professional" to every user message
  3. Use a system prompt that makes Claude act like a customer service representative
  4. Set the temperature to maximum creativity
推荐答案C
System prompt 是定义角色与边界的标准位置——"你是 X 公司客服代表,只回答产品相关问题…"——比每条消息重复(B)高效,比依赖用户自律(A)可靠。
Question 4

You're building a chatbot where users ask follow-up questions. A user asks "What's 2+2?" and then asks "Add 5 to that." What do you need to do for the second question to make sense?

  1. Send both the first question and Claude's previous answer along with the second question
  2. Restart the conversation from the beginning
  3. Wait 30 seconds before sending the second question
  4. Send only the second question to Claude
推荐答案A
Claude API 是无状态的——每次请求必须把完整 messages 数组(含之前的 user + assistant 轮次)一起发送。"Add 5 to that"中的"that"只能从历史里推断。

Quiz on Prompt Engineering Bedrock

4 题 · Skilljar Bedrock 课程
对应本地卷: V8·L02 Prompt 工程
Question 1

You want an AI to write movie reviews in a specific style. What's the best way to show the AI exactly what you want?

  1. Use very strict formatting rules
  2. Tell it to copy famous movie critics
  3. Describe the style in great detail
  4. Give it a sample movie review as an example
推荐答案D
Few-shot example——对于"风格""口吻"这种难以言说的属性,举一个真实样本远胜于抽象描述。一个示例胜过千言万语,正是 few-shot prompting 的核心理由。
Question 2

You're improving a prompt that generates workout plans. What should you do after writing your first version?

  1. Test it, see how well it works, then improve it
  2. Use it immediately for all workouts
  3. Write five different versions at once
  4. Ask other people to guess what it does
推荐答案A
Eval-driven prompt iteration——写完先测,看输出问题再针对性修改。盲目并行写五版(C)或直接上线(B)都是无效循环。
Question 3

You're asking an AI to analyze a long customer review mixed in with your instructions. What helps the AI understand which part is the review?

  1. Put the review at the very end
  2. Put the review between XML tags like <review></review>
  3. Write the review in a different font
  4. Make the review all uppercase
推荐答案B
XML 标签是 Anthropic 推荐的内容隔离方式——Claude 训练时大量见过此格式,能精准识别 <review> 是数据,外面是指令。位置(A)和大小写(D)都没有结构化能力。
Question 4

You want an AI to write a book summary. Which opening instruction works best?

  1. Write a three-paragraph summary of this book
  2. What do you think about summarizing things?
  3. I was wondering if you could maybe help with something about books?
  4. Books are interesting, aren't they?
推荐答案A
"Be clear and direct"原则:祈使句 + 明确数量约束(三段)。试探性提问(B/C)或闲聊(D)会让 Claude 不确定该做什么。

Quiz on Prompt Evaluations Bedrock

7 题 · Skilljar Bedrock 课程
Question 1

You need test data for evaluating your prompt. You want to create realistic examples quickly without writing them all by hand. What's the best approach?

  1. Use the same example over and over
  2. Use Claude to automatically generate test cases
  3. Ask your friends to write them
  4. Copy examples from the internet
推荐答案B
用 LLM 生成测试数据是行业标准做法——可指定边界、噪声、对抗场景批量产出,比人工写更快、覆盖更广。重复样本(A)和网络抄袭(D)都不可控。
Question 2

You've written a prompt for Claude and want to know if it works well. What's the difference between prompt engineering and prompt evaluation?

  1. Prompt engineering is for beginners, prompt evaluation is for experts
  2. Prompt engineering tests the prompt, prompt evaluation writes it
  3. Prompt engineering writes better prompts, prompt evaluation measures how well they work
  4. They're the same thing with different names
推荐答案C
两者是互补环节:engineering 是"创造"(写 prompt 的技巧),evaluation 是"度量"(测 prompt 效果)。形成"写→测→改"循环。
Question 3

In prompt evaluation, what is a "grader" used for?

  1. To save your prompts to a file
  2. To give objective scores measuring output quality
  3. To write better prompts automatically
  4. To make Claude respond faster
推荐答案B
Grader 是评估 pipeline 的打分组件——可以是代码(确定性检查格式/关键词)或 LLM(model-based grading),将主观感觉变成可对比的数值。
Question 4

You just wrote a prompt for your app. You test it once and it works great, so you decide to use it. What's the main risk with this approach?

  1. The prompt will stop working after a few days
  2. Users will provide unexpected inputs that break it
  3. It will be too expensive to run
  4. Other developers won't understand your code
推荐答案B
单测试样本 = 0 覆盖率。真实用户输入的多样性(边界、错别字、对抗、跨语言)远超开发者预期,"自测两次就上线"是最常见的生产事故根因。
Question 5

You're running a prompt evaluation. After creating your dataset and feeding questions through Claude, what's the next step?

  1. Feed the responses through a grader to get scores
  2. Write a completely new prompt
  3. Ask users what they think
  4. Publish your prompt immediately
推荐答案A
标准 eval 顺序:dataset → prompt → response → grader → score。grader 是从"输出"到"分数"的关键环节,没有它无法量化好坏。
Question 6

You're using another AI model to evaluate Claude's responses. To get better scores than just random numbers around 6, what should you ask the grader to provide?

  1. Just "good" or "bad"
  2. Strengths, weaknesses, reasoning, and a score
  3. A rewritten version of the response
  4. Only a number from 1-10
推荐答案B
让 LLM grader 先做 chain-of-thought(说优点缺点、给推理)再给分,分数明显更准。直接要数字(D)会得到中位偏 6 的随机分布——LLM 不"思考"无法精确评判。
Question 7

You want to check if Claude's output contains certain keywords and has the right length. Which type of grader should you use?

  1. Manual grader
  2. Code grader
  3. Model grader
  4. Human grader
推荐答案B
关键词包含、长度等可程序化的判断用 code grader(正则、字符串匹配、长度计数)——确定性、零成本、零延迟。Model grader 留给"语气是否专业""逻辑是否合理"等主观维度。

Quiz on Tool Use Bedrock

4 题 · Skilljar Bedrock 课程
Question 1

When using tools, what happens right after Claude asks for specific external data?

  1. Claude analyzes the original question again
  2. The user needs to approve the data request
  3. Your server runs code to fetch the requested information
  4. Claude provides the final answer immediately
推荐答案C
Tool use 协议:Claude 返回 tool_use block → 你的应用执行函数 → 把结果作为 tool_result 回传给 Claude → Claude 据此给最终回答。中间执行环节由你的服务器负责,不是 Claude。
Question 2

You want to force Claude to use a specific tool for data extraction. Which toolChoice setting should you use?

  1. {"toolChoice": {"auto": {}}}
  2. {"toolChoice": {"tool": {"name": "tool-name"}}}
  3. {"toolChoice": {"any": {}}}
  4. {"toolChoice": {"required": true}}
推荐答案B
tool_choice 三种模式:auto(Claude 决定是否用工具)、any(必须用某个工具但 Claude 选)、tool(强制用指定工具)。"force specific tool"对应第三种。
Question 3

You ask Claude "What's the weather today?" but it says it doesn't have current weather information. What would tools help Claude do?

  1. Access live weather data from external sources
  2. Ask you to check the weather yourself
  3. Guess the weather based on the date
  4. Remember previous weather conversations
推荐答案A
Tool 突破训练数据时间截止——通过 weather API 工具,Claude 可调取实时数据。这就是 tool use 解决"知识截止"的本质。
Question 4

You're writing a tool function for Claude. What's the most important thing to include when creating the JSON schema?

  1. The programming language being used
  2. Detailed descriptions of what the tool does and its parameters
  3. Your contact information
  4. The function's source code
推荐答案B
Claude 只通过 description 字段决定何时调用以及怎么传参——清晰、明确的描述是工具被正确使用的关键。源码(D)Claude 看不到,语言(A)无关,联系方式(C)荒唐。

Quiz on Retrieval Augmented Generation Bedrock

5 题 · Skilljar Bedrock 课程
对应本地卷: V8·L04 RAG / Agent 模式
Question 1

What is contextual retrieval?

  1. A technique that adds context to document chunks before storing them to improve search accuracy
  2. A way to reduce the size of document chunks
  3. A system for automatically generating new content from existing documents
  4. A method for searching through documents faster
推荐答案A
Anthropic 提出的 Contextual Retrieval——存储前用 LLM 给每个 chunk 加一段"它在原文中的位置/角色"摘要,恢复被切断的上下文,索引和检索时都更准。
Question 2

What is a vector database in the context of RAG systems?

  1. A specialized database optimized for storing, comparing, and searching through numerical embeddings
  2. A system for backing up document files
  3. A regular database that stores text documents as files
  4. A database that only stores mathematical equations
推荐答案A
Vector DB(如 Pinecone、Weaviate、Milvus、pgvector)针对高维向量的余弦相似度查询做了专门优化——传统关系数据库扫表会慢得不可用。
Question 3

You're searching for a specific incident ID "INC-2023-Q4-011" in your documents. Semantic search isn't finding it well. What search method would work better?

  1. Converting everything to lowercase first
  2. Searching only document titles
  3. BM25 lexical search for exact keyword matching
  4. Using longer text embeddings
推荐答案C
Semantic search(embedding)擅长意思相近,但对 ID、SKU、错误码、专有名词等需要字面精确匹配的场景表现差——BM25 这类传统词频检索更可靠。两者结合即 hybrid search。
Question 4

You send the text "The cat sat on the mat" to an embedding model. What do you get back?

  1. A shorter version of the same text
  2. Keywords extracted from the text
  3. A list of about 1024 numbers representing the meaning
  4. A translation in another language
推荐答案C
Embedding model 输出固定维度(如 1024 维)浮点向量——这串数字就是文本的语义坐标,可用余弦相似度衡量两段文本的语义距离。Embedding 不是摘要也不是关键词。
Question 5

You have an 800-page financial report and want to ask an AI specific questions about it. What does RAG help you do?

  1. Send only the relevant sections to the AI for each question
  2. Make the document shorter by deleting pages
  3. Translate the document into simpler language
  4. Create a summary of the entire document
推荐答案A
RAG = Retrieval-Augmented Generation——按问题动态检索最相关的段落注入 prompt,让 Claude 看到"刚好够用"的上下文。这避开了 context window 上限、降低 token 成本、提升 lost-in-the-middle 抗性。

Quiz on Features of Claude Bedrock

4 题 · Skilljar Bedrock 课程
Question 1

You send Claude the same long document twice in a row. What does prompt caching help with?

  1. It stores your conversation history permanently
  2. It automatically summarizes repeated content
  3. It reduces the document's file size
  4. It saves the computational work from processing the text in the document
推荐答案D
Prompt caching 缓存的是 KV cache(注意力计算的中间结果),第二次请求复用 prefix 的计算,跳过 forward pass,命中部分 token 价格降至原价 10%。不是"对话记忆"(A)。
Question 2

You've optimized your prompt but Claude still isn't accurate enough on a complex task. What should you consider next?

  1. Rewrite the prompt completely from scratch
  2. Break the task into smaller pieces
  3. Use extended thinking to improve accuracy
  4. Switch to a different AI model
推荐答案C
Extended Thinking 让 Claude 在回答前显式做长推理(thinking block),针对数学、复杂规划、多步分析等任务能显著提升准确率。任务拆分(B)也可行但更费工——先试 thinking。
Question 3

You want to cache a short message that's 500 tokens long. What will happen?

  1. It will be automatically expanded to meet requirements
  2. It won't be cached because it's too short
  3. It will be cached normally
  4. It will be cached at half price
推荐答案B
Anthropic 文档:默认模型最低缓存 prefix 1024 tokens(部分小模型 2048)。500 tokens 低于阈值,cache_control 会被静默忽略——cache 不生效。
Question 4

What is an effective technique for increasing Claude's effectiveness with images?

  1. Uploading more images
  2. Using prompt engineering techniques
  3. Using JPEG instead of PNG images
  4. Providing zoomed-in images
推荐答案B
Vision 任务的提升点不在图片格式或数量,而在 prompt——明确说要看什么、给推理步骤、用 chain-of-thought。这是文字-视觉联合任务的通用规律。

Quiz on Model Context Protocol Bedrock

8 题 · Skilljar Bedrock 课程
Question 1

You're building a chat app where users ask Claude about their GitHub data. Without MCP, what's the main problem you'd face?

  1. Claude can't connect to the internet
  2. GitHub doesn't allow API access
  3. You'd have to write and maintain all the GitHub tool functions yourself
  4. Users can't type GitHub questions
推荐答案C
MCP 解决的是 N×M 集成问题——没有 MCP 时每个应用都得自己实现 GitHub 工具的 schema、调用、错误处理。有了 MCP,社区维护一个 GitHub Server,所有兼容客户端即可使用。
Question 2

You're creating an MCP server tool using the Python SDK. What's the easiest way to define a new tool?

  1. Create a separate configuration file
  2. Send HTTP requests to register tools
  3. Write complex JSON schemas manually
  4. Use the @mcp.tool decorator on a function
推荐答案D
Python SDK 的 @mcp.tool 装饰器自动从函数签名(参数、类型注解、docstring)生成 JSON Schema——无需手写 schema。这是 MCP SDK 设计的核心便利。
Question 3

Claude automatically decides to use a calculator tool when you ask "What's 15 × 23?" Who is controlling this tool usage?

  1. The MCP server providing the tool
  2. The application showing the chat
  3. Claude (the AI model) itself
  4. The user who asked the question
推荐答案C
Tools 是 model-controlled 原语——由 Claude(模型)自主决定何时调用。这与 Resources(app-controlled)和 Prompts(user-controlled)形成 MCP 三类原语的完整对照。
Question 4

You just wrote an MCP server and want to test if your tools work correctly. What's the best first step?

  1. Ask other developers to try it
  2. Connect it to Claude immediately
  3. Write unit tests for each function
  4. Use the MCP Inspector in your browser
推荐答案D
官方 MCP Inspector(npx @modelcontextprotocol/inspector)是浏览器调试 UI——可直接调用 server 的 tools / 读取 resources / 触发 prompts,看到原始 JSON-RPC 消息。开发期最快验证手段。
Question 5

You want to let users type "@document_name" to automatically include document content in their message. Should you use a tool or a resource?

  1. Tool - because Claude needs the information
  2. Resource - because documents are files
  3. Resource - because the app fetches data for the UI
  4. Tool - because it involves documents
推荐答案C
区分关键不在"是文件还是数据",而在"谁控制何时取用"。@提及 由 app 解析并主动注入上下文 → 这是 app-controlled = Resource。Tool 是 Claude 自主决定调用,与本场景不符。
Question 6

You're running an MCP client and server on the same computer. How do they most commonly communicate with each other?

  1. Using Bluetooth connection
  2. Through standard input/output
  3. Through email messages
  4. By writing files to disk
推荐答案B
本机进程间通信用 stdio transport——客户端把 server 作为子进程启动,通过标准输入输出收发 JSON-RPC 消息。简单、零配置、零网络开销。远程 server 用 SSE/HTTP transport。
Question 7

Your MCP client needs to find out what capabilities an MCP server offers. What message type should it send?

  1. GetServerInfo
  2. CheckCapabilities
  3. ListToolsRequest
  4. CallToolRequest
推荐答案C
MCP 协议定义了 ListToolsRequest(列出可用工具)、ListResourcesRequestListPromptsRequest——三类原语各有 list 方法。CallToolRequest 是调用特定工具,不是发现。A/B 不是协议方法名。
Question 8

Your MCP server has tools, resources, and prompts. A user clicks a "Format Document" button in your app. Which primitive is being used?

  1. Resources - because it accesses documents
  2. Tools - because it formats something
  3. All three at the same time
  4. Prompts - because the user directly triggered it
推荐答案D
Prompts 是 user-controlled 原语——专门为"用户点按钮 / 输入 slash command 触发预定义工作流"而设计。点击"Format Document"按钮触发预设格式化流程,正是 Prompts 的典型用法。

Final Assessment Quiz Bedrock

6 题 · Skilljar Bedrock 课程 · 综合判断
Question 1

Porting from Anthropic API to Bedrock. The model identifier changes how?

  1. Stays the same — `claude-sonnet-4-5`
  2. Bedrock model ID prefixes the family with `anthropic.`, e.g. `anthropic.claude-sonnet-4-5-v2:0` — and may need an inference profile prefix for cross-region capacity
  3. Use the Vertex format `claude-sonnet-4-5@`
  4. Bedrock auto-resolves any string
推荐答案B
Bedrock model ID 形式:`anthropic.claude-X-vN:0`,跨区域调用时还可能需要 inference profile 前缀(如 `us.anthropic.claude-...`)。直接复用 Anthropic ID(A 错);Vertex 的 `@date` 是另一平台(C 错);Bedrock 不会替你猜(D 错)。
Question 2

Why does your first call to a new Claude model on Bedrock fail with `AccessDeniedException`?

  1. Bedrock requires an Anthropic API key as fallback
  2. You need to email AWS support for every model
  3. Bedrock requires you to request and grant model access in the Bedrock console before invocation; access is not auto-granted per account
  4. The model is unavailable in any AWS region
推荐答案C
Bedrock 的 model access 需要在控制台显式申请并被授予后才能调用——这是与 Anthropic 直连最大的运营差异。Anthropic key 不参与(A 错);不需要邮件 support(B 错);模型通常多区域可用,不是地理问题(D 错)。
Question 3

Production app on EC2 calling Bedrock. Best auth pattern?

  1. Hard-code AWS access key + secret in the app config
  2. Reuse the Anthropic API key as `AWS_ACCESS_KEY_ID`
  3. Attach an IAM role to the EC2 instance profile; SDK uses the role's temporary credentials
  4. Use the AWS root account credentials
推荐答案C
EC2 → IAM instance profile 是 AWS 生产标配——临时凭证、自动轮换、审计可追溯。硬编静态 key(A)是最常见的泄漏入口;Anthropic key 与 AWS IAM 完全不同(B 错);root account(D)严禁日常使用。
Question 4

Cross-region inference profile — what does it actually do?

  1. Reduces cost by 50% for the first 1k requests
  2. Routes invocations across multiple AWS regions for capacity & availability without code changes
  3. Replaces VPC endpoints
  4. Required to enable streaming
推荐答案B
Inference profile 把多个区域聚成一个逻辑端点——AWS 自动选可用容量,应用代码无需变化。它不打折(A 错);与 VPC endpoint(C)正交;streaming 不需要它(D 错)。代价:需要明确允许跨区域数据驻留。
Question 5

Compliance team needs evidence of every Claude invocation. Where on AWS?

  1. Bedrock console "History" tab — only shows the last 24h
  2. CloudTrail records all Bedrock control-plane and (with Data Events enabled) data-plane API calls — including `InvokeModel` and `Converse`
  3. Anthropic Console — Bedrock proxies traffic there
  4. It's not captured; deploy your own logging
推荐答案B
CloudTrail 是 AWS 全平台审计源——Bedrock 的 control plane 默认开启,数据 plane(如 InvokeModel)需要显式启用 Data Events。这是 Bedrock 优于直连的关键合规价值。Bedrock 控制台不是审计源(A 错);与 Anthropic Console 无关(C 错);自建日志(D)反而是「不知道有 CloudTrail」的应急手段。
Question 6

Migrating an Anthropic-API-built app to Bedrock. Most subtle compatibility risk to test for?

  1. System prompts are removed in Bedrock
  2. `max_tokens` is unsupported in Bedrock
  3. Tool use, content block, and request/response shapes have minor differences — eval set must re-pass on the Bedrock SDK end-to-end
  4. Streaming is unavailable on Bedrock
推荐答案C
大方向兼容(system prompt、max_tokens、streaming 都支持),坑在细节:tool use 协议字段、message content block、错误码、配额限制都可能微妙不同——必须在 Bedrock 上完整跑一遍 eval 才能信任迁移。System prompt 没被移除(A 错);max_tokens 支持(B 错);streaming 支持(D 错)。

📦 Vol.6 Introduction to subagents(补充练习)

Skilljar 无官方 quiz · 以下 8 题为本地补充练习,覆盖子智能体定义、创建、设计与使用 · 基于 Vol.6 四节教材

Vol.6 · Introduction to subagents — 练习

8 题 · 本地补充练习(Skilljar 无官方 quiz)
Question 1

子智能体(subagent)与主会话最核心的区别是什么?

  1. 子智能体只能运行 Bash 命令,不能编辑文件
  2. 子智能体只能通过 slash command 手动调用
  3. 子智能体拥有独立的上下文、系统提示词和工具权限,与主会话隔离
  4. 子智能体的响应速度始终比主会话慢
推荐答案C
子智能体最关键的架构特征就是独立上下文——高输出搜索、日志分析、测试结果不会污染主会话。子智能体可配置独立的系统提示词和工具权限,这与主会话形成隔离。子智能体可以拥有编辑权限(A 错);除了手动调用,还可通过 description 匹配自动委派(B 错);速度并非定义特征(D 错)。
Question 2

以下哪项任务最适合委派给子智能体,而非在主会话中处理?

  1. 修改单个配置文件中的一个变量名
  2. 对全仓库进行安全扫描,只需要最终的问题清单
  3. 与用户持续讨论架构方案,需要多轮交互
  4. 需要密切参考主会话中刚刚讨论的上下文来做决策
推荐答案B
安全扫描会产生大量工具输出,如果放在主会话中会迅速占满上下文窗口。子智能体可以独立完成扫描,只将问题清单(结论)返回主会话——这正是「高输出、只需结论」的典型场景。小修改在主会话中更高效(A 错);需要持续交互或紧密上下文的任务不适合子智能体(C、D 错)。
Question 3

子智能体配置中,哪两个字段是必填的?

  1. name 和 model
  2. name 和 tools
  3. name 和 description
  4. description 和 model
推荐答案C
子智能体 YAML 配置中只有 name 和 description 是必填字段。name 是唯一标识符;description 直接影响自动委派——Claude Code 根据它来判断何时将任务委派给该子智能体。model、tools 等字段可选,省略时继承主会话设置。
Question 4

子智能体配置文件存放在 `.claude/agents/`(项目级)和 `~/.claude/agents/`(用户级)。以下说法正确的是?

  1. 用户级子智能体优先级始终高于项目级
  2. 项目级子智能体不能被团队成员使用
  3. 项目级子智能体可随仓库提交,团队成员 clone 后即可使用
  4. 两种级别的子智能体功能不同——用户级不支持 tool use
推荐答案C
项目级(.claude/agents/)子智能体位于仓库内,可通过 git 共享给团队。名称冲突时较高优先级会覆盖较低优先级(A 方向反了);项目级正是为了团队共享而设计(B 错);两种级别的功能相同(D 错)。
Question 5

子智能体的 `description` 字段写为 "Helps with coding tasks",这有什么问题?

  1. description 太长,会浪费上下文
  2. 太宽泛——几乎任何任务都可能触发,导致 Claude 误判委派时机
  3. description 不能包含英文,必须用中文
  4. description 中不能出现 "coding" 等通用词
推荐答案B
子智能体最常见的失败模式就是 description 过于宽泛。"Helps with coding tasks" 几乎匹配任何编程相关的任务,导致 Claude 在不合适的时机也尝试委派。好的 description 应该说明:何时使用、任务边界、输出格式。例如 "Use after code changes to review TypeScript files for security, correctness, and maintainability. Read-only."
Question 6

为代码审查(code review)子智能体配置工具权限时,最佳实践是什么?

  1. 赋予全部工具权限,以便子智能体可以自主修复发现的问题
  2. 仅赋予 Read、Glob、Grep 等只读工具——审查不应修改代码
  3. 赋予 Edit 和 Write 权限但禁止 Bash
  4. 不配置 tools 字段,让子智能体继承主会话的完整权限
推荐答案B
审查子智能体的职责是发现问题,不是修复问题。赋予写入权限可能导致未经批准的修改。最佳实践是仅赋予只读工具(Read、Glob、Grep),让审查结果返回主会话后由人工或专门的 fix 子智能体处理。权限最小化是子智能体设计的关键原则。
Question 7

设计子智能体时,单一职责原则(Single Responsibility Principle)的核心含义是?

  1. 每个子智能体只能使用一种工具
  2. 每个子智能体应专注于一类明确的任务,而非充当万能助手
  3. 每个项目中最多只能创建一个子智能体
  4. 子智能体一次只能处理一个文件
推荐答案B
单一职责原则在子智能体设计中意味着:每个子智能体有清晰的任务边界(如代码审查、安全扫描、测试诊断),而非一个通用的"帮我编程"助手。这直接关系到 description 的精确度和自动委派的准确率。子智能体可以使用多种工具(A 错),项目中可以创建多个(C 错),可以处理多个文件(D 错)。
Question 8

关于子智能体的前台(foreground)与后台(background)执行,以下哪种场景最适合后台执行?

  1. 代码修改任务,需要用户的权限确认
  2. 代码审查,结果需要立即讨论
  3. 对多个外部文档进行长时间研究搜索,结果供后续参考
  4. 执行数据库迁移,需要确认迁移结果成功才能继续
推荐答案C
后台执行适合长时间运行、不需要即时交互的独立任务——子智能体在后台搜索研究,主会话可以继续处理其他工作。需要权限确认(A)、即时讨论(B)、或阻塞等待结果(D)的任务应使用前台执行,以确保流程正确。

📦 Vol.7 Introduction to agent skills(补充练习)

Skilljar 无官方 quiz · 以下 8 题为本地补充练习,覆盖 Agent Skills 定义、创建、配置、边界与排错 · 基于 Vol.7 六节教材

Vol.7 · Introduction to agent skills — 练习

8 题 · 本地补充练习(Skilljar 无官方 quiz)
Question 1

一个 Agent Skill 的入口文件是什么?

  1. README.md
  2. AGENT.md
  3. SKILL.md
  4. index.md
推荐答案C
Skill 是一个目录,其入口文件必须命名为 SKILL.md——包含 YAML frontmatter 和正文内容。Claude Code 通过这个文件名来发现和识别 Skill。其他文件名不会被识别为 Skill 入口。
Question 2

Skill 的 `description` 字段主要作用是什么?

  1. 在 UI 中展示 Skill 的版本信息
  2. 帮助 Claude 判断何时自动调用该 Skill
  3. 定义 Skill 可以使用的工具列表
  4. 指定 Skill 的输出文件格式
推荐答案B
description 是 Claude 判断「何时调用 Skill」的核心依据——它描述触发条件和使用场景,Claude 在每次对话中根据它来匹配当前任务。工具列表由 allowed-tools 控制(C 错);版本和输出格式不是 description 的职责。
Question 3

项目级 Skill 的正确存储路径是什么?

  1. ~/.claude/agents/<skill-name>/SKILL.md
  2. .claude/skills/<skill-name>/SKILL.md
  3. .claude/commands/<skill-name>/SKILL.md
  4. skills/<skill-name>/SKILL.md
推荐答案B
项目级 Skill 存放在仓库的 .claude/skills/ 目录下,可随 git 提交与团队共享。~/.claude/skills/ 是用户级路径(跨项目个人使用),~/.claude/agents/ 是子智能体路径(A 错)。.claude/commands/ 存放自定义命令,不是 Skill(C 错)。
Question 4

Skill 的 `allowed-tools` 字段的正确理解是什么?

  1. 它定义了 Skill 被禁止使用的工具列表
  2. 它预批准(pre-approve)所列工具——Skill 激活时这些工具无需用户额外确认
  3. 它限制 Skill 只能使用这些工具,其他工具被完全禁止
  4. 它继承自父智能体的工具权限,不可单独配置
推荐答案B
allowed-tools 是「预批准列表」,不是「限制列表」——Skill 激活时,列表中的工具被预授权无需逐次确认,但它并不阻止 Claude 使用其他工具。如果真正需要限制工具,应通过权限拒绝规则实现,而非依赖 allowed-tools。
Question 5

一个 Skill 执行时会产生副作用(如发送 Slack 消息、创建 PR),最佳实践是什么?

  1. 在 description 中写明 "This skill has side effects"
  2. 不配置 allowed-tools,让用户逐次确认
  3. 设置 disable-model-invocation: true,只允许用户手动调用
  4. 将 Skill 放在用户级路径避免被团队误用
推荐答案C
有副作用的 Skill(发送消息、创建 PR、修改外部系统)不应被 Claude 自动触发——disable-model-invocation: true 确保只有用户通过 slash command 手动调用时才会执行。仅在 description 中标明(A)或依赖权限确认(B)都不足以防止误触发。
Question 6

某团队需要给 Claude 接入公司内部数据库以实时查询客户信息,应选择哪种机制?

  1. 创建一个包含数据库查询步骤的 Skill
  2. 构建 MCP Server 暴露数据库查询工具
  3. 在 CLAUDE.md 中写入数据库 schema
  4. 创建一个包含数据库凭据的子智能体
推荐答案B
Skill 是文档/流程——Claude 可以阅读但无法执行实时查询。MCP(Model Context Protocol)正是为这种场景设计:提供实时工具接口,允许 Claude 调用外部系统获取数据。Skill 适合固定工作流和知识包,MCP 适合需要实时数据的外部系统集成。
Question 7

以下哪种方式是团队共享 Skill 的主要途径?

  1. 通过邮件发送 SKILL.md 文件给团队成员
  2. 将 Skill 放在项目 .claude/skills/ 目录,随 git 提交
  3. 上传到 Anthropic 官方 Skill 市场
  4. 使用 ~/.claude/skills/ 路径并导出环境变量
推荐答案B
项目级 .claude/skills/ 可随仓库 git 提交,是团队共享 Skill 的主要方式——成员 clone 后自动获得。用户级(~/.claude/skills/)仅限个人使用(D 错)。更完整的打包分发可使用 Plugin(含 Skill + 子智能体 + 命令 + MCP 等),但目前没有官方 Skill 市场(C 错)。
Question 8

Skill 无法被 Claude 自动发现,排查的第一步应该检查什么?

  1. 文件路径是否正确,入口文件是否命名为 SKILL.md,frontmatter 是否有效
  2. description 是否包含足够多的关键词
  3. Skill 目录中是否包含 examples 文件夹
  4. CLAUDE.md 中是否声明了该 Skill
推荐答案A
排查顺序:路径 → frontmatter → description → 调用方式。Skill 不出现的首要原因通常是路径不正确或入口文件未命名为 SKILL.md,或 YAML frontmatter 格式有误。description 问题会导致「不触发」而非「不出现」(B 属于后续排查步骤)。

📦 Vol.11 AI Fluency: Framework & Foundations(补充练习)

Skilljar 仅含 conclusion wrap-up,无官方 quiz · 以下 10 题为本地补充练习,覆盖 4D 框架、生成式 AI 基础、能力与局限 · 基于 Vol.11 十三节教材

Vol.11 · AI Fluency: Framework & Foundations — 练习

10 题 · 本地补充练习(Skilljar 无官方 quiz)
Question 1

AI Fluency 的四个目标是什么?

  1. 快速、准确、低成本、可扩展
  2. 有效(effective)、高效(efficient)、合乎伦理(ethical)、安全(safe)
  3. 自动化、增强、代理、协作
  4. 委派、描述、辨别、勤勉
推荐答案B
AI Fluency 的四个目标是 effective(有效)、efficient(高效)、ethical(合乎伦理)、safe(安全)——涵盖能力、效率、价值观和安全四个维度。选项 C 是三种协作模式,选项 D 是 4D 框架的四个能力,不是目标层级。
Question 2

「人类设定目标、边界和行为规则,AI 更独立地工作」描述的是哪种协作模式?

  1. Automation(自动化)
  2. Augmentation(增强)
  3. Agency(代理)
  4. Assistance(辅助)
推荐答案C
三种协作模式:Automation = AI 按指令执行特定任务;Augmentation = 人与 AI 共同思考、创造、分析;Agency = 人类设定目标和边界,AI 更独立地工作(如配置智能体定期整理研究资料)。Agency 模式下 AI 自主性最高,Diligence(勤勉/责任)也最重要。
Question 3

4D 框架中的四个能力分别是什么?

  1. Define, Design, Develop, Deploy
  2. Discover, Diagnose, Decide, Deliver
  3. Delegation, Description, Discernment, Diligence
  4. Data, Dialogue, Decision, Deployment
推荐答案C
4D = Delegation(委派:谁来做、做到什么程度)、Description(描述:如何清晰沟通目标与过程)、Discernment(辨别:AI 输出是否可用)、Diligence(勤勉:如何负责、披露、安全使用)。四个 D 不是线性步骤,而是循环——评估不满意时回到描述或委派。
Question 4

Delegation(委派)的三要素不包括以下哪项?

  1. Problem Awareness(问题意识):我真正要达成什么?
  2. Platform Awareness(平台意识):AI 擅长和不擅长什么?
  3. Task Delegation(任务委派):哪些由人做、哪些由 AI 做?
  4. Performance Evaluation(绩效评估):AI 的输出效率如何?
推荐答案D
Delegation 三要素:Problem Awareness(明确目标与约束)、Platform Awareness(了解工具能力与风险)、Task Delegation(分配人机分工)。常见错误是跳过前两步直接跳到分工。Performance Evaluation 属于 Discernment 而不是 Delegation。
Question 5

Description(描述)的三种类型是什么?

  1. 任务描述、格式描述、风格描述
  2. Product Description(产品描述)、Process Description(过程描述)、Performance Description(表现描述)
  3. 输入描述、处理描述、输出描述
  4. 目标描述、方法描述、结果描述
推荐答案B
三种 Description:Product(要什么输出——格式、受众、长度、风格)、Process(怎么处理——步骤、方法、参考来源、约束)、Performance(怎么交互——语气、主动性、错误处理)。大多数人只写了 Product Description,加入 Process 和 Performance 可显著减少返工。
Question 6

Discernment(辨别)评估后,以下哪项不是正确的后续行动?

  1. 直接采用(低风险格式化输出)
  2. 要求修改(给出具体反馈)
  3. 返回重新描述或切换工具
  4. 忽略发现的问题,因为 AI 输出通常都会有小错
推荐答案D
Discernment 的五个正确后续行动:直接采用、要求修改、补充信息重试、切换工具或外部验证、升级到人类专家。关键规则是每次 Discernment 必须有明确的下一步——仅仅说「不够好」而不采取行动等于没做。忽略已知问题的做法违反了 Diligence 原则。
Question 7

Description-Discernment 循环的核心逻辑是什么?

  1. 写好提示词 → AI 生成 → 如果不好就换一个模型
  2. 描述目标 → AI 生成/行动 → 评估产品/过程/表现 → 定位差距 → 调整描述或重新委派
  3. 描述任务 → AI 自动评估 → 人类最终确认
  4. 一次性写好详细提示词,避免反复修改
推荐答案B
高质量 AI 协作通常需要 2-5 轮循环:描述→生成→评估产品/过程/表现→定位差距→调整描述或回到委派。如果连续两轮没有改进,问题可能在委派层面(任务本身不适合 AI)。这不是「写好一次提示词」就能完成的。
Question 8

Diligence(勤勉)的三种类型是?

  1. 数据勤勉、代码勤勉、内容勤勉
  2. 输入勤勉、处理勤勉、输出勤勉
  3. Creation Diligence(创建勤勉)、Transparency Diligence(透明勤勉)、Deployment Diligence(部署勤勉)
  4. 事前勤勉、事中勤勉、事后勤勉
推荐答案C
三种 Diligence:Creation(选择什么 AI 系统?是否适合此任务?)、Transparency(谁需要知道 AI 参与了?如何披露?)、Deployment(我能为即将分享的输出负责吗?最终审查和标签)。三者覆盖了从选择工具到发布成果的完整责任链。
Question 9

以下关于大语言模型「幻觉」(hallucination)的说法,哪项最准确

  1. 大模型不会产生幻觉,只要使用最新版本即可避免
  2. AI 输出可能流畅自信但包含虚构事实,因为模型基于模式生成而非从数据库检索
  3. 幻觉只会出现在中文等非英语语言中
  4. 只要在系统提示词中写「不要编造」,就可以杜绝幻觉
推荐答案B
语言模型通过预测下一个 token 生成内容,并非从数据库中检索确定答案——因此可能生成流畅、自信但完全虚构的内容。幻觉是所有大语言模型的固有问题(A 错),不限于特定语言(C 错),也無法仅通过一句提示词彻底消除(D 错)。正确做法是要求引述来源、人工核实关键事实。
Question 10

以下哪项属于 Diligence 的常见失败模式?

  1. 使用 AI 生成初稿后人工修改
  2. 将未核实的 AI 输出作为事实直接发布
  3. 在报告中声明使用了 AI 辅助
  4. 根据任务风险等级决定 AI 的参与程度
推荐答案B
Diligence 四大失败模式:将敏感数据放入不合适的工具、发布未核实的 AI 输出作为事实、在关键工作中隐瞒 AI 参与、让 AI 取代专业责任。「直接发布未核实输出」是最常见也最危险的失败——AI 输出即使流畅也需要独立验证。其他三项都是正确的 Diligence 实践。

📦 Vol.12 AI Fluency for educators(补充练习)

Skilljar 无官方 quiz · 以下 6 题为本地补充练习,覆盖教育者 4D 应用、课程设计、学术诚信与 AI 使用边界 · 基于 Vol.12 四节教材

Vol.12 · AI Fluency for educators — 练习

6 题 · 本地补充练习(Skilljar 无官方 quiz)
Question 1

教育场景中 AI 应用的四个层次不包括以下哪项?

  1. 课程层(Course layer):教学大纲、活动设计
  2. 材料层(Material layer):阅读材料、示例、练习题
  3. 管理层层(Administration layer):排课、考勤、预算
  4. 制度层(Institutional layer):政策草案、伦理规范
推荐答案C
教育 AI 应用四层:课程层、材料层、学习层、制度层。这四个层次贯穿「AI 可以协助什么」和「什么必须由人判断」的边界。行政管理虽也涉及 AI,但不在这四个教育专业性层次之内——这四层关注的是教学核心而非行政效率。
Question 2

设计课堂 AI 活动的五个步骤中,「分配辨别任务」的目的是什么?

  1. 让学生比较不同 AI 工具的速度
  2. 让学生主动评估 AI 输出的准确性、偏见和适用性,培养批判性思维
  3. 让教师检查 AI 是否按预期工作
  4. 为 AI 输出打分,确定哪些学生可以用 AI
推荐答案B
五步课堂设计:明确学习目标 → 设计 AI 参与点 → 分配辨别任务(学生主动评估 AI 输出)→ 反思讨论 → 迁移练习。辨别任务的核心目的是培养学生的批判性思维——不是让学生被动接受 AI 输出,而是主动判断其准确性、偏见和适用性。
Question 3

在 AI 时代,学术诚信的评估框架发生了什么转变?

  1. 从「是否注明引用来源」转变为「是否使用了 AI」
  2. 从「这是你自己的作品吗」转变为「使用方式是否促进了真正的学习」
  3. 从「是否抄袭」转变为「AI 使用比例是否低于 30%」
  4. 从「独立完成」转变为「鼓励使用 AI 完成全部作业」
推荐答案B
旧框架关注「作品是否原创」(抄袭、买论文、作弊),新框架关注「使用方式是否促进真正学习」——AI 生成的文本归属问题、AI 辅助写作的边界、理解证明 vs. 仅会提示词。核心不再仅仅是「检测 AI」,而是「学生是否真正理解了、会判断了、能负责」。
Question 4

以下哪个学科中 AI 的「辨别需求」最高——即最需要人工核验 AI 输出?

  1. 创意写作——AI 可以提供风格建议
  2. 法律——AI 可能生成看似合理但引用错误的法律条文
  3. 编程入门——AI 可以解释语法错误
  4. 体育——AI 可以分析运动数据
推荐答案B
法律、历史、医学属于「高辨别需求」学科——AI 在这些领域的事实、法规、案例方面的幻觉率较高,可能生成看似合理但引用错误的条文或判例。创意写作和编程属于「高价值」使用场景但辨别需求相对于事实型学科较低。体育等需要身体技能的领域 AI 价值有限。
Question 5

教育者视角的 Delegation-Diligence 循环中,第一步应该做什么?

  1. 让 AI 生成教学材料的第一版
  2. 教师先确定哪些任务不可委派给 AI(如学习目标设定、最终评分)
  3. 让学生试用 AI 工具并收集反馈
  4. 查阅其他学校的 AI 使用政策作为模板
推荐答案B
教育者 Delegation-Diligence 循环的第一步是设定委派边界——学习目标设定、评估判断、学生反馈的最终决定是专业判断,不可委派给 AI。材料生成、题目变体、效率任务可以委派。关键在于「先保护学习目标,再用 AI 提升备课效率」,而非反过来。
Question 6

关于作业设计,材料中建议的核心理念是什么?

  1. 完全禁止 AI 在作业中的使用,以保持学术诚信
  2. 从「检测 AI 使用」转向「让学生展示理解过程、判断力和责任」
  3. 允许学生在任何作业中自由使用 AI,无需声明
  4. 为每份作业设置 AI 使用比例上限(如不超过 30%)
推荐答案B
作业设计的关键转变:要求学生提交问题定义、源材料、草稿修改和反思——展示他们的理解和判断过程,而不仅是最终产品。让学生解释为什么接受或拒绝 AI 建议。融入课堂讨论、个人观察或本地数据,明确注明哪些 AI 使用允许、哪些需声明、哪些禁止。

📦 Vol.14 Teaching AI Fluency(补充练习)

Skilljar 无官方 quiz · 以下 6 题为本地补充练习,覆盖四种教学法、两大循环、评量策略与作业设计 · 基于 Vol.14 七节教材

Vol.14 · Teaching AI Fluency — 练习

6 题 · 本地补充练习(Skilljar 无官方 quiz)
Question 1

教授 AI Fluency 时,教学的核心应该是什么?

  1. 教学生使用最新 AI 工具的具体按钮和功能
  2. 教可迁移的判断框架,而非特定工具的操作技能
  3. 教学生写出最优美的提示词模板
  4. 教学生比较不同 AI 模型的性能基准
推荐答案B
AI 工具快速迭代,教具体按钮和提示词格式会很快过时。AI Fluency 教学的核心是传授可迁移的判断框架(4D)——让学生在任何工具、任何场景下都能做出可解释、可负责的 AI 协作决策。教的是判断力,不是工具操作。
Question 2

四种 AI Fluency 教学法中,哪种最接近真实 AI 协作场景但需要最多的课堂时间?

  1. Linear(线性):按顺序教四个 D
  2. Non-linear(非线性):根据任务需要跳跃式教学
  3. Focused(聚焦):在一个 D 上深度教学
  4. Loop-based(循环式):围绕两大循环设计活动
推荐答案D
Loop-based(循环式)教学围绕 Delegation-Diligence 和 Description-Discernment 两大循环设计活动——最接近真实 AI 协作流程,但需要充分课时让学生经历完整循环。Linear 适合初学者但可能僵硬;Non-linear 适合有经验的学生;Focused 适合嵌入现有课程。
Question 3

Delegation-Diligence 循环教学活动中,最重要的产出是什么?

  1. 学生对分工判断和责任方式的清晰解释
  2. AI 生成的高质量最终作品
  3. 学生编写的精美提示词
  4. AI 使用的时间效率数据
推荐答案A
Delegation-Diligence 循环教学的关键产出不是 AI 生成的内容,而是学生能否清晰解释他们的分工判断和能力责任。活动结构:给任务 → 写委派计划 → 使用 AI → 写勤勉笔记 → 反思改进。核心评估的是学生的判断力,而非 AI 输出质量。
Question 4

评量 AI Fluency 时,最稳健的策略是什么?

  1. 仅看最终作品质量
  2. 仅看学生的 AI 聊天记录
  3. 结合结果(最终作品)、过程(版本变化、决策记录)和反思(学生自述),三者并用
  4. 仅看学生对 AI 使用的自我反思
推荐答案C
三种评量策略各有利弊:Outcome-based 只看最终作品看不到过程;Process-based 看聊天记录但依赖学生愿意记录;Reflection-based 看自述但可能流于空泛。最稳健的方案是三者结合——看作品、看过程、看学生能否解释自己的 AI 决策。
Question 5

AI Fluency 作业设计的三个原则是什么?

  1. 简单、快速、可自动评分
  2. 真实性(Authenticity)、迭代(Iteration)、文档化(Documentation)
  3. 标准化、统一化、集中化
  4. 开放性、创造性、协作性
推荐答案B
三个设计原则:Authenticity(作业模拟真实场景,不为用 AI 而用 AI)、Iteration(含多轮尝试与修订,提交版本变化)、Documentation(学生记录 AI 使用过程和决策)。如果作业只要求提交最终答案,就难以评量 AI Fluency——过程与产品同等重要。
Question 6

分析 AI 对学科的影响时,以下哪种判断是正确的?

  1. 所有学科的课程内容都应保持不变,只需在考试中禁用 AI
  2. 学科中容易被 AI 替代的操作需要更新教学内容,难以替代的判断力则更加珍贵
  3. 所有学科的评估方式都应改为口试
  4. AI 对 STEM 学科没有影响,只影响人文学科
推荐答案B
AI 对学科的影响分三类:自动化操作(格式化、草稿生成、语言润色)可能被替代;增强能力(批判性判断、问题定义、来源验证)更加重要;稳定核心(学科价值观、方法论、证据标准、伦理责任)不变。如果一门课只训练 AI 可替代的操作,需要更新;如果训练难以替代的判断力,价值更高。

📦 Vol.15 AI Fluency for students(补充练习)

Skilljar 无官方 quiz · 以下 6 题为本地补充练习,覆盖学生 4D 框架、学习伙伴、职业规划与人在回路中 · 基于 Vol.15 五节教材

Vol.15 · AI Fluency for students — 练习

6 题 · 本地补充练习(Skilljar 无官方 quiz)
Question 1

学生在学习场景中使用 AI 时,最关键的边界是什么?

  1. 每天使用 AI 的时间不超过 2 小时
  2. 增强(Augmentation)vs 自动化(Automation)——AI 应让你更有能力,而非替你完成该学的内容
  3. 只能使用学校指定的 AI 工具
  4. 使用 AI 时必须全程录屏以便教师检查
推荐答案B
学生学习场景中最重要的边界是 Augmentation vs Automation:增强让你更有能力(AI 解释概念后你用自己的话重述、AI 指出逻辑漏洞后你自己修改);自动化让 AI 替你完成本该学习的内容(直接复制 AI 答案、用 AI 写读后感替代阅读)。铁律是:不管 AI 参与多少,所有提交内容必须是你能够解释、应用和负责的。
Question 2

「铁律」(Iron Rule)的核心含义是什么?

  1. 永远不要使用 AI 完成任何作业
  2. 无论 AI 参与多少,所有提交内容必须是你能够解释、应用和负责的
  3. AI 只能用于课外学习,不能用于课堂相关任务
  4. 每次使用 AI 后必须向老师提交使用报告
推荐答案B
铁律(Iron Rule)不禁止使用 AI,而是设立问责标准——你可以用 AI 解释概念、检查逻辑、提供练习,但最终提交的内容必须是你能独立解释、应用和为之负责的。如果做不到这三条,说明 AI 可能正在绕过你的学习过程,而非增强它。
Question 3

学生版 Learning Context Document(学习背景文档)不包含以下哪项?

  1. 你是谁(年级、专业、当前课程)
  2. 学习目标(这次学习要真正掌握什么)
  3. 当前水平(已知什么、卡在哪里)
  4. 过往所有考试成绩和 GPA
推荐答案D
Learning Context Document 包含五要素:身份(年级/专业/课程)、学习目标、当前水平与困难点、偏好风格(类比/例子/推导/提问)、交互要求(不给答案,引导思考)。目的不是做全面档案,而是为每次学习建立清晰的对话背景——让 AI 像家教而非答题机一样回应。
Question 4

AI 作为学习伙伴时,以下哪种使用方式最有利于学习?

  1. 让 AI 生成完整答案后抄写到作业本上
  2. 让 AI 生成练习题(不含答案),自己先做,之后让 AI 给予反馈
  3. 考试时用 AI 实时查询答案
  4. 用 AI 代写读书笔记,节省时间多看几本书
推荐答案B
好的 AI 学习使用模式:AI 生成问题但不给答案 → 学生先自己做 → AI 给反馈。这保持了学习的核心——学生自己在思考和实践。直接抄答案(A)、考试中用 AI(C)、替代阅读(D)都属于 Automation,绕过了学习过程。核验方式很简单:关闭 AI 后你还能独立完成吗?
Question 5

关于使用 AI 准备求职材料,以下哪项是正确的?

  1. AI 可以为没有实习经历的学生编造合理的项目经验
  2. AI 可以帮助改善简历的表达清晰度,但真实经历和个人声音必须由你保持
  3. 直接提交 AI 生成的通用求职信是最有效率的方式
  4. AI 可以替学生决定最适合的职业方向
推荐答案B
AI 在职业规划中的三个角色:职业探索(解释行业、路径、技能要求)→ 你判断适配性;求职材料(改善结构、语调、清晰度)→ 你保持真实经历和个人声音;面试准备(模拟面试官、追问、给反馈)→ 你练习真实回答,不背 AI 脚本。AI 不能编造经历(A 错),不能替你决策(D 错),直接提交通用内容会失去个人辨识度(C 错)。
Question 6

提交 AI 辅助作业前的最终自检(Self-Check)不包含以下哪项?

  1. 我能在没有 AI 的情况下解释这个内容吗?
  2. 我知道哪些部分来自 AI,哪些是我自己的判断吗?
  3. 我核实了事实、来源或计算吗?
  4. 我使用了最新的 AI 模型版本吗?
推荐答案D
提交前五个自检问题:能否独立解释?是否知道 AI 贡献与自己的判断的分界?是否核实了事实/来源/计算?课程或机构是否允许此类 AI 使用?如果老师问起过程,能否诚实解释?使用哪个 AI 模型版本不是重点——重点是你是否理解、判断、核实并能为所提交内容负责。