
Pass@k and Pass^k Tell Different Stories from Mean Success Rate
October 30, 2025 | 6 min readThese metrics capture coverage and reliability.
Under the sea, in the hippocampus's garden...
![[object Object]](/static/2d0f4e01d6e61412b3e92139e5695299/e9fba/profile-pic.png)

These metrics capture coverage and reliability.

Some LLMs disable sampling knobs like temperature and top_p. Here’s why.

A deep dive into how LLMs serialize prompts, output schemas, and tool descriptions into a token sequence, with examples from Llama 4's implementation.

A deep dive into how databases work.

Learn how to extend Claude's capabilities by building your own Model Context Protocol server.

A detailed guide on how to build applications with foundation models.