google-deepmind/onetwo v0.2.0
google-deepmind/onetwo
Captured source
source ↗published Aug 19, 2024seen 5dcaptured 8hhttp 200method plain
v0.2.0
Repository: google-deepmind/onetwo
Tag: v0.2.0
Published: 2024-08-19T15:10:02Z
Prerelease: no
Release notes:
- Backends
- VertexAI: Add VertexAI chat support.
- Space healing: Add token/space healing options to builtin functions, including proper support for space healing in
llm.generate_textandllm.chatofGeminiAPI. - Core
- Caching: Enable loading from multiple cache files, while merging the contents. This is useful, for example, when collaborating in a group, where each person can save to a personal cache file, while loading from both their own and ones from teammates.
- Retries: Implement a generic
with_retrydecorator that automatically retries a given function with exponential backoff when an exception occurs, and enable this for theGeminiAPIandOpenAIAPIbackends. - Standard library
- Chain-of-thought: Define a library of helper functions and data structures for implementing chain-of-thought [[Wei, et al., 2023]](https://arxiv.org/pdf/2201.11903) strategies, including off-the-shelf implementations of several commonly-used approaches, and add a corresponding section to the tutorial colab. Variants illustrated include:
- Chain-of-thought implemented using a prompt template alone (w/2 calls).
- Chain-of-thought implemented using a prompt template (1 call) + answer parser.
- Few-shot chain-of-thought.
- Few-shot exemplars represented as data, so as to be reusable across different styles of prompt template.
- Few-shot chain-of-thought with different exemplars specified for each question (e.g., for dynamic exemplar selection).
- Self-consistency: Define a generic implementation of self-consistency [[Wang, et al., 2023]](https://arxiv.org/pdf/2203.11171) and add a corresponding section to the tutorial colab. In this implementation, we reformulate self-consistency as a meta-strategy that wraps some underlying strategy that outputs a single answer (typically via some kind of reasoning path or other intermediate steps) and converts it into a strategy that outputs a marginal distribution over possible answers (marginalizing over the intermediate steps). The marginal distribution is estimated via repeated sampling from the underlying strategy. Supported variations include:
- Self-consistency over chain-of-thought (like in the original paper).
- Self-consistency over a multi-step prompting strategy (e.g., ReAct).
- Self-consistency over a multi-arg strategy (e.g., Retrieval QA).
- Self-consistency over diverse parameterizations of the underlying strategy (e.g., with samples taken using different choices of few-shot exemplars).
- Self-consistency over diverse underlying strategies.
- Self-consistency with answer normalization applied during bucketization.
- Self-consistency with weighted voting.
- Evaluation based on the consensus answer alone.
- Evaluation based on the full answer distribution (e.g., accuracy@k).
- Evaluation taking into account a representative reasoning path.
- Evaluation
- Add a new
agent_evaluationlibrary, which is similar to the existingevaluationlibrary, but automatically packages the results of the evaluation run in a standardizedEvaluationSummaryobject, with options to include detailed debugging information for each example. This can be used for evaluating arbitrary prompting strategies, but contains particular optimizations for agents. - Add library for writing an
EvaluationSummaryto disk. - Visualization
- Update
HTMLRendererto support rendering ofEvaluationSummaryobjects, to render structured Python objects in an expandable/collapsible form, and to allow specification of custom renderers for other data types. - Documentation
- Add sections to the tutorial colab on chain-of-thought, self-consistency, and swapping backends.
- Other
- Various other bug fixes and incremental improvements to
VertexAIAPIbackend,ReActAgent, caching, composables, and handling of multimodal content chunks.
Notability
notability 5.0/10Notable library update from major lab, lacks strong community traction.