Summary of "FractalMath - Мультиагентный подход в решении математических задач arithmetic reasoning"

Summary of “FractalMath - Мультиагентный подход в решении математических задач arithmetic reasoning”

This video presents a detailed discussion on a multi-agent approach to solving arithmetic mathematical problems using language models (LMs) and agent-based systems. The speakers explain the motivation, methodology, experimental results, comparisons with other models like GPT, and future directions.

Main Ideas and Concepts

Context and Motivation

The focus is on arithmetic problems, which are simpler than more complex mathematical domains like algebra, geometry, or logic.
Traditional large language models (LLMs) such as GPT-4, while powerful, have limitations in reliably solving arithmetic problems, especially when exactness is required.
There is a fundamental misunderstanding in the market about the functional purpose of LLMs—they are often used for tasks they are not well-suited for, such as precise mathematical or engineering problem-solving.
Existing “Agent” or “Autonomous Agent” projects (e.g., AutoGPT, AutoGen) mostly operate by decomposing tasks into subtasks but lack true goal-setting, negotiation, and self-organization capabilities that define a multi-agent system.

Multi-Agent System Definition and Approach

Agents are defined as independent modules with properties such as goal-setting, negotiation, and self-organization.
The system uses multiple specialized agents, each responsible for a domain or part of the problem.
The approach involves the following steps:
1. Input problem in natural language → passed to a condition rewriting module to clarify and enrich the problem statement.
2. Sent to a multi-agent system where agents correspond to specific knowledge areas (initially arithmetic).
3. Agents negotiate and form a strategy by combining pieces of known strategies.
4. The strategy is adapted and rewritten to a command language understood by a processor.
5. The processor executes the commands, which may involve calculators, symbolic algebra, or interpreters.
6. Multiple answer candidates are generated and selected by an arranger module.
7. The final answer is presented to the user.

Key Features of the System

Strategy formation is modular: strategies are assembled from pieces rather than synthesized monolithically.
The processor can detect if a problem is unsolvable or contradictory.
The system uses vector models (not LLMs) for understanding problem statements and synthesizing formulas.
Agents communicate to negotiate and improve the solution iteratively.
Designed to reduce hallucinations and improve reliability compared to single large language models.

Experimental Results

Tested on a third-grade arithmetic dataset (simple problems involving addition, subtraction, multiplication with natural numbers).
Achieved 100% accuracy on 1880 test problems after training on only 5 examples, demonstrating strong generalization.
Compared to GPT models:
- GPT-3 and GPT-4 sometimes make arithmetic or reasoning errors, especially with repetitive actions or tasks requiring maintaining state.
- The multi-agent system remains stable and accurate with repetitive steps.
The system is resistant to changes in object names and some variations in problem wording.
Challenges include:
- Abstract or ill-defined objects (e.g., “questions” as countable items).
- Logical paradoxes or trick problems not seen during training.
- Problems requiring knowledge of operation precedence (e.g., order of operations).
Compared with chain-of-thought prompting, the multi-agent approach showed more stable and accurate performance on simple arithmetic tasks.

Limitations and Challenges

Currently focused on arithmetic; not yet extended to algebra, geometry, logic, or physics problems.
Can be confused by irrelevant or contradictory information added to the problem statement.
Requires carefully designed agents and strategies; expanding to new domains needs new agent knowledge or adapters.
Training data is small but carefully chosen to enable generalization.
Not intended for use in educational cheating or high-stakes exams.

Use Cases and Future Directions

Potential applications in engineering and CAD, where multi-agent systems can analyze drawings, access external data (e.g., material strength), and perform complex calculations.
Plans to integrate symbolic algebra systems like Wolfram Alpha as processors.
Ongoing research to improve multi-agent interaction, train adapters within LMs, and reduce hallucinations.
Exploring scalability by adding more agents with domain-specific knowledge.
Considering different LMs (e.g., Mistral, Quantum Mistral) and comparing their performance.
Developing methods to synthesize readable text explanations without LM involvement.

Technical Details

Multiple stages involve LMs primarily for:
- Filtering and rewriting problem conditions to remove irrelevant information.
- Selecting and substituting variables from a knowledge base.
- Assisting in strategy formation by combining agent outputs.
The processor executes commands in a pseudo-language derived from the synthesized strategy.
Token usage roughly triples the input tokens due to multi-stage processing.
The approach avoids relying solely on LLMs for final problem-solving to reduce errors and hallucinations.

Methodology / Process Overview

Problem Input Receive problem statement in natural language.
Condition Rewriting Use LM-based module to clean and enrich the problem statement. Remove irrelevant or confusing information.
Agent Knowledge Assignment Assign problem components to specialized agents by domain (initially arithmetic).
Strategy Formation Agents communicate and negotiate to form a solution strategy. The strategy is assembled from pieces of known strategies rather than synthesized from scratch. Multiple candidate strategies can be combined for robustness.
Strategy Adaptation The combined strategy is adapted and rewritten into a command language.
Command Execution Commands are executed by a processor module. The processor can be a calculator, symbolic algebra system, Python interpreter, or external tools like Wolfram Alpha.
Answer Generation and Selection Generate multiple answer candidates. Use an arranger module to select the best answer.
Output Present the final answer to the user.

Speakers / Sources Featured

Evgeny – Moderator and CEO of company R, former Yandex webmaster.
Viktor Nasko – Director of Research and Development at Fractal Tech.
Zakhar Pamash – CEO of Development at Fractal Tech.

Summary

The video presents a multi-agent system for arithmetic problem-solving that outperforms traditional LLM approaches in accuracy and reliability on simple arithmetic datasets. It achieves this by decomposing problems into subtasks handled by specialized agents that negotiate and form solution strategies, which are then executed by processors capable of symbolic computation.

The system is robust to many variations but has limitations with abstract concepts and complex logical problems. Future work aims to expand domain coverage, improve training methods, integrate symbolic processors, and scale the agent network.

This approach highlights the potential of multi-agent architectures to overcome some inherent limitations of large language models in precise mathematical reasoning tasks.