Adventures in Symbolic Algebra with Model Context Protocol
I spent last weekend playing with this new MCP protocol all the kids are talking about. And it's fun, but a bit early and rough around the edges. The source code for this project is up on Github in case anyone wants to try it out.
- Github Source Code (
git clone https://github.com/sdiehl/sympy-mcp.git
)
MCP, if you're not familiar, is Anthropic's answer to the question: "How do we get an LLM to actually DO things instead of just TALK about doing them?" It's a protocol that allows language models to call external tools, much like how your friend who claims to know everything actually calls their more knowledgeable friend behind your back. The core idea of MCP is to standardize the way language models call external tools, instead of writing custom connectors for each AI model and each tool, you implement the protocol once on each side. It's the USB-C of AI tooling, if USB-C were still in its awkward adolescent phase.
Notably the MCP server runs locally on your machine letting the language model invoke arbitrary code and commands by using one of the LLM desktop clients to call out to the local server. This is probably a bad idea in general, and as dangerous and reckless as you might think, so there's a definite security concern here, be warned. But let's not let a potential rootkit get in the way of a fun weekend experiment.
My particular itch stemmed from watching Claude (and its cousins like o4-mini-high and DeepSeek-R1) really struggle with tensor calculus. If you've ever asked an LLM to perform complex symbolic manipulation, you know the drill: confident answers, beautiful LaTeX formatting, and results that would make your math professor weep bitter tears. These models, despite their impressive linguistic capabilities, are absolute disasters when it comes to keeping track of indices in tensor expressions or manipulating complicated algebraic forms. The expressions involved in even moderate general relativity problems are HUGE, with hundreds of terms and complex index gymnastics.
But we already have specialized tools that excel at this! Computer algebra systems like Mathematica, Sympy, xAct, Cadabra were built specifically for this purpose. So the obvious solution presented itself: let's expose these tools to the LLM through MCP and let each system do what it does best. The LLM handles the natural language understanding and planning, while the symbolic algebra system performs the actual mathematical manipulations with perfect precision.
Working with the MCP ecosystem is like visiting a frontier town in the Wild West. The documentation exists in the form of scattered campfire stories, the implementations have a distinctly "I wrote this at a 3 AM hackathon" vibe, and everything is strangely Node-heavy. This Node fixation likely stems from most MCP tools being designed to call REST services for cloud applications. Then there's the peculiar ecosystem of suspiciously self-referential products from companies that just happen to sell AI coding assistants. There's definitely a faint whiff of opportunism in the air.
Debugging an MCP server is a crazy exercise. You're essentially working with a stochastic black box that communicates through a complex web of JSON schemas attached to docstring annotations. When something goes wrong, good luck figuring out if it's your server, the client, the LLM's interpretation, or just the model having a laugh at your expense. The non-deterministic nature of the whole setup means that something can work perfectly five times in a row and then spontaneously fail on the sixth attempt because the random number seed on the inference server is different, the prompt vibes are off, or for reasons completely shrouded in the mists of the digital occult. Welcome to the fun future of software.
However, once you've made the ritual animal sacrifice to appease the non-determinism gods (I find that sacrificing a rubber duck while chanting "Ph'nglui mglw'nafh Roko's Basilisk R'lyeh wgah'nagl fhtagn, O Great Transformer of the Deep, may thy hallucinations be few and thy completions true" works best), you'll find the basic implementation is refreshingly straightforward, similar to FastAPI if FastAPI were designing its endpoints for a language model. Here's a simple example that highlights why this approach matters: Ask any LLM to factor a large integer, and watch it confidently fabricate entirely wrong answers. By design, transformers can't perform the arbitrary computation required for integer factorization. They've merely memorized some factorizations from the internet (and even those, poorly).
from mcp.server.fastmcp import FastMCP
import subprocess
mcp = FastMCP("Demo", instructions="You factor integers.")
def factor_number(number):
result = subprocess.run(['factor', str(number)], capture_output=True, text=True)
return result.stdout.strip()
@mcp.tool()
def factor(a: int) -> int:
"""Factor an integer"""
return factor_number(a)
With this simple tool, we can now expose the actual computation to the LLM using the GNU factor shell command. Instead of making up factors, the model recognizes when to delegate the task to a specialized tool.
When it works, it's pretty magical. The successful integration of these components (properly formatted JSON, accurate tool selection by the language model, and reliable server processing) demonstrates the potential of this architectural approach. While the technology is still in its early stages of development, the possibilities for integration with formal verification systems are particularly compelling. The combination of LLMs with theorem provers like Lean could enable natural language interfaces to formal mathematics, allowing automated reasoning systems to benefit from the language models' ability to understand informal mathematical concepts while maintaining rigorous logical foundations. Similarly, making computer algebra systems accessible through natural language could dramatically lower the barrier to entry for complex mathematical computation. However, significant research challenges remain in ensuring reliable and semantically correct interactions between these different systems.
Let me show you how this looks in practice with a classic example from physics: solving the damped harmonic oscillator with a forcing term. This is the bread and butter problem for undergraduate physics students. Here's how the workflow unfolds with our sympy-mcp
server:
User Solve the damped harmonic oscillator with forcing term: the mass-spring-damper system described by the differential equation where m is mass, c is the damping coefficient, k is the spring constant, and F(t) is an external force.
$$
m\frac{d^2x}{dt^2} + c\frac{dx}{dt} + kx = F(t)
$$
Assistant: (Internal tool call) intro_many
to introduce variables with appropriate assumptions:
{
"variables": [
{"var_name": "t", "pos_assumptions": ["real"], "neg_assumptions": []},
{"var_name": "m", "pos_assumptions": ["real", "positive"], "neg_assumptions": []},
{"var_name": "c", "pos_assumptions": ["real", "positive"], "neg_assumptions": []},
{"var_name": "k", "pos_assumptions": ["real", "positive"], "neg_assumptions": []}
]
}
Assistant: (Internal tool call) introduce_function
to define x(t) as a function:
{
"func_name": "x"
}
Assistant: (Internal tool call) introduce_expression
to create the differential equation:
{
"expr_str": "Eq(m*Derivative(x(t), t, t) + c*Derivative(x(t), t) + k*x(t), F(t))"
}
Assistant: (Internal tool call) introduce_function
to define the forcing function F(t):
{
"func_name": "F"
}
Assistant: (Internal tool call) dsolve_ode
to solve the ODE:
{
"expr_key": "expr_12",
"func_name": "x"
}
And voilĂ ! Instead of hallucinating a solution, the LLM delegates to SymPy which returns the correct solution:
$$
x{\left(t \right)} = C_{1} e^{\frac{t \left(- c + \sqrt{c^{2} - 4 k m}\right)}{2 m}} + C_{2} e^{- \frac{t \left(c + \sqrt{c^{2} - 4 k m}\right)}{2 m}} + \frac{e^{\frac{t \left(- c + \sqrt{c^{2} - 4 k m}\right)}{2 m}} \int F{\left(t \right)} e^{\frac{c t}{2 m}} e^{- \frac{t \sqrt{c^{2} - 4 k m}}{2 m}}, dt}{\sqrt{c^{2} - 4 k m}} - \frac{e^{- \frac{t \left(c + \sqrt{c^{2} - 4 k m}\right)}{2 m}} \int F{\left(t \right)} e^{\frac{c t}{2 m}} e^{\frac{t \sqrt{c^{2} - 4 k m}}{2 m}}, dt}{\sqrt{c^{2} - 4 k m}}
$$
No hallucinated terms, no mysterious constants appearing out of nowhere, just the correct solution. The LLM handles the natural language interaction and orchestration, while the computer algebra system does what it does best ... exact symbolic manipulation.
Anyways all the code is up on Github here so maybe someone else will find it useful. If you have Cursor or Claude installed add the following to ~/.cursor/mcp.json
or ~/Library/Application Support/Claude/claude_desktop_config.json
to install the MCP server.
{
"mcpServers": {
"sympy-mcp": {
"command": "uv",
"args": [
"run",
"--with",
"https://github.com/sdiehl/sympy-mcp/releases/download/0.1/sympy_mcp-0.1.0-py3-none-any.whl",
"python",
"server.py"
]
}
}
}
Or maybe slightly better, run it from a Docker image.
{
"mcpServers": {
"sympy-mcp": {
"command": "docker",
"args": [
"run",
"-i",
"-p",
"8081:8081",
"--rm",
"ghcr.io/sdiehl/sympy-mcp:latest"
]
}
}
}
And maybe read the source code for what you're installing here, because I feel like a lot of people are going to install these MCP servers without knowing what they're doing and thus install a lot of malware and exploits. There's basically no security going on here. And that could be a big problem. Tread carefully.