MCPApr 14, 20265 min read

Adding a Local Image Optimization MCP Server to an AI Agent

AI agents can help with image optimization when they have access to a real local tool. Without a tool, the agent can only suggest commands. With a local MCP server, it can scan a folder, convert selected images, and report results in a structured way while keeping source files under the user's control.

The important part is the boundary: the agent should be able to request image work, but the tool should be narrow, auditable, and honest about its limits.

Adding MCP does not make image optimization automatic. A useful MCP workflow still names the allowed tools, the JSON fields the agent must report, the plan limits, the local data boundary, and the human review step for visual quality.

What MCP Adds#

The Model Context Protocol lets clients connect language models to tools and data sources. The official MCP architecture documentation explains the relationship between clients, servers, and tool calls at a high level. For GetWebP specifically, the MCP server reference documents the local stdio server, its three tools, their parameters, and their JSON responses.

For image optimization, an MCP server can expose a few specific tools instead of giving the agent a vague instruction to "optimize images." That makes the workflow easier to review:

  • scan images
  • convert images
  • check status and limits
  • return structured results
  • report errors clearly

This is a better fit than asking the model to infer file sizes or write ad hoc scripts.

The Three Useful Tools#

A focused image MCP server should not need dozens of actions. For GetWebP, the documented tool set is:

scan_images
convert_images
get_status

scan_images is the safe first step. It inspects a file or directory and returns metadata such as path, size, format, and has_webp without changing anything.

convert_images performs the conversion. WebP is the default output format, AVIF can be requested when that format is appropriate for the workflow, and manifest_path can write a manifest with fingerprints for successful conversions.

get_status reports plan and limit information so the agent can avoid starting work it cannot finish.

Clear tool names help the model choose the right action and help the human understand what happened.

Scan Before Convert#

The best agent pattern is scan first, convert second. A scan result can show total files, large candidates, existing WebP siblings, unsupported files, and likely follow-up questions.

For example, the agent can say:

I found 18 convertible images in public/images.
5 already have WebP siblings.
3 screenshots contain small text and should be reviewed after conversion.
The largest file is hero-dashboard.png at 4.2 MB.

That gives the user a decision point before any output files are written.

Keep Conversion Local and Reviewable#

For privacy-sensitive work, local conversion is often valuable. Client screenshots, unreleased product images, and internal documentation should not be uploaded to a third-party conversion API unless that data flow is approved.

In a local-first workflow, image bytes are processed on the user's machine. The GetWebP security whitepaper describes the MCP server as a local stdio process that reuses the CLI conversion core and the CLI credential store. Account or license checks, if present, are control-plane traffic separate from image conversion. This distinction matters because "local conversion" and "no network activity of any kind" are different claims.

Output should also be easy to review. A conversion tool should preserve originals and write new files as siblings or into a chosen output directory, not silently mutate source assets.

Know the Limits Before a Batch#

Limits should be visible to the agent. In GetWebP's MCP workflow, the MCP server reference documents that the Free tier is limited to 20 files per convert_images call and 3 free calls in a 60-second rolling window before rate_limited responses begin. Pro removes those conversion caps.

That means an agent should call get_status before a larger job and report constraints plainly:

This folder contains 64 candidate images.
The current plan can process 20 files per call.
I can convert a sample first or you can activate Pro before a full run.

This is better than failing halfway through a task with no explanation.

Use Structured Errors#

The MCP tools specification describes tools as callable capabilities with defined inputs and outputs. For image work, structured errors are especially important.

The documented stable MCP error codes include:

  • input_not_found
  • io_error
  • rate_limited

The agent can branch on these. A missing folder should produce a different response from a rate limit or an I/O problem. For per-file conversion outcomes, the agent should inspect the structured results returned by convert_images rather than inventing a separate error vocabulary.

Keep Human Approval for Quality#

An agent can report bytes saved. It cannot fully approve brand quality, product color, screenshot readability, or legal text. The workflow should flag sensitive outputs for review.

Ask the agent to call out:

  • product images
  • screenshots with small text
  • transparent logos
  • hero images
  • files that grew or saved very little
  • unsupported or failed files

These outputs should be inspected before references are updated or a pull request is merged.

A Practical Conversation#

A good interaction looks like this:

User: Scan public/images and tell me what needs WebP conversion.
Agent: calls scan_images.
Agent: reports candidates and asks before converting.
User: Convert the three largest PNGs only.
Agent: calls convert_images for those files.
Agent: reports output paths, saved bytes, skipped files, and review notes.

The agent coordinates the work, but the user remains in control of scope and approval.

A useful agent report should cite the actual tool response fields, not just prose:

Tool: convert_images
Input: ./public/images
Output: ./public/images-webp
Plan: free
Total: 20 processed, 4 skipped_by_limit
Succeeded: 18
Failed: 2
Manifest: ./public/images-webp/manifest.json
Review required: failed files, screenshots with small text, files with minimal savings

What Not to Expose#

Keep the MCP server focused. If the server does not support watch mode, authentication, logout, dry-run, or markdown rewriting, do not imply those tools exist. Authentication can be handled through the CLI when needed, and the MCP server can read the resulting status through get_status.

Narrow capability is easier to test and safer for agents to use.

Adding a local image optimization MCP server gives AI agents a concrete way to help with WebP and AVIF workflows. The quality comes from the boundaries: scan first, convert locally, preserve originals, expose limits, return structured errors, and keep human review in the loop for important images.

Jack avatar

Jack

GetWebP Editor

Jack writes GetWebP guides about local-first image conversion, WebP workflows, browser compatibility, and practical performance checks for teams that publish images on the web.