Quality 80 Is Not a Strategy: Build a Corpus Test

Many image optimization workflows begin with a borrowed number: "Use quality 80." That may be a reasonable starting point for some WebP conversions, but it is not a strategy. A single quality value cannot represent every photograph, screenshot, product image, transparent asset, and layout on a real website.

A corpus test gives you better evidence. Instead of arguing over a default, you choose a small set of representative images, convert them at several settings, review the outputs in context, and document which settings are acceptable for each asset type.

A universal "best WebP quality" number is easy to write but hard to defend without source images, output evidence, review criteria, and the page context where the images actually appear. A defensible quality decision needs a corpus, fixed commands, structured output, and named acceptance rules.

Know What You Are Testing#

Quality testing only works when the command is explicit. GetWebP's CLI command reference documents three relevant modes:

Command choice	What it means	Use in a corpus test
no `--quality`	WebP auto-quality mode uses an SSIM search per file	Good for production defaults, but not a fixed-number comparison
`--quality <N>`	Use a fixed quality value, clamped to 1--100	Best for comparing candidate settings
`--no-auto-quality`	Disable auto quality and force fixed quality 80	Useful only when you intentionally want fixed 80

For AVIF, auto-quality does not apply; --format avif uses fixed quality, with default 55 unless you pass --quality. Do not compare a WebP fixed-quality ladder against AVIF default output and call the result a WebP quality decision.

Build a Real Sample Set#

The corpus should reflect the images your site actually publishes. Include:

hero photos
product or portfolio images
portraits
screenshots with text
transparent cutouts
thumbnails
images with gradients or dark shadows
older source files from real content

Do not test only the easiest images. Clean, bright photos often compress well. The hard images reveal where a quality setting fails.

For many small sites, 20 to 40 images are enough to make a better decision than one default number.

Record why each file is in the corpus:

file,group,risk
hero-dark-gradient.jpg,hero,banding in dark gradient
sku-1934-zoom.jpg,product-detail,fabric texture and color accuracy
checkout-ui.png,screenshots,small text and thin borders
logo-transparent.png,transparent-assets,edge halo on dark background

This prevents the corpus from becoming a random folder of convenient images.

Keep Image Roles Separate#

Group the corpus by role. A blog thumbnail, product zoom image, and documentation screenshot should not necessarily use the same quality setting.

Example groups:

photos/
product-detail/
screenshots/
transparent-assets/
thumbnails/

Each group can have its own approval rule. Screenshots may need higher quality or PNG fallback. Thumbnails may tolerate more compression. Product detail images may need conservative settings because users inspect texture.

Test a Small Quality Range#

Convert each group with a few candidate settings. Do not test every possible value. A small range is easier to review and compare.

mkdir -p ./reports

getwebp ./corpus/photos \
  -o ./results/photos-q76 \
  --quality 76 \
  --json > ./reports/photos-q76.ndjson

getwebp ./corpus/photos \
  -o ./results/photos-q82 \
  --quality 82 \
  --json > ./reports/photos-q82.ndjson

getwebp ./corpus/photos \
  -o ./results/photos-q88 \
  --quality 88 \
  --json > ./reports/photos-q88.ndjson

The goal is to find the lowest setting that still passes visual review for that group. If all tested settings fail, the issue may be resizing, source quality, or the wrong format for the asset.

If the corpus has nested folders, add --recursive. If you are comparing output format as well as quality, make that a separate test:

getwebp ./corpus/photos \
  -o ./results/photos-avif-q55 \
  --format avif \
  --quality 55 \
  --json > ./reports/photos-avif-q55.ndjson

Keep WebP and AVIF decisions separate unless the delivery plan will actually use both formats.

Turn Reports Into a Review Table#

The JSON output reference documents the NDJSON events and per-file fields. For quality testing, keep outputPath, originalSize, newSize, savedRatio, quality, qualityMode, and status.

You can turn one run into a TSV summary:

jq -r '
  select(.type == "convert.completed")
  | .data.results[]
  | [.file, .outputPath, .originalSize, .newSize, .savedRatio, .quality, .qualityMode, .status]
  | @tsv
' ./reports/photos-q82.ndjson > ./reports/photos-q82.tsv

And fail the test batch if any candidate run has a file-level error:

for report in ./reports/photos-q*.ndjson; do
  jq -e '
    select(.type == "convert.completed")
    | .data.failedCount == 0
  ' "$report" >/dev/null || {
    echo "Conversion failed in $report" >&2
    exit 1
  }
done

The numeric report is evidence, not the verdict. savedRatio can explain the byte tradeoff, and qualityMode confirms whether the run used fixed or auto mode, but neither field can decide whether a face, product label, or UI screenshot still looks acceptable.

Review in Context#

Open outputs where they will appear: product pages, article templates, gallery views, documentation pages, and mobile layouts. File viewers are useful, but they do not show how the image interacts with page background, crop, text, and surrounding UI.

Reviewers should check:

readability of text
faces and skin tones
product texture
edge halos around transparency
shadows and gradients
brand colors
quality at mobile and desktop sizes

Record failures with examples. A rejected image is useful evidence, not wasted work.

Make the review repeatable:

Group: screenshots
Candidate: q82
Page context: docs article body, 900 px max width, light and dark mode
Pass rule: all button labels and code text readable at normal zoom
Reviewer: documentation owner
Result: rejected, small sidebar labels blurred
Action: test q88 and PNG fallback for this group

Include the Right Reviewer#

The reviewer should match the asset risk. A developer can check that files were generated correctly, but a designer may be better at judging brand color, crop, and visual polish. A merchandiser or product owner may be better at judging whether product texture and detail are still convincing.

For documentation screenshots, include the person responsible for the instructions. They can tell whether the user can still read the relevant label or button.

Measure Savings Without Chasing Them#

Track file size reduction, but do not let the largest reduction win automatically. The correct result is the smallest file that still meets the visual standard for the use case.

Create a small table:

Group: product-detail
q76: rejected, texture loss
q82: approved, good balance
q88: approved, larger than needed
Decision: q82 for this group
Evidence: reports/product-detail-q82.ndjson, review screenshots, product owner approval

This makes the decision clear to designers, developers, and content teams.

Do not approve by average savings alone. One failed high-value product image can matter more than 30 successful thumbnails. When a single group contains both low-risk and high-risk assets, split the group rather than forcing one setting to cover both.

Document Exceptions#

A corpus test should produce a rule set, not a universal number:

photos: WebP q82 approved
product-detail: WebP q88 for zoom images, q82 for listing thumbnails
screenshots: WebP q88 only after text review; keep PNG when labels blur
transparent-assets: manual review on dark and light backgrounds; keep PNG for brand marks if edges fail

Exceptions are not a sign that the test failed. They are the point of the test. A logo, a UI screenshot, and a product zoom photo can all be valid web images while needing different delivery rules.

Re-Test When Sources Change#

Quality decisions age. A site that starts with editorial photos may later add UI screenshots or ecommerce product images. A setting approved for one corpus should not automatically apply to new content categories.

Re-test when:

a new image type appears
the design changes image sizes
a CMS starts re-encoding uploads
an encoder version changes
users or stakeholders report visible quality problems

Use Documentation as Background, Not a Substitute#

Google's WebP documentation explains WebP compression capabilities, and MDN's image file type guide helps compare image formats. Those references are useful, but they cannot tell you which setting protects your actual catalog.

Quality 80 is a starting hypothesis. A corpus test turns that hypothesis into a documented decision based on real images, real layouts, and real review criteria.

Quality 80 Is Not a Strategy: Build a Corpus Test

Know What You Are Testing#

Build a Real Sample Set#

Keep Image Roles Separate#

Test a Small Quality Range#

Turn Reports Into a Review Table#

Review in Context#

Include the Right Reviewer#

Measure Savings Without Chasing Them#

Document Exceptions#

Re-Test When Sources Change#

Use Documentation as Background, Not a Substitute#

Jack