Appendix E — Pilot Design, Success Criteria, and Post-Demo Follow-Up

This appendix is for what happens after a good demo.

A serious product should not treat the pilot as a vague free trial. It should treat it as a structured test against real workflow pain.

That is especially true for SUMMA.

The point of the pilot is not to let the buyer casually click around and vaguely “get a feel for it.” The point is to answer a harder question:

Does SUMMA reduce real structural pain in the kinds of files it claims to be built for?

That is the standard.

1. Goal of the pilot

The pilot should test whether SUMMA helps on a real severe-review burden.

The best pilot question is not: Do people think the interface looks interesting?

The best pilot question is: Does the system reduce re-entry loss, issue rediscovery, weak handoff, source drift, and poor pressure visibility in a file painful enough to matter?

That is the real commercial test.

2. What a pilot should not be

A SUMMA pilot should not be:

a generic sandbox with no real file burden
a broad “try it and let us know” exercise
a feature scavenger hunt
a fake AI theatre performance
a trial run on a file too small to expose the actual wedge

Those kinds of pilots create noise instead of proof.

3. What a pilot should be

A strong SUMMA pilot should be:

structured
narrow enough to evaluate honestly
tied to a real painful file or file class
explicit about what is being tested
explicit about what good outcomes would look like
explicit about what the product is not promising

That last point matters.

The pilot is not testing whether SUMMA replaces counsel. It is testing whether SUMMA strengthens the environment in which counsel and review teams are already struggling.

4. Best pilot candidates

The best pilot candidate is not the cleanest file.

It is the file where the pain is already operationally real.

Strong pilot candidates usually include one or more of the following:

disclosure-heavy files
repeated or supplemental disclosure
mixed-format evidence
witness contradiction pressure
timeline instability
re-entry pain across long gaps
weak handoff between team members
existing notes/folders/search systems no longer preserving posture cleanly

That is where the product’s wedge becomes testable.

5. Pilot setup questions

Before the pilot begins, the seller and buyer should agree on:

which file or file type is being tested
who the core users are
what current tools are being used now
what pain points are already visible
what success would look like
what time frame the pilot runs for
what is in scope and out of scope

Without that, the pilot becomes vague and memory-driven.

6. Good pilot success criteria

The success criteria should be operational.

Good examples include:

less time lost on re-entry
less repeated rediscovery of the same issue
better preservation of source-linked understanding
cleaner handoff between reviewers
better visibility into contradiction or pressure zones
stronger issue concentration
more confidence that the current posture of the file is being preserved honestly
better ability to tell what deserves serious attention next

These are much better than shallow success criteria like: - users thought it was cool - the demo looked modern - the interface seemed smart

That is not enough.

7. Weak pilot success criteria

These should be avoided:

“Did AI save the case?”
“Did the system tell us the right legal theory?”
“Did it solve everything?”
“Did it make the whole file simple?”
“Did it replace the need for judgment?”

These are bad tests because they are inflated, vague, or dishonest.

8. What the pilot should measure

At minimum, the pilot should try to capture:

Before state

how the team currently works
where time is being lost
where structure is drifting
what is hardest to preserve

During state

how users move through the file
what they return to
where the system helps
where they still struggle
what becomes more survivable

After state

what pain was reduced
what remained painful
what became more trustworthy
whether the users would want the system on another file

This is enough to make the pilot useful.

9. Post-demo follow-up language

A strong post-demo follow-up should sound like this:

The right next step is not a vague trial. It is a focused pilot against a file painful enough to show whether SUMMA reduces real structural loss: re-entry cost, issue rediscovery, weak continuity, and poor pressure visibility.

Or:

If the demo matched the kind of burden you’re dealing with, the next honest step is to test the system on a live file or file class and see whether the environment becomes more survivable in the ways that matter operationally.

That sounds serious and grounded.

10. Good pilot framing lines

This pilot is not about proving that the system is magical. It is about proving that it reduces real structural pain.

We are not testing whether the product replaces judgment. We are testing whether it helps preserve the environment in which judgment happens.

The question is not whether the file becomes easy. The question is whether the file becomes more survivable.

These are strong framing lines because they protect honesty.

11. Post-pilot decision categories

At the end of the pilot, the outcome should be categorized clearly.

Outcome 1 — Strong success

The product clearly reduced pain in one or more critical areas: - re-entry - issue concentration - source preservation - handoff - pressure visibility

Move toward broader usage.

Outcome 2 — Partial success

The product helped meaningfully, but only on certain layers or for certain users. Refine scope and decide whether a narrower next phase makes sense.

Outcome 3 — Weak fit

The file may not have been painful enough, or the workflow may not match the wedge strongly enough. Do not force the sale.

Outcome 4 — Wrong setup, not wrong product

The pilot may have been run on the wrong file, with the wrong expectations, or without clear success criteria. If that happened, diagnose honestly before concluding anything.

12. Pilot mistakes to avoid

Do not run the pilot on an easy file just because it is convenient.

Do not define success too vaguely.

Do not let the buyer assume the product is promising automatic legal strategy.

Do not skip the “before state” entirely.

Do not end the pilot with nothing more than general impressions.

Do not confuse excitement with proof.

13. Strong closing lines after a pilot

Success close

What matters here is not that the product looked impressive. What matters is that it reduced real structural pain in the review environment. That is the basis for moving forward.

Partial-success close

The pilot suggests that the fit is real, but only in certain parts of the workflow so far. The next step is to tighten scope and test the highest-value lane more directly.

Weak-fit close

The pilot did not show strong enough relief against the file burden we tested. That may mean the current workflow pain has not crossed the threshold where SUMMA becomes urgent.

Those closes preserve honesty.

14. Final takeaway

A strong SUMMA pilot should answer one question clearly:

Did the product make a painful file more structurally survivable in a way the buyer can actually feel and reuse?

If yes, move forward. If partly, refine. If no, do not fake the outcome.