Appendix E — Pilot Design, Success Criteria, and Post-Demo Follow-Up
This appendix is for what happens after a good demo.
A serious product should not treat the pilot as a vague free trial. It should treat it as a structured test against real workflow pain.
That is especially true for SUMMA.
The point of the pilot is not to let the buyer casually click around and vaguely “get a feel for it.” The point is to answer a harder question:
Does SUMMA reduce real structural pain in the kinds of files it claims to be built for?
That is the standard.
1. Goal of the pilot
The pilot should test whether SUMMA helps on a real severe-review burden.
The best pilot question is not: Do people think the interface looks interesting?
The best pilot question is: Does the system reduce re-entry loss, issue rediscovery, weak handoff, source drift, and poor pressure visibility in a file painful enough to matter?
That is the real commercial test.
2. What a pilot should not be
A SUMMA pilot should not be:
- a generic sandbox with no real file burden
- a broad “try it and let us know” exercise
- a feature scavenger hunt
- a fake AI theatre performance
- a trial run on a file too small to expose the actual wedge
Those kinds of pilots create noise instead of proof.
3. What a pilot should be
A strong SUMMA pilot should be:
- structured
- narrow enough to evaluate honestly
- tied to a real painful file or file class
- explicit about what is being tested
- explicit about what good outcomes would look like
- explicit about what the product is not promising
That last point matters.
The pilot is not testing whether SUMMA replaces counsel. It is testing whether SUMMA strengthens the environment in which counsel and review teams are already struggling.
4. Best pilot candidates
The best pilot candidate is not the cleanest file.
It is the file where the pain is already operationally real.
Strong pilot candidates usually include one or more of the following:
- disclosure-heavy files
- repeated or supplemental disclosure
- mixed-format evidence
- witness contradiction pressure
- timeline instability
- re-entry pain across long gaps
- weak handoff between team members
- existing notes/folders/search systems no longer preserving posture cleanly
That is where the product’s wedge becomes testable.
5. Pilot setup questions
Before the pilot begins, the seller and buyer should agree on:
- which file or file type is being tested
- who the core users are
- what current tools are being used now
- what pain points are already visible
- what success would look like
- what time frame the pilot runs for
- what is in scope and out of scope
Without that, the pilot becomes vague and memory-driven.
6. Good pilot success criteria
The success criteria should be operational.
Good examples include:
- less time lost on re-entry
- less repeated rediscovery of the same issue
- better preservation of source-linked understanding
- cleaner handoff between reviewers
- better visibility into contradiction or pressure zones
- stronger issue concentration
- more confidence that the current posture of the file is being preserved honestly
- better ability to tell what deserves serious attention next
These are much better than shallow success criteria like: - users thought it was cool - the demo looked modern - the interface seemed smart
That is not enough.
7. Weak pilot success criteria
These should be avoided:
- “Did AI save the case?”
- “Did the system tell us the right legal theory?”
- “Did it solve everything?”
- “Did it make the whole file simple?”
- “Did it replace the need for judgment?”
These are bad tests because they are inflated, vague, or dishonest.
8. What the pilot should measure
At minimum, the pilot should try to capture:
Before state
- how the team currently works
- where time is being lost
- where structure is drifting
- what is hardest to preserve
During state
- how users move through the file
- what they return to
- where the system helps
- where they still struggle
- what becomes more survivable
After state
- what pain was reduced
- what remained painful
- what became more trustworthy
- whether the users would want the system on another file
This is enough to make the pilot useful.
9. Post-demo follow-up language
A strong post-demo follow-up should sound like this:
The right next step is not a vague trial. It is a focused pilot against a file painful enough to show whether SUMMA reduces real structural loss: re-entry cost, issue rediscovery, weak continuity, and poor pressure visibility.
Or:
If the demo matched the kind of burden you’re dealing with, the next honest step is to test the system on a live file or file class and see whether the environment becomes more survivable in the ways that matter operationally.
That sounds serious and grounded.
10. Good pilot framing lines
This pilot is not about proving that the system is magical. It is about proving that it reduces real structural pain.
We are not testing whether the product replaces judgment. We are testing whether it helps preserve the environment in which judgment happens.
The question is not whether the file becomes easy. The question is whether the file becomes more survivable.
These are strong framing lines because they protect honesty.
11. Post-pilot decision categories
At the end of the pilot, the outcome should be categorized clearly.
Outcome 1 — Strong success
The product clearly reduced pain in one or more critical areas: - re-entry - issue concentration - source preservation - handoff - pressure visibility
Move toward broader usage.
Outcome 2 — Partial success
The product helped meaningfully, but only on certain layers or for certain users. Refine scope and decide whether a narrower next phase makes sense.
Outcome 3 — Weak fit
The file may not have been painful enough, or the workflow may not match the wedge strongly enough. Do not force the sale.
Outcome 4 — Wrong setup, not wrong product
The pilot may have been run on the wrong file, with the wrong expectations, or without clear success criteria. If that happened, diagnose honestly before concluding anything.
12. Pilot mistakes to avoid
Do not run the pilot on an easy file just because it is convenient.
Do not define success too vaguely.
Do not let the buyer assume the product is promising automatic legal strategy.
Do not skip the “before state” entirely.
Do not end the pilot with nothing more than general impressions.
Do not confuse excitement with proof.
13. Strong closing lines after a pilot
Success close
What matters here is not that the product looked impressive. What matters is that it reduced real structural pain in the review environment. That is the basis for moving forward.
Partial-success close
The pilot suggests that the fit is real, but only in certain parts of the workflow so far. The next step is to tighten scope and test the highest-value lane more directly.
Weak-fit close
The pilot did not show strong enough relief against the file burden we tested. That may mean the current workflow pain has not crossed the threshold where SUMMA becomes urgent.
Those closes preserve honesty.
14. Final takeaway
A strong SUMMA pilot should answer one question clearly:
Did the product make a painful file more structurally survivable in a way the buyer can actually feel and reuse?
If yes, move forward. If partly, refine. If no, do not fake the outcome.