← back to blog

cloud phone proof-of-concept success criteria 2026

May 07, 2026

cloud phone proof-of-concept success criteria 2026

cloud phone POC success criteria in 2026 should be written before the POC starts and signed by every stakeholder who will weigh in on the buy decision. teams that skip this step end up arguing about whether the POC “passed” weeks after it ended, with each evaluator remembering different goals. teams that write criteria first run a 14-day evaluation that produces a clear yes or no on the last day.

this guide gives you the criteria template, the threshold values that work in 2026, and the decision matrix that translates measurements into a buy decision. if you have not yet planned the POC itself, the POC framework is the prerequisite.

why criteria must be written first

three failure modes when criteria are not pre-defined.

write criteria on day zero, get sign-off, then run the test.

the five categories

every POC should produce a number across five categories.

category weight example threshold
reliability 25% <2% flake rate over 1000 test runs
integration ease 20% full CI integration in <8 hours engineer time
scale behavior 20% 8-way parallel runs, no contention errors
security and compliance 20% SOC 2 valid, audit log complete, wipe verified
TCO at projected scale 15% within 10% of vendor quote, no surprise line items

adjust weights to fit your context. regulated teams weight security higher. high-throughput teams weight reliability and scale higher.

category 1: reliability

the easiest thing to measure, and the easiest to fudge if you do not run enough trials.

metric passing excellent
lock success rate >98% >99.5%
ADB connect success rate >97% >99%
test flake rate <5% <2%
device-stuck-locked rate after job kill <1% 0%
screenshot API success rate >99% >99.9%

measure across at least 1000 trials. fewer than that and you are testing the vendor’s lucky day. distribute the trials across the 14 days; do not run all 1000 on day 7.

category 2: integration ease

the speed at which a competent engineer can wire the platform into your stack predicts how painful day-to-day operations will be.

metric passing excellent
CI integration (one job) <8 engineer hours <2 hours
webhook receiver setup <4 hours <1 hour
RBAC mirroring from existing IdP <8 hours <2 hours
custom dashboard with vendor API <16 hours <4 hours
docs quality (5-point rubric) 3+ 5

if integration takes 40 hours instead of 8, you are paying that cost every time the vendor changes the API or you onboard a new team. it compounds.

category 3: scale behavior

scale tells you whether the platform will survive your eventual growth, not just today’s load.

metric passing excellent
parallel device locks 4 simultaneously 16+
latency at 95th percentile (lock to ADB ready) <30s <10s
API rate limit headroom 2x current load 10x
webhook delivery latency <5s <1s
error rate during burst (50 locks in 60s) <2% 0%

run an explicit burst test. the platform’s behavior under burst is more predictive than steady-state numbers. spec a “scale day” in your POC plan.

category 4: security and compliance

binary pass/fail on most items. a single fail blocks the buy.

item required
SOC 2 Type II report current yes
ISO 27001 or equivalent yes
SSO via SAML or OIDC yes
RBAC with custom roles yes
immutable audit log yes
audit log export to SIEM yes
device wipe verified between sessions yes
data residency commitment for your region yes
TLS 1.2+ in transit yes
encryption at rest yes

a vendor that fails any of these on day 14 is not yet enterprise-ready. that does not necessarily kill the deal but it does mean a longer evaluation and stronger contractual remedies.

category 5: TCO at projected scale

actual usage from days 1-12 lets you build a realistic projection. the criterion is consistency between the projection and the vendor quote.

metric passing excellent
invoice prediction accuracy within 15% within 5%
hidden line items found <2 0
projected 3-yr TCO vs vendor quote within 20% within 10%
price increase cap on renewal <15% <8%
support tier upgrade required for SLA needs optional not needed

build the projection using the TCO worksheet and the actual usage data from your POC.

the decision matrix

each category produces a score 1-5. weighted sum yields the buy decision.

weighted score decision
> 4.5 strong buy
4.0-4.5 buy with negotiated improvements
3.5-4.0 extended pilot recommended
3.0-3.5 go back to RFP, expand vendor list
< 3.0 no

if any category scores below 3.0, that is a hard constraint. for example: TCO scores 4.5 but security scores 2.0 means do not buy regardless of weighted total.

stakeholder sign-off

before the POC starts, three signatures.

after the POC ends, the same three signatures on the final scorecard. if any of the three refuse to sign, do not proceed to contract.

what to do with edge cases

three patterns come up often.

vendor passes 4 of 5 categories

if the failure is in TCO, negotiate. if the failure is in integration ease, build the integration anyway and factor the cost. if the failure is in security, do not buy until remediated. if the failure is in reliability or scale, no amount of negotiation fixes it. walk.

two vendors both pass

run the POC framework’s tiebreakers. audit log depth, support response times, exit terms. if still tied, pick the cheaper vendor and lock the better contract terms.

one vendor passes only because of an exceptional sales engineer

red flag. the SE will not be assigned to your account post-sale. ask “what happens if our SE rotates?” if the answer is hand-wavy, downgrade your scores 10%.

the right vendor does not exist on the shortlist

happens 10% of the time. expand the search, run a fast checklist scan on 5 more vendors, and shortlist the top 2. accept the timeline cost.

frequently asked questions

can I share the success criteria with the vendor before the POC starts?

share the categories and weights, not the specific thresholds. otherwise the vendor will optimize their POC environment to hit your numbers narrowly.

what if my stakeholders disagree on weights?

force the conversation before the POC. a 30-minute alignment meeting saves a 4-week debate at the end.

should success criteria differ between vendors evaluated in parallel?

no. same criteria, same weights, same thresholds. anything else makes the comparison invalid.

how do I handle a vendor that says “we are working on that” for a critical criterion?

ask for a date. if they commit in writing to delivery before contract start, accept conditionally with contractual exit if missed. if they wave the question, fail the criterion.

are these criteria valid for emulator-only vendors too?

mostly yes, with minor adjustments. drop “device wipe verified between sessions” since emulators reset by definition. add “concurrency limit per region” since emulators scale differently.

ready to write your criteria first and run the POC second? open a cloudf.one trial so you have a benchmark vendor running while you draft the scorecard.