Why This Is The Standard
The MCP ecosystem moves fast. Security signals don't. We built a single, reproducible framework so enterprise teams can evaluate servers consistently — and server authors know exactly what "secure" means before they ship.
📖 Published Methodology
Every scoring decision is documented and reproducible. No hidden weights, no subjective adjustments — run the same checklist and get the same score.
🏢 Enterprise-Ready
Procurement teams can trace every score back to specific, auditable criteria. Maps directly to SOC 2, ISO 27001, and zero-trust evaluation frameworks.
✅ Accountable Attestation
Self-attestations are spot-checked quarterly. Falsified claims result in public score downgrades — accountability by design, not by trust.
🔮 Built for What's Next
Phase 2 brings automated dependency scanning via Socket.dev. The framework is designed to grow: more dimensions, more automation, same open methodology.
Scoring Methodology (0-100)
Each server is scored across 6 security dimensions. The first 5 dimensions are manually audited (20 pts each, max 100). Dependency Health is a separate 0–20 score automated via Socket.dev in Phase 2.
Transport Security
Max 20 ptsHow the MCP server communicates with clients. Stdio is inherently safer than network-exposed transports.
| Level | Score | Risk |
|---|---|---|
| Stdio (Local Process) | 20 | ✓ Low |
| SSE/HTTP with Authentication | 15 | ⚠ Medium |
| SSE/HTTP without Authentication | 5 | ✗ High |
Authentication Method
Max 20 ptsHow the server verifies client identity. Strong authentication prevents unauthorized access and confused deputy attacks.
| Level | Score | Risk |
|---|---|---|
| SSO/SAML | 20 | ✓ Low |
| OAuth2 | 16 | ✓ Low |
| API Key | 10 | ⚠ Medium |
| None | 0 | ✗ High |
Token Lifecycle
Max 20 ptsHow long authentication tokens remain valid. Short-lived tokens with refresh limit the damage window if compromised.
| Level | Score | Risk |
|---|---|---|
| Short-lived (with Refresh) | 20 | ✓ Low |
| Long-lived (Static) | 8 | ⚠ Medium |
| N/A (No Auth) | 0 | ✗ High |
Input Handling
Max 20 ptsHow the server processes user/LLM inputs. Parameterized inputs prevent injection attacks. Shell string concatenation enables command injection.
| Level | Score | Risk |
|---|---|---|
| Parameterized (Safe) | 20 | ✓ Low |
| Mixed | 10 | ⚠ Medium |
| Shell Strings (Risky) | 2 | ✗ High |
Data Residency
Max 20 ptsWhere server data flows and is stored. Local-only means data never leaves your machine. Cloud residency means data passes through third-party APIs.
| Level | Score | Risk |
|---|---|---|
| Local Only | 20 | ✓ Low |
| Hybrid | 12 | ⚠ Medium |
| Cloud | 6 | ⚠ Medium |
| Unknown | 0 | ✗ High |
Dependency Health
Separate 0–20 scoreWhether the dependency tree is free of known CVEs, malicious packages, and supply chain risks. Automated scanning via Socket.dev — rolling out in Phase 2.
| Level | Score | Risk |
|---|---|---|
| Clean (No CVEs) | 20 | ✓ Low |
| Warnings (Minor CVEs) | 10 | ⚠ Medium |
| Critical CVEs Present | 2 | ✗ High |
| Unscanned | 0 | ✗ High |
Score Tiers
Secure
80+ pointsSafe for enterprise deployment. Strong authentication, safe input handling, and clear data boundaries.
Moderate
50+ pointsAcceptable for internal use with risk acceptance. May have long-lived tokens or cloud data flow but with proper auth.
At Risk
0+ pointsNot recommended for production. Lacks authentication, uses shell strings, or has uncontrolled data exfiltration vectors.
Full Audit Checklist
The complete checklist our auditors use when evaluating MCP servers. Each item maps to one of the 6 dimensions. Copy this for your own vendor reviews.
Transport Encryption
- ☐ Does the server use stdio transport (inherently local, no network exposure)?
- ☐ If SSE/HTTP: is TLS 1.2+ enforced with no plaintext HTTP fallback?
- ☐ If SSE/HTTP: is the server certificate from a trusted CA or explicitly pinned?
- ☐ If SSE/HTTP: are CORS origins restricted — no wildcard `*`?
- ☐ If SSE/HTTP: is the listening address bound to localhost for local use?
- ☐ Is the transport configuration documented in the README?
- ☐ Are deprecated or unencrypted transport modes explicitly disabled?
- ☐ Is transport downgrade protection in place?
Auth Methods
- ☐ Does the server require authentication before serving any tool call?
- ☐ Is OAuth2 implemented with PKCE (not implicit grant flow)?
- ☐ Are token scopes following the least-privilege principle?
- ☐ Are API keys stored in environment variables — not hardcoded in source?
- ☐ Is there a documented auth setup guide in the README?
- ☐ Do auth error messages avoid leaking implementation details?
- ☐ Is there rate limiting on authentication endpoints?
- ☐ Are authentication failures logged without exposing credentials?
Token Lifecycle
- ☐ Do access tokens expire within 1 hour?
- ☐ Is there a refresh token rotation mechanism?
- ☐ Are revoked tokens immediately invalidated server-side?
- ☐ Are secrets stored securely — not in plaintext configs or logs?
- ☐ Is JWT validation complete (alg, exp, iss, aud claims all checked)?
- ☐ Are tokens bound to a specific client identity?
- ☐ Is there a token revocation endpoint or documented revocation process?
- ☐ Are long-lived credentials (API keys) rotatable without downtime?
Input Validation
- ☐ Are all tool inputs parameterized — no string concatenation into commands?
- ☐ Is shell execution avoided, or strictly sandboxed when unavoidable?
- ☐ Are SQL queries using prepared statements with bound parameters?
- ☐ Are file paths validated against directory traversal attacks?
- ☐ Is input size bounded — no unbounded payload acceptance?
- ☐ Are deserialization inputs schema-validated before processing?
- ☐ Is prompt injection detection or sanitization in place?
- ☐ Are URL inputs validated against SSRF (Server-Side Request Forgery)?
Data Flow
- ☐ Is all data processed locally by default?
- ☐ Are outbound API calls documented and kept to the minimum necessary?
- ☐ Can cloud features be disabled for air-gapped or compliance deployments?
- ☐ Is sensitive data (PII, secrets) redacted from logs?
- ☐ Are data retention policies documented?
- ☐ Is there a data processing agreement for SaaS deployments?
- ☐ Is network egress restricted to allowlisted endpoints?
- ☐ Are third-party data recipients disclosed in the README or privacy policy?
Dependency Health Phase 2
- ☐ Are all dependencies from verified, reputable publishers?
- ☐ Are there any known CVEs in direct dependencies?
- ☐ Are there any known CVEs in transitive (indirect) dependencies?
- ☐ Are dependency versions pinned in a committed lockfile?
- ☐ Is there a documented dependency update and review policy?
- ☐ Are deprecated or abandoned packages avoided?
- ☐ Has the dependency tree been audited (npm audit / pip audit / cargo audit)?
- ☐ Does Socket.dev scan show no malware or suspicious behavior flags?
⚡ Automated via Socket.dev — Phase 2 rollout
Self-Attestation for Server Authors
Authored an MCP server? Submit your own security details. We verify 10% of attestations quarterly. Falsified claims result in a public score reduction.
Spot-Check Verification Process
Random Selection
10% of all attested servers are randomly selected each quarter for re-verification.
Manual Review
Auditors clone the repository and verify each claimed security property against the source code.
Score Adjustment
Verified attestations maintain their score. Discrepancies result in score downgrades published on the server detail page.
Public Record
Spot-check results are published in our quarterly transparency report, linked from this page.
Explore Audited Servers
Browse MCP servers with transparent security scores. Filter by Local-Only, OAuth2, and more.