LLM orchestrator selection matrix

OculiX MCP is neutral with respect to the LLM that orchestrates it. This document provides the factual matrix to help select an orchestrator compatible with GDPR, EU AI Act, DORA, NIS2, HDS, SecNumCloud — or simply with a sovereignty strategy.

1. Positioning and scope

1.1 OculiX is neutral with respect to the LLM orchestrator

The OculiX MCP server (oculixmcp) exposes 11 visual automation tools (click, find, type, screenshot, etc.) via the Model Context Protocol (MCP), with Ed25519 audit trail and ActionGate access control.

OculiX does not embed an LLM. The orchestrator — the language model that decides which MCP tools to call, in what order, with which arguments — is provided by the customer, in the customer’s environment, under the customer’s responsibility.

Concretely:

flowchart LR
    A["LLM Orchestrator<br/>(customer pick)"] <-->|MCP| B["OculiX MCP<br/>Server"]
    B <-->|OculiX| C["Application<br/>under test"]
    A -.-> D["Customer data<br/>(prompts, screenshots,<br/>UI context, logs)"]
    style D fill:#fff3cd,stroke:#ffc107,color:#856404

All sensitive data (screenshots, prompts, application context, UI traces) transits through the LLM orchestrator before or during the call to OculiX tools. So the LLM choice determines:

the jurisdiction where data is processed,
the retention policy,
eligibility for GDPR / EU AI Act / DORA / NIS2 / HDS / SecNumCloud,
operational cost,
the robustness of tool calling and therefore the reliability of scenarios.

OculiX does not opine on the LLM choice. This document provides the factual matrix to help the customer choose with full knowledge.

1.2 What this document is not

Not a “best LLM” ranking — the concept makes no sense out of context
Not legal advice — this document does not bind OculiX or any associated commercial entity under GDPR or AI Act
Not a quality benchmark — those exist elsewhere (lmarena.ai, artificialanalysis.ai, scale.com/leaderboard)
Not a real-time-updated guarantee — vendor terms evolve; re-verify before signature

1.3 Reading assumptions

The reader is assumed to know:

the basics of MCP (Anthropic, November 2024) and tool calling
the basics of GDPR (Art. 28, Art. 44–46, Chapter V)
the basics of the AI Act (Annex III, Art. 6, Art. 26, 2026–2028 calendar)
the provider / processor / deployer distinction

Security & compliance of OculiX itself Security posture of the OculiX project (MIT, no cloud, no telemetry, Ed25519 audit). Complementary to this matrix.

2. Evaluation criteria

Seven criteria are used. They answer concrete operational questions a DPO, CISO, or Procurement officer asks when choosing an LLM for an OculiX integration.

2.1 Physical hosting of inference servers

Question: where, geographically, is the compute performed?

This is the simplest question but often poorly handled. An “EU” endpoint can in fact be a proxy to the US (historical case of several “EU” offerings up to 2025). Verify in the DPA and sub-processors the actual GPU inference location, not only the access control.

2.2 Vendor’s legal jurisdiction

Question: which jurisdiction can compel the vendor to disclose data?

US-incorporated company → subject to the CLOUD Act (2018) and FISA 702, even if datacenters are in Europe
Chinese company → subject to the National Intelligence Law 2017 (Art. 7), which obliges any Chinese entity to cooperate with intelligence services
European company (SAS, SA, GmbH, etc.) → subject only to GDPR and national laws

2.3 Retention policy (ZDR — Zero Data Retention)

Question: how long are inputs/outputs stored after inference?

Three typical cases:

Default retention: 30 days (anti-abuse), with opt-out sometimes available (OpenAI, Mistral, Anthropic)
Contractual ZDR: no post-inference storage (available on Enterprise / API tier)
Indefinite storage: default on free consumer offerings (to be avoided for pro use)

Pitfall: ZDR generally does not cover the ongoing inference pipeline, nor subprocessors during processing. It covers the absence of persistence after the response.

2.4 Training-on-customer-data policy

Question: can customer prompts/outputs be used to train the vendor’s future models?

Major vendors’ enterprise APIs: no, never, by default (contractual)
Consumer offerings (ChatGPT Free/Pro, Claude.ai Free/Pro): variable, opt-in/opt-out depending on period — not usable in B2B pro
Public DeepSeek, some Chinese offerings: yes by default, to be avoided for sensitive data

Question: does the vendor have the contractual commitments needed to allow the customer (deployer) to meet their own GDPR and AI Act obligations?

Minimum checklist:

Signed DPA (GDPR Article 28)
SCC (Standard Contractual Clauses) if transfer outside the EEA
Documented subprocessors
Commitment to non-use for training
Logs accessible for AI Act Art. 12 obligations (traceability)
Model documentation sufficient for AI Act Art. 11 + Annex IV
HIPAA BAA for US healthcare, HDS for French healthcare

2.6 Self-hosted / on-premise deployment

Question: can the model be run in your own datacenter, or even fully air-gapped?

Downloadable open weights (Apache 2.0, MIT, Llama Community License): yes, fully
Proprietary models in dedicated VPC: possible with Mistral (Le Chat Enterprise), partially with OpenAI (dedicated Azure), Anthropic (via Bedrock PrivateLink), but this is not true on-prem — the weights stay with the vendor
Pure SaaS API: no self-hosting possible

This is the only way to obtain complete sovereign independence. Hardware cost to anticipate: see section 5.

2.7 Tool calling and MCP compatibility

Question: can the model call tools reliably, deterministically, in parallel?

Technical criteria:

Native support for function/tool calling (vs manual prompt engineering)
Accuracy on BFCL (Berkeley Function Calling Leaderboard) benchmarks
Native MCP protocol support client-side or via bridge (MCPHost, ollmcp, LiteLLM, llama.cpp)
Tool call streaming
Call parallelization (multiple tools in parallel within one response)

State of the art as of May 2026:

Tier 1 (production-grade): Claude Sonnet/Opus 4.x, GPT-4.1/5, Gemini 3.1 Pro
Tier 2 (excellent): Mistral Large 3, Qwen 3.5, Llama 4
Tier 3 (functional but to validate): Gemma 4, DeepSeek V3.x, Phi-4

3. Detailed vendor matrices

The data below reflects the state as of May 9, 2026 and may evolve. Always re-verify in the effective DPA at signature time.

3.1 Anthropic Claude

Criterion	Status
Company	Anthropic PBC, Delaware (US)
Jurisdiction	United States (CLOUD Act, FISA 702 applicable)
Models available via API	Claude Opus 4.7, Claude Sonnet 4.6, Claude Haiku 4.5
Direct API hosting	Mostly US; `inference_geo=eu` option in beta, storage US in all cases
Hosting via AWS Bedrock	EU possible (eu-central-1 Frankfurt) — under AWS / Delaware jurisdiction
Hosting via GCP Vertex AI	EU possible (europe-west1, europe-west4) — under Google / Delaware jurisdiction
Hosting via Azure Foundry	EU “Coming 2026” — not yet effective on Foundry
ZDR	Available by separate addendum, on Enterprise/API tier — not by default
DPA / SCC	Included automatically (DPA v.01/01/2026) on Team, Enterprise and commercial API
API training opt-out	Opt-out by default (Anthropic does not train on API data)
Certifications	SOC 2 Type II, ISO 27001:2022, ISO 42001:2023, HIPAA BAA, FedRAMP High (Claude for Government)
Self-hosted	No. Weights not public. No on-premise option.
Tool calling	Excellent. Industry reference. Native streaming and parallel tools.
MCP	Creator of the protocol (Nov. 2024). Reference native support.
Indicative pricing	Sonnet 4.6: ~$3/M input, ~$15/M output. Opus 4.7: ~$15/M input, ~$75/M output

Verdict: excellent tool-calling quality, native MCP support (concrete advantage for OculiX), but US-incorporated. To achieve EU data residency and remain legally defensible under GDPR, you must route through Bedrock (eu-central-1) or Vertex AI (europe-west). The CLOUD Act remains applicable even in this case — this is documented in any serious DPIA.

3.2 OpenAI (GPT)

Criterion	Status
Company	OpenAI OpCo LLC (Delaware, US) / OpenAI Ireland Ltd (secondary EU entity)
Jurisdiction	United States (CLOUD Act, FISA 702 applicable — Ireland entity insufficient alone)
Models available	GPT-5, GPT-4.1, o-series (reasoning)
Direct API hosting	EU residency option available for eligible projects, with forced ZDR
Hosting via Azure OpenAI	EU Data Zone available (West Europe, North Europe, etc.), full control via Azure tenant
ZDR	Available: automatic on EU residency projects, on request for US projects (Limited Access program)
DPA / SCC	Included in standard commercial terms
API training opt-out	Opt-out by default on API (no training on API/business data since March 2023)
Certifications	SOC 2 Type II, ISO 27001, HIPAA BAA (via Azure), CSA STAR
Self-hosted	No. Weights not public. Open gpt-oss models under evaluation (limited).
Tool calling	Excellent. Historical function-calling reference since June 2023.
MCP	Native support since 2025 (Responses API, ChatGPT Apps)
Indicative pricing	GPT-5: ~$1.25/M input, ~$10/M output. GPT-4.1: ~$2/M input, ~$8/M output

Verdict: Azure OpenAI in EU Data Zone is the most compliance-defensible path for a customer that wants to stay in the OpenAI ecosystem. The CLOUD Act remains applicable but the operational scope is better controlled (logs, IAM, VNet). For strict sovereign use (government, defense, French HDS healthcare), insufficient as is.

3.3 Mistral AI

Criterion	Status
Company	Mistral AI SAS, Paris (France)
Jurisdiction	France / European Union (GDPR, no CLOUD Act applicable)
Models available	Mistral Large 3 (MoE, 256k context), Pixtral Large, Devstral 2 (code), Mistral Medium 3, Mistral Small/Nemo (open weights)
Direct API hosting (La Plateforme / Mistral AI Studio)	France and EU by default. No routing to the US.
Hosting via AWS / Azure / GCP Marketplace	Available. Effective jurisdiction depends on the chosen cloud operator.
Self-hosted	Yes, officially supported: self-hosted, private cloud, VPC, on-premise via TensorRT-LLM, vLLM, Ollama, llama.cpp
ZDR	Available (toggle parameter on API). Default 30-day rolling retention for anti-abuse monitoring.
DPA / SCC	Included. SCC not needed for EU customers (no transfer outside the EEA).
Training opt-out	Opt-out by default on Team and Enterprise. No training on customer data under enterprise terms.
Certifications	SOC 2 Type II, ISO 27001 (extension in progress). GPAI Code of Practice commitment.
Open-weight models	Several models published under Apache 2.0: Mistral 7B, Mixtral 8x7B/8x22B, Nemo, Small 3, Devstral, Codestral Mamba
Tool calling	Very good (Mistral Large 3 and Medium 3). Documented native function calling.
MCP	Compatible support via official SDK and OpenAI-compatible gateways (LiteLLM, etc.)
Indicative pricing	Mistral Medium 3: ~$0.40/M input. Mistral Large 3: ~$2/M input, ~$6/M output

Verdict: the only frontier-class native EU offering without CLOUD Act. Complete stack: France-hosted SaaS, VPC, on-prem, open-weight models. Adopted by HSBC, SAP, French/German governments for sovereign stacks. Recommended default choice for regulated EU OculiX customers unless specific technical constraints apply. Limitation: tooling ecosystem younger than OpenAI/Anthropic.

3.4 Google Gemini

Criterion	Status
Company	Google LLC, Mountain View (US) / Alphabet Inc. (US)
Jurisdiction	United States (CLOUD Act, FISA 702 applicable)
Models available	Gemini 3.1 Pro, Gemini 3 Flash, Gemini 2.5 Pro, Gemini 2.0 Flash
Hosting via Vertex AI	EU possible: europe-west1 (Belgium), europe-west4 (Netherlands). But Gemini 3.x not yet in EU as of May 9, 2026 — only the 2.x generations are GDPR-compatible in EU.
ZDR	Available on Vertex AI Enterprise (paid option)
DPA / SCC	Included in Google Cloud Terms
Training opt-out	Opt-out by default on Vertex AI (no training on customer prompts)
Certifications	SOC 1/2/3, ISO 27001, 27017, 27018, 27701, 42001, FedRAMP High, HIPAA BAA
Self-hosted	No. No Gemini open-weight models. Gemma 4 variants available open-weight (but far less capable than Gemini 3).
Tool calling	Excellent (Gemini 3.1 Pro). Native function calling, structured output, parallel calls.
MCP	Official support since 2026 (Vertex AI Agent Builder, Gemini Enterprise)
Indicative pricing	Gemini 3.1 Flash: ~$0.30/M input, ~$2.50/M output. Gemini 3.1 Pro: ~$2/M input, ~$10/M output

Verdict: excellent capability/price ratio, but two major traps:

the 3.x models are not (yet) available in EU as of May 9, 2026 → need to use Gemini 2.5 Pro to remain GDPR-compatible, which degrades tool-calling quality
US jurisdiction, CLOUD Act applicable

For a non-regulated customer, OK. For a customer with sovereignty constraints, unsuitable.

3.5 DeepSeek

Criterion	Status
Company	Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd. (China)
Jurisdiction	People’s Republic of China — National Intelligence Law 2017 applicable
Models available	DeepSeek V4, DeepSeek R1 (reasoning), DeepSeek Coder
Official API hosting	Servers in China. No EU residency available.
Hosting via third parties (Atlas Cloud, Baseten, Fireworks, Together)	Variable. Verify the sub-processor.
ZDR	Not guaranteed on official API. Partial documentation.
DPA / SCC	Nearly nonexistent. DeepSeek contested GDPR applicability in 2025.
Training opt-out	No, by default, on official API. Data used for training.
Certifications	No recognized EU certification.
EU regulatory status	Italy (Garante) banned DeepSeek in January 2025. Active investigations in France, Germany, Belgium, Ireland, Netherlands.
Open-weight models	Yes, under MIT license — this is the safe usage angle: self-host the weights and ignore the official API
Tool calling	Decent (V3.x), to validate in production
MCP	Via OpenAI-compatible bridge on self-hosted side
Indicative pricing (official API)	Very low — ~$0.07/M input. Out of EU for pro use.

Verdict: the official DeepSeek API is to be avoided for any EU customer processing personal or industrial-sensitive data. However, the open-source weights are among the most performant freely available and can be self-hosted with no downside. Conclusion: yes to the weights, no to the API.

3.6 Self-hosted open-weight models

Generic case: the customer downloads the weights and runs them on their own infrastructure (datacenter, private cloud, or own on-prem GPUs).

Model	License	Sizes	Tool calling	Ideal self-host
Llama 4 (Meta)	Llama Community License	8B, 70B, 405B	Good	vLLM
Mistral Small 3 / Nemo / Mixtral	Apache 2.0	7B to 8x22B	Good to very good	vLLM, llama.cpp
Qwen 3.5 (Alibaba)	Apache 2.0	0.5B to 72B	Very good	vLLM, llama.cpp
Gemma 4 (Google)	Gemma Terms	2B to 27B	Good (big jump since Gemma 3)	Ollama, vLLM
DeepSeek V3 / R1	MIT	671B (MoE)	Decent	vLLM
Phi-4 (Microsoft)	MIT	14B	Average	Ollama, LM Studio

Self-host advantages:

Jurisdiction = your datacenter’s. Native GDPR. No CLOUD Act, no FISA 702, no NIL 2017.
Air-gap possible (no outbound connection)
Very low marginal cost once hardware is amortized: ~$0.001 to $0.04/M tokens in electricity vs ~$2.50 to $15/M in cloud API
Typical hardware ROI: less than 4 months above ~30M tokens/day
Full auditability: frozen weights, traceable version, no silent “model drift” on the vendor side

Drawbacks:

Initial capex: a 2x H100 80GB node costs ~€50-80k (rental ~€3-5k/month)
Internal MLOps competence needed (vLLM tuning, GPU K8s, monitoring)
Open-weight models trail the proprietary frontier by 6 to 12 months
Stack maintenance (vLLM updates, OS security, CUDA drivers)

Recommended stacks as of May 9, 2026:

Prototyping / dev: Ollama (simple, OpenAI-compatible, MCP via MCPHost)
Multi-user production: vLLM 0.17+ (PagedAttention, continuous batching, Anthropic API compat since v0.17)
CPU-only air-gap: llama.cpp (native MCP support since March 2026)
Managed EU cluster: Mistral AI Studio “Enterprise-Supported Self-Deployment”, or sovereign operators like Scaleway/OVH with dedicated LLM offerings

3.7 Inference accelerators (Groq, Cerebras, SambaNova)

These vendors are not model publishers. They operate specialized hardware (LPU, WSE, RDU) that serves third-party open-weight models (Llama, Mistral, Qwen, DeepSeek) at very low latency.

Vendor	Hosting	Jurisdiction	Models served	Main interest
Groq	Mostly US, EU coming	US	Llama, Mixtral, Qwen, GPT-OSS	Latency < 100ms, record throughput
Cerebras	US	US	Llama, Qwen, DeepSeek	Massive throughput (3000+ tok/s)
SambaNova	US	US	Llama, DeepSeek	Throughput

Verdict: interesting for latency (OculiX MCP benefits from fast responses since there are many round-trips), but US jurisdiction on all major players as of May 9, 2026. For regulated EU customers, not a sovereign path. For non-regulated customers wanting an ultra-fast orchestrator, excellent latency/price ratio.

4. Summary table

Legend:

OK: criterion fully satisfied in standard configuration
CONF: satisfied through specific configuration (Bedrock EU, Vertex EU, ZDR addendum, etc.)
NO: not available or unsatisfactory

Vendor	Non-US jurisdiction	EU hosting	ZDR	GDPR/DPA	Self-host	Tool calling	Native MCP
Anthropic Claude (direct API)	NO	CONF	CONF	OK	NO	OK	OK
Anthropic via Bedrock EU	NO (AWS)	OK	OK	OK	NO	OK	OK
Anthropic via Vertex EU	NO (Google)	OK	OK	OK	NO	OK	OK
OpenAI direct API EU residency	NO	OK	OK	OK	NO	OK	OK
OpenAI via Azure EU Data Zone	NO (MS)	OK	OK	OK	NO	OK	OK
Mistral La Plateforme	OK	OK	OK	OK	OK	OK	CONF
Mistral self-hosted	OK	OK	OK	OK	OK	OK	CONF
Google Gemini 3.x via Vertex EU	NO	NO (not avail EU on 5/9/26)	CONF	OK	NO	OK	OK
Google Gemini 2.5 via Vertex EU	NO	OK	CONF	OK	NO	OK	OK
DeepSeek official API	NO (China)	NO	NO	NO	NO	OK	NO
DeepSeek self-hosted (MIT weights)	OK (per DC)	OK	OK	OK	OK	OK	CONF
Llama 4 self-hosted	OK (per DC)	OK	OK	OK	OK	OK	CONF
Mixtral / Qwen self-hosted	OK (per DC)	OK	OK	OK	OK	OK	CONF
Groq / Cerebras / SambaNova	NO	NO	CONF	CONF	NO	OK	CONF

5. Recommended deployment profiles

Five typical profiles, from most to least constrained. Profile A is the strictest on sovereignty; profile E is the most flexible.

5.1 Profile A — Government, defense, healthcare (HDS), critical-infrastructure operators

Constraints:

SecNumCloud, HDS (French healthcare), DiffusionRestreinte (defense)
No CLOUD Act, no FISA 702
Air-gap possible or required
AI Act high-risk (Annex III) likely

Recommendation: open-weight models self-hosted on SecNumCloud infrastructure or air-gapped on-prem.

Stack:

Model: Mistral Large 3 (if Enterprise license negotiated), Mixtral 8x22B, or Llama 4 70B
Inference engine: vLLM 0.17+
Hosting: OVH SecNumCloud, Outscale, Scaleway, or private datacenter
MCP bridge: native llama.cpp or MCPHost

Acceptable fallback: Mistral AI Studio in self-deployment mode supervised by Mistral.

5.2 Profile B — Regulated mid/large enterprise (banking, insurance, energy)

Constraints:

DORA (since January 2025), NIS2, strict GDPR
Active AI ethics committee and DPO
AI Act high-risk for certain uses (HR, scoring, surveillance)
Possible audit by regulators (ACPR, BaFin, etc.)

Recommendation: Mistral Le Chat Enterprise in private VPC or sovereign cloud.

Stack:

Model: Mistral Large 3 via La Plateforme with enhanced DPA and ZDR enabled
Hosting: Mistral cloud (FR/EU) or self-hosted in VPC on OVH/Scaleway
Backup: Claude via Bedrock eu-central-1 for non-sensitive tasks (with DPIA documenting CLOUD Act residual risk)

5.3 Profile C — Generic EU B2B SaaS (non-regulated)

Constraints:

GDPR applicable
No AI Act high-risk
Cost-sensitive
Need good tool-calling quality

Recommendation: OpenAI via Azure EU Data Zone, or Claude via Bedrock eu-central-1.

CLOUD Act residual risk is documented in the DPIA. Acceptable for an EU B2B customer that does not process particularly sensitive data. Mistral La Plateforme remains the best option if you accept testing a slightly younger ecosystem.

5.4 Profile D — POC, internal R&D, sandbox

Constraints:

Non-sensitive data (anonymized, synthetic, or outside GDPR scope)
Minimal cost
Fast iteration

Recommendation: Anthropic API direct (Claude Sonnet 4.6) or Mistral API.

Native tool calling and MCP, top quality. Explicitly document in an internal policy that this scope does not handle personal or confidential data. Otherwise switch back to profiles A/B/C.

5.5 Profile E — Full air-gap, classified environment

Constraints:

No outbound connection allowed
Datacenter under physical customer control
Long-term bit-for-bit inference reproducibility

Recommendation: open-weight models, llama.cpp or vLLM, on dedicated hardware.

Stack:

Model: Llama 4 70B FP16, or Mixtral 8x22B, or Mistral Small 3 for more modest targets
Inference engine: llama.cpp (CPU possible) or vLLM (dedicated GPU, A100/H100)
Audit: systematic packet capture to verify the absence of any outbound
Storage: cryptographically signed weights, SHA-256 verification at each load

No SaaS offering is eligible. This is the only profile where sovereignty is mathematically verifiable.

6. Common pitfalls to avoid

7. Specific recommendations for OculiX MCP integration

7.1 Ed25519 audit trail: to keep customer-side

OculiX’s signed audit trail allows proving, after the fact and in a tamper-evident way, which tools were called with which arguments. It is a major asset for AI Act compliance (Art. 12 — logs and traceability).

Never transmit this audit trail to the LLM orchestrator for re-injection: this would invert the chain of trust. The audit trail is meant for DPOs/auditors, not for the model.

7.2 ActionGate: deterministic authorization policy

OculiX’s ActionGate access control is deterministic and independent of the LLM. Even if an LLM hallucinates or falls victim to prompt injection, ActionGate blocks actions not authorized by the policy.

Consequence: the customer can choose a less “safe” LLM (more creative, less guarded) for orchestration without compromising operational security, provided the ActionGate policy is correctly defined.

7.3 Latency: direct impact of LLM choice

OculiX MCP typically does 5 to 50 LLM round-trips per test scenario. Per-call latency accumulates:

Provider	Median TTFT latency
Groq (Llama)	100-200 ms
Cerebras (Llama)	150-300 ms
Claude Sonnet (US direct)	400-800 ms
Claude via Bedrock EU	500-1000 ms
GPT-4.1 via Azure EU	500-1200 ms
Mistral La Plateforme	300-700 ms
Self-hosted vLLM (H100, EU DC)	50-150 ms

For a 20-call scenario, that’s a 2 to 20-second difference per execution. Over thousands of executions, this is structurally significant for CI/CD.

7.4 Reproducibility

For visual non-regression use cases, you ideally want an LLM whose outputs are reproducible. No stochastic LLM is strictly reproducible (even temperature=0 doesn’t guarantee determinism on GPU due to float non-determinism).

OculiX mitigation: the deterministic layer (Sikuli, OpenCV, OCR) absorbs the majority of LLM non-reproducibility. The LLM decides what to look for; the deterministic layer guarantees how it is found.

This is consistent with the project’s philosophy: deterministic code in the critical loop, LLM as a high-level decision layer. Not the other way around.

7.5 Specifically discouraged models

As of May 9, 2026, do not use for orchestrating OculiX MCP in production:

Models < 7B parameters: tool calling too weak
Non-instruct (base) models: no value
Phi-3: unstable tool calling on multi-hop scenarios
“Uncensored” models without alignment: risk of aberrant behavior on ambiguous prompts
LLMs without native function-calling support (manual prompt engineering): too fragile

8. Appendices

8.1 Glossary

CLOUD Act (Clarifying Lawful Overseas Use of Data Act, 2018): US law allowing US authorities to require a US-incorporated company to disclose data, wherever stored worldwide.
FISA 702 (Foreign Intelligence Surveillance Act, Section 702): US provision authorizing the collection of electronic communications from non-US foreigners without individual warrant.
National Intelligence Law 2017 (Art. 7): Chinese law requiring any Chinese organization and citizen to cooperate with intelligence services.
ZDR (Zero Data Retention): contractual commitment not to store inputs and outputs beyond the inference cycle.
DPA (Data Processing Agreement): GDPR Art. 28 processor agreement.
SCC (Standard Contractual Clauses): standard contractual clauses for transfers outside the EEA.
HRAIS (High-Risk AI System): AI system classified high-risk under the AI Act (Annex III).
Deployer (AI Act): natural or legal person using an AI system under their own authority.
MCP (Model Context Protocol): open Anthropic protocol (Nov. 2024) for connecting LLMs and tools.
BFCL (Berkeley Function Calling Leaderboard): reference benchmark for tool-calling quality.

8.2 Sources and useful links

8.3 Disclaimer

This document is provided for informational purposes. It does not constitute legal advice, a compliance audit, or a contractual commitment from OculiX or any associated commercial entity. Responsibility for LLM orchestrator compliance rests with the customer (deployer under the AI Act and processor/controller under GDPR as applicable).

LLM vendor terms evolve rapidly. Always re-verify DPAs, SCCs, certifications, and retention policies at contract signature time.

8.4 Document history

Version	Date	Changes
1.0	May 9, 2026	Initial version
1.0 (publication)	May 19, 2026	Published on oculix.org

LLM orchestrator selection matrix

1. Positioning and scope

1.1 OculiX is neutral with respect to the LLM orchestrator

1.2 What this document is not

1.3 Reading assumptions

2. Evaluation criteria

2.1 Physical hosting of inference servers

2.2 Vendor’s legal jurisdiction

2.3 Retention policy (ZDR — Zero Data Retention)

2.4 Training-on-customer-data policy

2.5 GDPR and AI Act compliance

2.6 Self-hosted / on-premise deployment

2.7 Tool calling and MCP compatibility

3. Detailed vendor matrices

3.1 Anthropic Claude

3.2 OpenAI (GPT)

3.3 Mistral AI

3.4 Google Gemini

3.5 DeepSeek

3.6 Self-hosted open-weight models

3.7 Inference accelerators (Groq, Cerebras, SambaNova)

4. Summary table

5. Recommended deployment profiles

5.1 Profile A — Government, defense, healthcare (HDS), critical-infrastructure operators

5.2 Profile B — Regulated mid/large enterprise (banking, insurance, energy)

5.3 Profile C — Generic EU B2B SaaS (non-regulated)

5.4 Profile D — POC, internal R&D, sandbox

5.5 Profile E — Full air-gap, classified environment

6. Common pitfalls to avoid

7. Specific recommendations for OculiX MCP integration

7.1 Ed25519 audit trail: to keep customer-side

7.2 ActionGate: deterministic authorization policy

7.3 Latency: direct impact of LLM choice

7.4 Reproducibility

7.5 Specifically discouraged models

8. Appendices

8.1 Glossary

8.2 Sources and useful links

8.3 Disclaimer

8.4 Document history