Vector-Space Protocols — Building Shared Context Between AI Agents
When AI agents talk to each other, they usually use JSON. They send structured messages with fixed schemas. This works, but it’s brittle. If one agent changes its message format, everything breaks. If agents work in different domains, they can’t understand each other.
There’s a better way. Instead of sending strings, agents can communicate through vector spaces. They can share meaning through embeddings. They can align their understanding using cosine similarity. This is what Vector-Space Protocols (VSPs) do.
VSPs let agents maintain consistent meaning across conversations. They reduce semantic drift. They enable cooperation even when agents don’t share exact vocabularies. This article explains how they work and how to build them.
Introduction: The Problem of Semantic Drift
Most multi-agent systems rely on structured communication. Agent A sends a JSON message to Agent B. Agent B parses it, processes it, and responds. This works when both agents agree on the schema. But it breaks when they don’t.
Consider two agents working on different tasks. One agent handles customer support. Another handles inventory management. They need to coordinate. The support agent needs to check if an item is in stock. The inventory agent needs to know about customer requests.
They could use a shared API schema. But what if the inventory agent changes how it represents stock levels? What if the support agent uses different terms? The system breaks.
This is semantic drift. Over time, agents develop different understandings of the same concepts. Their internal representations diverge. Their communication becomes less effective.
Semantic drift happens for several reasons:
Domain differences: Agents trained on different data develop different embeddings. A “product” in e-commerce might mean something different than a “product” in manufacturing.
Temporal changes: Language evolves. Concepts shift. An agent trained last year might not understand new terminology.
Context loss: When agents communicate through strings, they lose context. The word “bank” could mean a financial institution or a river bank. Without context, agents can’t disambiguate.
Embedding misalignment: Different embedding models produce different vector spaces. Even if two agents use the same words, their vectors might not align.
Traditional solutions try to fix this with better schemas. They use ontologies. They create shared vocabularies. They enforce strict contracts. But these approaches are rigid. They don’t adapt. They break when agents need to evolve.
Vector-Space Protocols take a different approach. Instead of fixing the schema, they fix the meaning. They align vector spaces. They preserve context. They enable semantic interoperability.
From API Contracts to Vector-Space Contracts
Traditional agent communication looks like this:
# Traditional JSON-based communication
message = {
"action": "check_inventory",
"product_id": "SKU-12345",
"quantity": 10
}
response = inventory_agent.process(message)
This works, but it’s fragile. If the inventory agent changes its API, the support agent breaks. If they use different field names, they can’t communicate.
Vector-Space Protocols work differently. Instead of sending structured data, agents send embeddings. They communicate through meaning, not syntax.
# Vector-space communication
query_embedding = embed("check if product SKU-12345 has 10 units in stock")
response_embedding = inventory_agent.process_vector(query_embedding)
result = decode_response(response_embedding)
The key difference is that embeddings capture meaning. Two different phrasings that mean the same thing produce similar vectors. Agents can understand each other even if they use different words.
Embedding Protocols
An embedding protocol defines how agents encode and decode messages using vectors. It specifies:
Encoding: How to convert a message into a vector. This usually involves using a sentence transformer or language model to create embeddings.
Alignment: How to ensure different agents use compatible vector spaces. This might involve shared embedding models or alignment techniques.
Decoding: How to convert a vector back into a meaningful response. This could be direct similarity search or generation from the vector.
Thresholds: How similar vectors need to be to represent the same meaning. This is usually measured with cosine similarity.
Here’s a simple example:
from sentence_transformers import SentenceTransformer
import numpy as np
class VectorSpaceProtocol:
def __init__(self, model_name='all-MiniLM-L6-v2'):
self.model = SentenceTransformer(model_name)
self.similarity_threshold = 0.75
def encode(self, message: str) -> np.ndarray:
"""Convert a message to a vector."""
return self.model.encode(message)
def decode(self, query_vector: np.ndarray,
candidate_messages: list[str]) -> str:
"""Find the most similar message to the query vector."""
candidate_vectors = self.model.encode(candidate_messages)
similarities = np.dot(candidate_vectors, query_vector) / (
np.linalg.norm(candidate_vectors, axis=1) *
np.linalg.norm(query_vector)
)
best_match_idx = np.argmax(similarities)
if similarities[best_match_idx] >= self.similarity_threshold:
return candidate_messages[best_match_idx]
return None
This protocol lets agents communicate through vectors. They don’t need to agree on exact message formats. They just need similar meanings to produce similar vectors.
Designing a Vector-Space Protocol (VSP)
Building a VSP requires several components. You need encoding, alignment, versioning, and drift detection. Let’s build a complete example.
Core Components
A VSP needs:
-
Shared embedding model: All agents use the same model to ensure compatible vector spaces.
-
Context preservation: Messages include context to maintain meaning across conversations.
-
Similarity thresholds: Define how similar vectors need to be to represent agreement.
-
Versioning: Handle changes in embedding models or protocols over time.
-
Drift detection: Monitor when agents’ understanding diverges.
Here’s a complete implementation:
from sentence_transformers import SentenceTransformer
from typing import Dict, List, Optional, Tuple
import numpy as np
from dataclasses import dataclass
from datetime import datetime
import hashlib
@dataclass
class VSPMessage:
"""A message in the Vector-Space Protocol."""
vector: np.ndarray
context: Dict[str, any]
timestamp: datetime
version: str
message_hash: str
class VectorSpaceProtocol:
def __init__(
self,
model_name: str = 'all-MiniLM-L6-v2',
similarity_threshold: float = 0.75,
version: str = '1.0'
):
self.model = SentenceTransformer(model_name)
self.similarity_threshold = similarity_threshold
self.version = version
self.message_history: List[VSPMessage] = []
def encode_message(
self,
message: str,
context: Optional[Dict[str, any]] = None
) -> VSPMessage:
"""Encode a message with context into a VSP message."""
# Combine message and context for encoding
context_str = self._context_to_string(context or {})
full_text = f"{message} {context_str}".strip()
# Generate embedding
vector = self.model.encode(full_text, normalize=True)
# Create message hash for deduplication
message_hash = hashlib.sha256(full_text.encode()).hexdigest()[:16]
return VSPMessage(
vector=vector,
context=context or {},
timestamp=datetime.now(),
version=self.version,
message_hash=message_hash
)
def _context_to_string(self, context: Dict[str, any]) -> str:
"""Convert context dictionary to string for encoding."""
parts = []
for key, value in context.items():
parts.append(f"{key}: {value}")
return " ".join(parts)
def compute_similarity(
self,
msg1: VSPMessage,
msg2: VSPMessage
) -> float:
"""Compute cosine similarity between two VSP messages."""
return np.dot(msg1.vector, msg2.vector)
def find_similar_messages(
self,
query: VSPMessage,
candidates: List[VSPMessage],
threshold: Optional[float] = None
) -> List[Tuple[VSPMessage, float]]:
"""Find messages similar to the query."""
threshold = threshold or self.similarity_threshold
similarities = [
(candidate, self.compute_similarity(query, candidate))
for candidate in candidates
]
return [
(msg, sim) for msg, sim in similarities
if sim >= threshold
]
def align_vector_space(
self,
reference_messages: List[VSPMessage],
target_messages: List[VSPMessage]
) -> float:
"""Measure alignment between two vector spaces."""
if not reference_messages or not target_messages:
return 0.0
# Compute average similarity between reference and target
similarities = []
for ref_msg in reference_messages[:10]: # Sample for efficiency
for target_msg in target_messages[:10]:
sim = self.compute_similarity(ref_msg, target_msg)
similarities.append(sim)
return np.mean(similarities) if similarities else 0.0
def detect_semantic_drift(
self,
recent_messages: List[VSPMessage],
historical_messages: List[VSPMessage]
) -> float:
"""Detect if semantic drift has occurred."""
if not recent_messages or not historical_messages:
return 0.0
# Compare recent messages to historical baseline
alignment = self.align_vector_space(recent_messages, historical_messages)
# Drift is inverse of alignment
drift = 1.0 - alignment
return drift
This VSP implementation provides the basics. Agents can encode messages with context. They can find similar messages. They can detect when their understanding drifts.
Versioning and Re-indexing
Over time, you might need to update your embedding model. Or you might change how context is encoded. VSPs need to handle version changes.
class VSPVersionManager:
def __init__(self):
self.versions: Dict[str, VectorSpaceProtocol] = {}
def register_version(self, version: str, protocol: VectorSpaceProtocol):
"""Register a new protocol version."""
self.versions[version] = protocol
def migrate_message(
self,
message: VSPMessage,
target_version: str
) -> Optional[VSPMessage]:
"""Migrate a message to a new protocol version."""
if target_version not in self.versions:
return None
target_protocol = self.versions[target_version]
# Re-encode using the new protocol
# Note: We lose the original text, so this is approximate
# In practice, you'd store the original text
context_str = target_protocol._context_to_string(message.context)
new_vector = target_protocol.model.encode(context_str, normalize=True)
return VSPMessage(
vector=new_vector,
context=message.context,
timestamp=message.timestamp,
version=target_version,
message_hash=message.message_hash
)
Versioning lets you evolve your protocol without breaking existing agents. You can support multiple versions simultaneously. You can migrate messages when needed.
Entropy Reduction
In long-running systems, message vectors can become noisy. Entropy increases. Meaning becomes less clear. VSPs can reduce entropy by maintaining message clusters.
class VSPEntropyReducer:
def __init__(self, protocol: VectorSpaceProtocol):
self.protocol = protocol
self.clusters: List[List[VSPMessage]] = []
def add_message(self, message: VSPMessage):
"""Add a message, clustering similar ones."""
# Find existing cluster
for cluster in self.clusters:
if cluster:
centroid = self._compute_centroid(cluster)
similarity = self.protocol.compute_similarity(
message,
VSPMessage(
vector=centroid,
context={},
timestamp=datetime.now(),
version=message.version,
message_hash=""
)
)
if similarity >= self.protocol.similarity_threshold:
cluster.append(message)
return
# Create new cluster
self.clusters.append([message])
def _compute_centroid(self, messages: List[VSPMessage]) -> np.ndarray:
"""Compute the centroid vector of a cluster."""
vectors = [msg.vector for msg in messages]
return np.mean(vectors, axis=0)
def get_representative_messages(self, top_k: int = 10) -> List[VSPMessage]:
"""Get representative messages from each cluster."""
representatives = []
for cluster in self.clusters:
if cluster:
centroid = self._compute_centroid(cluster)
# Find message closest to centroid
closest = min(
cluster,
key=lambda msg: np.linalg.norm(msg.vector - centroid)
)
representatives.append(closest)
return representatives[:top_k]
This reduces entropy by clustering similar messages. Instead of storing every message, you store representatives. This maintains meaning while reducing noise.
Implementation Walkthrough
Let’s build a complete example with two agents communicating via VSP.
class AIAgent:
def __init__(self, name: str, protocol: VectorSpaceProtocol):
self.name = name
self.protocol = protocol
self.message_history: List[VSPMessage] = []
self.knowledge_base: Dict[str, any] = {}
def send_message(
self,
message: str,
context: Optional[Dict[str, any]] = None
) -> VSPMessage:
"""Send a message encoded in the VSP."""
vsp_msg = self.protocol.encode_message(message, context)
self.message_history.append(vsp_msg)
return vsp_msg
def receive_message(self, vsp_msg: VSPMessage) -> Optional[VSPMessage]:
"""Process a received VSP message and respond."""
# Find similar messages in history
similar = self.protocol.find_similar_messages(
vsp_msg,
self.message_history
)
# Process based on similarity
if similar:
# High similarity: respond based on similar past interactions
best_match, similarity = max(similar, key=lambda x: x[1])
response = self._generate_response(vsp_msg, best_match, similarity)
else:
# Low similarity: new type of message
response = self._handle_new_message(vsp_msg)
if response:
response_msg = self.protocol.encode_message(response)
self.message_history.append(response_msg)
return response_msg
return None
def _generate_response(
self,
query: VSPMessage,
similar: VSPMessage,
similarity: float
) -> str:
"""Generate response based on similar past message."""
# In practice, this would use the agent's logic
# For now, return a simple acknowledgment
return f"Understood (similarity: {similarity:.2f})"
def _handle_new_message(self, vsp_msg: VSPMessage) -> str:
"""Handle a message with no similar history."""
return "Processing new request"
# Example usage
protocol = VectorSpaceProtocol(
model_name='all-MiniLM-L6-v2',
similarity_threshold=0.75
)
agent_a = AIAgent("SupportAgent", protocol)
agent_b = AIAgent("InventoryAgent", protocol)
# Agent A sends a message
query = agent_a.send_message(
"Check if product SKU-12345 has 10 units available",
context={"customer_id": "CUST-001", "priority": "high"}
)
# Agent B receives and processes
response = agent_b.receive_message(query)
print(f"Agent B response: {response}")
This example shows two agents communicating through vectors. They don’t need to agree on exact message formats. They just need similar meanings.
Vector-Space Handshake
When agents first meet, they need to establish a shared understanding. This is the handshake phase.
class VSPHandshake:
def __init__(self, protocol: VectorSpaceProtocol):
self.protocol = protocol
def perform_handshake(
self,
agent_a: AIAgent,
agent_b: AIAgent
) -> bool:
"""Perform VSP handshake between two agents."""
# Agent A sends capability message
capabilities_a = agent_a.send_message(
"I can handle customer support queries and check order status",
context={"capabilities": ["support", "orders"]}
)
# Agent B receives and responds with its capabilities
response_b = agent_b.receive_message(capabilities_a)
capabilities_b = agent_b.send_message(
"I can manage inventory and check product availability",
context={"capabilities": ["inventory", "products"]}
)
# Agent A receives B's capabilities
response_a = agent_a.receive_message(capabilities_b)
# Check alignment
alignment = self.protocol.align_vector_space(
[capabilities_a],
[capabilities_b]
)
# Handshake successful if alignment is above threshold
return alignment >= self.protocol.similarity_threshold * 0.8
# Handshake example
handshake = VSPHandshake(protocol)
success = handshake.perform_handshake(agent_a, agent_b)
print(f"Handshake successful: {success}")
The handshake establishes that agents can understand each other. They exchange capability vectors. They verify alignment. If alignment is good, they can communicate.
Measuring Semantic Consistency
To know if your VSP is working, you need metrics. You need to measure semantic consistency, drift, and agreement.
Vector Drift
Vector drift measures how much agents’ understanding has shifted over time.
def measure_vector_drift(
protocol: VectorSpaceProtocol,
baseline_messages: List[VSPMessage],
current_messages: List[VSPMessage],
window_size: int = 10
) -> float:
"""Measure how much vectors have drifted from baseline."""
if not baseline_messages or not current_messages:
return 0.0
# Sample recent messages
recent = current_messages[-window_size:]
baseline_sample = baseline_messages[:window_size]
# Compute average similarity
similarities = []
for recent_msg in recent:
for baseline_msg in baseline_sample:
sim = protocol.compute_similarity(recent_msg, baseline_msg)
similarities.append(sim)
avg_similarity = np.mean(similarities) if similarities else 0.0
drift = 1.0 - avg_similarity
return drift
High drift means agents are losing shared understanding. Low drift means they’re staying aligned.
Semantic Agreement Ratio
This measures how often agents agree on meaning.
def compute_semantic_agreement_ratio(
protocol: VectorSpaceProtocol,
agent_a_messages: List[VSPMessage],
agent_b_messages: List[VSPMessage]
) -> float:
"""Compute how often two agents agree on meaning."""
if not agent_a_messages or not agent_b_messages:
return 0.0
agreements = 0
total = 0
for msg_a in agent_a_messages:
for msg_b in agent_b_messages:
similarity = protocol.compute_similarity(msg_a, msg_b)
if similarity >= protocol.similarity_threshold:
agreements += 1
total += 1
return agreements / total if total > 0 else 0.0
High agreement means agents understand each other well. Low agreement means they’re miscommunicating.
Context Decay
Context decay measures how well context is preserved across conversations.
def measure_context_decay(
protocol: VectorSpaceProtocol,
messages: List[VSPMessage]
) -> float:
"""Measure how much context is lost over time."""
if len(messages) < 2:
return 0.0
# Compare each message to the previous one
decays = []
for i in range(1, len(messages)):
prev_msg = messages[i-1]
curr_msg = messages[i]
# Check if context is preserved
context_overlap = len(
set(prev_msg.context.keys()) &
set(curr_msg.context.keys())
)
total_context = len(
set(prev_msg.context.keys()) |
set(curr_msg.context.keys())
)
if total_context > 0:
decay = 1.0 - (context_overlap / total_context)
decays.append(decay)
return np.mean(decays) if decays else 0.0
High decay means context is being lost. Low decay means it’s being preserved.
Visualization
You can visualize agent communication integrity using dimensionality reduction.
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
def visualize_agent_communication(
agent_a_messages: List[VSPMessage],
agent_b_messages: List[VSPMessage],
protocol: VectorSpaceProtocol
):
"""Visualize agent communication in 2D space."""
# Combine all vectors
all_vectors = np.vstack([
msg.vector for msg in agent_a_messages + agent_b_messages
])
# Reduce to 2D
tsne = TSNE(n_components=2, random_state=42)
vectors_2d = tsne.fit_transform(all_vectors)
# Plot
plt.figure(figsize=(10, 8))
plt.scatter(
vectors_2d[:len(agent_a_messages), 0],
vectors_2d[:len(agent_a_messages), 1],
label='Agent A',
alpha=0.6
)
plt.scatter(
vectors_2d[len(agent_a_messages):, 0],
vectors_2d[len(agent_a_messages):, 1],
label='Agent B',
alpha=0.6
)
plt.legend()
plt.title('Agent Communication Vector Space')
plt.xlabel('Dimension 1')
plt.ylabel('Dimension 2')
plt.show()
This shows how agents’ messages cluster in vector space. Overlapping clusters mean good alignment. Separate clusters mean drift.
Best Practices and Future Outlook
Building VSPs requires careful design. Here are some practices that work.
Architectural Strategies
Shared embedding models: All agents should use the same embedding model. This ensures compatible vector spaces. If you need to change models, use versioning.
Context preservation: Always include context in messages. This helps maintain meaning across conversations. Use structured context dictionaries.
Similarity thresholds: Choose thresholds carefully. Too high, and agents can’t communicate. Too low, and they misunderstand each other. Start with 0.75 and adjust.
Message history: Keep message history for drift detection and clustering. But don’t keep everything forever. Use entropy reduction to manage storage.
Handshake protocols: Always perform handshakes when agents first meet. Verify alignment before starting communication.
Contextual Governance
As VSPs scale, you need governance. You need to manage protocol versions, monitor drift, and handle conflicts.
Protocol registries: Maintain registries of protocol versions. Let agents discover compatible protocols.
Drift monitoring: Continuously monitor drift. Alert when drift exceeds thresholds. Trigger re-alignment when needed.
Conflict resolution: When agents disagree, use voting or consensus mechanisms. Or escalate to human oversight.
Trust networks: Build trust networks between agents. Agents that communicate well should trust each other.
Agent Trust Formation
VSPs enable trust formation. Agents can learn which peers are reliable. They can adapt their communication based on trust.
class VSPTrustNetwork:
def __init__(self, protocol: VectorSpaceProtocol):
self.protocol = protocol
self.trust_scores: Dict[str, float] = {}
self.interaction_history: Dict[str, List[Tuple[VSPMessage, VSPMessage]]] = {}
def update_trust(
self,
agent_id: str,
query: VSPMessage,
response: VSPMessage,
success: bool
):
"""Update trust score based on interaction outcome."""
if agent_id not in self.trust_scores:
self.trust_scores[agent_id] = 0.5 # Neutral
# Check semantic alignment
alignment = self.protocol.compute_similarity(query, response)
# Update trust based on success and alignment
if success and alignment >= self.protocol.similarity_threshold:
self.trust_scores[agent_id] = min(1.0, self.trust_scores[agent_id] + 0.1)
elif not success or alignment < self.protocol.similarity_threshold:
self.trust_scores[agent_id] = max(0.0, self.trust_scores[agent_id] - 0.1)
# Store interaction
if agent_id not in self.interaction_history:
self.interaction_history[agent_id] = []
self.interaction_history[agent_id].append((query, response))
def get_trust_score(self, agent_id: str) -> float:
"""Get trust score for an agent."""
return self.trust_scores.get(agent_id, 0.5)
Trust networks let agents learn which peers to trust. They adapt communication based on reliability. This improves overall system performance.
Future Directions
VSPs are still emerging. Several directions look promising:
Multi-modal VSPs: Extend VSPs to handle images, audio, and other modalities. Use multi-modal embeddings.
Dynamic threshold adjustment: Let thresholds adapt based on context and history. Use machine learning to optimize thresholds.
Federated VSPs: Enable VSPs across organizations. Use secure aggregation to align vector spaces without sharing data.
Semantic routing: Use VSPs for intelligent message routing. Route messages to agents based on semantic similarity, not just keywords.
Emergent protocols: Let agents discover protocols through interaction. Use reinforcement learning to evolve protocols.
Conclusion
Vector-Space Protocols offer a new way for AI agents to communicate. Instead of brittle schemas, they use shared meaning. Instead of exact matches, they use similarity. This enables more flexible, adaptive multi-agent systems.
The key insight is that meaning matters more than syntax. Agents don’t need to agree on exact message formats. They just need to share understanding. VSPs make this possible.
Building VSPs requires careful design. You need shared embedding models, context preservation, and drift detection. You need versioning and governance. But the benefits are worth it: more robust communication, better adaptation, and emergent cooperation.
As multi-agent systems scale, VSPs will become essential. They enable semantic interoperability at scale. They let agents evolve without breaking communication. They form the foundation for the next generation of AI agent ecosystems.
Discussion
Loading comments...