ViCLIP - Video-Text Embeddings
Powered by ViCLIP-L-14 on ZeroGPU.
Capabilities:
- Video segment embeddings (768-dim)
- Text query embeddings
- Temporal-aware video understanding
- Semantic video search
API Endpoints for EagleEye:
POST /call/api_embed_video- Video segment embeddingPOST /call/api_embed_text- Text query embeddingPOST /call/api_embed_frames- Multi-frame embeddings
4 32
API Usage for EagleEye Integration
Video Embedding
from gradio_client import Client
client = Client("Cadayn/viclip-zerogpu")
result = client.predict(
video_url="https://example.com/clip.mp4",
num_frames=8,
api_name="/api_embed_video"
)
print(result)
# {"success": True, "embedding": [...], "dim": 768, ...}
Text Embedding
result = client.predict(
text="a soccer player scoring a goal",
api_name="/api_embed_text"
)
print(result)
# {"success": True, "embedding": [...], "dim": 768, ...}
Multi-Frame Embeddings
result = client.predict(
frames_base64=["frame1_b64", "frame2_b64", ...],
timestamps=[0.0, 1.0, 2.0, ...],
api_name="/api_embed_frames"
)
print(result)
# {"success": True, "frame_embeddings": [[...], [...]], "pooled_embedding": [...], ...}