CLIP · ViT-L/14 · Foundation Model

Text → Image Retrieval

Prompt-aware retrieval: tokenize → text encoder → semantic pre-filter → FAISS cosine similarity → diversity re-rank → XAI.

Execution pipeline
  1. 1Tokenize prompt
  2. 2Text encoder (Transformer)
  3. 3Candidate selection (keyword pre-filter)
  4. 4Query FAISS IVF-PQ index
  5. 5Cosine similarity + diversity re-rank
  6. 6Explainability re-ranker
  7. 7Metric evaluation & report
Top-K
Enter a prompt and run retrieval — each run produces a new ranking.