CLIP · ViT-L/14 · Foundation Model
LAION-5B · Open Multimodal Corpus
Large-scale Artificial Intelligence Open Network — 5.85B CLIP-filtered image-text pairs used for foundation model pretraining, Stable Diffusion, and multimodal research.
⌘K
GPU78%
Loss0.184
Throughput312/s
Image-Text Pairs
5.85 B
Languages
100+
Total Size
240 TB
Open License
CC-BY 4.0
Resolution distribution
Image size buckets (millions)
Language coverage
Top languages (millions of pairs)
ENZHESDEFRJAOther
Foundation Pretraining
Backbone corpus for CLIP, OpenCLIP, ALIGN-style contrastive pretraining at scale.
Generative AI
Powering Stable Diffusion, latent diffusion models, and text-to-image generators.
Multilingual VLMs
LAION-5B includes LAION-2B-multi enabling 100+ language multimodal models.
Dataset card
LAION-5B at a glance
- • 5.85 billion CLIP-filtered image-text pairs
- • Splits: LAION-2B-EN, LAION-2B-MULTI, LAION-1B-NOLANG
- • Built from Common Crawl, NSFW/illegal content tagged
- • Filter: cosine similarity ≥ 0.28 via OpenAI CLIP ViT-B/32
- • Released by LAION e.V. under CC-BY 4.0
- • Used by: Stable Diffusion, OpenCLIP, BLIP-2, Kosmos-2
- • Hosted as Parquet + WebDataset shards
- • Average caption length: 12.4 tokens