Extract (query, positive, negative) from ClickHouse click data for BGE-M3 fine-tuning.
2. Fine-Tune BGE-M3
Train on exported triplets. Requires GPU.
3. Activate Model
Hot-swap the embedder to a fine-tuned checkpoint. Then re-index.
Build Eval Set
Create a frozen (query, judgments) set from ClickHouse for offline evaluation.
Run Eval Harness
Replay eval queries against this service. Reports NDCG, Recall, MRR.
Eval History
Loading...
Background Tasks
Scheduled Jobs
Cron-managed background jobs. Pause to stop the schedule; resume restores it. "Run now" fires the job immediately.
Loading…
Image embedder status
Status is read from GET /image-embed/status and does NOT trigger model load.
First test below will lazy-load the model (≈30 s on a cold container).
Text → image-vector
Embed text into the same space as image_vector in ES. This is what gets called
at query time for visual-KNN retrieval. Try Arabic + English queries to verify the
multilingual text tower is alive.
Bulk reindex images
Scrolls the ES index for products with an image field, fetches each via
the configured CDN base URL, embeds via SigLIP 2, and bulk-updates the
image_vector field. Default mode embeds only products missing a vector
(weekly top-ups); switch to re-embed all only for model upgrades.
Image URL → vector
Embed a single product image by URL. Useful for sanity-checking that the indexing pipeline's
URL fetch + decode + GPU forward pass works end-to-end before kicking off a 87k-product batch.