Loading...

Tag trends are in beta. Feedback? Thoughts? Email me at [email protected]

DeepSeek: Thinking with Visual Primitives [pdf]

DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence

DeepSeek-V4: a million-token context that agents can use

Qwen/Qwen3.6-27B ยท Hugging Face

DeepSeek-V4 Technical Report [pdf]

Alibaba open-sources Qwen3.6-35B-A3B, a 35B MoE model with 3B active parameters

Bonsai 1.7B in the browser: a 290MB 1-bit LLM on WebGPU

Distilling 100B+ Models 40x Faster with TRL

Free virtual try-on API for a mobile app (student project)?

LLM Embeddings Explained: A Visual and Intuitive Guide

Show HN: Hacker News archive (47M+ items, 11.6GB) as Parquet, updated every 5m

Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI

Hugging Face Storage Buckets: Mutable, non-versioned object storage at $12/TB

Show HN: I logged Gemini's stock predictions for 38 days to study LLM drift

Show HN: 17MB model beats human experts at pronunciation scoring

Continuous batching (2025)

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete

Anyone Can Clone Your Voice Now

Nvidia Nemotron 3-Nano 30B-A3B-BF16

Show HN: Text-to-video model from scratch (2 brothers, 2 years, 2B params)

Waypoint-1: Real-Time Interactive Video Diffusion from Overworld

Uncensored General Intelligence (UGI) Leaderboard

Anatomy of BoltzGen

Show HN: 30k IKEA items in flat text

Meta releases open datasets for training AI Co-Scientists

Show HN: Largest Public Dataset of Electronic Circuit Files

Qwen-Image-Layered: transparency and layer aware open diffusion model

AI Energy Score v2: Refreshed Leaderboard, now with Reasoning

DeepSeek-v3.2: Pushing the frontier of open large language models [pdf]

More →