Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents

Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models

Double Descent Demystified: size of smallest non-zero singular value of X

CaMeL: Defeating Prompt Injections by Design

Quantum-assured magnetic navigation with higher positioning accuracy than GPS

Do Large Language Models know who did what to whom?

Inferring the Phylogeny of Large Language Models

Nofl: A Precise Immix

BitNet b1.58 2B4T Technical Report

Teuken-7B-Base and Teuken-7B-Instruct: Towards European LLMs (2024)

Our quantum assembly parser got updated to the QASM 3.0 spec

M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Nofl: A Precise Immix

How is Google using AI for internal code migrations?

Three things everyone should know about Vision Transformers

Reasoning Models Can Be Effective Without Thinking

Ultra-precision formation flying demonstration for space-based interferometry

NoProp: Training neural networks without back-propagation or forward-propagation

Eccfrog512ck2: An Enhanced 512-Bit Weierstrass Elliptic Curve [pdf]

ProtoGS: Efficient and High-Quality Rendering with 3D Gaussian Prototypes

Visualizing a Million Time Series with the Density Line Chart

SDFs from Unoriented Point Clouds Using Neural Variational Heat Distances

Optimizing Diffusion Models with Scale-wise Distillation: A Computational Efficiency Boost

All-in-Memory Stochastic Computing Using ReRAM

Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

MooseAgent: A LLM Based Multi-Agent Framework for Automating Moose Simulation

DeepSeek: Inference-Time Scaling for Generalist Reward Modeling

PyGraph: Robust Compiler Support for CUDA Graphs in PyTorch

Evaluating Agent-Based Program Repair at Google

NNN: Next-Generation Neural Networks for Marketing Mix Modeling

More →