Optimizing Diffusion Models with Scale-wise Distillation: A Computational Efficiency Boost

All-in-Memory Stochastic Computing Using ReRAM

Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

MooseAgent: A LLM Based Multi-Agent Framework for Automating Moose Simulation

DeepSeek: Inference-Time Scaling for Generalist Reward Modeling

PyGraph: Robust Compiler Support for CUDA Graphs in PyTorch

Evaluating Agent-Based Program Repair at Google

NNN: Next-Generation Neural Networks for Marketing Mix Modeling

UCSD: Large Language Models Pass the Turing Test

Banked Memories for Soft SIMT Processors