Testing "Raw" GPU Cache Latency

Tiny-gpu-compiler: An educational MLIR-based compiler targeting open-source GPU hardware

Parakeet.cpp – Parakeet ASR inference in pure C++ with Metal GPU acceleration

Show HN: A physically-based GPU ray tracer written in Julia

mdpt: Markdown TUI slides with GPU rendering (not terminal-dependent) — Rust

Numr: A high-performance numerical computing library with GPU acceleration

The Future for Tyr, a Rust GPU Driver for Arm Mali Hardware

Attyx – tiny and fast GPU-accelerated terminal emulator written in Zig

Show HN: Llama 3.1 70B on a single RTX 3090 via NVMe-to-GPU bypassing the CPU

Blinc: A declarative, reactive UI system with first-class state machines, spring physics animations, and GPU-accelerated rendering

which gpu should i get for ai training/inference/finetuning?

A browser benchmark that actually uses all your CPU/GPU cores

A browser benchmark that actually uses all your CPU/GPU cores

I got 14.84x GPU speedup by studying how octopus arms coordinate

I got 14.84x GPU speedup by studying how octopus arms coordinate

Optimizing GPU Programs from Java Using Babylon and Hat

Rust's Standard Library on the GPU

Deepseek research touts memory breakthrough, decoupling compute power and RAM pools to bypass GPU & HBM constraints — Engram conditional memory module commits static knowledge to system RAM

The Rapier physics engine 2025 review, 2026 goals, and GPU physics experiments

I wish to study compiler design and also wish to have a career in GPU Compiling. Please help me with the path

AWS raises GPU prices 15% on a Saturday, hopes you weren't paying attention

Diving into Qualcomm's Upcoming Adreno X2 GPU with Eric Demers

GPU memory snapshots: sub-second startup (2025)

qeep: Deep Learning framework in Go with Tensors, AutoGrad, and GPU acceleration

GPU Compiler Internship @Intel

Judging a Type by Its Pointer: Optimizing GPU Virtual Functions (2021)

Gigabyte CEO explains Nvidia's potential GPU supply strategy amid crushing memory shortages — gross revenue per gigabyte of GDDR7 memory could decide what products thrive

Show HN: GPU Cuckoo Filter – faster queries than Blocked Bloom, with deletion

Burn 0.20.0 Release: Unified CPU & GPU Programming with CubeCL and Blackwell Optimizations

Quick And Easy GPU Random Numbers In D3D11 (2013)

More →