SelfHostLLM - GPU Memory Calculator for LLM Inference

Apple shows how much faster the M5 runs local LLMs compared to the M4

Suppressing ability to lie makes LLM more likely to claim it's conscious

Show HN: Stun LLMs with thousands of invisible Unicode characters

qq.fish: A tiny, local, LLM assistant to propose commands using LMStudio that (almost) everyone can run

Great, now even malware is using LLMs to rewrite its code, says Google, as it documents new phase of 'AI abuse'

LLM agents demystified

Worries about Open Source in the age of LLMs

Show HN: Gerbil – an open source desktop app for running LLMs locally

I built an LLM inference server in pure Go that loads HuggingFace models directly (10MB binary, no Python)