This startup wants to take on Nvidia with a server-on-a-chip to eliminate what it calls an already flawed system — faster GPU, CPU, LPU, TPU or NIC will not deliver the leap that many firms are aiming for

According to Israeli startup NeuReality, many AI possibilities aren’t fully realized due to the cost and complexity of building and scaling AI systems.  Current solutions are not optimized for inference and rely on general-purpose CPUs, which were not designed for AI. Moreover, CPU-centric architectures necessitate multiple hardware components, resulting in underutilized Deep Learning Accelerators (DLAs) … Read more

Groq LPU (Language Processing Unit) performance tested – capable of 500 tokens per second

A new player has entered the field of artificial intelligence in the form of the Groq LPU (Language Processing Unit). Groq has the remarkable ability to process over 500 tokens per second using the Llama 7B model.  The Groq Language Processing Unit (LPU), is powered by a chip that’s been meticulously crafted to perform swift inference … Read more