Categories
Featured

Intel quietly launched mysterious new AI CPU that promises to bring deep learning inference and computing to the edge — but you won’t be able to plug them in a motherboard anytime soon

[ad_1]

Intel has launched a new AI processor series for the edge, promising industrial-class deep learning inference. The new ‘Amston Lake’ Atom x7000RE chips offer up to double the cores and twice the higher graphics base frequency as the previous x6000RE series, all neatly packed within a 6W–12W BGA package.

The x7000RE series packs more performance into a smaller footprint. Boasting up to eight E-cores it supports LPDDR5/DDR5/DDR4 memory and up to nine PCIe 3.0 lanes, delivering robust multitasking capabilities.

[ad_2]

Source Article Link

Categories
Featured

Chip firm founded by ex-Intel president plans massive 256-core CPU to surf AI inference wave and give Nvidia B100 a run for its money — Ampere Computing AmpereOne-3 likely to support PCIe 6.0 and DDR5 tech

[ad_1]

Ampere Computing unveiled its AmpereOne Family of processors last year, boasting up to 192 single-threaded Ampere cores, which was the highest in the industry. 

These chips, designed for cloud efficiency and performance, were Ampere’s first product based on its new custom core leveraging internal IP,  signalling a shift in the sector, according to CEO Renée James.

[ad_2]

Source Article Link

Categories
Featured

AMD teams up with Arm to unveil AI chip family that does preprocessing, inference and postprocessing on one silicon — but you will have to wait more than 12 months to get actual products

[ad_1]

AMD is introducing two new adaptive SoCs – Versal AI Edge Series Gen 2 for AI-driven embedded systems, and Versal Prime Series Gen 2 for classic embedded systems.

Multi-chip solutions typically come with significant overheads but single hardware architecture isn’t fully optimized for all three AI phases – preprocessing, AI inference, and postprocessing. 

[ad_2]

Source Article Link

Categories
Featured

Samsung is going after Nvidia’s billions with new AI chip — Mach-1 accelerator will combine CPU, GPU and memory to tackle inference tasks but not training

[ad_1]

Samsung is reportedly planning to launch its own AI accelerator chip, the ‘Mach-1’, in a bid to challenge Nvidia‘s dominance in the AI semiconductor market. 

The new chip, which will likely target edge applications with low power consumption requirements, will go into production by the end of this year and make its debut in early 2025, according to the Seoul Economic Daily.

[ad_2]

Source Article Link

Categories
Featured

Inference: The future of AI in the cloud

[ad_1]

Now that it’s 2024, we can’t overlook the profound impact that Artificial Intelligence (AI) is having on our operations across businesses and market sectors. Government research has found that one in six UK organizations has embraced at least one AI technology within its workflows, and that number is expected to grow through to 2040.

With increasing AI and Generative AI (GenAI) adoption, the future of how we interact with the web hinges on our ability to harness the power of inference. Inference happens when a trained AI model uses real-time data to predict or complete a task, testing its ability to apply the knowledge gained during training. It’s the AI model’s moment of truth to show how well it can apply information from what it has learned. Whether you work in healthcare, ecommerce or technology, the ability to tap into AI insights and achieve true personalization will be crucial to customer engagement and future business success.

Inference: the Key to true personalisation

[ad_2]

Source Article Link

Categories
News

SteerLM a simple technique to customize LLMs during inference

SteerLM a simple technique to customise LLMs during inference

Large language models (LLMs) have made significant strides in artificial intelligence (AI) natural language generation. Models such as GPT-3, Megatron-Turing, Chinchilla, PaLM-2, Falcon, and Llama 2 have revolutionized the way we interact with technology. However, despite their progress, these models often struggle to provide nuanced responses that align with user preferences. This limitation has led to the exploration of new techniques to improve and customize LLMs.

Traditionally, the improvement of LLMs has been achieved through supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF). While these methods have proven effective, they come with their own set of challenges. The complexity of training and the lack of user control over the output are among the most significant limitations.

In response to these challenges, the NVIDIA Research Team has developed a new technique known as SteerLM. This innovative approach simplifies the customization of LLMs and allows for dynamic steering of model outputs based on specified attributes. SteerLM is a part of NVIDIA NeMo and follows a four-step technique: training an attribute prediction model, annotating diverse datasets, performing attribute-conditioned SFT, and relying on the standard language modeling objective.

Customize large language models

One of the most notable features of SteerLM is its ability to adjust attributes at inference time. This feature enables developers to define preferences relevant to the application, thereby allowing for a high degree of customization. Users can specify desired attributes at inference time, making SteerLM adaptable to a wide range of use cases.

The potential applications of SteerLM are vast and varied. It can be used in gaming, education, enterprise, and accessibility, among other areas. The ability to customize LLMs to suit specific needs and preferences opens up a world of possibilities for developers and end-users alike.

In comparison to other advanced customization techniques, SteerLM simplifies the training process and makes state-of-the-art customization capabilities more accessible to developers. It uses standard techniques like SFT, requiring minimal changes to infrastructure and code. Moreover, it can achieve reasonable results with limited hyperparameter optimization.

Other articles you may find of interest on the subject of  AI models

The performance of SteerLM is not just theoretical. In experiments, SteerLM 43B achieved state-of-the-art performance on the Vicuna benchmark, outperforming existing RLHF models like LLaMA 30B RLHF. This achievement is a testament to the effectiveness of SteerLM and its potential to revolutionize the field of LLMs.

The straightforward training process of SteerLM can lead to customized LLMs with accuracy on par with more complex RLHF techniques. This makes high levels of accuracy more accessible and enables easier democratization of customization among developers.

SteerLM represents a significant advancement in the field of LLMs. By simplifying the customization process and allowing for dynamic steering of model outputs, it overcomes many of the limitations of current LLMs. Its potential applications are vast, and its performance is on par with more complex techniques. As such, SteerLM is poised to play a crucial role in the future of LLMs, making them more user-friendly and adaptable to a wide range of applications.

To learn more about SteerLM and how it can be used to customise large language models during inference jump over to the official NVIDIA developer website.

Source &  Image :  NVIDIA

Filed Under: Technology News, Top News





Latest timeswonderful Deals

Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, timeswonderful may earn an affiliate commission. Learn about our Disclosure Policy.