AiThority - Artificial Intelligence | News | Insights | AiThority
Run:ai, the leader in compute orchestration for AI workloads, announced new features of its Atlas Platform, including two-step model deployment — which makes it easier and faster to get machine learning models into production. The company also announced a new integration with NVIDIA Triton Inference Server. These capabilities are particularly focused on supporting organizations in deploying and using AI models for inference workloads on NVIDIA-accelerated computing, so they can provide accurate, real-time responses. The features cement Run:ai Atlas as a single unified platform where AI teams, from data scientists to MLOps engineers, can build, train and manage models in production from one simple interface.
AI models can be challenging to deploy into production; despite the time and effort spent to build and train models, most never leave the lab. Configuring a model, connecting it to data and containers, and dedicating only the required amount of compute are major barriers to making AI work in production. Deploying a model usually requires manually editing and loading tedious YAML configuration files. Run:ai’s new two-step deployment makes the process easy, enabling organizations to quickly switch between models, optimize for economical use of GPUs, and ensure that models run efficiently in production.
Recommended AI News: Appalachian Regional Healthcare Selects Biofourmis Tech for Home Hospital Programs, Including Rural Healthcare Study
Run:ai also announced full integration with NVIDIA Triton Inference Server, which allows organizations to deploy multiple models — or multiple instances of the same model — and run them in parallel within a single container. NVIDIA Triton Inference Server is included in the NVIDIA AI Enterprise software suite, which is fully supported and optimized for AI development and deployment. Run:ai’s orchestration works on top of NVIDIA Triton and provides auto-scaling, allocation and prioritization on a per-model basis — which right-sizes Triton automatically. Using Run:ai’s Atlas with NVIDIA Triton leads to increased compute resource utilization while simplifying AI infrastructure. The Run.ai Atlas Platform is an NVIDIA AI Accelerated application, indicating it is developed on the NVIDIA AI platform for performance and reliability.
Running inference workloads in production requires fewer resources than training, which consumes large amounts of GPU compute and memory. Organizations sometimes run inference workloads on CPUs instead of GPUs, but this might mean higher latency. In many use cases for AI, the end user requires a real-time response: identification of a stop sign, facial recognition on a phone, or voice dictation, for example. CPU-based inference can be too slow for these applications.
Using GPUs for inference workloads gives lower latency and higher accuracy, but this can be costly and wasteful when GPUs are not fully utilized. Run:ai’s model-centric approach automatically adjusts to diverse workload requirements. With Run:ai, using a full GPU for a single lightweight workload is no longer required, saving considerable cost while maintaining low latency.
Daily AI Roundup: Biggest Machine Learning, Robotic And Automation Updates
Cherry Labs Awarded NIH Grant to Help Older Americans Age in Place Independently Using Privacy-protected, AI-based Monitoring
Marionnaud Italy and Revieve Partner to Launch Industry-First AI Suncare Advisor to Strengthen Sun Protection Awareness Among Consumers
Recommended AI News: AB Tasty and Mixpanel Announce Two New Integrations to Accelerate Digital Product Innovation
Other new features of Run:ai Atlas for inference workloads include:
“With new advanced inference capabilities, Run:ai’s Altas Platform now offers a solution for the entire AI lifecycle — from build to train to inference — all delivered in a single platform,” said Ronen Dar , CTO and co-founder of Run:ai. “Instead of using multiple different MLOps and orchestration tools, data scientists can benefit from one unified, powerful platform to manage all their AI infrastructure needs.”
“The flexibility and portability of NVIDIA Triton Inference Server, available with NVIDIA AI Enterprise support, enables fast, simple scaling and deployment of trained AI models from any framework on any GPU- or CPU-based infrastructure,” said Shankar Chandrasekaran, senior product manager at NVIDIA. “Triton Inference Server’s advanced performance and ease of use together with orchestration from Run:ai’s Atlas Platform make it the ideal foundation for AI model deployment.”
Recommended AI News: Darktrace Adds Early Warning System to Antigena Email
[To share your insights with us, please write to sghosh@martechseries.com]
AIT News Desk is a trained group of web journalists and reporters who collect news from all over the technology landscape. The technical space includes advanced technologies related to AI, ML, ITops, Cloud Security, Privacy and Security, Cyberthreat intelligence, Space, Big data and Analytics, Blockchain and Crypto. To connect, please write to AiT Analyst at news@martechseries.com.
Flashback Technologies, Inc. Announces Appointment of Marcus Barkham as CEO and Subsequent Company Rebrand
HiveMQ Applauded by Frost & Sullivan for Enabling Fast Bidirectional Data Sharing and Interoperability with Its Flagship MQTT-based Messaging Platform
Daily AI Roundup: Biggest Machine Learning, Robotic And Automation Updates
Cherry Labs Awarded NIH Grant to Help Older Americans Age in Place Independently…
Marionnaud Italy and Revieve Partner to Launch Industry-First AI Suncare Advisor to…
GE Digital Collaborates with Microsoft and Teradata for Sustainability Solution…
New Release of the expert.ai Platform Accelerates AI-based Natural Language…
Intelligent Voice Assistants and the Future of Contact Center Service
Clarifai Announces New Free AI Collaboration Community
LivePerson Announces Agreement with Starboard Value
AiThority.com covers AI technology news, editorial insights and digital marketing trends from around the globe. Updates on modern marketing tech adoption, AI interviews, tech articles and events.
Copyright © 2022 AiThority. All Rights Reserved. Privacy Policy