
NVIDIA has introduced NVIDIA Inference Microservices (NIMs), a new technology designed to facilitate the deployment and operation of AI models in enterprise environments. NIMs are pre-trained AI models packaged as microservices or containers, which can be easily deployed across various domains such as language, vision, and robotics. This innovation, spearheaded by Jensen Huang, aims to transform generative AI development by enabling the creation of expert AI agents that can work in teams to accomplish assigned missions. The integration of NIMs with platforms like KServe and tools like Weights & Biases ensures that AI models can be deployed as efficiently as any other large enterprise application. NVIDIA's own operations have demonstrated the effectiveness of NIMs, using them to develop an AI Planner agent that significantly reduces re-planning time from hours to seconds. The technology also supports embedding inference and has been integrated by companies like Dataloop to accelerate AI deployment.
Dataloop Integrates NVIDIA NIM to Accelerate Running and Deploying Generative AI -- https://t.co/TaIKUP1kKa #AI #GenAI @DataloopAI @NVIDIAAI
Using generative AI, NVIDIA operations built an AI Planner agent, developed on NVIDIA Inference Microservices (NIM). The agent leverages LLM, NeMo Retriever and CuOpt NIM to reduce re-planning time from hours to just seconds. https://t.co/V5ISjCSDxa https://t.co/cSCBe7EwYJ https://t.co/52gHDRaQAj
Speed up AI app deployment from weeks to minutes with NVIDIA NIM's embedding inference microservices. With Weights & Biases integrated, developers can now build and deploy domain-specific GenAI apps with optimized inference. Learn more: https://t.co/0XKafLroAX https://t.co/8b4o6ZzeME




