博文

NVIDIA Launches a New“Brain”-Smarter Robots Are Coming!

As technology evolves rapidly, artificial intelligence and robotics continue to merge, producing amazing breakthroughs. Recently at SIGGRAPH, NVIDIA unveiled its impressive new Cosmos world model - a development that could transform the robotics industry. NVIDIA’s Cosmos World Model Cosmos is a model designed to generate synthetic data that obeys real-world physical laws. It helps robots better understand and adapt to the physical environment. Since its launch, Cosmos has been adopted by leading robotics and autonomous driving companies, including Figure, Agility Robotics, and General Motors. What’s New in the Upgrade? Software Enhancements Cosmos Reason : A 7-billion-parameter vision-language model with reasoning capabilities that assist robots in task planning. It understands physics and applies commonsense for multi-step reasoning. Cosmos Transfer-2 & Lite Version : Speeds up converting virtual scenes into training data with a streamlined distillation process, enabling...

OpenAI's Delay, Manus's Retreat from China, Baichuan's Brain Drain: Three Undercurrents in July

图片
  This week, the AI industry didn't see any "world-changing" headlines, but three significant events quietly unfolded: OpenAI once again postponed the release of its first open-source weight model. Manus moved its headquarters from Beijing to Singapore, reducing its China team to just over 40 people. Xie Jian, co-founder and CTO of Baichuan Intelligence, confirmed his departure, leaving only two members of the founding team. These three events point to a common undercurrent:  resource reallocation, compliance costs, and technical roadmap uncertainties.  Below, we break down the key facts for peers, investors, or anyone concerned with the foundational infrastructure. 1. OpenAI Open-Source Model Further Delayed On July 12, Sam Altman confirmed on X that the release was further postponed for "additional safety testing," without providing a new timeline. Red teaming test cases have doubled to 28,000, with 41% related to disinformation spread. The delay reflects a re...

Google Open Sources Gemma 3n: The Most Capable Sub-10B Multimodal Model That Runs on Just 2GB RAM

图片
 Balancing performance and memory efficiency has long been a challenge for AI models running on edge devices. Google's newly open-sourced Gemma 3n tackles this issue head-on. Designed with an efficient multimodal architecture, Gemma 3n delivers state-of-the-art capabilities while requiring minimal memory-just 2GB or 3GB depending on the variant. It redefines what’s possible for on-device AI. Key Features of Gemma 3n Multimodal from the Ground Up Gemma 3n natively supports i mages, audio, video, and text as input, with text as output. This flexibility makes it an ideal solution for a wide range of applications, from real-time transcription and translation to interactive visual understanding. Built for the Edge Two optimized configurations are available: E2B : 2GB runtime memory, equivalent to 2B effective parameters E4B : 3GB runtime memory, equivalent to 4B effective parameters Although their total parameter counts are 5B and 8B, architecture innovations make their memory requireme...

AI Text Generation API Documentation

图片
  1. Overview This document outlines the usage and specifications of the AI Text Generation API. Powered by advanced artificial intelligence models, this API can generate high-quality text based on user-provided prompts. It is ideal for use cases such as intelligent writing assistants, content creation tools, and chatbots. 2. Basic Information API Name : AI Text Generation API Version : 1.0 HTTP Method : POST Endpoint :  https://api.example.com/ai/text/generate 3. Request Parameters Parameter Type Required Description prompt string Yes Input prompt to guide the text generation max_tokens int No Maximum length of generated text, default is 100 temperature float No Controls randomness of output; range: 0–1, default is 0.7 4. Response Parameters Parameter Type Description code int Response status code (200 = success) message string Response message, empty on success data object Contains the generated text - text string The generated text content 5. Request Example Copy Copy { ...