
Google Unveils Stitch: A New AI-Powered Design Tool
Google has launched Stitch, an AI tool designed to streamline the creative process for designers. It offers advanced features to enhance productivity and innovation in design workflows.
All AI stories, newest first.

Google has launched Stitch, an AI tool designed to streamline the creative process for designers. It offers advanced features to enhance productivity and innovation in design workflows.

Researchers propose Group Fine-Tuning (GFT), a method that combines imitation and reward learning to improve LLM training. GFT addresses key challenges like single-path dependency and gradient instability.

Researchers demonstrate that individual experts in sparse Mixture-of-Experts (MoE) models have causally meaningful identities. This discovery enables more precise control over model behavior through geometric routing techniques.

Google's Gemini AI can now pull from Google Photos to create personalized images. This feature leverages the Nano Banana 2 model for tailored visual content based on user data.

Researchers introduce Fun-TSG, a function-driven tool for generating multivariate time series with detailed anomaly labels. This addresses key limitations in current benchmark datasets for anomaly detection.

Researchers introduce Credo, a framework for declarative control of LLM pipelines using beliefs and policies. It aims to make agent behavior more transparent and adaptable than current imperative approaches.

Researchers introduce AIBuildAI, an AI agent designed to automate the complex process of building AI models. This innovation aims to reduce the manual effort required in model development, potentially democratizing AI creation.

Researchers introduce Weight Patching, a new method for source-level mechanistic localization in LLMs. This approach promises to identify the exact parameters responsible for specific capabilities, advancing our understanding of how these models work.

Researchers introduce WebXSkill, a framework that combines executable skills with natural language understanding to improve autonomous web agents. This innovation addresses the grounding gap in current LLM-powered agents, enhancing their ability to complete complex browser tasks.

A new arXiv paper quantifies how floating-point precision issues in large language models lead to chaotic behavior. The research highlights the need for better numerical stability in AI systems.

Researchers introduce SciFi, a new agentic AI framework designed for safe, autonomous execution of scientific tasks. The system combines isolated environments and self-assessment mechanisms to enhance reliability in research applications.

Researchers introduce RiskWebWorld, a realistic benchmark for evaluating GUI agents in high-stakes e-commerce risk management. It features 1,513 tasks from production risk-control pipelines, addressing a gap in current benchmarks.

Researchers introduce ReSS, a hybrid framework that merges symbolic and neural models to improve tabular data prediction. The approach aims to enhance both accuracy and human-understandable reasoning in high-stakes domains like healthcare and finance.

OpenAI has introduced GPT-Rosalind, a specialized AI model designed to enhance drug discovery, genomics analysis, and protein reasoning. This model aims to accelerate scientific research workflows in the life sciences sector.

A new study introduces conformal prediction to quantify uncertainty in large reasoning models, addressing gaps in traditional methods. This approach provides statistically rigorous uncertainty sets, crucial for complex reasoning tasks.

Researchers have developed a method to quantify exploration and exploitation in language model agents without accessing their internal policies. This breakthrough could improve AI decision-making in complex tasks.

Researchers find that the primary bottleneck in scaling multimodal large language models (MLLMs) is knowledge density in training data, not task format. Task-specific supervision like Visual Question Answering (VQA) adds little incremental semantic information beyond image captions.

Researchers introduce GeoAgentBench, a dynamic evaluation framework for LLM-based geographic information systems (GIS). It addresses gaps in static testing by assessing real-time, multimodal spatial analysis capabilities.

Fleeks is a new infrastructure platform designed to remove bottlenecks for AI agents, enabling them to execute, verify, and integrate code seamlessly. The platform aims to bridge the gap between code generation and real-world application.

Researchers introduce CONCORD, a framework for privacy-aware AI collaboration. It enables assistants to work together while only capturing the owner's speech, addressing key privacy concerns.