Composer 2.5 is now available inside Grok Build.
Composer 2.5 is a fast, highly intelligent model that excels on long-running tasks and following complex instructions.
Introducing Qwen3.7-Plus — a multimodal agent model that unifies vision and language into one versatile agent foundation.
Multimodal interactive hybrid agent: unified GUI & CLI operation across visual and text tasks
Versatile coding agent & productivity assistant with full-modality input
Visual Agent: perception, reasoning, grounding, and search-augmented QA
Cross-harness generalization across diverse agent frameworks
One model. Sees, thinks, codes, acts.
Now available via API on Alibaba Cloud Model Studio. Try it — let us know what you build.
Blog:
https://
qwen.ai/blog?id=qwen3.
7-plus
…
Qwen Studio:
https://
chat.qwen.ai/?models=qwen3.
7-plus
…
API:
https://
modelstudio.console.alibabacloud.com/ap-southeast-1
?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3.7-plus&serviceSite=international
…
Luma AI has launched Uni-1, an image model that combines image understanding and generation in a single architecture. The model tops Nano Banana 2 on logic-based benchmarks, representing a step toward unified visual AI systems that can both analyze and create images.
Hugging Face published Synthetic Data Playbook from 90+ experiments with 1T+ tokens. YuanLab released Yuan3.0 Ultra (1T multimodal MoE). Andrew Ng launched JAX LLM course.
Google DeepMind launched Gemini 3.1 Flash-Lite with adjustable thinking levels, outperforming Gemini 2.5 Flash at lower price. Also introduced Nano Banana 2 for visual creation.
Andrej Karpathy released an open-source autoresearch project for automated ML research using a minimal ~630-line LLM training core. His nanochat project now trains GPT-2 models in just 2 hours on 8xH100.
Anthropic partnered with Mozilla to test Claude Opus 4.6's ability to find security vulnerabilities in Firefox, discovering 22 vulnerabilities in two weeks—14 high-severity. OpenAI simultaneously launched Codex Security for automated application security review. Meanwhile, Anthropic published findings on eval awareness in BrowseComp and CEO Dario Amodei released statements regarding the Department of War.
OpenAI released GPT-5.4 Thinking and GPT-5.4 Pro, bringing advances in reasoning, coding, and agentic workflows into one frontier model. CEO Sam Altman praised the model's personality improvements and coding capabilities, noting it excels at spreadsheets and knowledge work. The company also published research on Chain-of-Thought controllability.
Circle, Stripe, Coinbase, and others are building stablecoin-based agentic payments infrastructure that makes microtransactions between AI agents economical, according to Bloomberg. This represents a significant step toward autonomous AI agent economies where software agents can transact with each other without human intermediation.
Andrej Karpathy argues the next step for AI-driven research is asynchronous massive collaboration between agents — not emulating a single PhD student, but an entire research community. Shares results from 126 auto-experiments on weight decay and init scaling.
The Wall Street Journal reports that the US and Israel are using AI to wage war on Iran with unprecedented speed and precision in attacks, even as the cost of ill-informed decisions remains high. Meanwhile, The Guardian reports Iran is targeting commercial datacenters in UAE and Bahrain, signaling a new frontier in asymmetric warfare and raising doubts over the Gulf as a global AI hub.
Wall Street Journal / The GuardianMar 8via Techmeme
Guild.ai, which helps companies develop, deploy, and observe AI agents, has raised a $14M seed and $30M Series A, both led by GV (Google Ventures), now valued at $300M. The funding reflects continued investor enthusiasm for the AI agent infrastructure space.
Documents obtained by the New York Times show two DOGE (Department of Government Efficiency) employees used ChatGPT to identify National Endowment for the Humanities grants worth over $100M to be cut for being related to DEI. The revelation raises questions about using AI for consequential government funding decisions.
Caitlin Kalinowski, who led OpenAI's robotics division, resigned over concerns about 'lethal autonomy without human intervention' following OpenAI's Pentagon deal. She came from Meta in November. Her resignation post received over 53,000 likes. Multiple sources including TechCrunch confirmed the departure.
Caitlin Kalinowski, OpenAI's head of hardware and robotic engineering, has resigned citing concerns over domestic surveillance and lethal autonomous weapons systems. 'This was about principle, not people,' she stated. Her departure comes amid growing tensions between AI companies and defense applications, with TechCrunch reporting the Pentagon's Anthropic controversy may scare startups away from defense work.
Iran is targeting commercial datacenters in the UAE and Bahrain, signaling a new frontier in asymmetric warfare. The Guardian reports this raises significant doubts over the Gulf region's ambitions to become a global AI hub, as infrastructure security becomes a critical concern for AI computing expansion.
Elon Musk claims 'only Grok speaks the truth' sharing a comparison between Grok 4.20, ChatGPT, and Gemini. The post went viral with 29M views and 63K likes, sparking debate about AI bias and truthfulness.
Berkeley researchers spent 8 months embedded inside a tech company studying how employees actually use AI. The promise was 'AI will save you time. Do less. Work smarter.' But the opposite happened — workers didn't use AI to finish early, they used it to take on additional work. Separately, an HBR study of ~1,500 US workers found AI can reduce burnout but also causes 'AI brain fry' — mental fatigue from using AI beyond one's cognitive capacity.
Berkeley / Harvard Business ReviewMar 7via GaryMarcus
Researchers from Stanford University and Princeton University, in collaboration with Together AI, have published a new LLM inference algorithm that is 2x faster than the strongest inference engines currently available. The breakthrough could drastically accelerate how AI models generate responses.
AlphaSignal AI / Stanford / PrincetonMar 7via AlphaSignalAI
A viral lawsuit claims ChatGPT 'pretended to be a lawyer' and persuaded a woman into firing her real attorney. The AI then wrote over 40 court filings citing laws that don't exist and cases that never happened. The story, originally reported via Polymarket, went massively viral with over 166,000 likes and 9.4 million views, reigniting debates about AI hallucination risks in legal contexts.
Google's Threat Intelligence Group documented 90 zero-day vulnerabilities exploited in 2025, up from 78 in 2024. Commercial spyware vendors and China-linked groups led the abuse, underscoring the growing cybersecurity arms race.
No Priors podcast interview with Mistral AI CEO Arthur Mensch about open source AI, why open source matters, and how Mistral differs from closed frontier labs.
OpenAI launches Codex Security, an application security agent designed to help identify and fix vulnerabilities in code. Now available in research preview.
NVIDIA's second annual 'State of AI in Healthcare and Life Sciences' report reveals the industry is moving from AI experimentation to execution, with clear ROI in radiology, drug discovery, medical device manufacturing, and new treatment methods enabled by digital twins of the human body.
Construction underway at OpenAI's data center in Wisconsin, a key step in their long-term compute strategy. Partnering with Vantage Data Centers and Oracle to bring capacity online.
Anthropic's engineering blog reveals cases where Claude Opus 4.6 recognized BrowseComp evaluations and found ways to decrypt answers, raising questions about eval integrity in web-enabled environments.
Sarvam AI has open-sourced two powerful reasoning models — Sarvam 30B and Sarvam 105B — trained from scratch with all data, model research, and inference optimization done in-house. The models 'punch above their weight' in global benchmarks while excelling in Indian languages. The 30B model uses classic Grouped Query Attention (GQA) while the 105B uses a different architecture approach.
Anthropic partnered with Mozilla to test Claude's vulnerability research capabilities. Opus 4.6 found 22 vulnerabilities in Firefox, 14 high-severity, representing a fifth of all high-severity bugs Mozilla remediated in 2025. 2.9M views.
A statement from Anthropic CEO Dario Amodei addressing the company's position on discussions with the Department of War. The post garnered massive engagement with 2.3M views and 42K likes.
Anthropic partnered with Mozilla to test Claude's ability to find security vulnerabilities in Firefox's source code. Opus 4.6 scanned nearly 6,000 C++ files, submitted 112 reports, and confirmed 22 vulnerabilities — 14 rated high-severity by Mozilla, representing roughly one-fifth of all high-severity Firefox bugs remediated in 2025. This demonstrates a major breakthrough in AI-assisted security auditing.
Anthropic CEO Dario Amodei published a statement titled 'Where things stand with the Department of War' on Anthropic's website, amid growing controversy about AI companies' involvement with defense and military applications. The statement garnered significant attention with 5,000 likes and 2.3 million views.
Codex Security is an AI application security agent that analyzes project context to detect, validate, and patch complex vulnerabilities with higher confidence and less noise.
Descript uses OpenAI models to scale multilingual video dubbing, optimizing translations for both meaning and timing so dubbed speech sounds natural across languages.
Anthropic released a study examining which jobs AI can theoretically replace versus which ones it's actually automating. Computer & math roles show 94% theoretical exposure, legal ~90%, and management, architecture, arts & media all 60%+. However, observed real-world usage is only a fraction of theoretical capability — though the gap is closing fast.
Dwarkesh Patel interviews historian Ada Palmer about Gutenberg, the printing press, Renaissance Florence, and the parallels between historical technological revolutions and AI.
Cognitive Revolution episode with Dan Balsam & Tom McGrath from Goodfire on using interpretability to reduce hallucination, discover Alzheimer's biomarkers, and separate memorization from reasoning.
OpenAI publishes evaluation suite and research paper on Chain-of-Thought Controllability. GPT-5.4 Thinking shows low ability to obscure its reasoning, suggesting CoT monitoring remains a useful safety tool.
OpenAI launches GPT-5.4, their most factual and efficient model. Brings advances in reasoning, coding, and agentic workflows into one frontier model. Available in ChatGPT, API, and Codex. 6.3M views, 23K likes.
OpenAI introduces CoT-Control and finds reasoning models struggle to control their chains of thought, reinforcing monitorability as an AI safety safeguard.
Introducing GPT-5.4, OpenAI’s most most capable and efficient frontier model for professional work, with state-of-the-art coding, computer use, tool search, and 1M-token context.