Composer 2.5 is now available inside Grok Build.
Composer 2.5 is a fast, highly intelligent model that excels on long-running tasks and following complex instructions.
Introducing Qwen3.7-Plus — a multimodal agent model that unifies vision and language into one versatile agent foundation.
Multimodal interactive hybrid agent: unified GUI & CLI operation across visual and text tasks
Versatile coding agent & productivity assistant with full-modality input
Visual Agent: perception, reasoning, grounding, and search-augmented QA
Cross-harness generalization across diverse agent frameworks
One model. Sees, thinks, codes, acts.
Now available via API on Alibaba Cloud Model Studio. Try it — let us know what you build.
Blog:
https://
qwen.ai/blog?id=qwen3.
7-plus
…
Qwen Studio:
https://
chat.qwen.ai/?models=qwen3.
7-plus
…
API:
https://
modelstudio.console.alibabacloud.com/ap-southeast-1
?tab=doc#/doc/?type=model&url=2840914_2&modelId=qwen3.7-plus&serviceSite=international
…
Very grateful to Jensen for working to expand Nvidia capacity at AWS so much for OpenAI. Jensen said Nvidia is expanding OpenAI capacity at AWS like mad. OpenAI Codex token use is exploding.
GPT-5.4 used for the past few weeks. In a sea of endless model drops and benchmark maxxing, this model is the first in a long time to be worth your time to try. Honestly did not expect OpenAI to pull this off.
What is the hardest question I could ask you that you might get right? Everyone is saying GPT-5.4 Pro is the smartest model, AGI-level intelligence, but do you have AGI-level questions to ask?
Want to host Claude meetups in your city? Anthropic will cover the funding, send swag, and give monthly API credits for demos. Access to pre-release features and a private slack with the team included.
Qwen3.5 4B apparently out-scores GPT-4o on some of the classic benchmarks. Given the enormous size difference in terms of parameters this raises suspicion that Qwen may have been training to the test on some of these.
Claude Code wiped a production database with a Terraform command. It took down the DataTalksClub course platform and 2.5 years of submissions: homework, projects, and leaderboards. Automated snapshots were gone too. Full recovery took about 24 hours.
New chapter on Agentic manual testing - about how having agents manually try out code is a useful way to help them spot issues that might not have been caught by their automated tests.
PhotoAI.com is a 40,870 line file called index.php generating $105,000 per month revenue and $80,000 per month profit. Simple architecture over complex setups continues to win.
Step-by-step guide for setting up and using Claude Code. Designed to show the power and reach of AI agents and beginner enough for anyone to jump in. From 60-second setup to advanced integrations with no coding required.
Just launched Codex Security. Probably a no-brainer for most teams to turn on. Features include agentic security review leveraging SOTA models, always-on codebase scanning, detailed reports with code paths on vulnerabilities, and auto-fix any report.
Automations already run thousands of times per day inside our own codebase! They power self-healing CI, auto-approving PR flows, highly-compute-intensive security review, and a team-wide memory system. One small step toward a self-driving codebase.
AI was asked to code Figma from scratch. It worked with proof and tips. Live link and github repo included with live multiplayer. Something fundamental is changing about how software is built.
Went to the sold out Open Claw meetup in NYC. Learnings: not a single person thinks that their setup is 100% secure. One openclaw expert said he has reviewed setups from cybersecurity experts and laughed.
5 years later, it feels like it is finally happening. Prompt engineering is dead. AI agents extracting goals and intents from users through proactive pings, questions, interviews, context, or intuited from actions is the name of the 2026 game.
Every single Claude Code conversation turn should end with it offering to do even more for you. It should be so proactive and helpful that you cry with joy at how productive it is making you. Add this to your global CLAUDE.md file.
New course: Build and Train an LLM with JAX, built in partnership with Google. JAX is the open-source library behind Google Gemini, Veo, and other advanced models. This short course teaches you to build and train a 20-million parameter language model.
We believe Cursor discovered a novel solution to Problem Six of the First Proof challenge, a set of math research problems that approximate the work of Stanford, MIT, Berkeley academics. Cursor solution yields stronger results than the official, human-written solution.
PSA: Google is turning down Gemini 3 Pro next Monday March 9th. You can upgrade to 3.1 Pro Preview which improves on lots of the things folks gave feedback about on the first Gemini 3 rev.
Cognition shares an early preview of their ongoing SWE-1.6 training run. It significantly improves upon SWE-1.5 while being post-trained on the same pre-trained model and runs equally as fast at 950 tok/s. On SWE-Bench Pro it exceeds top open-source models.
I believe these are reasonable lines to hold. And I am proud to work for a company willing to hold them. Referencing Anthropic statement on comments from Secretary of War Pete Hegseth.
It is absolutely horrifying that these people run the United States. God bless Anthropic. May liberty and democracy prevail. The Department of War demands full unrestricted access to AI systems but Anthropic holds its ground on safety principles.
Sam Altman deserves major praise for defending Anthropic. OpenAI CEO broke ranks and defended Anthropic amid reports the Trump administration is threatening to use the Defense Production Act against AI firms. Never back down.
Say hello to Nano Banana 2, the best image generation and editing model! You can access Nano Banana 2 through AI Studio and the Gemini API under the name Gemini 3.1 Flash Image. New resolutions (lower cost) and tools like Image Search included.
Anthropic, the life, liberty, and pursuit of happiness company. AI allows us to raise our ambitions. But as the technology advances, as a country we should never lower our standards. Let freedom ring.
The third era of AI software development. When we started building Cursor a few years ago, most code was written one keystroke at a time. Tab autocomplete changed that and opened the first era of AI-assisted coding. Then agents arrived.
Writing software is becoming effortless. So much has changed and yet there is still so much to go. Cloud agents and demo-based review are a big leap toward this future.
Impressive inference speed from Inception Labs diffusion LLMs. Mercury 2 is the world first reasoning diffusion LLM, delivering 5x faster performance than leading speed-optimized LLMs. Diffusion LLMs are a fascinating alternative to conventional autoregressive LLMs.
3 months in, today is fun because we start to collaborate with builders. We have been hard at work. Our goal by the end of year is to make the entire AI stack adaptable. Data is the foundation all AI progress has been built on. So, it is our natural starting point.
We just launched demos! I think it is a glimpse at a post-code future. Over a third of our PRs are now created autonomously with this feature. Cursor now shows you demos, not diffs. Agents can use the software they build and send you videos of their work.
DeepSeek got called out for scraping 150K Claude messages. So POM is releasing 155K of personal Claude Code messages with Opus 4.5. Also open sourcing tooling to help fetch your data, redact sensitive info and make it discoverable on Hugging Face.
Will AI create new job opportunities? Used Gemini Nano Banana to design a cat-themed birthday cake in yellow, then asked a baker to create it using delicious sponge cake. AI as a creative design tool enabling new business opportunities.
We trained the best coding model in the world under 1T parameters. Excited for people to try it! Composer 1.5 is now available, striking a strong balance between intelligence and speed.