[February 2026] AI & Machine Learning Monthly Newsletter 🤖

Daniel Bourke
Daniel Bourke
hero image

In This Month's Update:

In This Month's Update:

Want to become an AI/ML Engineer?

Our AI/ML Career path takes you from complete beginner (at any age!) to getting hired as a Machine Learning and/or AI Engineer 👇

Get The Full Career PathGet The Full Career Path

74th issue! If you missed them, you can read the previous issues of my monthly A.I. & Machine Learning newsletter here.

Hey everyone!

Daniel here, I’m a machine learning engineer who teaches the following beginner-friendly machine learning courses:

I also write regularly about machine learning on my own blog as well as make videos on the topic on YouTube.

Since there's a lot going on, the utmost care has been taken to keep things to the point.

Here's what you might have missed in February 2026 as an A.I. & Machine Learning Engineer... let's get you caught up!

My Work

  • Sunny - Kaggle Competition Entry — My brother and I entered the MedGemma Impact Challenge Kaggle Competition. Our entry was called Sunny, an iOS application which uses a fine-tuned version of MedGemma to help privately track skin health over time. All of our code and models are open-source and you can see the overview video on YouTube.
  • Three new tutorials on learnhuggingface.com — There are three new tutorials on learnhuggingface.com: LLM fine-tuning, VLM fine-tuning and multimodal RAG (retrieval augmented generation with text and images). Stay tuned for the videos/courses to launch on ZTM!
  • RTX 4090 and DGX Spark Benchmarking video — I made a video comparing the NVIDIA DGX Spark to the RTX 4090 (the GPU in my deep learning PC) across various every day AI tasks such as LLM fine-tuning, object detection model training and LLM inference. In summary, the RTX 4090 has much more raw compute power + memory bandwidth, however, the DGX Spark has a much higher memory capacity.
  • Upcoming talk on small LLMs — I’m doing a talk at the Queensland AI Meetup (my home state) on the power and potential of Small Language Models (SLMs) on March 12 2026. It’ll be in person, however, I’ll be sure to record it and post it on my YouTube channel.

From The Internet

  • An AI agent coding skeptic tries AI agent coding, in excessive detail by Max Woolf. Max Woolf, a data scientist at BuzzFeed and long-time agent skeptic, puts Claude Code and Codex through increasingly ambitious tasks, from API scrapers to porting scikit-learn to Rust. His conclusion: agents work best when you have approximate knowledge of many things with enough domain expertise to know what should and should not work. A highly detailed and honest read.

The main lesson I learnt from working on these projects is that agents work best when you have approximate knowledge of many things with enough domain expertise to know what should and should not work. Opus 4.5 is good enough to let me finally do side projects where I know precisely what I want but not necessarily how to implement it. These specific projects aren’t the Next Big Thing™ that justifies the existence of an industry taking billions of dollars in venture capital, but they make my life better and since they are open-sourced, hopefully they make someone else’s life better. However, I still wanted to push agents to do more impactful things in an area that might be more worth it.

ml-monthly-february-2026-images.001

Breakdown of what goes into the context window of a coding agent. Also a note on the illusion of control. Even if you give your agent plenty of context, it’s still not guaranteed to output the correct thing. Source: martinfowler.com.

  • Data is your only moat. A reminder that while models are commoditizing rapidly, proprietary high-quality data remains the most durable competitive advantage for AI-driven businesses.
  • Waymo introduces the Waymo World Model built on Genie 3. Waymo built a generative simulation platform on top of Google DeepMind’s Genie 3 that creates hyper-realistic driving scenarios, including rare events like tornadoes, floods and animals on the road, that their fleet has never encountered. The system generates both camera and lidar data, enabling billions of virtual testing miles before real-world deployment.

ml-monthly-february-2026-images.002

Examples of Waymo’s World Model powered by Genie 3 and converted into Waymo-style vision and Lidar data. Having the Genie 3 World Model allows you to create scenes that otherwise would rarely happen and be hard to gather data for. Top left: Elephant on the road, top right: flooding in a local suburb, bottom right: person in a T-rex costume running on the street, bottom right: changing the time of day for the same scenario.

ml-monthly-february-2026-images.003

PaperBanana workflow for iteratively improving an academic image.

Hugging Face Roundup

ml-monthly-february-2026-images.004

QED-Nano shows that with the right data mixture and training recipe, you can get outstanding results with much lower parameters. Right: The training recipe involves RL on long chains-of-thought for creating math proofs. These chains of thought are generated by a larger model and then distilled into the smaller one.

And The National Library of Scotland shows how to go from no labels to pseudolabels to a trained model with high performance using open-source models. An excellent example of iterative bootstrapping with VLMs and detection models. For this specific example, the goal was to detect bounding boxes on library index cards but could be expanded to many other workflows.

ml-monthly-february-2026-images.005

Step by step iterative way to create a custom model on a custom dataset. If you have the data, you can bootstrap the labels with a large model such as SAM3 and then train a smaller model to reproduce those labels.

Mooncake and PyTorch

  • Mooncake joins PyTorch ecosystem. Mooncake, a distributed serving framework for large models, is now an official member of the PyTorch ecosystem, making it easier to deploy and serve large models at scale.

Open Source

Open-Source LLMs and VLMs

  • Qwen3.5 series. Alibaba’s Qwen team released the Qwen3.5 family in February, headlined by a 397B parameter MoE model with only 17B active parameters. The flagship model is natively multimodal, supports 201 languages, and features a hybrid Gated Delta Networks plus MoE architecture that delivers 8.6x faster decoding than Qwen3-Max. Follow-up medium model releases include Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B show incredible improvements to the previous generations (similar performance with up to 10x less parameters). See them on Unsloth for efficient fine-tuning.
  • MiniMax M2.5. MiniMax releases their latest model with strong coding and agentic capabilities. The model demonstrates the ability to plan like a software architect, writing spec documents before generating code. MiniMax reports that M2.5-generated code accounts for 80% of newly committed code within their own company.
  • Ovis2.6 drops with an MoE architecture. A 30B total, 3B active parameter VLM using a Mixture of Experts architecture. Another entry in the efficient-VLM space. Strong performance over similar-sized models, however, might be a bit under the radar with Qwen3.5 release.
  • Photoroom open-sources their image generation model. Photoroom shares both the model weights and the journey it took to build a production-quality text-to-image model, including the training decisions and tradeoffs involved.

Speech and Audio

Embeddings and Retrieval

  • Perplexity releases open-source diffusion-based embedding models (pplx-embed). Perplexity enters the embedding space with pplx-embed-v1 and pplx-embed-context-v1, available at 0.6B and 4B parameter scales. The models use a novel approach: they take Qwen3 base models and convert them from causal decoders into bidirectional encoders through diffusion-based pretraining. They lead multiple public benchmarks including MTEB and ConTEB, require no instruction prefixes, and ship with native INT8 and binary quantization for up to 32x storage compression. MIT licensed. See the paper.
  • NVIDIA releases ColEmbed V2 for multimodal RAG systems. An updated embedding model from NVIDIA designed specifically for multimodal retrieval-augmented generation pipelines.
  • mmBERT for multilingual text classification. A high-quality text encoder from Johns Hopkins that supports text classification across multiple languages. Worth evaluating if you need a lightweight and fast multilingual classifier.

Computer Vision

Specialized Models

Papers

Releases

  • Apple adds agentic coding to Xcode 26.3. Apple shipped agentic coding in Xcode 26.3, with direct support for coding agents such as OpenAI Codex and Claude Agent. The practical win is that Xcode can now let those agents build and test projects, search Apple documentation, and handle more complex multi-step coding tasks through MCP-based tooling.
  • Google releases Gemini 3 Deep Think upgrade. A major upgrade to Google’s specialized reasoning mode, developed in partnership with scientists and researchers. The updated Deep Think achieved 48.4% on Humanity’s Last Exam (without tools), 84.6% on ARC-AGI-2.
  • Google releases Gemini 3.1 Pro. A point-version update to Gemini 3 Pro that more than doubles the ARC-AGI-2 reasoning score to 77.1%, introduces a three-tier thinking system, and adds the ability to generate animated SVGs from text prompts. Available via the Gemini API, Antigravity and Vertex AI.
  • Google releases Nano Banana 2 built on Gemini 3.1 Flash. The next generation of Google’s on-device image generation model, now built on Gemini 3.1 Flash (note: Gemini 3.1 Flash itself is not available yet… but this might mean it might be available soon?).
  • Gemini adds multimodal tool calling. Gemini models can now invoke tools based on multimodal inputs (text + images), expanding the range of agentic workflows possible with the updated Gemini interactions API.
  • Anthropic releases Claude Sonnet 4.6. Anthropic’s most capable Sonnet model, now the default on claude.ai. The model brings a 1M token context window (beta), improved coding, computer use, and agent planning. Performance that previously required an Opus-class model is now available at Sonnet pricing ($3/$15 per million tokens). Released February 17, just 12 days after Opus 4.6.
  • Google releases CodeWiki, a Gemini-powered documentation creator. A tool that automatically generates and maintains up-to-date documentation for codebases using Gemini.
  • Project Genie. Google DeepMind expands access to Genie, their world model for generating interactive 3D environments from text prompts (see above for how Genie 3 is being used with Waymo to create world models for self-driving cars).
  • Oumi and Lambda partner for end-to-end custom model development. Oumi’s open-source training framework combined with Lambda’s GPU cloud, making custom model development more accessible from data preparation to deployment.
  • Ai2 introduces MolmoSpaces, an open ecosystem for embodied AI. Allen Institute for AI launches an open platform for building and evaluating AI agents that can interact with physical environments.

Videos

  • Ashok on building foundational models at Tesla. A behind-the-scenes look at Tesla’s approach to training foundation models for autonomous driving and robotics. Very cool to see how seriously they take safety and how sophisticated their simulations are.
  • Yann LeCun on World Models. Yann LeCun discusses his vision for world models and how they might bridge the gap between current AI capabilities and more general intelligence.
  • Nathan Limbach and Sebastian Raschka on Lex Fridman. A deep conversation covering LLM training, open-source model development and the state of ML research. It’s a long (but excellent) one. I listened to this one over the course of 3-4 days walking to and from training.
  • Elon Musk on Cheeky Pint with Dwarkesh. A wide-ranging interview covering AI, Tesla, xAI and the broader tech landscape.
  • PewDiePie on fine-tuning LLMs. PewDiePie shares his rollercoaster ride of a journey figuring out how to train a coding model. From dataset gathering to hardware malfunctions to finally… (I won’t spoil it :P). All in all a very fun and inspiring story.

See you next month!

What a massive month for the ML world in February!

As always, let me know if there's anything you think should be included in a future post.

Liked something here? Share it with someone.

In the meantime, keep learning, keep creating, keep dancing.

See you next month,

Daniel

www.mrdbourke.com | YouTube

By the way, I'm also an instructor with Zero To Mastery Academy teaching people Machine Learning & AI in the most efficient way possible. You can see a few of our courses below or check out all Zero To Mastery courses.

You might like these courses

More from Zero To Mastery

The No BS Way To Getting A Machine Learning Job preview
The No BS Way To Getting A Machine Learning Job
19 min read

Looking to get hired in Machine Learning? Our ML expert tells you how. If you follow his 5 steps, we guarantee you'll land a Machine Learning job. No BS.

6-Step Framework To Tackle Machine Learning Projects (Full Pipeline) preview
6-Step Framework To Tackle Machine Learning Projects (Full Pipeline)
30 min read

Want to apply Machine Learning to your business problems but not sure if it will work or where to start? This 6-step guide makes it easy to get started today.

How to Convince Your Boss to Pay for Your Upskilling preview
How to Convince Your Boss to Pay for Your Upskilling
10 min read

Get you company to pay for your tech upskilling. Use this training request email and strategy to make it happen.