M365 Copilot Image Generation Levels Up, and Video Summaries are coming to Copilot Notebooks

Microsoft has kicked off 2026 with two significant enhancements to Microsoft 365 Copilot – both providing a clear signal to how improvements in multi-media creation in AI will support creativity, communication, and knowledge sharing across the workplace.

The first is a major upgrade to Copilot’s image generation capabilities with the rollout of OpenAI’s GPT‑Image‑1.5 model. The second is a new capability that turns Copilot Notebooks into automatically generated video summaries (in addition to the voice / podcast over views). These are both in preview for organisations enrolled into the Copilot Frontier Preview.

Whilst subtle – these make Copilot more useful, more expressive, and more multi-modal in the flow of work. Read on for more detail.

GPT‑Image‑1.5 Creation comes to Microsoft 365 Copilot

Microsoft is good at getting new features into Copilot Quickly. Just before Xmas 2025, Copilot was updated to support the latest GPT5.2 models and now they are replacing OpenAI’s GPT‑4o with GPT‑Image‑1.5 across the Copilot’s image generation experiences. This will gradually roll out through January 2026. This includes Copilot Chat and the wider “Create” module in Copilot.

For organisations already using Copilot to create internal comms assets, presentation visuals, campaign concepts, or quick mock‑ups, this upgrade is most welcomed. The quality gap between “AI‑generated” and “designer‑produced” continues to narrow, and the speed improvements make Copilot even more viable for rapid ideation. Image creation in AI tools has come on massively in just a few months.

One of the key upgrades is the ability to updates aspects of an image (via a prompt) or take over and edit with Microsoft’s Image Designer Tools directly from Copilot. You can see the difference in the example below (you’ll need to zoom in sorry)!

What is GPT-Image-1.5?

This latest model from OpenAI is their answer to Google’s highly regarded Nano Banana image models – This is (according to experts), “on par” in terms of fidelity, instruction following, and realism, plus it’s included at not additional cost to Microsoft 365 Copilot users.

According to Microsoft, once rolled out, users can expect:

  • Sharper prompt adherence – especially for composition, style, and on‑image text
  • More precise region‑specific editing with fewer unintended changes
  • Higher‑quality visuals with more realistic lighting, textures, and detail
  • Faster generation – up to 4× quicker for many prompts
  • Better consistency when iterating on faces, colours, and lighting.

Video Overview in Copilot Notebooks

Microsoft is introducing Video Overviews – the ability for Copilot to automatically generate a short, narrated video summary of a Notebook’s content. This adds to the current audio overview feature and is rolling out to organisations enrolled in the Frontier Preview, Other organisations will get this in due course – keep an eye on the official Roadmap for this one.

This enhanced the existing overview feature, allowing Copilot Notebook users to:

  • Analyse the full Notebook
  • Extract key insights
  • Generate visuals
  • Produce a narrated video summary in first person, interview or podcast style

Think of it as a dynamic, visual executive summary – ideal for sharing updates, explaining concepts, or turning long‑form thinking into something more digestible.

Copilot Notebooks are a powerful space for iterative thinking, brainstorming, and structured problem‑solving in solo or shared mode. But they’ve also been static, plus you still had to read them. The ability to have audio or video overviews make these much more digestible, quicker to consume and are great for helping consume content in their preferred way.

Whilst Google’s NotebookLM has had this feature for a month or so, this is the first time Copilot is turning your content into multimodal output without requiring any video editing skills (or even prompting) at all. It’s a glimpse of a future where:

  • Documentation becomes auto‑summarised
  • Content is consumable in ways that meet the user’s need.
  • Knowledge becomes more accessible

I’m personally really interested to see how well Copilot handles narrative flow, visual selection, and pacing. If Microsoft gets this right, it could become one of the most impactful features in the Copilot.

Summary

In short, these subtle but impact updates point to the same trajectory of where Copilot is heading – becoming a fully multi-modal assistant, not just a text‑based one.

  • GPT‑Image‑1.5 – higher‑fidelity visual creation
  • Video Overviews – automated multimedia storytelling

Leave a Reply