Ever wanted to point at a tiny corner of a photo and tell an AI exactly what to change? Google is quietly giving Gemini that ability.
What’s rolling out
Over the past few weeks users have begun seeing a new image markup tool inside Gemini on Android and the web. When you attach a picture to a conversation a brief prompt can introduce the feature; tapping the image opens a small toolbar with a row of colors for scribbling or circling, a default “Sketch” mode and a “Text” option that lets you write instructions directly on the image. The marked area tells Gemini where to focus, whether you want a localized edit or a pointed question about a subject in the frame.
The rollout appears to be limited and experimental: some accounts show the feature, others don’t. The tool surfaced in both mobile and desktop Chrome builds during recent testing, confirming Google is trying to make image edits and image-based queries more granular across platforms.
Why now? Gemini’s imaging chops have been getting a lot of attention — from Nano Banana generation models to the upgrade path toward Gemini 3 — and finer control over visual inputs is a logical next step as the suite becomes more multimodal and tightly integrated into Google’s apps.
How it works in practice
Use the Sketch option to circle or highlight the exact pixels you mean. Use Text to annotate the image with a short instruction — for example, “fix the left person’s jacket” or “remove the lamppost in the background.” When the prompt is processed, Gemini treats the marked region as the primary context for whatever task you send.
Test demos show the editing side of the tool is promising: localized edits apply where you indicate. For analysis, results can be hit-or-miss — AI image recognition still struggles with some faces and fine-grain identification — but the markup clearly helps Gemini narrow its attention and reduces guesswork compared with sending an unannotated photo.
How to try it (if you can)
Some users report seeing the feature tied to recent Google app builds; a reported version is 16.49.59. If you don’t see it yet, simple troubleshooting steps include force-stopping the app and reopening it, though many users will simply need to wait for a broader rollout.
Because the feature is in testing, Google hasn’t published official documentation yet. Expect behavior and availability to change as the company collects feedback and irons out accuracy issues.
Why it matters — and what to watch for
This markup tool is small but meaningful. It converts an otherwise fuzzy instruction — “edit this photo” — into a precise command: point with your finger and tell Gemini what you want. That lowers the friction for everyday image edits and for visual troubleshooting: UX designers can circle a buggy button, shoppers can highlight an item in a photo, and students can point at a math diagram and ask for help.
At the same time, bringing more precise multimodal controls into Gemini ties into broader moves across Google’s stack. Gemini features are being woven into search, Workspace and Maps, expanding where visual inputs can matter; for example, the company’s recent moves to let Gemini search across Gmail and Drive show how visual and textual grounding can be combined for richer answers. See our coverage of Gemini’s deeper Workspace reach in Gemini’s Deep Research plugs into Gmail and Drive and how conversational AI is appearing in mapping tools like Google Maps’ Gemini copilot.
There are also familiar trade-offs. More powerful image tools can raise privacy questions (photos often contain sensitive context), and any feature that identifies people or objects will need robust guardrails. Accuracy remains uneven: while edits are generally convincing, identification and fine-grain interpretation can still miss the mark.
If you’re a creator or product person, this change is worth paying attention to. It’s not flashy like a new generative model, but it changes the user flow: from typing awkward directions to literally pointing at what you mean. That feels small on paper and surprisingly good in practice.
If you don’t see it yet, hang tight — Google appears to be testing the markup in phases. When it arrives for everyone, the difference between “change this” and “change this” while actually pointing at the thing should make Gemini feel noticeably smarter and more collaborative.