Initial thoughts on Google Gemini 2.5

Dave Hudson - 2025-03-28

I was interested to see what Google Gemini 2.5 pro could do, compared with Claude Sonnet 3.7. Claude has been my go-to model for months now.

The good

  • Gemini can handle a very large context - my 35k lines of python required less than 35% of the available context window. Google continue to win hands down on this as Claude can only get to just over 20k lines of code before it hits its limits.
  • I asked it to find opportunities to refactor my code and it found quite a few (I knew about most of them already). Its suggestions all looked architecturally sound.
  • It can clearly generate very large amounts of software (and it seems to be high quality), including breaking things into file-sized chunks to make things easy to work with.
  • Gemini is very fast at generating code.
  • You can try it for free as Google offer a free API tier!

The not-so-good

  • Despite quite a lot of discussion it was very hard to get Gemini to not try and do other things while it was doing the refactoring problem I asked it to try.
  • Even after I got it to stop adding pointless comments it insisted on "helpfully" doing unrelated things. To be fair, it picked up and fixed a couple of bugs, corrected some type hinting issues, and fixed some comments that were incorrect, but I'd asked it not to touch anything that wasn't directly related to my change. A big part of engineering process is to change one thing at a time, so I'd be happy if it told me about issues it found, but silently changing code is frustrating.
  • I fairly quickly hit the "requests-per-day" limit.

On balance I'm pretty impressed. It's dramatically better than earlier Google models, but I'm not trading in Claude for doing implementation work just yet. Claude is just that bit better at doing what it's asked.

With that said, I can now see me regularly using Gemini 2.5 to help me plan work for Claude. No matter how much I do this, there is still something enormously impressive about watching an AI review 35,000 lines of code and then make high quality suggestions about how to improve it within 15 seconds!

Gemini offers thoughts on the design of Humbug.