Testing GPT-3's Mathematics Comprehension

For the previous installment of this series with DALL-E 2, see here. I wanted to try GPT-3 on some similar problems. DALL-E clearly doesn't have a coherent world-model of the mathematical objects I'm asking it about; does GPT-3?

Plenty of people have already tested it on arithmetic so I won't waste much space on these, but here are a few I tried. (GPT-3's answers in green.)

With low numbers, it can usually get the exact answer. With larger numbers, it tends to get the right first few digits, but then messes it up. With irrational numbers it gives a few digits correctly and then gives up. With rational decimals it can get common ones correct and has similar problems with uncommon ones.

I'm more interested in geometric or topological problems, or other forms of abtract reasoning that a calculator can't perform.

Should be a 5; the convention is that opposing sides on a die always add up to 1 more than the highest face. This isn't really a math question though; it might just not know much about dice.

(For all the next questions I left the previous prompt in with a corrected answer, in case this helps it understand it needs to answer correctly.)

Nope, doesn't understand fractions either. How about topology?

I think this might just be because it recognizes that a bagel is extremely similar to a doughnut?

Well that's scary. It... might know what a torus is.

Ok, we're not all doing to turn into paperclips quite yet. I'm really curious how it figured out the tire though.

Uh, no, that looks pretty similar.

A ring is still pretty toroidal geometrically, but that's somewhat subjective. I tried this prompt several times and got:

Talking about a donut-shaped thing-that-is-very-different-from-a-donut doesn't quite satisfy what I asked for, but is a pretty clever hack.

Several of the other answers are wildly wrong, but the toilet paper roll and garden hose are correct.

Pretty good.

Huh. Most humans could not answer that. That's seriously impressive. I tried this prompt several times to check it if got lucky, but no, it consistently answered either "The hexagon" or "A hexagon" every time.

GPT-3 is still pretty inconsistent. It does well on some problems and extremely poorly on others, seemingly arbitrarily. But its ability to consistently answer some pretty abstract questions makes me think it might actually have some primitive world models somewhere in there.

RSS feed