Most lists of AI use cases for finance are either too vague to act on or describe things that don't actually work under production conditions. What follows are the use cases that hold up when you apply real scrutiny, along with an honest note on where the limits are.
First-pass narratives for board decks and monthly reporting
This is the highest-return use case in practice. If you give ChatGPT structured inputs — actual numbers with context about what changed and why — and force it to explain drivers rather than restate figures, the first draft is genuinely useful. You're not publishing what comes out, but you're editing something substantive rather than starting from a blank page. If you are concerned about security, talk with your IT team first.
The failure mode is giving it vague inputs and expecting it to figure out the story. It won't. The quality of the output is almost entirely a function of the quality of what you put in.
Turning vague business questions into a structured analysis plan
This one surprises people. "Why is growth slowing?" is a question a CEO asks that can go in ten different directions. ChatGPT is useful for structuring the analytical approach: cohort analysis, funnel breakdown, pricing trends, retention patterns. It's not doing the analysis. It's helping you organize the problem before you start, which saves time and keeps you from missing obvious angles.
Pressure-testing models and outputs
Describing a model structure or a set of assumptions to ChatGPT and asking what's missing is a legitimate QA step. It catches missing drivers, flags assumptions that aren't grounded in anything, and surfaces edge cases that didn't come up in the original build. It's not a substitute for a second set of experienced eyes, but it's better than self-review alone.
Writing SQL, Python, or Excel logic
If you know what you want to build and need help with the syntax, ChatGPT is fast and reliable for this. The key qualifier is knowing what you want. If you're using it to figure out what the analysis should be — not just how to write it — the output quality drops quickly. Use it as an implementation tool, not a thinking tool.
Where it still breaks
Anything more ambitious than these use cases still runs into real limits. Complex multi-driver financial models, nuanced judgment calls about what a variance actually means for the business, or analysis that requires context only someone inside the company has — these don't work well with current tools.
The other limit worth naming is the review burden. More AI output means more to check, and checking it well requires someone who knows what right looks like. Teams that don't have that capability will produce more output with more errors, which is worse than producing less output carefully.
The use cases above work because the human is still making the judgment calls. That's not a flaw in the current tools. It's just the honest state of where things are.