Practitioner Notes

Fine-Tuning Without the Hype

Fine-tuning is the most over-pitched and under-understood capability in financial AI. A practitioner-level look at when it actually moves the needle, when it does not, and what to ask before signing the work order.

Engineering NotebookApril 14, 20267 min read

Back to all insights

Fine-tuning illustration — the practitioner view of model alignment work.

Every vendor pitch deck includes the phrase "fine-tuned to your firm." Most of the time, what that actually means is closer to retrieval over a directory of your own documents than any meaningful change to the model itself. Both are useful; only one is fine-tuning. The conflation matters because the two solve different problems, and treating them as interchangeable is how firms end up paying for capabilities they never receive.

What fine-tuning actually does

Fine-tuning shifts how a model writes. Given a corpus of a firm's prior memos, IC decks, and reviewer notes, the model learns the cadence, the framing, the order in which a firm presents an investment case. It does not learn new facts in any reliable way — that is what retrieval is for. It learns voice. The output of a properly fine-tuned model reads like the firm; an off-the-shelf model with retrieval reads like an essay about the firm. The distinction is invisible from a demo and obvious from a draft IC memo.

Side-by-side draft comparison — off-the-shelf model output next to a fine-tuned counterpart. — Side-by-side: a draft from an off-the-shelf model versus a fine-tuned counterpart. The facts are the same; the register is not.

The honest answer to "do we need fine-tuning?" depends on what the output is for. If diligence output goes into a structured workflow — a memo, a one-pager, a credit grid — fine-tuning earns its keep within weeks. If it stays as ad-hoc Q&A in a chat window, retrieval alone is usually sufficient. Spending the engineering hours on fine-tuning when nobody downstream cares about voice is a category error.

The unsexy truth: most of the value in production AI for finance is in the retrieval, the citations, and the access controls. Fine-tuning is the last 20% — the part that makes the output yours.

Questions to ask before commissioning fine-tuning

Three questions separate firms that get value from fine-tuning from those that pay for an expensive non-event. First, do you have a corpus that is genuinely representative — at least a few hundred memos or decks, with consistent quality? Models trained on inconsistent inputs produce inconsistent outputs. Second, can the corpus be cleaned? Old templates, redacted sections, and merger-era artifacts should be excluded; otherwise the model learns those too. Third, who will evaluate the output? Fine-tuning without a senior reviewer in the loop tends to drift in directions nobody intended.

Done well, fine-tuning is the difference between an AI that helps the firm and an AI that sounds like the firm. Done poorly, it is a line item that buys nothing the team can point to. The path between the two is clearer than the marketing makes it sound — but it requires asking what most decks would prefer you not.

Back to all insights

Want to talk to the team behind this work?

Get in touch