The gathering key details step is a tool ("the librarian") that's called by the top level model. It uses a different (read: cheaper and faster) model with a long context window (gemini 2.5 flash lite at the moment)
The reason you see the same output is that the same PDF is being processed by the same model (gemini 2.5 flash lite) with the same config (temperature, top_p, etc.) so the output is the same.
You wouldn't want the top level model to run the librarian tool that's used in the gathering key details step -- it goes through A LOT of tokens and letting random models do that would lead users to run into unpredictable cost explosions or latency spikes when facing the mass of tokens from a 50page + PDF.
TL;DR -- the PDF output is not cached, it's just using the same submodel for the librarian with the same config, so the output is the same for the same input.