Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's what documentation is for. If you don't have that, AI won't figure it out either.


I'm not sure that's true?


Such a project is way too large for AI to process as a whole. So yes it's true.


I have a question: Many people have spoken about their experience of using LLMs to summarise long, complex PDFs. I am so ignorant on this matter. What is so different about reading a long PDF vs reading a large source base? Or can a modern LLM handle, say, 100 pages, but 10,000 pages is way too much? What happens to an LLM that tries to read 10,000 pages and summarise it? Is the summary rubbish?


Get the LLM to read and summarise N pages at a time, and store the outputs. Then, you concatenate those outputs into one "super summary" and use _that_ as context.

Theres some fidelity loss but it works for text, because there's quite often so much redundancy.

However, I'm not sure this technique could work on code.


It can't handle large contexts, so the way they often do it is file by file, which loses the overall context.


Lots of models CAN handle large contexts, gemini 2.5 pro their latest model can take 1 million tokens of context




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: