Back to AI Q&A
What should I do if Hermes Agent reads too many files and burns tokens?

What should I do if Hermes Agent reads too many files and burns tokens?

AI Q&A Admin 83 views

Hermes Agent reads too many files, resulting in high token consumption, so adjust the task scope first, and then look at the file_read_max_chars. Don't let it read the entire repository indiscriminately, you should ask it to search for the location first, and then only read the relevant fragments.

Why file reading is expensive

The file content will enter the model context, especially logs, build products, compressed code, large JSON, and large markdown documents, which can easily cram tens of thousands of tokens at a time. In the official configuration, file_read_max_chars is used to limit the number of characters per read by default, and large context models can be increased, and small contexts or local models are recommended to be reduced.

Optimization that can be done right away

  • Let Hermes use search to target keywords before reading the local files that hit.
  • Ask it not to read node_modules, dist, build, big logs, and cache directories.
  • The small model scenario reduces file_read_max_chars to a more conservative value.
  • Let the large file be read in sections and extract the conclusion after each paragraph, without plugging it all at once.

Note the boundaries of automatic deduplication

Hermes does some deduplication of duplicate file reads: if the same file area remains unchanged, subsequent reads may return a light prompt instead of sending the entire content repeatedly. However, after context compression, the model may need to re-read critical files, so you still have to control the read range.

The best way to prompt is: "List the files you need to read and why, and wait for me to confirm before reading them." This allows you to spend tokens on really relevant contexts, rather than leaving the agent to figure out the route with the budget.

Recommended Tools

More