Hermes Agent's image task cannot be recognized; first check whether the task text clearly cites images. v0.15.1 improved the Kanban worker, allowing models that support visual capabilities to receive images referenced in the task body. The key point is that images must be included in the task context, not just placed in a specific directory.
Why are there missed images?
People might know which one you mean when they see "refer to this screenshot," but the worker might not know. If the task body does not have image paths, attachment descriptions, or contextual references, after the task is split, the actual worker may only get the text and not the image.
The correct way to write it
- Clearly state the image path or attachment location in the task body.
- Specify which issues to watch for in the image, such as layout misalignment, text errors, or chart changes.
- Confirm that the Worker uses a model that supports visual input.
- If the task is split, keep image references in the subtask that needs to view the image.
Don't expect the Worker to automatically scan all images from the entire warehouse. That wastes context and easily brings irrelevant material into the task.
What else is there to investigate?
If the main text has already cited an image but still cannot be recognized, check whether the image is in the worker-accessible workspace. Backends like Docker, SSH, and Modal often encounter situations where "the local machine has files, but the remote worker has none." Sync files first, then discuss model capabilities.