I have run into this issue multiple times with Nomad. For some reason, when jobs are redeployed/restarted, the service registration isn’t removed. This causes my Traefik reverse proxy to send requests to a non-existing allocation in Nomad.
Last time this happened, I think I completely nuked the Nomad setup and setup everything from scratch. This time around, I finally figured out the right way to do things.
- Use
nomad service list
to get a list of the services - The use
nomad service info -verbose <service_name>
to get the service registrations for the app having the issue. - Open the UI and click on the Job > Services and click on the allocations to identify the dead allocation
- Note the allocation ID and get the corresponding ID from the service info
- Finally, remove the ghost entry with
nomad service delete <service-name> <ID>