To run the full 671B sized model (404GB in size), you would need more than 404GB of combined GPU memory and standard memory (and that’s only to run it, you would most probably want it all to be GPU memory to make it run fast).
With 24GB of GPU memory, the largest model which would fit from the R1 series would be the 32b-qwen-distill-q4_K_M (20GB in size) available at ollama (and possibly elsewhere).
Regarding photos, and videos specifically:
I know you said you are starting with selfhosting so your question was focusing on that, but I would like to also share my experience with ente which has been working beautifully for my family, partner and myself. They are truly end to end encrypted, with the source code available on github.
They have reasonable prices. If you feel adventurous you can actually also host it yourself. They have advanced search features and face recognition which all run on device (since they can’t access your data) and it works very well. They have great sharing and collaborating features and don’t lock features behind accounts so you can actually gather memories from people on your quota by just sharing a link. You can also have a shared family plan.