![](/static/61a827a1/assets/icons/icon-96x96.png)
![](https://lemmy.world/pictrs/image/8286e071-7449-4413-a084-1eb5242e2cf4.png)
0·
21 days agoThe context cache doesn’t take up too much memory compared to the model. The main benefit of having a lot of VRAM is that you can run larger models. I think you’re better off buying a 24 GB Nvidia card from a cost and performance standpoint.
One thing I would do differently is setup LDAP and OIDC so you can use the same authentication credentials for different apps (at least the ones that support them). I use LLDAP and Authelia for this purpose.