1U mini PC for AI?

nagaram@startrek.website · 1 month ago

1U mini PC for AI?

Flax@feddit.uk · 1 month ago

How much did this cost?

nagaram@startrek.website · 1 month ago

The Lenovo Thinkcentre M715q were $400 total after upgrades. I fortunately had 3 32 GB kits of ram from my work’s e-waste bin but if I had to add those it would probably be $550 ish The rack was $120 from 52pi I bought 2 extra 10in shelves for $25 each the Pi cluster rack was also $50 (shit I thought it was $20. Not worth) Patch Panel was $20 There’s a UPS that was $80 And the switch was $80

So in total I spent $800 on this set up

To fully replicate from scratch you would need to spend $160 on raspberry pis and probably $20 on cables

So $1000 theoratically

Diplomjodler@lemmy.world · 1 month ago

I’m afraid I’m going to have to deduct one style point for the misalignment of the labels on the mini PCs.

nagaram@startrek.website · 1 month ago

That’s fair and justified. I have the label maker right now in my hands. I can fix this at any moment and yet I choose not to.

I’m man feeding orphans to the orphan crushing machine. I can stop this at any moment.

Diplomjodler@lemmy.world · 1 month ago

The machine must keep running!

Flax@feddit.uk · edit-2 1 month ago

A single raspberry pi 5 can host an entire website

nagaram@startrek.website · 1 month ago

True, but I have an addiction and that’s buying stuff to cope with all the drawbacks of late stage capitalism.

I am but a consumer who must be given reasons to consume.

TropicalDingdong@lemmy.world · 1 month ago

This is so pretty 😍🤩💦!!

I’ve been considering a micro rack to support the journey, but primarily for house old laptop chassis as I convert them into proxmox resources.

Any thoughts or comments on you choice of this rack?

nagaram@startrek.website · 1 month ago

Not really a lot of thought went into rack choice. I wanted something smaller and more powerful than my several optiplexs I had.

I also decided I didn’t want storage to happen here anymore because I am stupid and only knew how to pass through disks for Truenas. So I had 4 truenas servers on my network and I hated it.

This was just what I wanted at a price I was good with at Like $120. There’s a 3D printable version but I wasn’t interested in that. I do want to 3D print racks and I want to make my own custom ones for the Pis to save space.

But this set up is way cheaper if you have a printer and some patience.

_//(0)(0)\\_@lemmy.world · 1 month ago

Looking good! Funny I happen across this post when I’m working on mine as well. As I type this I’m playing with a little 1.5” transparent OLED that will poke out of the rack beside each pi, scrolling various info (cpu load/temp, IP, LAN traffic, node role, etc)

hendrik@palaver.p3x.de · 1 month ago

Well, I always advocate for using the stuff you have. I don’t think a Discord bot needs four new RasPi 5. That’s likely to run on a single RasPi3. And as long as they’re sitting idle, it doesn’t really matter which model number they have… So go ahead and put something on your hardware, and buy new one once you’ve maxed out your current setup.

I’m not educated on Bazzite. Maybe tools like Distrobox or other container solutions can help running AI workloads on the gaming rig. It’s likely easier to run a dedicated AI server, but I started learning about quantization, tested some models on my main computer with the help of ollama, KoboldCPP and some random Docker/Podman containers. I’m not saying this is the preferrable solution. But definitely enough to get started with AI. And you can always connect the computers within your local network, write some server applications and have them hook into ollama’s API and it doesn’t really matter whether that runs on your gaming pc or a server (as long as the computer in question is turned on…)

nagaram@startrek.website · 1 month ago

Ollama and all that runs on it its just the firewall rules and opening it up to my network that’s the issue.

I cannot get ufw, iptables, or anything like that running on it. So I usually just ssh into the PC and do a CLI only interaction. Which is mostly fine.

I want to use OpenWebUI so I can feed it notes and books as context, but I need the API which isn’t open on my network.

6nk06@sh.itjust.works · 1 month ago

https://en.wikipedia.org/wiki/ThinkCentre because I didn’t knew it existed.

nagaram@startrek.website · 1 month ago

These are M715q Thinkcentres with a Ryzen Pro 5 2400GE

nagaram@startrek.website · 1 month ago

Oh and my home office set up uses Tiny in One monitors so I configured these by plugging them into my monitor which was sick.

I’m a huge fan of this all in one idea that is upgradable.

muppeth@scribe.disroot.org · 1 month ago

Wow! very cool rack you got there. I too started using mini pcs for local test servers or general home servers. But unlike yours mine are just dumped behind the screen on my desk (3 in total). For LLM stuff atm I use 16GB radeon but thats connected to my desktop. In the future I would love to build a proper rack like yours and perhaps move the GPU to a dedicated minipc.

As for the upgrades, like what others stated already, I would just go for more pc’s rather then rpi.

nagaram@startrek.website · 1 month ago

The PIs were honestly because I had them.

I think I’d rather use them for something else like robotics or a Birdnet pi.

But the pi rack was like $20 and hilarious.

The objectively correct answer for more compute is more mini PCs though. And I’m really thinking about the Mac Mini option for AI.

muppeth@scribe.disroot.org · 1 month ago

is the mac mini really that good? running 12-14b models on my radeon rx 7600xt is ok’ish but i do “feel it” while running 7-8b models sometimes just doesn’t feel enough. I wonder where does mac mini land in here.

nagaram@startrek.website · 1 month ago

From what I understand its not as fast as a consumer Nvdia card but but close.

And you can have much more “Vram” because they do unified memory. I think the max is 75% of total system memory goes to the GPU. So a top spec Mac mini M4 Pro with 48GB of Ram would have 32gb dedicated to GPU/NPU tasks for $2000

Compare that to JUST a 5090 32GB for $2000 MSRP and its pretty compelling.

$200 and its the 64GB model with 2x 4090’s amounts of Vram.

Its certainly better than the AMD AI experience and its the best price for getting into AI stuff so says nerds with more money and experience than me.

lepinkainen@lemmy.world · 1 month ago

This will get downvoted to oblivion because this is Lemmy:

Get a Mac Mini. Any M-series model with 32GB of memory will run local models at decent speeds and will be cheaper than just a 5xxx series GPU

And it’ll fit your cool rack 😀

ZeDoTelhado@lemmy.world · edit-2 1 month ago

I have a question about ai usage on this: how do you do this? Every time I see ai usage some sort of 4090 or 5090 is mentioned, so I am curious what kind of ai usage you can do here

nagaram@startrek.website · 1 month ago

With a RTX 3060 12gb, I have been perfectly happy with the quality and speed of the responses. It’s much slower than my 5060ti which I think is the sweet spot for text based LLM tasks. A larger context window provided by more vram or a web based AI is cool and useful, but I haven’t found the need to do that yet in my use case.

As you may have guessed, I can’t fit a 3060 in this rack. That’s in a different server that houses my NAS. I have done AI on my 2018 Epyc server CPU and its just not usable. Even with 109gb of ram, not usable. Even clustered, I wouldn’t try running anything on these machines. They are for docker containers and minecraft servers. Jeff Geerling probably has a video on trying to run an AI on a bunch of Raspberry Pis. I just saw his video using Ryzen AI Strix boards and that was ass compared to my 3060.

But to my use case, I am just asking AI to generate simple scripts based on manuals I feed it or some sort of writing task. I either get it to take my notes on a topic and make an outline that makes sense and I fill it in or I feed it finished writings and ask for grammatical or tone fixes. Thats fucking it and it boggles my mind that anyone is doing anything more intensive then that. I am not training anything and 12gb VRAM is plenty if I wanna feed like 10-100 pages of context. Would it be better with a 4090? Probably, but for my uses I haven’t noticed a difference in quality between my local LLM and the web based stuff.

Korhaka@sopuli.xyz · 1 month ago

Ohh nice, I want it. Don’t really know what I would use all of it for, but I want it (but don’t want to pay for it).

Currently been thinking of getting an N150 mini PC. Setup proxmox and a few VMs. At the very least pihole, location to dump some backups and also got a web server for a few projects.