Selling this beauty as I upgraded to M3 max Macbook pro 16in. It’s very sparingly used, mint condition. Haven’t sold yet mostly cause i was hoping to use it.
Have the box in storage. Comes only with a power cord.
Best for video editing (as it has dual encoder decoder which are available in only max & ultra chips), Running local LLMs. And everything else Macs can do.
Can show it working on video call or send video. Restrict all shit posting to my DM.
So it’s beating the new M4 pro mac minis which if similarly specced is 90K more expensive
And this has 10g lan+ much higher system memory bandwidth + tonnes of ports + dual encoder decoder + better cooling + a power button that’s not in its ass.
You’re gonna make me hoard this for another few years
This is a similarly specced M4 pro Mac mini that matches LLM performance of M1 Max. Its 2.2L
This is the best VFM editing rig you’ll find. About 40% cheaper than similarly performing upgraded Mac Mini m4 pro. And has extra dual encoder decoders + 10g Lan + tonnes of ports that mac mini wont have.
If i didnt have to upgrade to a macbook for portability, I would never sell this.
32 GB is less? 32 GB is max which you will get on a consumer grade GPU(RTX 5090) else you will have to go for a workstation card which will cost even more than a 5090.
bro, in mac the memory is unified that’s why people look for higher capacity ones as the RAM acts like VRAM, but LLM are not feasible to run under even 80GB VRAM for any thing meaning ful, and production ready.
The process needs RAM too (for non computation stuff) that’s why 32GB is less for AI stuff.
You want to run a production ready AI server for 1.1L? What stupid expectation is this?
What consumer gpu are you getting with 32gb ram?
Your replies make no sense at all. I’m comfortably running pretty decent size models on this.
I shared my view on the part mentioned LLMs, so I responded specifically to that. Not sure why you’re reading this as some kind of challenge.
You’re conflating running a model with running it well for production workloads.
On 16-24GB VRAM with RAM offloading, you can run quantized 70B models at around 4-8 tokens/sec or comfortably run 30B models at 30-40 tokens/sec. That works fine for prototyping or light inference. I do exactly that on 12GB VRAM for testing.
But the moment you exceed VRAM and start offloading to system RAM, you hit PCIe bandwidth limits. PCIe 4.0 x16 tops out at 64 GB/s, while GDDR6X VRAM runs at 900+ GB/s. That’s a 14x bandwidth gap. For batch inference, fine-tuning, or long context windows where you’re constantly shuttling data, that bottleneck compounds. Token generation can drop 50-80% when you’re pulling layers from system RAM.
On 32GB unified, the situation is different but worse for my use case. That 32GB is shared between macOS, applications, CPU, and GPU. The OS alone uses 6-8GB, leaving maybe 24GB effective for GPU compute. You can’t offload to separate system RAM because there is no separate pool. And at 400 GB/s max bandwidth on M3 Max, you’re already slower than dedicated VRAM.
So yes, people run models on smaller setups. I need 64GB+ unified specifically because I’m targeting larger models with longer contexts at production speeds, without constantly fighting memory pressure or offload penalties. If 32GB fits your workflow, that’s fine. It doesn’t fit mine.
I didn’t knew sharing something like that was crime.
This is a sale thread not a discussion thread. What part of this is hard to understand?
Ignoring the fact that you’re wrong on multiple levels, you’re spamming a post for no reason. You want 64gb ram, this has 32g, all you had to do was ignore and move on.
Your requirements for a production AI server needs enterprise cards from Nvida going into 3-5 lakhs at minimum. Now are you going to spam every sale thread on this forum that doesn’t meet that criteria?
This has more vram than the second best consumer nvidia gpu available, if model size is the main issue for you. If running just ollama you can easily cap the OS usage to 4-5gb. That leaves 25GB ram for LLMs using gpu. That’s 1 gb more than rtx 5080. That itself costs 50k more than this entire device
Bro this is like going to a car shop and complaining that cars are worse than helicopters and take more time to get you to destination. All the while not wanting to buy a car or a helicopter. Then wondering why you got beat up.
Here comes the rescue service for retards.
He needs a gfx card. Not a Mac. He mentioned running LLM in production environment. That’s enterprise GPU territory. So my example holds pretty well. Maybe you should get help for your comprehension issues.
He got some ego issues for being old member here. Now he is quoting something from my another response I didn’t made here, like BJP IT Cell doesto make people fool… let him enjoy the ego he got.
Wtf
You might want to just go in history and check who has been saying a ton of shit against the BJP & IT cell since before gobhiji was even appointed as a dicktator in 2014. You’ll find about 50 people here who will attest to that.
Brother you can just click on the arrow on the quoted text and it’ll take you to the post where you said that on the same thread
I literally just quoted you. YOU said that.
@Party_Monger its your fault tbh. Why are you selling this mac studio? You should be selling what @craftogrammer wants. You should be thankful to all the gyaan he gave you for FREE. Come on, you should do better