MacMini im KI-Test: Apple M4 gegen M1 Max

Lesedauer 3 Minuten

Wie schlägt sich der neue MacMini mit Apple M4 CPU gegen das MacBook Pro mit M1 Max CPU, wenn es um das Thema KI geht? Reicht die Rechenpower aus, um ein 20 GB Large Language Model zu betreiben? Dieser Artikel zeigt überraschende Ergebnisse.

1 MacMini M4 im Test: Das Setup
2 MacMini M4 vs. M1 Max: Die Ergebnisse
3 Apple M4 vs M1 Max: Das Fazit
4 Was ist mit dem MacMini M4 Pro und dem M4 Max?

MacMini M4 im Test: Das Setup

Unser KI Test-Szenario besteht aus dem folgenden Setup:

Getestet wird ein Mac Mini M4 mit 24 GB RAM, mit 10 Core CPU, 10 Core GPU, 16 Core Neural Engine, 120 GB/s RAM Bandwidth
Der Gegner: Ein MacBook Pro M1 Max mit 64 GB RAM, mit 10 Core CPU, 32 Core GPU, 16 Core Neural Engine, 400 GB/s RAM Bandwidth
Die Software ist Ollama mit folgenden LLMs: Qwen2.5-Coder:32b, 14b und 7b, sowie LLama3.2:3b

Die Aufgabe für das LLM: “Schreibe einen Websocket Server in C#”:

ollama run llama3.2:latest --verbose
>>> Schreibe einen Websocket Server in C#

1 2	ollama run llama3.2:latest --verbose >>> Schreibe einen Websocket Server in C#

Der Wichtigste Parameter ist die Eval-Rate am Ende der Ausgabe. Alles was >= 10 Token pro Sekunde ist, wird vom Benutzer beim Lesen noch als akzeptabel empfunden. Werte darunter als zu langsam.

MacMini M4 vs. M1 Max: Die Ergebnisse

LLama3.2:3b (2 GB)

MacMini M4:

total duration:       17.7778915s
load duration:        25.402583ms
prompt eval count:    32 token(s)
prompt eval duration: 199ms
prompt eval rate:     160.80 tokens/s
eval count:           717 token(s)
eval duration:        17.552s
eval rate:            40.85 tokens/s

total duration: 17.7778915s

load duration: 25.402583ms

prompt eval count: 32 token(s)

prompt eval duration: 199ms

prompt eval rate: 160.80 tokens/s

eval count: 717 token(s)

eval duration: 17.552s

eval rate: 40.85 tokens/s

MacBook Pro M1 Max:

total duration: 13.3542385s
load duration: 29.868542ms
prompt eval count: 32 token(s)
prompt eval duration: 882ms
prompt eval rate: 36.28 tokens/s
eval count: 936 token(s)
eval duration: 12.441s
eval rate: 75.24 tokens/s

total duration: 13.3542385s

load duration: 29.868542ms

prompt eval count: 32 token(s)

prompt eval duration: 882ms

prompt eval rate: 36.28 tokens/s

eval count: 936 token(s)

eval duration: 12.441s

eval rate: 75.24 tokens/s

Erstaunlicherweise ist der alte M1 Max mit >75 Token pro Sekunde fast doppelt so schnell wie der M4, was den 32 GPU-Kernen und 400 GB/s RAM-Speed geschuldet ist. Der M4 kann hier lediglich 10 GPU Kerne und 120 GB/s dagegen halten. Die Ausgabe mit knapp 41 Token pro Sekunde ist jedoch flüssig und somit OK.

Qwen2.5-Coder:7b (4.7 GB)

MacMini M4:

total duration: 43.016241625s
load duration: 22.598166ms
prompt eval count: 36 token(s)
prompt eval duration: 375ms
prompt eval rate: 96.00 tokens/s
eval count: 875 token(s)
eval duration: 42.456s
eval rate: 20.61 tokens/s

total duration: 43.016241625s

load duration: 22.598166ms

prompt eval count: 36 token(s)

prompt eval duration: 375ms

prompt eval rate: 96.00 tokens/s

eval count: 875 token(s)

eval duration: 42.456s

eval rate: 20.61 tokens/s

MacBook Pro M1 Max:

total duration: 19.472867541s
load duration: 25.387916ms
prompt eval count: 36 token(s)
prompt eval duration: 1.803s
prompt eval rate: 19.97 tokens/s
eval count: 772 token(s)
eval duration: 17.434s
eval rate: 44.28 tokens/s

total duration: 19.472867541s

load duration: 25.387916ms

prompt eval count: 36 token(s)

prompt eval duration: 1.803s

prompt eval rate: 19.97 tokens/s

eval count: 772 token(s)

eval duration: 17.434s

eval rate: 44.28 tokens/s

Mit dem Größeren LLM vergrößert sich auch der Abstand unseren beiden Teilnehmern: Der M1 Max ist nun mehr als doppelt so schnell wie der M4. Mit knapp 21 Token pro Sekunde liefert der M4 in dieser Kategorie als KI-Maschine ein noch gutes Ergebnis.

Qwen2.5-Coder:14b (9 GB)

MacMini M4:

total duration: 1m30.197150625s
load duration: 27.20875ms
prompt eval count: 39 token(s)
prompt eval duration: 259ms
prompt eval rate: 150.58 tokens/s
eval count: 972 token(s)
eval duration: 1m29.908s
eval rate: 10.81 tokens/s

total duration: 1m30.197150625s

load duration: 27.20875ms

prompt eval count: 39 token(s)

prompt eval duration: 259ms

prompt eval rate: 150.58 tokens/s

eval count: 972 token(s)

eval duration: 1m29.908s

eval rate: 10.81 tokens/s

MacBook Pro M1 Max:

total duration: 49.621699958s
load duration: 37.180875ms
prompt eval count: 39 token(s)
prompt eval duration: 3.429s
prompt eval rate: 11.37 tokens/s
eval count: 1155 token(s)
eval duration: 45.99s
eval rate: 25.11 tokens/s

total duration: 49.621699958s

load duration: 37.180875ms

prompt eval count: 39 token(s)

prompt eval duration: 3.429s

prompt eval rate: 11.37 tokens/s

eval count: 1155 token(s)

eval duration: 45.99s

eval rate: 25.11 tokens/s

Auch hier ist der M1 Max mehr als doppelt so schnell wie der M4. Mit knapp über 10 Token pro Sekunde ist der M4 in dieser Kategorie als KI-Maschine am Limit aber immer noch brauchbar.

Qwen2.5-Coder:32b (20 GB)

MacMini M4:

total duration: 4m47.733996s
load duration: 23.813958ms
prompt eval count: 36 token(s)
prompt eval duration: 24.238s
prompt eval rate: 1.49 tokens/s
eval count: 1093 token(s)
eval duration: 4m23.304s
eval rate: 4.15 tokens/s

total duration: 4m47.733996s

load duration: 23.813958ms

prompt eval count: 36 token(s)

prompt eval duration: 24.238s

prompt eval rate: 1.49 tokens/s

eval count: 1093 token(s)

eval duration: 4m23.304s

eval rate: 4.15 tokens/s

MacBook Pro M1 Max:

total duration: 50.514663125s
load duration: 38.55475ms
prompt eval count: 36 token(s)
prompt eval duration: 550ms
prompt eval rate: 65.45 tokens/s
eval count: 623 token(s)
eval duration: 49.923s
eval rate: 12.48 tokens/s

total duration: 50.514663125s

load duration: 38.55475ms

prompt eval count: 36 token(s)

prompt eval duration: 550ms

prompt eval rate: 65.45 tokens/s

eval count: 623 token(s)

eval duration: 49.923s

eval rate: 12.48 tokens/s

Während der M1 Max hier mit knapp 13 Token pro Sekunde noch ein akzeptables Ergebnis liefert, ist der M4 mit 4 Token pro Sekunde deutlich zu langsam.

Apple M4 vs M1 Max: Das Fazit

Wer eine vergleichsweise günstige KI-Maschine für LLMs bis ca. 14 Billionen Parameter sucht, ist mit dem MacMini M4 in der Grundausstattung gut beraten. Bietet sich dagegen die Chance einen gebrauchten M1 Max oder höher zu bekommen, sollte man zuschlagen.

Als Faustregel gilt: Je mehr GPU Cores und je höher die RAM-Bandbreite, desto schneller kann das Large Language Model verarbeitet werden. Hier macht sich der Aufpreis der Pro-, Max- und Ultra-Variante der M-Prozessoren bezahlt.

Was ist mit dem MacMini M4 Pro und dem M4 Max?

Der MacMini mit M4 Pro CPU kommt mit 12 CPU Cores, 16 GPU Cores, 16 Core Neural Engine und 273 GB/s RAM Bandwidth.

Die im Vergleich zum Standard M4 verdoppelte RAM Bandbreite und die 6 zusätzlichen GPU Cores schaffen lediglich ca. 80% der Leistung des M1 Max.

Der M4 Max des MacBookPro 2024 mit 410 GB/s RAM Bandbreite und 40 GPU Cores hingegen schafft ca. 25% mehr Leistung als der M1 Max.

Quelle: https://github.com/ggerganov/llama.cpp/discussions/4167

Hat Dir der Beitrag gefallen?

Wenn Du Fragen oder Anmerkungen zu diesem Beitrag hast, dann starte einen Kommentar. DANKE für Dein Feedback!

GRATIS: Hol Dir Pulse², die App zu meinem Blog.

Von Harald|2025-03-11T11:27:50+01:002024-12-03|Kategorien: Hardware|Tags: KI|0 Kommentare

Name*

E-Mail*

Webseite

0 Kommentare

Neuester

Ältester Beliebtester

Inline Feedbacks

View all comments

Name	Borlabs Cookie
Anbieter	Eigentümer dieser Website, keine Übermittlung von Daten and Dritte, Impressum
Zweck	Speichert die Einstellungen der Besucher, die in der Cookie Box ausgewählt wurden.
Cookie Name	borlabs-cookie
Cookie Laufzeit	1 Jahr

Name	Wordpress Application Firewall
Anbieter	Eigentümer dieser Website, keine Übermittlung von Daten and Dritte.
Zweck	Essentielles Session-Cookie um die Sicherheit unserer Website zu gewährleisten. Identifiziert den Besucher anhand einer anonymen ID um mehrfache Hackversuche wiederkehrender Besucher zu verhindern.
Cookie Name	icwp-wpsf

Name	Wordpress Login Session
Anbieter	Eigentümer dieser Website, keine Übermittlung von Daten and Dritte.
Zweck	Dieses Session Cookie speichert den Zustand unseres Login-Bereichs, um festzustellen ob ein User eingeloggt ist oder nicht.
Cookie Name	swpm_session

Name	PHP Session ID
Anbieter	Eigentümer dieser Website, keine Übermittlung von Daten and Dritte
Zweck	Dieses Session-Cookie wird benötigt um die einwandfreie Navigation auf unseren Seiten zu gewährleisten. Es identifiziert den Besucher anhand einer anonymen ID.
Cookie Name	PHPSESSID

Name	WooCommerce
Anbieter	Eigentümer dieser Website, keine Übermittlung von Daten and Dritte
Zweck	Dieses Cookie wird zur Funktionalität des WooCommerce Onlineshops benötigt.
Cookie Name	woocommerce_cart_hash, tk_ai

Akzeptieren	Google Analytics
Name	Google Analytics
Anbieter	Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Zweck	Cookie von Google für Website-Analysen. Erzeugt statistische, anonymisierte Daten darüber, wie der Besucher die Website nutzt. Google Analytics hilft uns, unser Angebot auf unsere Besucher besser abstimmen zu können.
Datenschutzerklärung	https://policies.google.com/privacy
Cookie Name	_ga,_gat,_gid
Cookie Laufzeit	1 Jahre

Akzeptieren	Vimeo
Name	Vimeo
Anbieter	Vimeo Inc., 555 West 18th Street, New York, New York 10011, USA
Zweck	Wird verwendet, um Vimeo-Inhalte zu entsperren.
Datenschutzerklärung	https://vimeo.com/privacy
Host(s)	player.vimeo.com
Cookie Name	vuid
Cookie Laufzeit	2 Jahre

Akzeptieren	YouTube
Name	YouTube
Anbieter	Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Zweck	Wird verwendet, um YouTube-Inhalte zu entsperren.
Datenschutzerklärung	https://policies.google.com/privacy
Host(s)	google.com
Cookie Name	NID
Cookie Laufzeit	6 Monate