Meta Llama 3.1 8B Instruct
Q4_K - Medium
8.0Bparams
COMPARE ACCELERATORS
105 accelerators tested
Select Accelerators
NVIDIA GeForce RTX 4090
24GB
NVIDIA GeForce RTX 5090
31GB
NVIDIA H100 PCIe
79GB
NVIDIA GeForce RTX 4080
16GB
NVIDIA GeForce RTX 3090 Ti
24GB
Meta Llama 3.1 8B Instruct - Q4_K - Medium
LEADERBOARD
GPU / 48GB
PROMPT
5487
tokens/s
GENERATION
51.3
tokens/s
TTFT
252
ms
LOCALSCORE
1038
GPU / 16GB
PROMPT
4222
tokens/s
GENERATION
77.6
tokens/s
TTFT
316
ms
LOCALSCORE
1012
GPU / 16GB
PROMPT
4461
tokens/s
GENERATION
54.4
tokens/s
TTFT
301
ms
LOCALSCORE
931
GPU / 12GB
PROMPT
3526
tokens/s
GENERATION
56.7
tokens/s
TTFT
376
ms
LOCALSCORE
808
GPU / 20GB
PROMPT
2617
tokens/s
GENERATION
56.5
tokens/s
TTFT
518
ms
LOCALSCORE
658
GPU / 20GB
PROMPT
2013
tokens/s
GENERATION
44.6
tokens/s
TTFT
689
ms
LOCALSCORE
507
GPU / 8GB
PROMPT
1460
tokens/s
GENERATION
57.8
tokens/s
TTFT
880
ms
LOCALSCORE
458
GPU / 8GB
PROMPT
1223
tokens/s
GENERATION
51.3
tokens/s
TTFT
1.04
sec
LOCALSCORE
392
GPU / 16GB
PROMPT
1328
tokens/s
GENERATION
37.9
tokens/s
TTFT
1.02
sec
LOCALSCORE
367
GPU / 128GB
PROMPT
534
tokens/s
GENERATION
48.9
tokens/s
TTFT
2.16
sec
LOCALSCORE
230
PROMPT
290
tokens/s
GENERATION
4.3
tokens/s
TTFT
8.73
sec
LOCALSCORE
52