Llama 3.2 1B Instruct
Q4_K - Medium
1.5Bparams
COMPARE ACCELERATORS
179 accelerators tested
Select Accelerators
NVIDIA H100 PCIe
79GB
NVIDIA A100-SXM4-80GB
79GB
NVIDIA GeForce RTX 3090 Ti
24GB
NVIDIA GeForce RTX 3080 Ti
12GB
NVIDIA RTX A6000
48GB
Llama 3.2 1B Instruct - Q4_K - Medium
LEADERBOARD
GPU / 48GB
PROMPT
19620
tokens/s
GENERATION
131
tokens/s
TTFT
68
ms
LOCALSCORE
3350
GPU / 16GB
PROMPT
15512
tokens/s
GENERATION
226
tokens/s
TTFT
93
ms
LOCALSCORE
3334
GPU / 16GB
PROMPT
16736
tokens/s
GENERATION
141
tokens/s
TTFT
81
ms
LOCALSCORE
3077
GPU / 12GB
PROMPT
13347
tokens/s
GENERATION
188
tokens/s
TTFT
106
ms
LOCALSCORE
2856
GPU / 20GB
PROMPT
10703
tokens/s
GENERATION
237
tokens/s
TTFT
142
ms
LOCALSCORE
2616
GPU / 20GB
PROMPT
8737
tokens/s
GENERATION
189
tokens/s
TTFT
179
ms
LOCALSCORE
2099
GPU / 8GB
PROMPT
6350
tokens/s
GENERATION
212
tokens/s
TTFT
216
ms
LOCALSCORE
1840
GPU / 16GB
PROMPT
6151
tokens/s
GENERATION
173
tokens/s
TTFT
244
ms
LOCALSCORE
1634
GPU / 8GB
PROMPT
5553
tokens/s
GENERATION
195
tokens/s
TTFT
287
ms
LOCALSCORE
1558
GPU / 8GB
PROMPT
6259
tokens/s
GENERATION
97.3
tokens/s
TTFT
223
ms
LOCALSCORE
1397
GPU / 6GB
PROMPT
4979
tokens/s
GENERATION
114
tokens/s
TTFT
286
ms
LOCALSCORE
1255
GPU / 128GB
PROMPT
3296
tokens/s
GENERATION
176
tokens/s
TTFT
334
ms
LOCALSCORE
1203
GPU / 192GB
PROMPT
3272
tokens/s
GENERATION
170
tokens/s
TTFT
339
ms
LOCALSCORE
1179
PROMPT
4083
tokens/s
GENERATION
124
tokens/s
TTFT
353
ms
LOCALSCORE
1128