Generated with MS Designer

How to install TabbyML on Mac

Cedric Ferry
4 min readJan 23, 2024


In the previous article, we discovered Tabby, an open source software that help you code faster by leveraging open source Large Language Models, such as Code Llamas, StarCoder and DeepSeeker.

In this article you will learn how to install, configure and use Tabby on your Mac equiped with Apple Silicon.

Note: Tabby can work on Mac Intel, please look at the official documentation

Recommended configuration

Running LLMs requires 2 essential things:

  • GPU capabilities
  • Large amount of RAM

Thankfully Tabby supports multiple LLM sizes, ranging from 1B to 13B parameters.

As minimum, I would recommend Apple Silicon M1 with 16GB. With that configuration, you will be able to run 1B and 7B parameters models providing decent latency and accuracy.

For a more comfortable usage, anything better than M1 with 24GB will just help load larger models.

Under the hood, Tabby leverages the Metal API configuration.

Installing Tabby

You can install Tabby with homebrew. If you don’t have homebrew yet, please follow the instructions on the website (single command to execute).

With homebrew, run this command and Tabby will install in a few seconds.

$ brew install tabbyml/tabby/tabby

Choosing the model

Tabby can be used with a variety of models. Some models perform better in specific language, so pick the one that fit your needs the best.

Available models can be found here.

| Name          |  XS  | S  |  M   |  L  |  
| DeepSeekcoder | 1.3B | - | 6.7B | - |
| StarCoder | 1B | 3B | 7B | - |
| CodeLlama | - | - | 7B | 13B |
| WizardCoder | 1B | 3B | - | - |

First, select the model:

  • Code Llamas generally works better for Python
  • StarCoder is light and works on lower configurations
  • DeepSeeker got the best results overall (leaderboard)

Second, select the size of the model, this depends directly on your RAM:

  • < 16GB get a small model up to 3B parameters
  • >16GB to 7B parameters
  • > 24GB up to 13B parameters

Running Tabby with the model

Running Tabby with the model is one command line:

# DeepseekCoder
$ tabby serve --device metal --model TabbyML/DeepseekCoder-6.7B

# StarCoder
$ tabby serve --device metal --model TabbyML/StarCoder-1B

# CodeLlama
$ tabby serve --device metal --model TabbyML/CodeLlama-7B

On the first launch, Tabby will download the model:

🎯 Transferring
⠒ 00:16:33 ▕████████▎ ▏ 2.78 GiB/6.67 GiB 2.82 MiB/s ETA 24m.

Installing your IDE plug-in

Tabby is available for a range of IDE including VSCode, NeoVim and Intellij IDEs such as PyCharm (this includes Android Studio).

Search for Tabby in your plugin manager.

Note: By default Tabby runs on localhost port 8080, the plugin should already be configured this way.

Your first code completion

Start typing some code, for instance some comments about what you want to do and the name of a function as below:

neovim running in Tabby code completion

After a couple of seconds, you should see a code completion that you can accept with the Tab key.


Triggering Tabby

You can invoke Tabby with a shortcut instead of automatically. This can be configured in your IDE plugin.

Providing Tabby with your own source code

To get code suggestions that matches better your project, you can provide Tabby with the list of your working repositories. Tabby will use the repositories as context.

# .tabby-client/agent/config.toml

name = "My project"
git_url = "file:///Users/sonique/Projects/MyAndroid"


Congratulations, you learn how to install and configure Tabby, the open source coding assistant. I hope this will help you be more productive! If you enjoyed reading, please show your appreciation by clapping 👏🏻.