Sourcegraph Data Center can be configured to scale to very large codebases and large numbers of users. If you notice latency for search or code intelligence is higher than desired, changing these parameters can yield a drastic improvement in performance.
For assistance scaling and tuning Sourcegraph, contact us. We're happy to help!
By default, your cluster has a single pod for each of sourcegraph-frontend, searcher, and gitserver. You can
increase the number of replicas of each of these services to handle higher scale.
We recommend setting the sourcegraph-frontend, searcher, and gitserver replica counts according to the following tables:
| Users | Number of sourcegraph-frontend replicas |
|---|---|
| 10-500 | 1 |
| 500-2000 | 2 |
| 2000-4000 | 6 |
| 4000-10000 | 18 |
| 10000+ | 28 |
You can change the replica count of sourcegraph-frontend by editing base/frontend/sourcegraph-frontend.Deployment.yaml.
| Repositories | Number of searcher replicas |
|---|---|
| 1-20 | 1 |
| 20-50 | 2 |
| 50-200 | 3-5 |
| 200-1k | 5-10 |
| 1k-5k | 10-15 |
| 5k-25k | 20-40 |
| 25k+ | 40+ (contact us for scaling advice) |
| Monorepo | 1-25 (contact us for scaling advice) |
You can change the replica count of searcher by editing base/searcher/searcher.Deployment.yaml.
| Repositories | Number of gitserver replicas |
|---|---|
| 1-200 | 1 |
| 200-500 | 2 |
| 500-1000 | 3 |
| 1k-5k | 4-8 |
| 5k-25k | 8-20 |
| 25k+ | 20+ (contact us for scaling advice) |
| Monorepo | 1 (contact us for scaling advice) |
Read docs/configure.md to learn about how to change
the replica count of gitserver.
When you're using Sourcegraph with many repositories (100s-10,000s), the most important parameters to tune are:
sourcegraph-frontendCPU/memory resource allocationssearcherreplica countindexedSearchCPU/memory resource allocationsgitserverreplica countsymbolsreplica count and CPU/memory resource allocationsgitMaxConcurrentClones, becausegit cloneandgit fetchoperations are IO- and CPU-intensiverepoListUpdateInterval(in minutes), because each interval triggersgit fetchoperations for all repositories
Consult the tables above for the recommended replica counts to use. Note: the gitserver replica count is specified
differently from the replica counts for other services; read docs/configure.md to learn about how to change
the replica count of gitserver.
Notes:
-
If your change requires
gitserverpods to be restarted and they are scheduled on another node when they restart, they may go offline for 60-90 seconds (and temporarily show aMulti-Attacherror). This delay is caused by Kubernetes detaching and reattaching the volume. Mitigation steps depend on your cloud provider; contact us for advice. -
For context on what each service does, see Sourcegraph Architecture Overview.
When you're using Sourcegraph with a large monorepo (or several large monorepos), the most important parameters to tune are:
sourcegraph-frontendCPU/memory resource allocationssearcherCPU/memory resource allocations (allocate enough memory to hold all non-binary files in your repositories)indexedSearchCPU/memory resource allocations (for thezoekt-indexserverpod, allocate enough memory to hold all non-binary files in your largest repository; for thezoekt-webserverpod, allocate enough memory to hold ~2.7x the size of all non-binary files in your repositories)symbolsCPU/memory resource allocationsgitserverCPU/memory resource allocations (allocate enough memory to hold your Git packed bare repositories)
Many parts of Sourcegraph's infrastructure benefit from using SSDs for caches. This is especially important for search performance. By default, disk caches will use the Kubernetes hostPath and will be the same IO speed as the underlying node's disk. Even if the node's default disk is a SSD, however, it is likely network-mounted rather than local.
See configure/ssd/README.md for instructions about configuring SSDs.
For production environments, we recommend the following resource allocations for the entire Kubernetes cluster, based on the number of users in your organization:
| Users | vCPUs | Memory | Attached Storage | Root Storage |
|---|---|---|---|---|
| 10-500 | 10 | 24 GB | 500 GB | 50 GB |
| 500-2,000 | 16 | 48 GB | 500 GB | 50 GB |
| 2,000-4,000 | 32 | 72 GB | 900 GB | 50 GB |
| 4,000-10,000 | 48 | 96 GB | 900 GB | 50 GB |
| 10,000+ | 64 | 200 GB | 900 GB | 50 GB |
See "Assign resource-hungry pods to larger nodes" in docs/configure.md.