Use Case 1: Autocomplete using Local Language Model (LLM) and Open-Source Tools

In our quest to implement LLM for code autocomplete, we explored various setups and tools. Here's the setup we found to be optimal:

Ollama : This serves as the main server responsible for managing and launching LLMs.
open-webui : This is a graphical interface for Ollama, functioning similarly to chatgpt.
Continue : This is an extension for VSCode/JetBrains that enables autocomplete functionality. As of now, it only supports chat and templates of frequently used requests in the stable branch.
deepseek:coder 6.7b: This is the actual model. Several models were tested, but the deepseek-coder:6.7b has been adopted for now. The selection is based on the speed of the local model operation, aiming for generation latency not exceeding a second or two for autocomplete purposes.

The system operates efficiently, although it's worth noting that the Continue extension is still in preview and may exhibit occasional crashes or autocomplete errors. Additionally, substantial configuration is required for both Ollama and Continue, due to their extensive customizability.

Next steps

An alternative tool, Tabby, was also considered due to its support for AMD video cards and its popularity. It includes its own server and extension, and supports any models in the gguf format. It also potentially offers functionality with repository indexing and code completion. However, without additional configuration, its autocomplete performance was unsatisfactory, even with the same deepseek-coder model. Although its setup is simpler compared to Ollama + Continue, it was found to be less satisfactory in terms of overall performance.Additional research is required to comprehend its indexing and retrieval mechanism, and to improve its efficiency.

GitHub Copilot

Tip 1: Put the task description(prompt) at the beginning of file or function you are working on

Tip 2: There are cases where Copilot Chat struggles to generate the necessary code, despite following all the prompting recommendations, restarting the result, and explicitly specifying the context. In such situations, it helps to manually copy the necessary context and insert it as a comment before the function being written. By utilizing autocomplete, Copilot begins to better understand the context and suggest relevant recommendations.