Tree-sitter powered code completion
Tree-sitter has taken the world of programming by a storm. Together with LSP, it’s probably the technology that has influenced the most programming editors and IDEs in the past several years. And now that Emacs 29+ comes with built-in Tree-sitter support I’ve been spending a lot of quality time with it, working on clojure-ts-mode and neocaml-mode.
There’s a lot I’d like to share with you about using Tree-sitter effectively, but today I’ll focus on a different topic. When most people hear about Tree-sitter they think of font-locking (syntax highlighting) and indentation powered by the abstract syntax tree (AST), generated by a Tree-sitter grammar. For a while I’ve also been thinking that the AST data can also be used for simple, yet reasonably accurate, code completion. (within the context of a single code buffer, that is) That’s definitely not nearly as powerful of what you’d normally get from a dedicated tool (e.g. an LSP server), as those usually have project-wide completion capabilities, but it’s pretty sweet given that it’s trivial to implement and doesn’t require any external dependencies.
Below, you’ll find a simple proof of concept for such a completion, in the context
of clojure-ts-mode
:1
(defvar clojure-ts--completion-query-globals
(treesit-query-compile 'clojure
`((source
(list_lit
((sym_lit) @sym
(:match ,clojure-ts--variable-definition-symbol-regexp @sym))
:anchor [(comment) (meta_lit) (old_meta_lit)] :*
:anchor ((sym_lit) @var-candidate)))
(source
(list_lit
((sym_lit) @sym
(:match ,clojure-ts--function-type-regexp @sym))
:anchor [(comment) (meta_lit) (old_meta_lit)] :*
:anchor ((sym_lit) @fn-candidate))))))
(defconst clojure-ts--completion-annotations
(list 'var-candidate " Global variable"
'fn-candidate " Function"))
(defun clojure-ts--completion-annotation-function (candidate)
(thread-last minibuffer-completion-table
(alist-get candidate)
(plist-get clojure-ts--completion-annotations)))
(defun clojure-ts-completion-at-point-function ()
(when-let* ((bounds (bounds-of-thing-at-point 'symbol))
(source (treesit-buffer-root-node 'clojure))
(nodes (treesit-query-capture source clojure-ts--completion-query-globals)))
(list (car bounds)
(cdr bounds)
(thread-last nodes
(seq-filter (lambda (item) (not (equal (car item) 'sym))))
(seq-map (lambda (item) (cons (treesit-node-text (cdr item) t) (car item)))))
:exclusive 'no
:annotation-function #'clojure-ts--completion-annotation-function)))
I hope you’ll agree that the code is both simple and easy to follow (especially
if you know a bit about Tree-sitter queries and Emacs’s completion APIs). The
meat of the example is clojure-ts--completion-annotation-function
, the rest is
just completion scaffolding.
And the result looks like this:
Not too shabby for 30 lines of code, right? With a bit more efforts this can be made
smarter (e.g. to include local bindings as well), and potentially we can even be
consulting all open buffers running clojure-ts-mode
to fetch completion data
from the as well. (although that’s probably an overkill)
Still, I think that’s an interesting use of Tree-sitter that some of you might find useful. It seems that Nic Ferrier has been playing with this idea recently as well - check out his recent video on the subject here.
In time Tree-sitter will redefine how we’re building Emacs major modes and what they can do.2 It’s still early days and sky is the limit. Exciting times ahead!
That’s all I have for you today. Keep hacking!
P.S. I plan to write more on the topic of Tree-sitter and how to use it in Emacs major modes, but in the mean time you might find some of my development notes useful: