Creating a local code search tool
One of the most common activities in programming is searching the code base for specific strings or patterns. However, such search is most often restricted to a specific project that we are working on.
If you are like me, you will have a couple of projects stored locally that might be using the same programming language, framework, libraries, patterns, configuration, and so on. Therefore, sometimes it can be useful to extend the search to all the local code to reuse or recall something that we already wrote before or to lookup things outside of the IDE.
Command line utilities like ripgrep are great for this kind of local code search. Ripgrep is one of the line search oriented tools (searching for things inside files) with all the necessary features like regex search or the ability to restrict search to only specific file types. It provides nicely colored results with context (showing surrounding lines) and has some conveniences like automatically skipping files ignored by .gitignore rules.
Let’s say I need to define the Django’s ROOT_URLCONF
setting. I have done it in the past and now I just want to list all the examples from the projects I have been working on before. With ripgrep, I can use the -F
flag to ask for exact matches (no regex pattern) and specify the folder where all my projects are located:
Notice that ripgrep is invoked with rg
. I was able to find all the examples I need from all the projects that I have checked out locally in my project folder. The output tells me where those files are and on which lines the variable is defined. Because ripgrep prints the whole file path of the files in the search results, it is even possible to open the files by clicking on the path if using Visual Studio Code’s terminal.
Of course, that is nice and all, but we don’t want to always specify the project directory or remember various settings that are handy to use in a tool like ripgrep. To make the search convenient, let’s create a shell function codesearch that will house our basic configuration (in Bash or Z shell startup script):
codesearch () {
rg --max-count=3 --context=5 --stats "$@" ~/projects ~/playground
}
The reason to define such utility as a shell function rather than just alias is that functions can accept arguments. It is possible to define parameters that we want along the list of folders where to search and still be able to pass additional arguments to codesearch function during its invocation. In this case, the $@
means that all positional arguments that are passed to the function will get inserted in the command right there in the middle.
The configuration above demonstrates some useful arguments for ripgrep:
max-count will limit the search to a number of matches per file. This setting is useful in order to not pollute the search results from just one file if such a file contains many results.
context is a context window that prints extra lines before and after a match. This setting is great to see where exactly some code is used and possibly copy and paste something without opening the file itself.
stats prints useful statistics like the number of matches found.
Ability to define multiple project folders, in my case, I have one serious project folder and a playground where I can dump more temporary code. The search is then performed across both folders.
Now instead of rg
we are going to invoke codesearch
:
You can notice that now the output is longer, listing results from both of my folders and showing the surrounding lines as well. And it is still possible to pass additional arguments like -F or even override the ones defined previously in the codesearch
function, which makes the function versatile.
I hope you will find what you need,
Petr