Doit: A make tool implemented in Python
Posted on novembre 14, 2022 in computer-science
make
is a very useful tool to automate the creation of files depending on other files. It takes into account the modification times of files to perform only the absolutely necessary actions. make
was invented to assist compilation, and is considered a developer's tool. As such, it is not available on every computer. I always thought that it would be nice to have a pure Python tool emulating the main functionality of make
. I was quite happy to discover that such a Python tool actually exists. It is named doit
.
Here is a Makefile
to builds the table of word frequencies from Alice in Wonderland.
all: alice_frequency_table.csv
alice.txt:
curl https://www.gutenberg.org/files/11/11-0.txt --output alice.txt
alice_nopunct.txt: alice.txt
sed 's/[[:punct:]]/ /g' $^ | tr -s "[:blank:]" > $@
alice_tokenized.txt: alice_nopunct.txt
sed 's/[[:blank:]]/\n/g' $^ > $@
alice_lowercase.txt: alice_tokenized.txt
gawk '$$0{print tolower($$0)}' $^ > $@
alice_frequency_table.csv: alice_lowercase.txt
sort $^ | uniq -c | sort -nr > $@
Now, here is the dodo.py
file that doit
will parse to behave like make
with Makefile above:
DOIT_CONFIG = {'action_string_formatting': 'both'}
def task_gettext():
"""Download the text of _Alice in Wonderland_"""
return {
"targets": ["alice.txt"],
"uptodate": [True],
"actions": ["curl https://www.gutenberg.org/files/11/11-0.txt \
--output {targets}"],
}
def task_removepunct():
"""remove punctuation"""
return {
"targets": ["alice_nopunct.txt"],
"file_dep": ["alice.txt"],
"actions": ["sed 's/[[:punct:]]/ /g' {dependencies} | \
tr -s '[:blank:]' > {targets}"]
}
def task_tokenize():
"""replace whitespaces by newlines"""
return {
"targets": ["alice_tokenized.txt"],
"file_dep": ["alice_nopunct.txt"],
"actions": ["sed 's/[[:blank:]]/\\n/g' {dependencies} > {targets}"]
}
def task_tolowercase():
"""convert text to lower case"""
return {
"targets": ["alice_lowercase.txt"],
"file_dep": ["alice_nopunct.txt"],
"actions": ["gawk '$0{{print tolower($0)}}' \
{dependencies} > {targets}"]
}
def task_computefreqs():
"""tabulate the token frequencies"""
return {
"targets": ["alice_frequency_table.csv"],
"file_dep": ["alice_lowercase.txt"],
"actions": ["sort {dependencies} | uniq -c | sort -nr > {targets}"],
}
Once you have installed doit
with pip install doit
, you can run it with doit
. It will search for dodo.py
in the current working directory, and run the necessary actions (and only them) to create the missing targets. It will use md5 sums saved in an internal database to check if files have been modified and if and which actions need to be executed to update the targets.
The dodo.py
file is, no doubt, more verbose than the original Makefile
.
Yet, if in the above example, all actions are commands to be executed inside a shell, doit
'actions can also execute Python functions. This gives you the full power of Python when you write the “Makefiles”.
If you want to know more, check out https://pydoit.org/