January was a slow month, I only did three uploads to Debian unstable:
- xdg-desktop-portal-wlr updated to 0.8.1-1
- swayimg updated to 4.7-1
- usbguard updated to 1.1.4+ds-2, which closed #1122733
I was very happy to see the new dfsg-new-queue and that there are more hands now processing the NEW queue. I also finally got one of the packages accepted that I uploaded after the Trixie release: wayback which I uploaded last August. There has been another release since then, I’ll try to upload that in the next few days.
There was a bug report for carl
asking for Windows support. carl used the xdg
create for looking up the XDG directories, but xdg does not support
windows systems (and it seems this will not
change)
The reporter also provided a PR to replace the dependency with the
directories crate which more system
agnostic. I adapted the PR a bit and merged it and released version
0.6.0 of carl.
At my dayjob I refactored
django-grouper.
django-grouper is a package we use to find duplicate objects in our data. Our
users often work with datasets of thousands of historical persons, places and
institutions and in projects that run over years and ingest data from multiple sources,
it happens that entries are created several times.
I wrote the initial app in 2024, but was never really happy about the approach
I used back then. It was based on this blog
post
that describes how to group spreadsheet text cells. It uses sklearns
TfidfVectorizer
with a custom analyzer and the library
sparse_dot_topn for creating the
matrix. All in all the module to calculate the clusters was 80 lines and with
sparse_dot_topn it pulled in a rather niche Python library. I was pretty sure
that this functionality could also be implemented with basic sklearn
functionality and it was: we are now using
DictVectorizer
because in a Django app we are working with objects that can be mapped to dicts
anyway. And for clustering the data, the app now uses the
DBSCAN
algorithm (with the manhattan distance as metric). The module is now only half
the size and the whole app lost one dependency! I released those changes as
version
0.3.0 of the
app.
At the end of January together with friends I went to Brussels to attend FOSDEM. We took the night train but there were a couple of broken down trains so the ride took 26 hours instead of one night. It is a good thing we had a one day buffer and FOSDEM only started on Saturday. As usual there were too many talks to visit, so I’ll have to watch some of the recordings in the next few weeks.
Some examples of talks I found interesting so far:
- a talk about supporting Python web deployments with Rust in the Rust Developer room
- a talk about duckdb in the Python Developer room
- an introduction to particleos in the Distributions Developer room
debian foss fosdem python carl