Since I'm such a fan of handmade programming, I find myself this fine eve implementing document indexing from scratch(ish) for my #GoActivityPub library storage backends.
-
Since I'm such a fan of handmade programming, I find myself this fine eve implementing document indexing from scratch(ish) for my #GoActivityPub library storage backends.
-
I'm sure I made plenty of mistakes, but I have to admit I find it surprisingly satisfying to be able to operate on a data type that I can overlay on top of the existing #FedBOX storage engines and get native and *fast* querying for them.
The indexes are quite chunky despite being built on top of roaring bitmaps because there's so many "indexable" elements in an #ActivityPub object. (Currently I'm indexing the type, the content, summary, name, preferredUsername, the recipients, the actor and the object)
As I explore some more, I hope I streamline some of these issues, and make the whole thing more robust.
-
By *native* I mean that I can have my own little API for searching:
-
Frantic day today, around 10h of productive work on improving the Index and moving it as part of the go-ap/filters module.
+1510/-11 lines of which 987 belong to tests.
Coverage is not entirely sufficient yet, because it's missing the checks for the top level Index.Add() and Index.Search() methods.
Another thing left to do is the persistence to disk.
The **reason** why I wanted to move the work I've done yesterday to this module is that instead of the custom client.SearchByX() functions, I wanted to retrofit the existing functionality already present in the filters module. Ah, also moving the bitmaps themselves to a semblance of generic types....
-
The new API would not be terribly different.
-
The full(working) example can be found here: https://pkg.go.dev/github.com/go-ap/filters#example-AggregateFilters