LB: UUIDv7 is time ordered, sortable, and has good key locality. Just use those, and you won't even miss your autoincrement keys
-
LB: UUIDv7 is time ordered, sortable, and has good key locality. Just use those, and you won't even miss your autoincrement keys
-
@jenniferplusplus omg, it's literally "v4 but with all the stuff you wish v4 had."
-
@joshuaelliott you do give up like 40 bits of entropy. But that still leaves you with more than 60 bits of entropy, and it turns out that's plenty for virtually every scenario.
-
smallcircles (Humanity Now π)replied to Jenniferplusplus last edited by
@jenniferplusplus @joshuaelliott
I got confused in all the variations, but there was a recent article giving a good summary, and then discussed on HN: https://news.ycombinator.com/item?id=41350225
I got to v7 as well from that list.
-
Benjamin Sonntag-King πreplied to Jenniferplusplus last edited by
@jenniferplusplus is it though?
I read (don't remember where) that v7 being ordered makes it (a bit) harder for B-Trees in sgbd tables... -
Jenniferplusplusreplied to smallcircles (Humanity Now π) last edited by
@smallcircles @joshuaelliott
Basically, v4 is enormous entropy, completely unordered, collisions statistically cannot happen within the lifetime of the universe, use when you have an absolutely enormous number of records (trillions+) in the same name space.v7 is high entropy and time ordered, collisions are mathematically impossible after 1 millisecond. Use when records close together in time are likely to be queried together, or if you need keys that have a stable meaningful ordering.
-
Jenniferplusplusreplied to Jenniferplusplus last edited by
@smallcircles @joshuaelliott the rest are mostly not very useful
-
smallcircles (Humanity Now π)replied to Jenniferplusplus last edited by
@jenniferplusplus @joshuaelliott
useful advice, thanks!
-
Jenniferplusplusreplied to Benjamin Sonntag-King π last edited by
@vincib being closer in value means they should cluster into fewer larger buckets, so you do less tree traversals and you can get them in sequential memory pages more often. I'm not sure how being ordered affects things, but my understanding is the access characteristics due to the narrower distribution makes a big difference
-
Jesse Cookereplied to Jenniferplusplus last edited by
@jenniferplusplus this is where I was landing and then I saw https://github.com/paralleldrive/cuid2 a few days ago. I'd be curious to hear your thoughts.
-
Jenniferplusplusreplied to Jesse Cooke last edited by
@jc00ke I've never seen it before. It seems like it's solving problems I don't have, and it's not clear to me why anyone would have these problems.
-
Jesse Cookereplied to Jenniferplusplus last edited by
@jenniferplusplus good point. The biggest plus for me in switching to a sortable ID is what I've previously read about db indexes, but you're right, even with UUIDv4 I've not had a real problem with index size.
-
Hrefna (DHC)replied to Jenniferplusplus last edited by
It really shines when looking at *sets* of data and are likely to be drawing an entire set of data at once (e.g., a timeline).
The negative side is when you are dealing with a potential for hotspotting, say because you have several orders of magnitude difference in number of entries between one timestamp and another.
(We deal with this kind of key design constraint all of the time when working in spanner or bigtable or equivalent datastoresβit's about tradeoffs)
-
Basically: if you want and will mostly use random access, it isn't the correct solution but it also isn't correct to use a b-tree (hash index is your best bet). If you think it is likely that you are going to be drawing closely temporally related things at once, then there's some nuance around the _grain_ but in general you want an ordered key (possibly not a millisecond-resolution ordered key, but an ordered key nonetheless)
If you want a coarser grain, that's easy