Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
100% Positive
Analyzed from 964 words in the discussion.
Trending Topics
#more#date#price#period#data#ranges#rate#row#postgres#order
Discussion Sentiment
Analyzed from 964 words in the discussion.
Trending Topics
Discussion (20 Comments)Read Original on HackerNews
Let’s say for 2026 you have it at 7.25% and you entered that into the system ahead of time (say December 2025). Today, June 12 you learn that it should have been 7.35%. It would be incorrect to say that the rate changed today: it was 7.35% since January 1. But you also don’t want to lose the fact that all your invoices have been generated using the wrong rate because if you go to recalculate them you will get a different answer.
In this case what you do is create version 2 of the rate in your database with the same time period but the correct rate. This would allow your other database objects to reference either version 1 or 2 and to even recalculate all the objects that reference version 1 to now reference version 2 such that you can get line item corrections and figure out what to do about them.
It is cumbersome to use but for the specific use case of modeling real world laws that are not available as machine-readable info it is the best option I came up with.
I think the end date should be nullable though, but valid_to is NOT NULL in the starting example... later in the article, when showing the "new way" using date ranges, it inserts a row with an open-ended range, which is more what I'd expect.
It never came to pass when we used Oracle, maybe now with Postgres I will finally have a chance at it.
This question has been answerable in Dolt for years now.
Say you have one time series with CPU-core task switches:
T=1 task=A T=3, task=B, T=5 task=A, ...
... and another of CPU frequency changes ...
T=2 freq_hz=800, T=5 freq_hz=1200, T=6 freq_hz=900
How, in SQL, do you express the question "How many CPU cycles did each task use?"? Try to do it with more complex examples. You'll tear your hair out.
Having worked on this sort of data analysis quite a lot, I'm strongly of the opinion that SQL needs syntax, not just table-valued functions, for expressing questions about timelines.
And really, ranges are an amazing substrate for this. I've had to do this by hand in a ... less featuresome ... SQL-speaking DB and it was clunky and performed fairly unimpressively.
See https://dancol.org/dctv/index.xml
I still think there's a lot of value in first-class syntax for time queries. Splitting ranges like Postgres 19 does is a good first step, but there's also a lot of power in broadcasting over these ranges, combining them in various ways, and storing multiple, independent ranges in a single table.
Ignore the bit about active development: these days, it'd take more sense to add the operators I describe to Postgres and DuckDB than to make a numpy-based engine just to host the analysis. This work predates DuckDB, and it's reassuring that DuckDB (and now Postgres) are thinking along similar lines.
I'm also glad that in the intervening years "data lake"-style analysis has become more prominent. My ideal data processing pipeline consists of sourcing from raw data and pipelining views all the way to human-meaningful outputs. Materialization, if it occurs, is just an optimization.