DE version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
75% Positive
Analyzed from 4417 words in the discussion.
Trending Topics
#sql#postgres#data#server#database#https#oracle#table#views#postgresql

Discussion (104 Comments)Read Original on HackerNews
These days, I do myself a favor and always avoid Oracle and MySql/MariaDB.
Postgres is amazing, and the two big things I wished it had:
1. lightweight connection; connection bouncers improve the situation, but you still have an unreasonably high memory footprint per concurrent connection.
2. Synchronously updated materialized views (Sql Server calls them indexed views). These are incredible tools in complex data situations. I saw a project struggle with complex technical implementations that would be elegant, trivial and always correct with indexed views.
Sql Server can be costly, but in many cases the benefits it provides are totally worth the cost.
Choosing the data store carefully prevents lots of future trouble.
If I am going to use a "free" provider, SQLite is impossible to beat. They cover a majority of use cases today. SQLite starts to fall apart with backup, replication and tooling. If I am on the hook for things like system availability and disaster recovery, I don't have a problem spending money to cover my ass.
If I am going to pay any amount of money at all, I am going all the way. The developer experience around MSSQL is untouchable. SSMS and VS with sql projects runs circles around contemporary entity framework crap. Sprinkle in 3rd party tools from vendors like RedGate and you can replace multi-million dollar consulting packages.
I wouldn't ever advocate for standing up a new Oracle or DB2 machine, but if one was already in place I'd probably die on the hill of not trying to refactor it away. These databases typically come with multi-volume ghost stories attached. Reinventing all those weird effects on a new engine will typically kill the business if there are no other options available.
This may be the case for MS-centric, application & human developers, but I'm not convinced moving forward. Microsoft's BI story is pretty thin and out of date. Postgres has some solid columnar support/functions (which probably why Snowflake is writing about it) which means you can potentially use it for both you transactional and analytical workflows. As more development shifts to agentic workflows I'd bet Postgres shines when the overall ecosystem is more important than the human tools that were essential for the past 20 years. I loved Redgate's value-add but I don't think agents care about the UI which was the big win. SQL Server will continue to live in the enterprise andf where MS can sell lucrative support contracts or build for their clients, but I'm not seeing any net-new projects where the builders have any choice to not use SQL Server.
Try Postgresql. I was previously SQL Server and the move has been great.
SSMS doesn't offer much over alternatives, both PGAdmin and others. VS is dying, VSCode is the future.
Redgate doesn't offer anything that is essential or not available elsewhere for postgresql
The money you spend on Microsoft licenses could be put towards more ram on the server.
Whether the popularity of the MS ecosystem is good or not is a separate problem. If we are solving for "make the business go well and I get paid more", the strategy is usually obvious. We can still advocate for OSS and not-so-many-eggs in the Microsoft basket while we get paid for using these technologies. Again, SQLite is the preferred engine in my tool belt. I don't want to have to manage a hosted sql machine. But, sometimes the problem absolutely insists upon it.
Also, if you are using C#/.NET, integrating with MSSQL is always a little bit easier than the other providers. SQLite (and some others) have lackluster types for things like time. MSSQL has DateTime2 and DateTimeOffset that map exactly into the CLR types.
Oh yes, I'd love them too (if you're referring to, in Oracle slang, "...update on commit") - and it would be cool to have as well the option for a lazy update ("on demand" by taking into consideration only the records that have been changed since the last refresh, to handle multiple updates in a single pass - not sure how Oracle can achieve that technically...). This would be in my opinion a fantastic added functionality compared to basically all other (OLTP?) opensource DBs.
And: I'm really curious about the "OrioleDB" project... ( https://github.com/orioledb/orioledb/releases ) as a few years ago I was struggling a lot with "vacuum" of a kind-of-temporary table that had quite high amounts of continuous random inserts & deletes (problem solved by accumulating more changes in RAM before flushing them to the table therefore increasing amount of rows changed per "page", but I had to sweat a lot to find a good balance...).
Just to give an example, I ingest otel trace spans individually and in a materialized view calculate the total duration of the whole trace among other things.
The oracle database is also actually pretty nice, according to the people I've talked to who use it.
But mostly they play in the 'nobody ever got fired for suggesting' club. These days AWS is the most egregious landlord of that club, but whatever.
I believe you can do this with the pg-trickle extension.
https://github.com/trickle-labs/pg-trickle
Usually there will be one or two tables that grow at a dramatically faster rate than everything else and I have always found that those belong in a separate data store.
Pg connections are definitely heavy, but usually on resources other than memory in my experience. If you configure reasonable dirty reclamation and recycling, the memory numbers are often overstated due to Linux tools’ deceptive fork accounting and shared buffers. Ofc, if you’re averaging lots of heavy queries per connection it’ll be truly heavy, but many times the numbers overstate the impact.
In my eyes they're similar to triggers, which incur a high performance overhead in OLTP systems and are shunned by developers. In OLAP systems custom ETL code will likely outperform them.
The sweet spot is if you have a read-mostly database and use SNAPSHOT transaction isolation for the readers (which is SQL Server's implementation of MVCC). That way, writers may still block writers, but writers can never block readers, even when indexed views are being maintained.
Another neat trick is to "abuse" indexed views as multi-table CHECKs. The idea is to make a JOIN that would produce duplicated rows (and fail the indexed view's key) if some multi-table condition is not met.
Indexed views are not much worse than indexes. Of course, when they refer to other tables there are underlying data lookups, but in our experience when we moved from triggers to indexed views, large scale data ingestion went way faster.
Where we used it: While revamping a large scale sales program, we stored the warehouse in/out in one table, and several things like current stock were calculated using indexed views.
Bonus: Using Snapshot concurrency control, you can do many things concurrently, and only when they both updates to a certain product in the same store you'll get the second transaction failing (which could be retried on the backend).
The fact that they are completely in-sync with your data is amazing.
What you're describing is amazing, and I wish I had it available to us. We've hand rolled far too many triggers to achieve the same thing, with all the expected problems you'd assume. I'm sure it could be abused/misused, but a batteries-included approach like that would be huge.
Stale MV is a thing you only ever burn your fingers on once. Like how "It's not DNS" is a common meme in networking.
1. "materialize" the view as a full table, then index that. Any reasonable pipeline/ETL tool can provide incremental updates between tables. Obviously, anything materialized requires considerations around storage, replication, backup/restore, I/O, etc.
2. use a regular VIEW and index (precisely) the underlying expressions mentioned in the view, i.e. so when the view is used, then the indexes get used.
Both require rewriting SQL, though I've used VIEWs to make the change transparent.
Windows Server is a real pain to operate and the SQL Server ecosystem expects you to run a lot of add-ons on the server alongside your database. Those don’t translate to managed database services, so you lose a lot of functionality if you jump to RDS or similar.
The first party tools are also aging poorly. SSIS and SSRS are not fun. SSMS is ok for what it is but can’t compete with the ecosystem around PostgreSQL.
Maybe I’m missing something but I can’t wait to ditch it.
https://learn.microsoft.com/en-us/sql/linux/sql-server-linux...
So what's wrong with MySQL or MariaDB?
Oracle treats empty strings as being NULL.
Anyone who's never used Oracle before in their life is probably wondering if I'm making it up. I'm not. In Oracle, inserting '' in a VARCHAR column is exactly the same as inserting NULL. And if there's a NOT NULL column, you're not allowed to store the empty string in there.
Which means that in Oracle, you do not have any way of distinguishing "I don't know the person's middle name" vs. "I know what the person's middle name is: he/she doesn't have one".
There are apparently historical reasons for this, but I don't care. The empty string is NOT the same as NULL, and any software that treats them as the same IS BUGGY!
Sorry. Had to get that off my chest. I know I'm answering a question you didn't ask, but that has been bothering me for nearly 25 years (I first learned about this misfeature of Oracle's in 2002 or 2003), and I just had to vent to somebody who would understand.
Very interesting (and hopefully cathartic). I never got past "it involves doing business with Oracle".
That is ridiculous. Do not they know the difference between "" and NULL?
2. No MERGE statement
3. No partial indices
4. Many ways to lock out instant add table, meaning you can’t add a column without a full table write, which can lock the table for minutes at a time in even moderately sized tables.
5. Dealing with legacy mysql databases often means dealing with utf8mb3, which used to be the default utf8 data type despite not storing all utf8
6. Dealing with all but the most recent mysql databases means dealing with non strict mode which means your NOT NULL column won’t require a value.
Note: the below taken nearly verbatim from https://sql-info.de/mysql/referential-integrity.html#3_5
Now that we've created both tables, let's insert a record into table one: And now let's insert a record with a different primary key into table two: MariaDB will give you an error at this point (ERROR 1264 (22003): Out of range value for column 'id' at row 1), but MySQL (at least back when I tried this about ten years ago, which was the last time I was forced to work with MySQL — and I am so glad I never have to go back!) would return no error message and just say "Query OK, 1 row affected (0.009 sec)".Now let's select the value we inserted into table "two":
And what do we see? The value 127, even though we inserted 128. Which has created a foreign-key relationship to table "one" that we never intended to put in there.There are other reasons why MySQL was inadequate, but I no longer remember them. Probably MariaDB has fixed them by now. But I no longer have to use MySQL/MariaDB for anything, and I never want to go back. I have a VERY strong averse reaction, caused by past pain, when I think of using MariaDB. (I actually spun up a virtual machine to test what I wrote here, because there's no way I was going to install MariaDB on my primary work machine).
My list:
No `explain (analyze,buffers)`. Instant DDL has some warts (e.g. fk, metadata locks). Query planning bugs (actually... query planning in general is disappointing). Exiting the repl doesn't stop queries. Implicit type casting. Replication lag from large DDL (e.g. creating an index). Lack of two phase DDL (creating constraints NOT VALID and then VALIDATE later). Lack of extensions (e.g. pg_vector). No safe access to inspect buffer cache. AWS Aurora seems to only add shiny new things to Postgres. And more.
Again, none of this is quite enough to migrate off of it for an established system, but certainly enough to avoid it on a new project.
When I started I thought there's too much inherent overhead in using Postgres tables for storage and using the Postgres executor, so figured it would be pretty cool to match Timescale in performance. I didn't think it would be possible to get close to dedicated analytical DBs. But as the project progressed and the performance got better and better, I'm now firmly in the camp of doing analytics with Postgres + an extension.
It’s like saying that you’re getting worried Apple doesn’t sell washing machines.
Columnar and all the other fun stuff (JSON, GIS, inverted indexes, embedding vectors) is a natural progression of that thinking. With TimescaleDB, Hydra, Citus, pg_mooncake, etc. becoming very popular the last few years, there is a clear demand for an integrated experience.
(Stonebraker also thought one database shouldn't do everything, as described in his early 2000s "One Size Does Not Fit All" paper, and Stonebraker branched out into HStore/Vertica for columnar. In hindsight, I think that was appropriate for the time, but no longer a significant concern.)
I don't want a columnar database, I have a few tables which would be excellent fits for columnar storage. And the relational model doesn't say anything about the way data is stored.
Andy Pavlo teaches about this and has done real work in this space, and certainly considers it to be non-trivial. His YouTube channel is spicy and worth looking into if you want to dive deep on the topic.
You can either 1) accept the limits of the tool, at least until it becomes a priority for the tool developer to add whatever feature you want, 2) pay someone to develop the features you want, or 3) buck up and do what vast numbers of ops folks have done before you and move to something that does what you need.
No real point here other than an observation about how the installed base’s needs change, across industries.
By the same logic, you could say Microsoft Access should have all the capabilities of Postgres because it's painful for small businesses to move off of it when it's no longer a good fit for their needs.
There are other relational databases that have both kinds of storage engines and some use both on the same table (row based insert with column based migration and secondary column store indexes: https://learn.microsoft.com/en-us/sql/relational-databases/i...
Just like you can have b-tree based table storage vs heap in the case of index organized tables / clustered indexes (which pg doesn't have) you can choose column based instead the logical data model is still the same relational model.
I think the better way is to use Postgresql for new data and routinely archive off older data to data warehouse type database, to keep the Postgres one small.
(Many companies also now use a RDBMS alongside either a KV database or document store in main app)
The first was using SQL Server alongside CouchDB, an older company. CRUD Data into SQL, documents into Couch. (used separately, document Url in Couch stored in SQL and brought in by front end)
The second, startup with funding, was using Azure SQL Server (and complaining of the cost) and using Azure Cosmo (Key Value NoSQL) alongside it. Same relationship, no link between the two, it's the front end that drew data from both.
Just look up the pro and cons of each database, some are insert only, some are distributed, some are faster read then write etc. LLM will give you a quick comparison.
Also: https://db-engines.com/en/ranking
There was a big fanfare about orioledb a while ago, and i think it got bought by people that wanted to push that into mainstream postgres?
Did it die somewhere along the road?
https://supabase.com/blog/orioledb-launch
And they continue to work on it.
https://supabase.com/blog/orioledb-patent-free
confirming that we're still working hard on this. aiming to be production-ready this year. benchmarks and compatibility testing are on track
https://news.ycombinator.com/item?id=48413655
(LLM writing rant below)
---
> That alone tells you something: Users had a real need, and the ecosystem filled the gap.
> This sounds straightforward, but it solves a real operational problem.
> None of these change the world. All of them make day-to-day data workflows better.
> The easy thing to do here is list planner changes and call it done. But the more useful takeaway is this: Postgres keeps getting better at recognizing the shape of common queries and doing less unnecessary work.
> [Proceed to list planner changes]
If Orwell were alive today, he might declare himself illiterate in English and learn Klingon just to avoid having to read these.
SELECT customer_name FROM GRAPH_TABLE (myshop MATCH (c IS customers)-[IS customer_orders]->(o IS orders WHERE o.ordered_when = current_date) COLUMNS (c.name AS customer_name));
That is _awful_ syntax; it is reminiscent of neo4j, which is surely not a tool anyone serious should copy from outright in 2026.
And of course the final thing I am left wondering is if it's fast. Row-level security is such a useful feature and yet only a fool would contemplate building anything serious with Postgres', as the planner goes haywire and does per-row-matching, nuking performance.
It's SQL/PGQ, which was derived from the Cypher language for Neo4J and now is part of the SQL standard.
Er… yes? It’s called Row-level security; how else are you going to validate that a row passes a policy?
> Several of these features were first introduced by DuckDB, while some are inspired by other systems. Many of the features originally introduced by DuckDB (e.g., GROUP BY ALL) have been since adapted by other systems.
https://duckdb.org/docs/lts/sql/dialect/friendly_sql
When Claude writes things like "as someone who has spent a lot of time doing X", I think this is also a kind of failure of alignment. LLMs shouldn't write as if they had personal experience. It's something a person might say in the training data, but I just think LLMs shouldn't claim life experience they don't have, even if that's a statistically likely sequence of tokens.
Something like rocksdb as PG backend would be fantastic. Yugabyte does this but it's not PG.
https://vldb.org/cidrdb/2026/a-multi-tenant-relational-oltp-...
Context: https://www.orioledb.com/docs#:~:text=OrioleDB%20currently%2...