Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Agree. I've been hoping for a PostgreSQL extension with in-built sharding (think Netezza or Teradata). I know this is ambiguous so a low-effort definition is in order: a tightly bound cluster (nodes are aware of each other and share data to fulfill queries) where you specify distribution for a table but there is no explicit rebalancing command. Admins can add nodes and the user is none-the-wiser (except for improved performance, of course). Cross-node joins work (reasonably) well. I've been watching Citus for a while but - unless I'm misunderstanding - the sharding is a bit more explicit and sometimes manual.


(Ozgun from Citus / Microsoft)

Hi there, thanks for mentioning Citus. Could you share a bit more about the user experience you're looking for with sharding?

With Citus, you create your Postgres table as-is. If you'd like for the table to be distributed, yes, you'd need to pick a distribution key. You'd do this by calling: SELECT create_distributed_table('postgres-table-name', 'distribution-column');

We also thought about picking a distribution key on behalf of the user. This however has performance implications, particularly as you add more nodes to the cluster.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: