Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've used a SQL database as a document store, using a blob column to store a serialized data structure. It was used to store heterogeneous documents of various types across various installations. I did for all the reasons that have been given for schemaless data storage in NoSQL solutions. For each new document type, there was no need to adjust any database schemas -- whatever properties you have are just stored.

However, unless I had a really large number of documents, I wouldn't do this again. With a schemaless solution, you're just moving the problem of schema changes into the code and whatever structure your data was at any time you have to support forever. Right now, I have conditional branches in code to support an document structures that haven't been relevant in years. In SQL, you can alter the table and write a quick query to update your data and you're done. Whatever mistakes you made in your design are history.

Proponents of schemaless storage claim that it's great for development but I disagree. I change my design constantly -- add tables, remove tables, split columns, you name it. I don't hinder myself because my data is organized into typed columns. I alter the data as needed to fit the new structure. As a benefit, I never have to support my previous mistakes.



Would you say that the schemaless storage debate is then similar to static/dynamic typing, with the same kind of tradeoffs?


Sort of, but worse. If your database has no schema, any mistakes you make can accumulate subtle damage to the integrity of your data that you don't have any way of going back and fixing. I saw this happen in a Notes shop. Over time, some of the documents had been updated by many different versions of the code, until they were in such bizarre and unintended states that not even the developers could say what the correct app behavior should be anymore.

If you are not migrating your data to your current schema, it will decay to garbage. If you are, you already have the old and new schema in your mind, so why not write them down and get some help from the tools?

The other way out is to keep the authoritative version of your data in a store with a schema, and maintain a summary of it in your low-latency store that you can regenerate at need.


I think it's wrong to make the analogy to static/dynamic typing -- but this comes up a lot. It'd be closer to having a type system without any classes, structures, or prototypes -- just hashes. And once you write code to construct an object instance, you can't change that code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: