230
38
I just ran across this old question asking what's so evil about global state, and the top-voted, accepted answer asserts that you can't trust any code that works with global variables, because some other code somewhere else might come along and modify its value and then you don't know what the behavior of your code will be because the data is different! But when I look at that, I can't help but think that that's a really weak explanation, because how is that any different from working with data stored in a database?
When your program is working with data from a database, you don't care if other code in your system is changing it, or even if an entirely different program is changing it, for that matter. You don't care what the data is; that's the entire point. All that matters is that your code deals correctly with the data that it encounters. (Obviously I'm glossing over the often-thorny issue of caching here, but let's ignore that for the moment.)
But if the data you're working with is coming from an external source that your code has no control over, such as a database (or user input, or a network socket, or a file, etc...) and there's nothing wrong with that, then how is global data within the code itself--which your program has a much greater degree of control over--somehow a bad thing when it's obviously far less bad than perfectly normal stuff that no one sees as a problem?
111It's nice to see veteran members challenge the dogmas a little ... – svidgen – 2016-05-24T19:58:25.913
9In an application, you usually provide a mean to access the database, this mean is passed to functions which want to access the database. You don't do that with global variables, you simply know they're at hand. That's a key difference right there. – David Packer – 2016-05-24T20:55:53.377
42Global state is like having a single database with a single table with a single row with infinitely many columns accessed concurrently by an arbitrary number of applications. – BevynQ – 2016-05-24T23:35:03.257
2@BevynQ that makes no sense at all to me, could you elaborate? – kai – 2016-05-25T06:34:53.480
1The state of the database is part of the spec of most operations, for example when I add a new customer; the testers will check the customer record is in the database. These tests will hopefully be automated. Global variables are just there because they make life easier for the programmer. – Ian – 2016-05-25T08:36:07.610
39Databases are also evil. – Stig Hemmer – 2016-05-25T09:21:16.993
1Much of the pain you get from a database is exactly the same as a singleton. For example difficulty in automated testing. Singletons and globals aren't evil. But like so many concepts you need to know the pros/cons of them. Typically the singleton is the right model for the database. – ArTs – 2016-05-25T09:58:58.723
4The trick is to move all the singletoness into a single place where it can be managed and walled off. Arguably that is the entire raison d'être for the database. – ArTs – 2016-05-25T10:02:17.343
3
Also, it is possible to make databases immutable as well.
– gardenhead – 2016-05-25T18:36:59.07325It's entertaining to "invert" the argument you make here and go in the other direction. A struct that has a pointer to another struct is logically just a foreign key in one row of one table that keys to another row of another table. How is working with any code, including walking linked lists any different from manipulating data in a database? Answer: it isn't. Question: why then do we manipulate in-memory data structures and in-database data structures using such different tools? Answer: I really don't know! Seems like an accident of history rather than good design. – Eric Lippert – 2016-05-25T22:18:42.663
I take umbrage with this
When your program is working with data from a database, you don't care if other code in your system is changing it, or even if an entirely different program is changing it, for that matter.I care a great deal. Application A should never be able to see Application B's data except via application B. – BevynQ – 2016-05-25T22:31:37.7771@Kai It is possible to design a database badly so that all data is globally scoped. It is also possible to highly restrict who has access to what data when and how. It is also possible to enforce data integrity rules. – BevynQ – 2016-05-25T22:40:06.767
3@EricLippert please make that a question... – trichoplax – 2016-05-26T00:06:23.947
3
The MUMPS programming language is worth a mention here. In MUMPS, there really is no functional difference between global variables and databases!
– Andrew Coonce – 2016-05-26T00:42:25.5371@ArTs: databases is not a Singleton, they are usually more akin to a Borg. You can create multiple instances of the connections, or a connection pool, but they share the same state. – Lie Ryan – 2016-05-26T01:41:59.710
@LieRyan A database CONNECTION is not a database. I am however trying to describe the real world object rather than the data structures. Also, I called it "a database", rather than "the database". Applications, sometimes have multiple databases, but each one there must be one and only one of. – ArTs – 2016-05-26T01:47:47.550
It is the quality of the design and the code that touches global states that matters. – rwong – 2016-05-26T21:07:41.250
1I'm voting to close this question as off-topic because the premise is fundamentally flawed as it is an equivalence fallacy. – Jarrod Roberson – 2016-05-26T23:31:55.737
1I don't understand this question. Is your database connection stored as a global variable? If not, then how is it global? It's only accessible to the procedures that you explicitly passed it to... – Mehrdad – 2016-05-27T04:36:31.427
@EricLippert Actually that difference is a practical consideration having to do with the requirements of using data in a database because 1) it has to be persisted outside of the current program instance and 2) it's dynamic state has to be shared (eventually, in some way) with other instances and programs with potentially far-flung distribution. Changing a shared datum is hard/kludgy enough when you only have to synchronize with another thread in the same program instance. When you have to synchronize thousands of changes with millions of people across the world, you need a different approach. – RBarryYoung – 2016-05-27T19:00:18.813
3@RBarryYoung: Certainly there are many, many implementation considerations. My musing was more along the lines of why languages which fetch data by dereferencing a pointer, and languages which fetch data by querying a table feel so different, when then underlying operation is conceptually the same. It's always struck me as odd. – Eric Lippert – 2016-05-27T19:03:32.620
@EricLippert ... IMHO, it's really the same answer as "Why is Web development so much different (worse) than Windows development? Why cant I just develop Web apps the way I develop windows apps?" AFAIK, the answer is: "Practical Considerations". – RBarryYoung – 2016-05-27T19:04:45.523
@EricLippert It has always struck me as odd too, and I've spent a lot of time pondering it. The best answer I've been able to come up with is the practical considerations of sharing, updating, protecting/persisting, and synchronizing changes transactionaly. You could take the ECC design pattern and extend it to make all data seem like just items and properties in a huge Object Model, but you get hung up on things again and again, like how to leverage the DB optimizers to search for row sets, and how to explicitly control when data is fetched, updated, comitted, checked for being stale etc. – RBarryYoung – 2016-05-27T19:12:02.717
2@JarrodRoberson How does that make it off-topic? That just means the answers should be "Your premise that ... is fundamentally flawed because ... " – Ixrec – 2016-05-28T10:15:03.810
If you're database is source of truth of your data then you're right. However, if you use event sourcing, the source of truth is events, not your global database. – blockhead – 2016-05-30T13:05:39.543
I'm surprised no-one has talked much about testability yet. Global variables are bad because they represent a testing combinatorics problem. Technically speaking each global variable introduced (minimally) doubles the number of tests you must run for unit testing. A database is different because it isolates these "super-global variables" in a metaphor that allows you to reset them all to a given state (drop table;insert...insert...insert...), and relational databases even allow you to constrain these "variables" in ways that are not possible in code (referential integrity for example). – Calphool – 2016-05-30T17:12:44.387
@StigHemmer Everything is evil. Except - in their mind - Google, – ott-- – 2016-05-30T18:04:39.877
I don't think they're that much comparable. There isn't a widely used rigorous set of properties specifically designed to minimize the negative effect of global variables in the same way as ACID principles in database. They are much more prone to errors and unintended effects than DB operations. – JI Xiang – 2016-05-30T18:34:43.327
1@EricLippert The situation feels even worse on the client side of a web app, wherein you have to work in a totally different mode of thought when you're hitting a local object (usually synchronously) versus something over the wire (usually asynchronously). Why do I have to care where the object is coming from, darnit!!?? I don't wanna! – svidgen – 2016-06-06T21:08:40.857