For those of you who haven’t been keeping score in this whole Vienna-municipal-pool thing, there’s a bit of a kerfuffle over the Towns’ survey of what people want to see in this location. And by kerfuffle, I mean repeated denial of FOIA requests to see the actual survey responses.
The dog ate it. That’s more-or-less Town Staff’s polite response. Legally acceptable response #4, as I see it, of the five legally-acceptable responses that a Virginia government entity may give, to a legitimate request under the Virginia Freedom of Information Act (reference). “Sorry, we can’t seem to lay our hands on that information.”
Why should anybody give a crap about some arcane technical issue like this?
Here’s why. As the Town gets serious about this, they need to think about how many customers this indoor pool complex is going to have. That’s obviously going to affect its financial performance, and so the size of the year-after-year taxpayer subsidy to this pool.
And, as I try to keep emphasizing, indoor gym “serious” pools and outdoor summer “fun” pools are different beasts. But the Town counted every survey response that mentioned “pool” as showing support for an indoor gym/pool, as proposed. And, apparently, it’s just not in their wheelhouse to split that out so we can see what fraction of those actually said “indoor pool”.
The extent to which indoor pool and outdoor pool are close substitutes is far from clear. For example, we’ve got 2200 names on the waiting lists for local outdoor summer pools. Yet every one of those people can buy instant access to any of several local indoor (gym) pools right now. Not that there isn’t overlap between the two pool user groups, and a certain fraction of the population that swings both ways. But spinning that the other way, you could plausibly say that it looks like a lot of people want a pool, but maybe only an outdoor pool, and maybe have no interest in an indoor pool.
In light how how questionable the Town’s existing demand, revenue, and operating loss projections look (see prior post), this now takes on new importance. Vienna paid to have the survey of resident’s desired use for this property produced and fielded. It’s nuts not to get as clear an answer as possible, to this fairly important question: What fraction of “pool” responses specifically said outdoor, indoor, or (other/unspecified) pool.
Putting that another way, is it prudent to vote on a meals tax increase (which, despite what anybody will try to soft-soap you about, is the Town’s green-lighting of this proposal unless something catastrophic occurs), and so commit to spending $26M of the taxpayers’ money up front, and (my estimate) $37M in payments down-the-road, with out having a crystal-clear idea what fraction of Vienna citizens were asking for the type of pool you are planning to supply?
Particularly, when it’s easy enough to do a rough cut of:
- indoor
- outdoor
- didn’t say/too hard to tell/both/etc.
Town Council obviously needs to see this information clearly presented, before proceeding. Given that it would take me about ten minutes to do that, off the raw survey data, failure to do this is just another flunk for the Town of Vienna decision-making process.
We’ve had enough of those already.
The simple, key issue: What fraction of survey respondents said “outdoor pool”.
Peel apart the current FOIAkerfuffle, and the key technical issue is indoor versus outdoor pools. Generally a different fun/exercise ratio. Different vibe.
Town Staff (maybe, it’s really Town Staff’s consultants) apparently combined all pools together (indoor, outdoor, unspecified), into just one “pool” category. Not an unreasonable choice of methods, as long as this big caveat (this is all pools, not indoor pools) is respected.
But then they lean on the resulting high vote total for “pool” total to argue, fairly forcibly, that the proposed indoor pool/gym is obviously Vienna’s clear choice.
There’s an entirely separate and highly important issue that, in all likelihood, “free pool” is what people had in mind. And all that implies.
Skip that for now. Focus on indoor/outdoor. Don’t you think Town Council should see that split, before they commit themselves on this path? And, despite what anybody says, once they vote to raise a tax, that decision is made, and the rest is detail.
Given what appears to be a rosier-than-rosy scenario about future financial performance of this indoor pool, hadn’t they better pin down this detail first? Because that future performance counts on having a lot of demand for this pool.
And if, say, we could find even the faintest shred of evidence that half of what you’ve counted so far as your future customers — via this survey — if half those folks actually want an outdoor pool, and maybe have no interest in an indoor pool, maybe you should step back.
We already know a big fraction said “outdoor pool”: A brief lesson in how word clouds work.
So, let’s start with the nuttiest part of this. We already know that a large fraction of the “pool” responses to the Town’s Annex survey said outdoor pool.
It’s just that nobody seems to have noticed. Because everybody thinks they know how word clouds work. But in fact, there’s more to word clouds than you might think. Not a lot more. But more.
Below is the word cloud in question, from one of the Town’s contractors.
Source: Town of Vienna, contractor’s report on The Annex, but it’s been used in a variety of places in the various materials.
Everybody kind-of knows how a word cloud works. That’s one of those trendy graphics where you input it some sort of text, and the word cloud software gives you a graphic where the size of the word/bubble represents how frequently a word was mentioned. Big bubble, important thing. Small bubble, minor thing. That’s the gist.
In the word cloud above, the input text is the verbatim (free-form, write-in) comments from the Town’s survey. And the word cloud shows how frequently individual words were mentioned. (Obviously, excluding the, and, if, but and similar common but un-helpful words.)
A word cloud is a perfectly acceptable data summarization technique to use in this case. This is a situation where you, the data analyst, have to find a cheap way to summarize like-onto a thousand free-form comments. And if you’ve ever tried to do this, it’s the point where you — the data analyst — finally realize why surveys give you a defined set of choices, in check boxes. Taking that mass of free-form text, and (old-school) getting anything useful out of it, is a challenge. I know because I’ve done it. As a data analyst, I hate free-form (write-in) fields on surveys.
Sure, you can say whatever you want. Good for you, the survey respondent. Then I have no clue what to do with it, once you’ve had your chance to express yourself.
So, an expedient solution to cranking out … well, anything useful — is just feed your text to a word-cloud generator, and see what pops out. As is the custom in the modern world, you have your pick of websites that will do that for you for free, no questions asked.
(These days,thanks to running into somebody who’s up to speed, the first thing I’d do now is feed the text into an AI and ask it questions. AI, tell me how people in this survey felt about indoor versus outdoor pools? And you’d get the AI’s general impression. Not sure you could trust it, but for free, or nearly, who wouldn’t do that?
Last time this technical issue about dealing with free-form comments came up was during discussion over MAC zoning. At that time, the Town was willing to release the verbatim comments of the survey at issue then, so I used Excel’s word search function to allow you, hoi pollio, to count word combinations occurring in those comments, to your heart’s content, in this prior post on the Town’s Visual Preference Survey.)
But, really, how is a word cloud any different than just counting up how many times each word occurs in the underlying text source (survey responses, in this case), filtering out the obvious crap (the, and …) and presenting a tabular list. Other than for the eye-candy impact of the word cloud.
Is a word cloud just a set of word/bubbles, where bubble size is proportional to word frequency in the summarized text?
Oh, heck no, a word cloud is more. Not much more, but more.
In particular, the proximity of words in the word cloud mirrors proximity of words in the underlying text.
You can tell, just by looking at a word cloud, which words were frequently mentioned near each other, by how near they are in the word cloud.
Bet ya didn’t know that, did ya?
Now have another look at that word cloud, where I’ve given you a little help.
If you now said, “hey, that looks like about half the people said outdoor pool, and half the people said indoor pool”, award yourself an A.
Sure, we can chit-chat all we want about this not being definitive, maybe something-something-something, and blah blah blah. So, sure, that goofy little graphic is not a smoking gun.
On the other hand, at some point I run out of patience trying to be fair about this. To estimate the fraction of “pool” respondents that wanted access to an outdoor pool, you’d have to look at the file containing the text of the individual write-in responses. Which is the core of what citizens are asking to be allowed to do for themselves, as Town Staff appear unable to do it. But for which, so far, citizens FOIA request have been politely stiff-armed.
It’s not rocket science to do such a count, if you’d been through it once. Pull all the comments that said “pool”, eyeball for common misspellings and synonyms (e.g., four-season pool means indoor pool, open-air pool means outdoor pool, “exercise” pool likely means indoor, and so on). Nothing but common sense. And use of Excel word-search functions. Then count them. State exactly how you counted them. And publish the answer.
And live with the answer, whatever it may be.
The fraction of “pool” respondents that specifically said “outdoor pool” is not zero. So treating it that way is wrong. We’re just dickering over exactly how wrong it is.
How do I know it ain’t zero? Again, not rocket surgery to figure this one out.
We know there are about 2200 names (with some unknown but presumed significant duplication) on waiting lists for local membership outdoor pools. These are folks who, for sure, want access to an outdoor pool. And they have (and have pledged to) put their money where their mouth is. And are lookingat waits up to eleven years to get it.
And so, out of the 1000 respondents to the Town’s survey, are we supposed to believe that NONE of those folks bothered to respond, or if they did, none of them said outdoor pool?
That’s not even remotely plausible. So we know the number ain’t zee-roh. At this point, the Town needs to stop treating it like zee-roh and if nothing else, let some citizens figure out what the actual fraction is, in a completely transparent manner.
But right now, we can’t do that for ya. Or FOIA. Owing to not being able to get our hands on that file.
Conclusion
With the amount of money we’re talking about here, it’s time to stop screwing around.
Plus, to be rude, even though we aren’t supposed to say that Town Staff are deliberately keeping anybody from finding this out, that is what any sane person would suspect, until proven otherwise.
So prove it otherwise, please.
In this post, I’ve explained the importance of the underlying issue, they key analysis citizens wish to perform with the file, the simple quick-and-dirty technique that will form the backbone of the analysis, and why there is no legitimate excuse for not having ready access to this file, that the Town spent quite a bit to produce.
As a former consultant, I’ve known consulting outfits that claim that any data product they produce, for you, using your money, surveying your respondents, is theirs. They do that, and ensure that the data is specifically NOT listed as a deliverable under the contract, so that if the client wants any further work, no matter how trivial, they get to charge another fee, for doing that.
But legally, that’s nonsense. The raw output of a survey is work-for-hire. All claim to copyright of work-for-hire belongs to the hirer — in this case, the Town. By contrast, a recalcitrant contractor might have a slender claim that their particular graphic is protected. That word cloud above, sure, they can claim copyright on it, if the client is pliable enough to let them do that. (For Federal contracts, I recall any such practice being banned.) Even if the contractor and only the contractor has the file, and even if you didn’t specify the file as a specific deliverable in the contract, the worst that can happen is that the contractor now has the right to charge a reasonable fee to cover the cost of copying it and emailing it back to the Town.
Finally, I can assure you this file didn’t accidentally get erased, or anything like that. As somebody who has done and paid for surveys, whoever actually compiled the file of survey responses treated it as the precious object that it is, and made a skazillion backup copies. So, absolutely the file exists.
The main point here is that Town Staff don’t have the right to egg the Town on, into what might be a minor bit of economic suicide, or keep stonewalling on this just because the results might not help “your side” of this issue. That’s not the way good decision-making works. At least, not outside of the TOV.
Step back and understand what we’re doing. We’re asking seven talented amateurs — Town Council — working in essentially un-paid part time jobs, to be responsible in their decision to green-light the spending of tens of millions of the taxpayers’ dollars, and committing the Town to subsidize this facility forever. The last thing they need is having to screw around like this, just to see a simple cut of the data that is obviously relevant to their task of evaluating the cost and benefits of this proposal.