Mapping for the masses: Population Density in Kitchener-Waterloo

One of my sidebar projects this fall has been to get back into mapping socio-economic data. This is something I used to do quite a bit four years ago (these maps have sadly succumbed to linkrot and plugin abandonment). Projecting numeric data onto maps is easier than most people think, and ever since I moved to a new city in 2013, I planned to pick up this skill again to learn a few things about my new town. And as a data librarian, I know where to find and work with census data, so it was easy to kickstart things into gear once more.

Below is a map showing population density in Waterloo Region’s census tracts at the 2011 census. Click through to get the entire map:

The interesting thing about this map isn’t so much its colorful polygons, (based on statistics anyone can download here) but the tools I used to build it.  When I was creating maps in 2010, the average person who wanted to hack something out was limited largely to using Arc on his or her campus, or using the open source (and still maturing) variant, QGIS, or working with Google Maps. These days, QGIS is very mature and has a strong developer community, GMaps is still going strong, and users can use services such as Mapbox’s TileMill. The options to choose from are stronger, and there is an option that can meet your background, whatever it may be.

As an example, I’m linking over to Mita Williams’s recent work mapping population change in Windsor, Ontario, as well as making the case for electoral change in her hometown.  Mita is a UX librarian and far more of a coder than I’ll ever be, so her recent work with maps shows a freer hand at hacking out java to make things go, while I use plugins within QGIS to automate some of the coding for me, which frees up my time to spend on analysis.

At the end of the day, our maps are projected with the same code and with data from the same datasets, so our endpoint is the same, but the tools we’ve chosen to use may be better suited to our own particular abilities. That is something I didn’t see in 2010 as much as I see today. And that change is a good thing. Getting these datasets into the hands of the masses, and then making them usable and understandable for everyone, is crucial to the precepts of openness – open access, open government, open data – that we espouse as librarians. One can have completely open access to data, but its value is lessened when it cannot be used or understood by all of society. Yes, open data is a crucial part of today’s citizen-to-citizen and citizen-to-government relationships, but the more tools people have to work with that data, the better.

Information Literacy, Census Geography, and Maps

One of the things I’m constantly doing as a government documents librarian is giving lessons on Statistics Canada geographic areas. Census geographies can be downright confusing to the new user (and to sometimes to the seasoned expert!). The names are riddled with acronyms and jargon, and their relationships to other areas and spaces can be complicated. One legally incorporated township may be considered a census subdivision while another may be classified as only a census agglomeration. Another city may be classified as a census subdivision, and also be part of a census metropolitan area of a similar name, e.g., Toronto CSD and Toronto CMA.   Or, a city may be classified as a census subdivision and exist not only in a CMA with a similar name, but also a census division (I’m looking at you, City of Waterloo CSD, Waterloo Region CD, and Kitchener-Cambridge-Waterloo CMA). And if you dare introduce census tracts the first time through, your short introduction to the “Russian dolls” nature of census geographies runs the risk of turning your lesson into an information dump about privacy and data validity when all that your first-year economics student wanted to know was why it’s so hard to get comparable income and migration numbers for Kitchener, Ontario, and The Pas in northern Manitoba.

Don’t ask me how many census tracts this CMA holds.

Confusion abounds. One of the problems we encounter are the tools we use to explain these geographies, which should be easily understood but are often abstract – we may live in towns and cities, but we refer to them as census agglomerations or CMAs. What can you use to show how spaces relate to one another, or how certain concepts can be measured and expressed spatially? The answer is a map, of course.  God lov’em, those maps. Maps help us express numbers – quantities, amounts, rations, proportions – with colours and shapes, and in the regions we live in and travel through each day. Face it, “big data” wouldn’t be as big as it is today if we didn’t have “big maps” to help use make sense of the numbers. However, StatCan’s digitized maps are large, layered PDFs that aren’t always user-friendly. The Standard Geographical Classification (SGC) PDFs are great reference items, but they aren’t very accessible.  And this creates a learning gap for so many of our users.

To overcome this gap, I’m constantly pulling out the old SGC print maps, and I’m also cutting and pasting and hacking together magnified screenshots of the PDFs into my slide deck.  Typically, if you need census help and you’ve found me in person, then there stands a good chance that I’m going to crack open the SGC and unfold a map somewhere in the office (I even keep the southern Ontario CD-CSD map posted to a wall).  I started doing this last Spring after I moved to Waterloo and had to learn the region’s geography and confirm its census divisions, subdivisions, and CMAs for myself, and I realized this was a simple and effective tool that should be used more often, especially with new StatCan users.

StatCan’s 2006 geographies for southern Ontario, from a summer 2012 research consultation

 

Typically, I bring students to a nearby conference room and unfold the map on a large table.  I find that being able to “walk around” the entire map and point to the places where the lines that signify the different geographies merge, separate, and then merge again, helps students understand some of the logic behind the regions (at least in terms of distance and population). They may not always be able to recall all the differences between a census division, subdivision and metropolitan area after a session, but they at least remember that there are differences, and these differences are important enough to affect their research.

The original SGC PDF gives us a wide view of Ontario

The classroom is a different story, though. When working with only one person or a small group, there is a persuasive element at work that captures everyone’s attention. Carefully unfolding and presenting a map to a small group of people is like opening a box that holds a surprise. (Let’s call this surprise “knowledge” and we’ll call ourselves awesome for charming our audience so handily into learning something). But if we take that same map into the classroom or lecture hall, it risks becoming an awkward, cumbersome prop. It can become a distraction or even a failed means to demonstrate your expertise in such a short time to such a large group of people.

Zooming in reveals the different geographies

Maps that unfold to become wider and taller than you put the room’s attention onto your map-wrangling skills (however good or poor they might be) instead of on the knowledge you have share, so I avoid them. You’ve never caught me walking to a classroom with a print map, and I doubt many other librarians do that today.

The final zoom focuses directly on the region the classroom is interested in (and it’s often Waterloo Region)

Instead, I give the class what they want and what they expect, and that means I work that map into my PowerPoint deck. Any time I’m introducing StatCan resources and geographies to a class, I insert three images of the same PDF map, each one magnified more than the last. This helps people “zoom in” with their eyes and see the many relationships and regions that are defined in one place alone. The length of time I spend on these slides depends on the classroom’s needs: sometimes, I spend only a few moments on these slides, and other times, I’ll spend five or ten minutes. What matters is that after I’ve finished up and am headed back to the office, I know that the instructor can pass around a slide deck that always refers to all these different areas.

I know I’m not presenting anything new in this post: maps have long been a tremendous tool within government documents librarianship. Perhaps the takeaway lies more in information literacy than it does anywhere else. Is your digital resource, as presented to you, the best way to help the user understand the resource? You may want to turn to the print resource or manipulate the digital resource, as I do with StatCan maps, to improve learning and synthesis. It’s just one more tool (or two, in this case) in our IL toolbox.

On Tories, Politics, and the StatCan Crisis

I’m not going to speak much about the Long-Form StatCan fiasco that the Tories have created this summer because so many other people and news organizations are covering it so well. David Eaves and Datalibre.ca have strong commentary and lists of organizations against it.  The Globe and Mail and The National Post have both kept their attention on the issue, too.   Aside from the fact that great resources already exist on this file, I haven’t offered my thoughts on it yet because so much of the issue lies in rhetoric, ideology, and politics.

Munir Sheikh, speaking truth to power. Click for details.

The Conservative Party of Canada, in its role as government, can if it so desires tell Statistics Canada to ditch the long form.  And Munir Sheikh, as the former director of StatCan, protests the only way he could by tendering his resignation.  Sheikh, like a proper civil servant, spoke truth to power and should be commended for it.  On these points, most people will agree.

If the Conservatives really do believe that the Long Form issue is about compelling citizens to offer information to the government under threat of a prison term (as PMO spokesman Dmitri Soudas keeps saying, as wannabe PM Maxime Bernier keeps suggesting, and as Tony Clement, I suspect, has been ordered to continually argued), then all the government must do to rectify this is change the StatCan Act so that individuals would be rewarded instead of punished for filing the long form.   I won’t take credit for this idea, since I’ve heard it several times in the media in the past week: Offer a $20 tax credit upon completion and submission of the long form. Anyone who has filed income taxes will appreciate the idea of a tax credit, and anyone who has filed income taxes also knows that a $20 credit does not equal $20 in tax savings, either.  This incentive could be a win-win for all parties.

As for the second-most argued point of contention about the long-form – whether or not the government should collect what might be privileged, personal data, e.g., what time you go to work in the morning, how many bedrooms are in the house, I think the CPC is making political hay.  What’s important is not how many bedrooms I, Michael Steeleworthy, possess (2), whether I rent or own (rent), or what time I go to work in the morning (between 8 and 830, depending on the time I wake up).  What matters is the aggregate data that comes of it.  No one is ever going to look at my own data to compromise my privacy – the government has not enough time on its hands to snoop into such arcane matters and has more important things to do.  And frankly, StatCan data is closely guarde  Its data is not freely available to the public, and its original files are kept under lock and key; not even Misters Harper, Soudas, Clement or Bernier could access my census form.  Really, if the government is keen on turning themselves into libertarian ideologues instead being the administrators of representative governance when it comes to the issue of data collection, then it should also stop collecting income taxes at CRA, and as Dan Gardner noted in the Ottawa Citizen, it better bow out of FINTRAC as soon as possible, since if there was ever an Orwellian “spy-on-your-neighbour organization out there”, this is the one.

What’s more, if the CPC is bothered by the collection of information, it may as well shred its own database of party members, which is a storehouse of information that their grassroots base would presumably disagree with (if the current CPC rhetoric about data collection is to be believed) in the first place.  Dear Stephen Harper, I’ve heard that teaching by example is the best way to give a lesson, so let’s start this Data Collection Disruption at home and send the CPC’s own files to the great Shredder in the sky.

Former Ontario Minister Snobelin, famous for wanting to create a "useful crisis" to promote political aims. Click for details.

Snarky comments aside, the long form issue is a political issue, and I don’t see the CPC moving back from it.  I may be wrong – I’m not a seasoned political observer, I’m only a fairly bright fellow living on the east coast.  But one thing is clear: in the tradition of one-time Ontario PC Minister of Education John Snobelin (cf. Mike Harris and the Common Sense Revolution; Snobelin served alongside Ministers Clement and Flaherty, I might note), the best way to create change in government is to create a crisis.  And that’s what’s happened with the Long Form.  The CPC has created a crisis.  Even if Stephen Harper, through Tony Clement, were to suddenly make peace and reach for consensus, they will have shifted the status quo closer toward their own political ideology.