Ten years ago I took an introductory Geographic Information Systems (GIS) class. It was a bit mystifying (as those technical courses can be to us humanities folks) but I could see the potential. I’m sorry to say that the ins and outs of ArcGIS faded from memory as I ran that marathon of coursework, comps, prospectus, etc., etc. Yes, I actually reverted to paper maps, highlighters and, yes, dear reader, I used thumbtacks. It worked for what I needed then, but I need a bit more now.
Maps answer questions. The questions my maps need to answer are really important to my larger project: How many people lived on the American coast? Where did they live? What was the coast’s population density? And how did it change over of the nineteenth century? I tried google and as you might surmise, no good hits. So I cracked open my computer and spent too much time piecing together advice and how-to guides as I stumbled through making my first GIS maps. What follows is how I did it. Why? Honestly, I want to have the directions handy when I’m trying to do this again in a year or two. Secondly, maybe this post can help some uninitiated humanities researcher dip their toes in GIS joy. This post explains how I made the maps above. The next post will analyze them.
I wanted my first map to visualize population density along the American coast at the dawn of the nineteenth century. You can’t make a map without data. Fortunately, Google worked this time. The wonderful folks at the Minnesota Population Center at the University of Minnesota maintain a fantastic database called the National Historical Geographic Information System (NHGIS), which “provides, free of charge, aggregate census data and GIS-compatible boundary files for the United States between 1790 and 2014.” In other words, they had all the files I needed to make my map: census data and a “shapefile” for 1800 (this shapefile is basically an outline map of every county in the US as drawn in 1800. It’s not perfect, as we’ll see in the next post). I advise watching the video tutorial because the NHGIS interface is not intuitive.
So I had the data now I needed GIS software to work the files. ArcGIS remains the standard GIS program but the days of being an unaffiliated scholar without access to expensive licenses is still very, very fresh. I went open source. Of all the choices, QGIS seemed adequate for my needs and had the added benefit of a large number of online tutorials and training materials. I’m still just getting into the software, but it looks like QGIS will be more than enough for a humble historian.
Now this is not the place nor am I the person to go into theories of cartography or higher- (even mid-) level GIS issues and problems. What follows are directions for how I used QGIS to make a population density map of the United States using historic census data. (If you want to follow them, you have to know that GIS works by layering different types geographically referenced information that can be manipulated, analyzed, joined, and separated. Take a few minutes to read the introductory material in the QGIS manual. If you’re curious, I suggest reading this very basic introduction to GIS.) Here we go.
- Open QGIS. Start a new project.
- Create a new shapefile layer. This is the basic foundation of the map. Open the zipped 1800 shapefile you downloaded from NHGIS. You should see a map of all 427 counties from the 1800 U.S. census. Use the mouse to zoom in and out. Grab and drag the map around. Play around with the buttons on the top toolbar.
- Now its time to upload the census data. Create a new “delimited text file” layer. This, of course, is census database. Unfortunately, we’re forced to use counties because aggregate data from the 1800 census is not available for smaller geographic units. Take a look at the county map and you’ll see some of the challenges of analyzing “coastal” population density using counties that, in some cases, go far inland. (As an aside: a future post will address the challenges of defining and quantifying “coast” and “coastal.”) Now we’ll just go with what we’ve got—counties.
- Upload the “…nominal_county.csv” file downloaded from NHGIS.
- Be sure the “file format” is set to “CSV (comma separated values)”
- For “Geometry definition” select “No geometry (attribute only table)”
- Click OK.
- Upload the “…nominal_county.csv” file downloaded from NHGIS.
- Don’t worry that your map didn’t change. You should see the new layer in your “Layers Panel” on the right. Now we have to put the numbers into the map.
- Right click the shapefile. Click “Properties”
- Click “Joins”
- Click the green “+” symbol to join the two layers together.
- Set “Join layer” to the .csv file
- Set “Join field” to “GISJOIN” (thank you NHGIS for making this easy!)
- Set “Target field” to “GISJOIN”
- Press OK. This should put you back in the shapefile “Properties” dialogue box.
- Click “Fields.” Scroll down. You should see all the county data for every census from 1790 to 2010. Smile – that’s pretty amazing!
- Now it’s time to display some info on our map. In the “Properties” dialogue box of the shapefile. Click “Style.”
- Change the dropdown box on the top from “Single Symbol” to “Graduated”
- Change “Column” to the 1800 data (my file reads: nhgis0001_ts_nominal_county_AOOAA1800)
- Click the “Classify” button. You should see the box in the middle of the dialogue box populate with symbols, values, and legend entries.
- Press “Apply” then “OK.” You’re looking at a population map. It probably doesn’t look too awe inspiriting at the moment.
- Go back to the “Style” dialogue box (right click shapefile, “Properties,” “Style.”
- Play around. You can’t hurt much (and if you do, it’s easy enough to start over). I suggest starting with the “color ramp,” “precision,” “mode,” and “classes” buttons. Remember: every time you make a change you have to press “apply” to see how it changes your map.
- Population is great but I want population density. Here’s how we get there
- Right click shapefile, “Properties,” “Style.”
- Change the “column” box to: nhgis0001_ts_nominal_county_A00AA1800/(($area/1000000)*0.386102)
- Here’s why: we want the number of people who lived in a square mile. This equation takes the population of each county and divides it by the area. The basic unit of measurement for this shapefile is meters. So we need to convert square meters into square miles. This makes it so. (don’t ask me how long it took to figure this out)
- Click “classify.” The symbols, values, and legend figures in the dialogue box should have changes. Change the mode to “mode” to “Natural Breaks (Jenks),” pick a “color ramp” that makes sense, click “Apply” and “OK.”
- There you have it: county-level population density in 1800. QGIS is amazingly versatile and adaptable and time spent “playing” with the representations is time well spent because only by going through the process of analyzing the data in different ways do you start to see new connections and patterns, which, of course, is one of the beauties of using GIS.
- I decided it might be useful to add the location and display the relative size of the largest American cities in 1800. I’m sure there are easier, more elegant ways to do this. I couldn’t figure out how so I did this:
- Under the “Layer” drop-down menu, click “add new Vector Layer”
- Type: “point”
- Under “New Attributes”
- “Name” – city
- “Type” – text data
- Click “add to attribute list.” You should see it added to the “attribute list” at bottom of dialogue box.
- Add second attribute.
- “Name” – population
- “Type” – whole number
- “Width” – 10
- “Precision” – 4
- Click “Add to attributes list”
- Click “OK.” Save layer as: “major cities.”
- Google “10 most populated American cities in 1800.” Keep the page open.
- Right click “major cities” in the “Layers Panel” and select “Toggle Editing.” Now you can add points to this layer.
- Zoom into New York City. You’ll see what today is essentially Manhattan was the most densely populated county in 1800 America.
- Click the “Add feature button” on the “Digitizing toolbar.”
- Click on 1800 New York City.
- “id” – 1
- “city” – New York
- “population” – 60515
- Do the same for the remaining cities. Note: because Northern Liberties (#6) and Southwark (#7) are now part of Philadelphia (#2) and because of the national the scale of analysis I’m interested in, I decided to combine them under Philadelphia.
- Click on 1800 New York City.
- Now its time to get your graduated symbols and labels working. Right click “major cities.” Select “Properties” and then “Style”
- Change the dropdown box on the top from “Single Symbol” to “Graduated”
- Change “Column” to population
- Click the “Classify” button. You should see the box in the middle of the dialogue box populate with symbols, values, and legend entries. Categorize, color, and adjust until you get what you like.
- Press “Apply.”
- Click “Labels”
- On the top change “no labels” to “show labels for this layer”
- “label with” select: “city”
- As with “style” page, QGIS give you lots of formatting and placement functionality. Play around – make your map work for you. Remember to press “Apply” to see your changes on the map canvas
- Under the “Layer” drop-down menu, click “add new Vector Layer”
- Print your map. This was probably the technically easiest part. I’ll save you the step-by-step directions, but you start by clicking the “New Print Composer” icon. Follow the directions in the QGIS manual and you can begin printing maps to your heart’s content.
Now that wasn’t too bad. In the next post I’m going to discuss the population density maps I created and explain how their “final” form reflects the questions I’m answering and the type of book I’m writing.