I few years ago I complained that no one was using the General Social Survey web interface for blogging, a practice which probably can be traced back to the Inductivist (yes, social scientists use the GSS constantly, but they use it to publish papers, not blog posts). Kevin Drum noted my lament in late 2008 and promised that he'd revisit the GSS in the future. He hasn't. That's fine, there are 1 million things I mean to do which I don't manage to get to. But still, it's kind of depressing to me the amount of opinion people can express which they don't bother to follow up on by using a web interface to a rich data source which requires no more than 1997 era browser skills. There's a lot you can do with the GSS interface, but I thought it might be useful to do something very simple so that people can see how easy it really is. Since most of the people I follow on twitter lean Left I see a lot of political chatter which is concerning to that segment of the population. For example there is a lot of talk about conservative white males and their lack of concern for global warming. Can we explore this with any greater precision with the GSS? Yes we can.
First you need to find the appropriate variables. So go to the search box and enter in what you want to find. I typed "warming." When you hit the "Go" button it will return a list of variables which we can then use in your further queries. My own suggestion is to keep the query simple and one word, this isn't Google. You'll get a lot of results usually, but at least it will give you options. Often there are many overlapping variables and you want to pick the one with the largest sample size or which was asked most recently. Here is some of what I got for "warming":
I want the last variable. If I click it it puts it into the "Selected" text box. I hit "Row" to copy it to the appropriate box. If you use the GSS enough with a few variables you get to know them off the top of your head and can skip this step. For evolution for example I know that "evolved" is a dichotomous response which was surveyed relatively recently.
But you want more than one variable. Going back to my initial curiosity I want to "cross" the variable under consideration with race and sex. I happen to know that there is a "Sex" variable where males are 1 and females are 2. I also know there is a "Race" and "Hispanic" variable, where 1 is white and non-Hispanic. I'll put "Race" in the column box, so it crosses with the row. I'll also limit the sample to Non-Hispanics and males. So you see I entered something in the "Selection Filters" box. There's a lot more fine-tuning you can do at this point, but let's just go with this. Below are the results for the query above. As you can see it's vintage 1997 as well:
All sorts of details are clear here. You can see the weighted sample size, the exact form of the question, and of course the results in combination of row and column classes. Finally let's control for ideology. I happen to know that the POLVIEWS variable has seven response classes, from extremely liberal to extremely conservative. I'm combine the three liberal classes and three conservative classes using the recombine option. You can see it below in the "Control" box. This means that the query above will now be split into three categories, one for liberals, one for moderates, and one for conservatives. Here's a response for liberals and conservatives:
The sample sizes for non-whites here are very small, but the big difference is across ideology among white males. In other words we're talking ideology as the causal factor. White male are more conservative. And conservatives are less concerned about Arctic seals possibly being threatened by global warming. I used very much a toy example above. I just wanted to show you how easy the interface really is.