June 11, 2024

Survey Says (Part 5 of 5)

surveyp

Survey Says (Part 5 of 5) : Text as Data

Dealing with Text Data

It’s almost a given that you’re going to feel compelled to include an item at the end of every survey that asks the respondent for any additional feedback. This isn’t necessarily a bad idea. But as with all things I’ve covered in this series, the most important thing is to make sure you have a plan for what to do with that text. Let’s say you get an average of 25 words per respondent and 200 people took your survey. That are 5,000 words, which is about twenty double-spaced pages of text. And it’s not just about reading the words.

If you’re collecting that text as data, then you need to give its analysis as much consideration as you give the numerical data. Who will be responsible for that? What’s your plan for analysis? Will you be coding the text like you would in a qualitative research project? Or will you just be trying to write down themes you encounter repeatedly? Are you going to use direct quotations from respondents? Do you have their permission to include their names? Are there people they don’t want you to share their comments with? Who will be keeping up with all of this? If you and your staff aren’t interested in doing any of that, then you need to ask yourself if you should just go ahead and omit those questions.

There are some tempting, convenient options for analyzing text data. I’m going to caution against one and perhaps introduce you to another one that’s (getting) better. First, let’s cover the cautionary piece. Most of you have probably seen a word cloud before. It works by taking the important words from a body of text (like your survey responses) and sizing the words based on their relative importance. The importance part is where people get hung up. By default, most of the free online word cloud generators determine the most important words in your data based simply on frequency. So, if “technology” is the word that is most often repeated, it will be the largest and will usually be featured in the center. It’s possible, though, that the word appeared more than any other word, but that only one of the respondents used.

Similarly, though many online platforms have the ability to exclude words like “a”, “and”, and “the”, there are likely many words that get repeated quite often by participants that make no sense out of context and are not excluded from the word clouds. Your word cloud generator may also count words like “responsibility” and “responsibilities” as two separate words and you want them to count as the same concept. And typos often complicate this further (e.g., shchool and school would appear as two different words). You can usually address these issues, but they add time to your work. If you don’t have a reason for using a word cloud other than the fact that you think they are interesting or cool, then it probably just isn’t worth your time.

I don’t have the space to give you all the warnings about word clouds, but I will say this: Do not use a word cloud as your means of analysis unless you are an expert in the way they work. Word clouds can be useful, but they best serve as a way of visualizing a story from the text that you have uncovered through more rigorous analysis. You should avoid presenting a word cloud to your stakeholders without accompanying explanation.

You might be surprised to hear me say this, but you may find a lot of success in using generative AI (GenAI) resources (e.g., ChatGPT, Bard) to help you analyze your data. I absolutely can’t take the necessary time to explain everything you need to know about using GenAI in your work, but here are some points to think about if you choose to go that route.

  • Do your due diligence to make sure you aren’t sharing personal/private/protected data with a platform without strict guarantees or privacy (don’t count on this with free resources)
  • Generative AI should be used to help streamline and focus your text analysis. It shouldn’t replace your efforts altogether.
  • Try asking the AI to analyze your data using several different prompts. You’ll be amazed how dramatically the answers you get can change just based on what you ask the chatbot to do (and how you ask it).
  • Always review what the GenAI produces for accuracy (and propriety)! The technology is amazing, but it’s far from perfect.

Practitioner’s Corner: Numbers as Text

I apologize for the length of this section, but I wanted to take an opportunity to talk to you about a frustrating part of the data analysis process that not many people understand or know how to deal with quickly. The best way I can explain this issue is with a thought exercise:

When you use a calculator, you press a button with a number on it followed by the “=” sign. This works as expected. That’s because the buttons are programmed to use a specific type of data when you push the button. Now imagine you open up Excel (or another spreadsheet) and type “seven plus seven” in the cell and press enter. It’s almost certainly not going to change the value in the cell to “fourteen”. This may be obvious to many of you. After all, calculators aren’t meant to do math with text.

What may be less obvious is that a number (e.g., 1 or 89 or 2,000,456) can be stored as either a number or as text. In other words, you see 756 in the cell. It looks like a number to you. But to the spreadsheet it’s the same as typing 756 in a word document. It’s a number, but you can’t do math with it. You do this all the time if you use social media and type numbers. There’s no math happening. It’s just a character version of the number.

I bring this up because this creates a very common and extremely frustrating problem people face when trying to analyze survey data. When you use a form to collect responses, you may be asking the user to input a number. However, the survey platform you use may be collecting those numbers as text. When you export the data and try to do something like add or calculate the average of the values, you will get an error.

This isn’t the end of the world. It’s just pretty annoying. To avoid this situation, you can first check your survey builder for options on the format of the data you’re collecting. It’s more likely, though, that you will need to Google the process of converting numbers from text (often called “String”) to numeric in Excel (or whatever you are using for your analysis). It’s no fun in either case but having an awareness of the problem as well as an idea of how to deal with it can really make the experience much less painful.

Survey Says (Part 4 of 5)

Survey Says (Part 4 of 5) : Making Sense of Your Numerical Data To be honest, I considered putting this installment first in the series instead of last. You may wonder why, but you won’t be surprised at the answer: planning. By the time you’re looking at the data from

Read More »

Survey Says (Part 3 of 5)

Survey Says (Part 3 of 5): Creating Your Survey In the first two installments of this 5-part series, I focused on the necessary process of planning your survey, which includes an honest consideration of whether you even need a survey at all and a specific linkage of your survey to

Read More »

Survey Says (Part 2 of 5)

Survey Says (Part 2 of 5): Planning Your Survey Using Surveys in Education Let’s take a look back at the questions one through four I posed in the first installment of this series. For the other questions, please see Survey Says part 1. Why is this survey necessary? Is this

Read More »

Survey Says (Part 1 of 5)

Survey Says (Part 1 of 5): Using Surveys in Education If you’ve been in education for more than a few years, then you’ve probably lost track of the number of surveys you’ve had to participate in, either as a creator or respondent. Often, when we take surveys, our minds reach

Read More »