Sex and Gender Data Collection Guidance
Minimizing harm to subjects is a core consideration when designing human subjects research, and being mindful in how you ask about your participants’ sex and gender is an important consideration in doing so. The following guidelines for collecting sex and gender information from human subjects are based on current best practices and are intended to help researchers collect accurate and representative data while respecting their subjects’ needs.
Please see the Resources/Further Readings section for links to more details, terminological definitions, examples, primary research, and further justifications.
1. Do you really need this information?
Consider whether you really need to ask about sex and/or gender. Sex/gender is an important subject to study, and there are many cases where it will form an key component of the analysis. That said, don’t simply collect this information by default where it is not in fact relevant.
2. Sex and gender are not the same.
Gender is a societal role which an individual may fulfill in various ways, while sex is a biological category related to chromosomes, hormones, and physical characteristics. (See e.g. www.nature.com/news/sex-redefined-1.16943 for a fuller discussion of the biological underpinnings of sex.) The two categories are interrelated, and gender may have biomedical effects while sex may influence social and cultural traits. Nevertheless, they are distinct and should not be conflated.
3. Free response vs forced choice?
When it is necessary to collect gender information, academic fields differ in whether they consider a free-response text box or a pre-determined list to be best practice. This will also vary based on your research question and analytical tools. Check with experts in your field, attend to the population being studied, and be mindful that recommendations are likely to change over time.
Free-response text boxes allow for a more accurate representation of individuals. While it does create a bit more work for the researchers -- for example, answers of ‘f’, ‘F’, ‘female’, ‘woman’ and misspellings thereof may need to be binned together for analysis -- we have found the additional labor to be minimal, even in studies with several hundred participants.
Pre-determined lists of options allow participants to decide for themselves how they’d like to be binned (where binning of data is necessary), rather than leaving it up to the researchers. See the following section for further discussion of lists.
Where feasible, some scholars suggest asking participants both how they define their gender, with a free-response box, and how they would like to be categorized in a categorical analysis, with set options. This maximizes accuracy and self-determination, and provides transparency as to how the data will be used.
4. Keep lists inclusive and flexible.
Pre-determined lists of gender options should be as inclusive and flexible as possible. Some things to keep in mind:
- Rather than radio buttons which restrict participants to a single choice, allow them to check multiple boxes.
- Include additional categories other than ‘Man’ and ‘Woman’, such as ‘Non-binary’, ‘Agender’, and ‘Gender fluid’.
- Include a ‘Not listed’ option for those whose gender is not included in your list. Use this instead of ‘Other’, which is alienating. Include an opt-out, such as ‘Prefer not to say’.
- Use ‘Man’ and ‘Woman’ rather than ‘Male’ and ‘Female’ when talking about gender.
- ‘Trans’ is not a gender, and ‘Trans man’ and ‘Trans woman’ should not be listed separately from ‘Man’ and ‘Woman’. If you want to know whether your participants are trans, that should be a separate question.
- Sexuality, such as straight, gay, lesbian, bisexual, asexual, etc., is separate from gender, and should be asked as a distinct question if that information is needed.
5. Sex is complex and not binary.
Many of the same principles described above for gender also apply to asking about sex. Unless further specified, sex is often assumed to mean the sex a participant was assigned at birth or the sex listed on their legal IDs. Wherever possible, let participants fill in a text box. If checkboxes are necessary, include options such as ‘Male’, ‘Female’, ‘Intersex’, and ‘Not listed’. Be aware that sex is biologically complex, with several contributing factors. Think about which aspects of sex are relevant and consider asking about them specifically (e.g. hormone status).
6. Keep participants fully informed.
Wherever possible, tell participants why you need this information, and what you plan to do with the data. Bear in mind the implications for privacy and identifiability, especially in combination with other demographic data, when a subject has a less-common gender or sex. For example, such a disclosure might look something like the following (adapted from the FAccT conference):
“Unless you opt to provide your name and contact information, your answers will be anonymous. Individual-level responses to the survey will be seen only by the researchers. We may report summary statistics of the results publicly. However, we will not report write-in categories that contain fewer than 5 respondents. After the study responses have been analyzed, we will delete individual responses and retain only the summary quantities.”
A much more detailed set of guidelines, including definitions for a number of terms, can be found here: HCI Gender Guidelines. This page is aimed specifically at human-computer interaction researchers, but its content is broadly relevant.
The following links provide extra information on the topic of sex and gender. Some of these resources may require you to be on the Swarthmore network or VPN to access them.