Data Ab Initio

Research Data Accessibility Report

by Kristin Briney Posted on 2026-07-16

I’m very excited that the RDAP report “Accessibility for Data and Data Repositories: Understanding and Applying the 2024 ADA Title II Rule” is now available. I was a co-chair of the group that wrote this report, alongside Rachel Woodbrook and Clara Llebot Lorente, and we worked with a fabulous group of peers to write this 100+ page document.

The report is useful for librarians and researchers, as well as anyone else interested in making research data more accessible. While the title frames the report with respect to U.S. disability law, the guidance within the report are broadly applicable to other countries.

Researchers interested in data accessibility should skip to section 4, which is about the accessibility of research data. That section contains a lot of guidance on making data files accessible, from spreadsheets to images to video and textual data. This section provides general guidance by data type and then specific guidance for select file formats within that data type, complete with screenshots on how to implement the guidance. The hope is that by making data accessibility guidance available and easy to follow, researchers can start making their files for accessible.

This report has been a labor of love in an area (data accessibility) where little guidance can be found. I hope that many people find this report helpful. It’s also CC BY licensed, so you are free to reuse this information and share it widely!

Posted in accessibility, dataManagement | Leave a comment

Plain Language for Research Data Documentation

by Kristin Briney Posted on 2026-07-06

Today, I want to continue to discuss various ways to make research data more accessible. While there are specific accessibility recommendations for specific types of data files, such as recommendations for accessible and reusable spreadsheets, we’re going to focus on something more general today: plain language.

Plain language is a clear way of writing that makes your content easier to understand. Plain language doesn’t necessarily apply to research data (as you shouldn’t alter data), but is a good strategy for documentation, where you have direct control over what you write. The idea is that if you write your documentation more clearly, people will better understand your data and can more easily reuse it. Using plain language is also an accessibility issue, as it helps people with various cognitive disabilities better understand written content.

So what does plain language look like? Plain language recommends simplification wherever possible. Some common recommendations for plain language include:

Write the most important information first;
Keep sentences short and direct;
Keep paragraphs short;
Break content into sections and use headings;
Avoid jargon;
Define acronyms and abbreviations;
Use bullet pointed lists, where possible.

You can adjust your writing to your audience and use research-specific terms, but remember that some researchers in your field may not be native English speakers. So plain language for research documentation should balance using clear and direct words against field-specific terminology.

Plain language also involves writing at a lower reading level, typically no higher than a lower secondary education level (WCAG 2.2, Criterion 3.1.5). Even with the ability to adjust your writing to your audience, assuming a lower reading level will make your writing more accessible to everyone, even those without disabilities.

Let’s look at an example of plain language. Let’s start with the sentence: “based on the state of the local flora and where we are in the growing season, I recommend ceasing the collection of samples.” A plain language version of that sentence is: “do not pick the flowers.” The second sentence is much clearer and more direct and means roughly the same thing.

Writing in plain language is an art and takes practice. But I encourage you to think about plain language the next time you create research documentation so as to make your documentation (and thus your research data) as accessible as possible.

Posted in accessibility, documentation | Leave a comment

Color Accessibility for Research Data

by Kristin Briney Posted on 2026-06-01

Last month, I blogged about the need to include accessibility in the discussions of research data reproducibility and reusability. This month, I want to address one way to do that. Specifically, we’re going to talk about color.

Color appears in research data in a number of places, most obviously in image and video files, though it can also appear in text and spreadsheets. Where we currently see accessibility guidance around color, such as when journals provide guidance for figures, it is frequently guidance to avoid red-green pairings because of color vision deficiency (colorblindness). But actually, accessibility guidance around color goes beyond this.

The first accessibility recommendation for color is to never use color as the only means of conveying information (WCAG 2.2, Criterion 1.4.1). One of my standard examples of this is to avoid highlighting cells in Microsoft Excel to encode information that is not available in another form. Not only can a blind person not access this information (because there is no textual equivalent that can be read by a screen reader) but also, a computer cannot perform calculations upon highlighting. Any information that is only shown as highlighting should be converted to a separate text-based variable on the spreadsheet. For images and video, the scenario is a little different. For example, if you are taking a photograph, you cannot change colors in the image without changing the underlying information in the image. Instead, you should provide a text alternative (called “alt text“) that describes what is in the image. In general, for any information encoded as color, it’s best to also provide that information in another form – typical, but not always, a text equivalent – that can be read by a blind person using a screen reader or a computer.

The second recommendation for accessible color is to chose colors with enough contrast. Obviously, you cannot change colors in a photograph without altering the underlying data (this is another reason why alt text is important) but you can chose colors in a data visualization and for other types of data. Choosing colors with high contrast means that your visualization will be understandable by a person with low vision as well as for someone who prints everything in black and white. The recommended contrast ratio for adjacent color blocks is 3:1 (WCAG 2.2, Criterion 1.4.11). WebAIM’s Contrast Checker tool is useful for checking your contrast ratios. There are also tools, such as Coblis, for checking colors against the various forms of color vision deficiency (because red-green isn’t the only type). I recommend that you make it part of your workflow to always check your color schemes for contrast before finalizing them.

The 3:1 contrast ration is specific to blocks of color. Guidance is a little different for color of text. The contrast ratio for text is a minimum of 4.5:1 (WCAG 2.2, Criterion 1.4.3), though a ratio of 7:1 is even better (WCAG 2.2, Criterion 1.4.6). When in doubt, black-on-white is best. There are other accessibility recommendations for text that cover issues beyond color, such as typography and font size, and I encourage you to check out guidance from Section 508 and WebAIM on this topic.

Overall, there is a lot of leeway in color choice when it comes to accessibility, so long as: 1) color-based information is also encoded in some other way; and 2) you use enough contrast in your color choices. WCAG 2.2 guidelines (which are the go-to web accessibility guidelines I’ve been referencing throughout this post) don’t actually say anything about color choices for color vision deficiency. Accounting for the various forms of this disability is nice to do when you are able to do so, but a lot of the concern in this area is reduced by having enough color contrast.

Hopefully this guidance will be useful to you. You don’t need to remove color from your data and you can still choose fun color schemes. You just need to add a few checks to your workflows around color to make sure your data is maximally accessible.

Posted in accessibility, digitalFiles | Leave a comment

Disability and Data Sharing

by Kristin Briney Posted on 2026-05-04

I’ve been blogging a little bit about topics at the intersection of accessibility and data sharing in the last year or so. This has been due to my having Long COVID and reinterpreting how I think about my body and my research. As I learn more about disability, I’ve made more and more connections between disability and data sharing. In today’s blog post, I want to examine this overlap in more detail to convince others that that the accessibility of research data is an important area to address.

According to the U.S. CDC, 28.7% of all Americans have one or more disabilities. Disability numbers out of the UK are about the same: 24%. Disability is actually very common. It’s a group that everyone is likely to be a part of at some point, especially as we age.

Due to the role of disability in society, disabled people are under-represented as researchers. Only 22.2% of people with disabilities hold a bachelor’s degree or higher (as compared to 42.6% of people without disabilities). It gets worse the further you go in academia. The 2023 U.S. National Science Foundation’s (NSF) Survey of Doctorate Recipients found that between 10-15% of U.S. doctorate recipients were disabled, with numbers varying across fields. All of this leads us to conclude that, while disability may be under-represented among researchers (who are more likely to hold higher degrees), it is still very present.

You may already work with a researcher who is disabled. With the high prevalence of non-apparent disabilities (disabilities that are not obvious by looking at someone), it’s likely that you know a disabled researcher even if you don’t know that they are disabled (waves hello). The point is that disability is common in research even if we aren’t always aware of it or talk about it.

How does this relate to research data? For all we speak about data being reproducible and reusable, I argue that data can not truly be reproducible and reusable unless it is usable by disabled people. If we speak about data being usable by those outside of our labs and how to format data to maximize this, disability needs to be a part of the conversation. Several people have made the point about the need for accessible research data before me, the most recent of which is Colón, Goben, and Karcher who argue for “actually accessible data”. I encourage you to check out their paper, which includes a call to action in this area.

The challenge of making data more accessible to disable people comes down to the details. There are known strategies for making business files more accessible, which can be translated into the research context, but this is far from covering the complete spectrum of research data. Additionally, some of the recommended accessibility strategies (such as formatting requirements for Microsoft Excel files) are in conflict with current reproducibility recommendations (such as to use CSV files with no formatting). At this point in time, there is only a small amount of guidance specifically about making research data files more accessible.

I don’t have an answer to the challenge of making research data files more accessible, though I am slowly trying to chip away at pieces of the challenge. I hope other people will join me in this exercise. I plan to blog more here in the future about any progress I make in this area.

Posted in accessibility | 1 Comment

New U.S. DMSP Templates on the Horizon

by Kristin Briney Posted on 2026-04-22

I’ve been writing about data management plans (DMPs) for over a decade on this blog and, while sometimes it feels like I’ve already discussed this topic plenty, the universe decided to throw a curve ball and make me write about DMPs even more. Though it is more accurate to say that the U.S. government is the one throwing the curve balls at the moment.

The U.S. National Science Foundation (NSF) is implementing a new Data Management and Sharing Plan (DMSP) template on April 27, and the U.S. National Institutes of Health (NIH) will implement their new DMSP template on May 25. Both agencies are shifting away from a 2-page narrative DMP and toward DMSPs with rigid check-box/drop-down answers for a handful of questions. There will be space for a couple free-text descriptions, but otherwise, the two templates are a dramatic shift from how U.S. agencies have handled data management plans for the previous decade.

On one hand, I love the shift toward more structured DMSPs. There has been a significant amount of work done in the community over the past few years to develop machine-actionable DMPs – the idea being that machine-actionable DMPs can easily connect what’s promised with the outputs of the grant. And it looks like the new, more structured DMSPs will allow the agencies to more easily check compliance. Given the benefits of data management and sharing, I’m not against making compliance easier for everyone.

I have several concerns about the new DMSP templates, however. Due to cuts at the NSF and NIH, the roll out of the new templates has been rushed. The NSF, in particular, only provided screenshots of the new NSF DMSP template one week before the template is being required; we have to wait until the day that the templates are required to see the full templates. The lack of information about the new templates has made it particularly difficult for specialists like me to prepare researchers for meeting the new requirements.

The rush has also set best practices back. Most egregious is the guidance, shared alongside the screenshots of the new NSF template, that states that “Note that sharing through institutional resources (e.g., lab webpages) can be denoted as ‘Institutional Repository’”. The data management and sharing community has spent well over a decade trying to stop researchers from putting their data on a lab website and the 2022 Nelson memo explicitly says that research data should be shared in a data repository. The NSF’s current guidance, as quoted, is problematic and goes against all current recommended practices. This is one example of several where the new templates have not been clear or have been counter to current expected practices.

Another concern about the new DMSPs is that they are so stripped down that they have taken what is already a bureaucratic hurdle and made it into a check box. The point of writing a DMP is to help researchers think about and improve their data management sharing practices. The new paired-down templates don’t really do that. That said, a DMSP written for a grant application probably won’t ever be as beneficial as writing a living DMP, so I will continue to advocate for researchers to write living DMPs.

It’s too early to tell how the roll out of the new DMSPs templates will go for two of the biggest funding agencies of academic research in the United States. It’s going to be disruptive for a lot of people but it’s not clear yet if the templates will be a change for the better. In the meantime, I guess it helps with job security, knowing that I’m needed to help guide people through this process.

Posted in dataManagementPlans | Leave a comment

The Year I Appeared in Nature

by Kristin Briney Posted on 2026-03-24

I’ve been working as a librarian for over a decade. But for some reason, this is the year I’ve been discovered by journalists. I have appeared in Nature three times this academic year. The three articles include:

Wild, S. (2025) Need to update your data? Follow these five tips. Nature 643, 868-869.

Briney regularly helps researchers to wrangle their data. Her favourite tips for data management are to establish a file naming convention, which includes the date (often given as YYYYMMDD or YYYY-MM-DD), and to store files in their correct folders.

Dance, A. (2026) Why every scientist needs a librarian. Nature 650, 1063-1065.

Librarians like to say that an hour in the library is worth a month in the laboratory, quips Kristin Briney, biology and biological engineering librarian at the California Institute of Technology (Caltech) in Pasadena, California. And the Caltech library team points out that a researcher could avoid hours of solo Internet searching by just sending a quick e-mail to a specialist librarian to get the same results.

Wild, S. (2026) Drowning in data sets? Here’s how to cut them down to size. Nature 651, 1121-1122.

“This is a problem that libraries have been dealing with for as long as libraries have existed,” says Kristin Briney, a librarian at the California Institute of Technology (Caltech) in Pasadena. “We cannot physically collect all the books that we want to collect, and in 50 years, the book may not be useful any more.”

Data sets, she says, are the same. “There has to be some curation that determines what is worth keeping and what is worth throwing away.”

While I’m honored to share my voice among the many others that appear in these articles, mostly I’m excited to see Nature covering topics around data and librarianship. I hope that you check the articles out and enjoy the work of these amazing science journalists.

Posted in admin, dataManagement | Leave a comment