AFFILIATE RESEARCH
Enhancing the Ethics of User-Sourced Online Data Collection and Sharing
Michelle N. Meyer, John Basl, David Choffnes, Christo Wilson, and David Lazer | July 2023
Abstract: Social media and other internet platforms are making it even harder for researchers to investigate their effects on society. One way forward is user-sourced data collection of data to be shared among many researchers, using robust ethics tools to protect the interests of research participants and society.
Twitter’s revocation of special academic access to its application programming interfaces (APIs) is the latest blow to the study of information sharing and consumption on the internet1. Platform APIs offer easy access to data, and Twitter is the modal source of online behavioral data — largely because of its generous APIs — to study everything from misinformation to the filter bubble.
Yet, while this is disastrous for the research community and others, such as journalists and various civil society actors, who relied on them, API-based (or platform-sourced) data collections were always quite limited: built for third-party app producers, not for research; often unreliable2; providing researchers little access to the central variable of interest (that is, what users actually see3); and subject to the whims of the platforms being studied. The recent batch of papers in Science and Nature involving a structured collaboration between academics studying the role of Facebook and Instagram in the 2020 election offers one powerful model for studying the internet4; however, it is notable that no company (even Meta) has committed to a similar effort in the future. The field needed new, independent paradigms for studying the internet long before the present-day retraction of APIs.
Contact Us
Are you interested in joining the IDI team or have a story to tell? reach out to us at j.wihbey@northeastern.edu