Thursday, October 7, 2021

Facebook hides data showing it harms users. Outside scholars need access.

Facebook hides data showing it harms users. Outside scholars need access.

The social media company has lost its right to secrecy.

By Nathaniel Persily

The disclosures made by whistleblower Frances Haugen about Facebook — first to the Wall Street Journal and then to “60 Minutes” — ought to be the stuff of shareholders’ nightmares: When she left Facebook, she took with her documents showing, for example, that Facebook knew Instagram was making girls’ body-image issues worse, that internal investigators knew a Mexican drug cartel was using the platform to recruit hit men and that the company misled its own oversight board about having a separate content appeals process for a large number of influential users. (Haugen is scheduled to appear before a Congressional panel on Tuesday.)


Facebook, however, may be too big for the revelations to hurt its market position — a sign that it may be long past time for the government to step in and regulate the social media company. But in order for policymakers to effectively regulate Facebook — as well as Google, Twitter, TikTok and other Internet companies — they need to understand what is actually happening on the platforms.


Whether the problem is disinformation, hate speech, teenagers’ depression or content that encourages violent insurrection, governments cannot institute sound policies if they do not know the character and scale of these problems. Unfortunately, only the platforms have access to the relevant data, and as the newest revelations suggest, they have strong incentives not to make their internal research available to the public. Independent research on how people use social media platforms is clearly essential.


Story continues below advertisement

After years of frustration — frustration also felt by many Facebook employees trying to do the right thing — I resigned last year as co-chair of an outside effort to try to get the company to share more data with researchers. Facebook’s claims of privacy dangers and fears about another Cambridge Analytica scandal significantly hindered our efforts. (A researcher at data firm Cambridge Analytica violated users’ privacy, prompting an investigation by the federal government into Facebook’s data-protection practices that led to a $5 billion fine.)


When Facebook did finally give researchers access to data, it ended up having significant errors — a problem that was discovered only after researchers had spent hundreds of hours analyzing it, and in some cases publishing their findings (about, for example, how disinformation spreads).


So we are now at a standstill, where the public does not trust Facebook on research and data that it releases, and Facebook says existing law (including the Cambridge Analytica settlement) prevents it from sharing useful data with outside researchers. Congress has the ability to solve this problem by passing a law granting scholars from outside the social media companies access to the information held by them — while protecting user privacy. (I have drafted text for a law along these lines, which I call the “Platform Transparency and Accountability Act.”)


Some models exist for analogous research on sensitive government databases, such as those overseen by the Census Bureau, Internal Revenue Service or Defense Department; and protocols exist, too, for studying biomedical and other highly personal data. But getting access to Facebook and Google’s data represents a challenge that is different in kind and degree. It’s not much of an exaggeration to say that almost all of human experience is now taking place on these platforms, which control intimate communications between individuals and possess voluminous information about what users read, forward, “like” and purchase.


Story continues below advertisement

Several ingredients seem important to insuring the success of a new data-access regime for independent researchers. First, a government agency — most likely the Federal Trade Commission, which already investigates issues of online fraud and privacy violations — would have to be vested with sufficient power to police researchers’ behavior, as well as ensure platforms’ cooperation with projects the agency approves.


Second, the government itself should not have access to the data. The risk of surveillance and mission creep from law enforcement is simply too great. The data must stay within the firm’s control, but the FTC should specify in detail the procedures for accessing data and the requirements for facilities at firms — “clean rooms” — where outside researchers will analyze it. These would likely include recording every keystroke made by a researcher while accessing the data and vetting any potential publications to ensure no leaking of private information.


Third, the firm should have no power to decide which researchers will have access. That is for the FTC to approve. Toward that end, the agency should work with the National Science Foundation to develop rules, procedures and applications governing which researchers get the nod. Who counts as a researcher? It makes sense to start with scholars at universities, because universities have Institutional Review Boards to prevent ethics violations, and universities can be signatories to the relevant data access agreements. If it proves possible to legally define who counts as a “legitimate” journalist or think-tank scholar, perhaps access could eventually be expanded beyond professors.


Critics of the Silicon Valley companies often describe them as monopolies, referring to their scale and their power over the markets they (theoretically) compete in. But the most recent Facebook revelations underscore that they are also data monopolies: They have exclusive access to the information needed to understand the most pressing challenges to society.


Story continues below advertisement

The current situation — platforms controlling all their data and deciding what information the public deserves to know — is unsustainable. So, too, would it be a mistake for Congress to regulate the Internet based on folk theories or misguided conventional wisdom regarding the harms caused by these new technologies. We need good information.


As divided as Republicans and Democrats may be on exactly how these companies should be regulated, members of both parties should be able to come to together to break these firms’ stranglehold on the information necessary for sound technology policy.


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.