Five years after one of the most heated internet policy debates of the 21st century, a new report claims that Stop Online Piracy Act (SOPA)-style website blocking isn’t such a bad idea after all. We disagree.
The new report from the Information Technology and Innovation Foundation (ITIF) claims that requiring internet service providers to block websites that host questionably legal content can prevent piracy. This analysis, and the public policy outcomes for which it advocates, are deeply flawed for a number of reasons.
Blocking sites ignores important legal implications for free speech, setting dangerous precedents for censorship and free speech. It’s also a terribly inefficient way to address online piracy. The United States has already enacted effective laws to prevent address illegal activity online- new website blocking law would undermine the success of these laws. Finally, the analysis and evidence used to justify blocking websites is suspect and overstated.
Stepping Back: The Policy Around Piracy
The public opposition that ultimately blocked SOPA was not about piracy itself, it was about fears that SOPA’s resulting website blocking would also unintentionally restrict legal content and online speech. We can all agree on the worthy goal of reducing copyright infringement online, but the potential for unnecessary censorship caused by SOPA would have significantly undermined the technical and public policy foundations of a free and open internet.
To fully understand the potential harm of SOPA, it’s important to understand the foundational goals of copyright law: advancing the arts for the public good.
Existing policies balance the exclusive rights of creators with the flexibility needed to foster further creativity and the public interest. Laws like SOPA could force the blocking of websites that contain legal material, and would undermine the safe harbors that have grown legal online platforms.
The flexibilities in copyright law like fair use that protect the public’s access to legal content are not always clear. For example, the ITIF report refers to the benefits of blocking per se illegal material (meaning something that is illegal always, by its existence), such as child pornography. Accessing music, video and other intellectual property content may infringe copyright in one instance and be legal in another depending on how it was acquired or used. Combating piracy is a legitimate public policy objective, but protecting the public’s access to the scale and diversity of legal online platforms offers society significant economic, social, and cultural gains.
Through the Digital Millennium Copyright Act (DMCA), the U.S. has already established a robust legal structure to facilitate cooperation between industries in combating online infringement. This law lays the groundwork for addressing online copyright infringement – including safe harbor clauses that require internet platforms to expeditiously remove infringing content when identified – and has allowed internet companies and traditional creators alike to grow in the digital age.
The consistent, cooperative effort between creators and platforms is necessary to continually maintain a system of protections in a rapidly changing environment. However, creating an ecosystem of censorship and legal uncertainty through a SOPA-like regime would undermine the efficacy of the DMCA’s notice and takedown system.
The Numbers Don’t Add Up
Blocking websites has self-evident, dangerous legal implications, but there is also little evidence that it would be effective. The research drawn on by the ITIF report is well-intentioned in providing more empirical analysis to the issue, but their findings are ultimately unreliable.
After recreating the analysis utilized by the ITIF report – using the same data and procedure, but with more robust controls added – we found that blocking 53 piracy websites in the UK resulted in only a 1%-5% reduction in total online piracy, and that unblocked piracy increased between 0.1%-3.5%. However, even our results are not definitive because the original study’s approach (which we use) is insufficiently rigorous to produce any reliable results.
The first statistical red flag in the analysis drawn on by the ITIF report is the implausibly high R-squared figures in their regression summary tables. These figures are 0.979, 0.851, 0.99, and 0.97 for the four regressions reported.
Second, even taking the R-squared values at face value, the report’s summary statistics do not match their claimed results. Specifically, the report claims that Visits to Blocked Piracy Sites fell from 86,735 visits prior to the implementation of a SOPA-like website blocking regime, to 10,474 visits in the post-block period. However, the study’s flaws are highlighted when the pre-implementation Average Visits to Blocked Sites Per User are added together: there are only 62,894 Visits to Blocked Piracy Sites in the pre-period, compared to the 86,735 previously stated.
Third, the summary regression results report only 20 observations – 10 observation groups over two periods – but the analysis setup should result in 60 total observations. The study claims to look at a 7-month period – with 3 months of pre-block observations, the month of implementation (which is not included in the regression), and 3 months of post-block observations. Something doesn’t add up.
These obvious discrepancies cast doubts on their results, but there are more problems with the fundamental design of the study itself. First, there is little rationale for removing the month of implementation from this type of model – a better approach would have been to use a 1 or 2-month time lag in the model to account for delays in reaction to the website block.
Second, the baseline case used to benchmark the impact of new website-blocking legislation provides little insight. The control case was a group segment that was larger than all other segments combined and which had no Visits to Blocked Piracy Sites. A better approach would have been to use a control group that was similar in size and in the frequency of Visits to Blocked Piracy Sites, but who were not subjected to the November 2014 website blocking.
Third, the researchers fail to accurately adjust for up unobservable shocks and influences over time. They use a post-treatment dummy, rather than a time fixed-effect, which fails to incorporate that their study used six periods of observations.
Finally, there is an a priori expectation that blocking certain websites would lower the number of Visits to Blocked Piracy Sites. This tells us nothing. The more important question relates to what happens to Total Piracy Visits and Unblocked Piracy Visits – i.e. does piracy simply switch from blocked sites to other sites that are still accessible. To do this, the model should include a series of controls to account for all piracy activity including the use of unblocked sites, VPNs, legal subscriptions, and others.
Fixing The Issues
After discovering the flaws in the analysis, we reran the analysis using the same model and create two adjusted models that address those flaws.
The following three difference-in-difference model specifications were used in the replication:
where LnVisitsJT indicates the natural log of visits made by consumer group j during period t. AfterT is a dummy variable for the post-treatment period and TreatIntensityJ is the average number of Visits to Blocked Piracy Sites per user in each segment. The term ControlVectorJ represents a series of controls in natural log form (to allow for results as elasticities) for 1) Users in Segment, 2) Legal Subscription Visits, and 3) VPN Visits. The term μj is a set of segment fixed effects. Specification 1 is the original report model, Specification 2 adds in the series of controls for other types of piracy activities while leaving in the segment fixed effects and Specification 3 is the same as the Specification 2, but removes the segment fixed effects – which we feel are unnecessary given the inclusion of the control characteristics.
Because of discrepancies in the report’s data and claimed results, we used two scenarios. Scenario 1 used the figures from the summary statistics and Scenario 2 used the figures claimed in the results via the text. Since the figures reported in the summary statistics are for the pre-period only, the paper assumed an average 25% increase in 1) Users in Segment, 2) Legal Subscription Visits, and 3) VPN Visits in each segment in Scenario 1.
Finally, we do not agree with the control used in the original study so we ran our additional analysis with both the original study’s control group (Segment 0) and with a new control group labeled as the Alternative Control. The Alternative Control group (Segment 10, which replaces Segment 0) has the average number of each of the metrics from across the other 9 segments and assumes no treatment.
Across the three model specifications, in both data scenarios (Scenario 1 and Scenario 2), and using both the original control group and Alternative Control group, the results are statistically significant and consistent.
They show a reduction of just 1%-5% Total Piracy Visits and a corresponding increase of approximately 0.1%-3.5% in Unblocked Piracy. These are far less impressive results for advocates of website blocking than those found in the ITIF study. These findings are also the result of a more rigorous analysis, with a fuller set of scenarios that include proper controls, a better counterfactual scenario.
The results presented here illustrate two things, the claimed evidence on the effectiveness of website blocking is clearly suspect and, in the realm of internet policy, we must be careful to allow time for the sector and its members to innovate and develop effective solutions.
The Internet Association deplores illegal activities through any medium and strongly supports existing laws like the DMCA Section 512 to address legitimate infringement claims. However, website blocking is ineffective and is antithetical to a free and open internet. Promoting censorship online will undermine the effectiveness of successful laws and threaten the public’s interest.
Alluding to the once censored Fitzgerald – censored for ‘promoting’ illegal activity – we must be able to consider opposing thoughts at the same time and evaluate them fully.
Appendix: Regression Summary Tables