United States

Basics Copyright Trademarks Censorship Site Map


Home Up


United States China France/Germany

Direct Censorship of Internet Search Engines

Congress has reflected a growing concern over the content distributed over the internet.  This is evidenced by the Child Online Privacy Protection Act, the Child Internet Protection Act, the Child Online Protection Act, and the Communications Decency Act (held unconstitutional in Reno v. ACLU, 521 U.S. 844 (1997)).  While these statutes restrict various aspects of the information supplied over the internet and access to other types of information, especially by minors, the Digital Millennium Copyright Act of 1998 (DMCA) presently appears to have the greatest implications particular to search engine companies.

Though untested in the courts of this country, at least one author argues that the DMCA’s safe harbor provisions constitute unconstitutional censorship in violation of the First Amendment.  In his article, Application of the DMCA Safe Harbor Provisions to Search Engines, Craig Walker argues that the DMCA provisions are unclear, which, when coupled with the risk-averse search engine operators, will lead to over zealous adherence to the safe harbor provisions.  As a result of their risk-aversions, the operators will more readily remove alleged infringing websites at the behest of the copyright holders instead of seeking to fight the complaint themselves.  This prediction ultimately states that the private search engine company will consistently err in favor of the copyright holders and to the detriment of the targeted websites owners First Amendment rights.  This balancing of interests is better accomplished by Congress and through use of the existing copyright protection schemes.

While search engine companies may ultimately be found liable under the DMCA, trademark or copyright laws, this author has found no present U.S. legislation expressly requiring a search engine company to alter its index or limit its search results based on government requirements.

 

REFERENCES:

Robots, Iraq, and the White House

On October 24, 2003, officials responsible for www.whitehouse.gov seemingly added all directories mentioning “Iraq” to its robots.txt file, thus preventing search engines from indexing and displaying those web pages as search results.  Though the event created a stir, even the Democratic National Committee web log noted that the web pages themselves still existed, were still available to the public, and could still be found by conducting a search on www.whitehouse.gov itself.  Nonetheless, those concerned with the threat of possible revisionist history offered the theory that the robot.txt file was changed to prevent the caching of those files.  They charge that without such a cached file, it is more difficult to prove substantial changes made to the information source by the administration.  To illustrate this point, it was noted that the White House revised earlier web pages on the web site claiming that combat in Iraq had ceased, instead changing it to “major combat.”

In response to these claims, White House spokesman Jimmy Orr asserted that the changes to the robots.txt file were made to avoid duplication.  Because the White House maintains a separate section devoted to issues relating to Iraq (see at www.whitehouse.gov/infocus/iraq/index.html), Orr contended that it was using many files also posted in other sections of the site.  Therefore to prevent duplication among the over 33,000 documents on the website, someone in Orr’s website staff of ten altered the robots.txt file.  Though most critics agree that many of the items found on the altered robots.txt file were indeed duplicates, those claiming to have accomplished more diligent analysis point out that there were original web folders that were hidden from the internet search engines as well. 

Although in the end this may be the controversy that never was, the debate highlights the potential use and misuse of the robots.txt file.  Fortunately the danger is considerably less than blanket filtering employed by some nations, the potential for abuse remains.  Because of the nature of caching, it is possible to search for and review expired web pages to compare them to their replacements.  This can be done only if the web pages indexed.  Indexing only occurs if the search engine crawler is not prevented from indexing the page by the robots.txt file.

 

More:

  • Keith Spurgeon, Whitehouse.gov Robots.txt, at http://www.bway.net/~keith/whrobots/index.html (last visited Apr. 9, 2005) (providing the original posting on the topic as well as a detailed description of the perceived changes to the robots.txt).

 

REFERENCES:

Back to Top

 

 


Home | Basics | Copyright | Trademarks | Censorship | Site Map

This website was created as an assignment for the Cyberspace Law seminar at the University of North Carolina School of Law.  Information contained in this site should not be considered legal advice. This website was created solely for educational purposes. All copyrighted content, trade names, and trademarks incorporated into this website are property of their respective owners and are reproduced with permission and/or under the Fair Use guidelines for educational purposes.

Last updated: 04/12/05.