Linguists for the responsible use of internet data

Welcome! Please feel free to join in this discussion and to contribute ideas, arguments, and suggested methods.

Mission Statement: Traditionally linguists have used corpora and databases as empirical sources for research. The internet now houses billions of electronically searchable words representing dozens of languages. All of this vast digital resource has the potential to be used as linguistic data. We as linguists need to develop standards and best practices for the use of this unprecedented resource to ensure that our use of internet data will comply fully with the goals and expectations of our profession. This page will serve as a discussion forum by presenting the advantages of internet data, debating possible disadvantages of internet data, and by suggesting standards for the use of internet data.

Advantages of internet data:

Disadvantages of internet data:

Standards for use of internet data:


If you want to contribute ideas to this duscussion, please feel free to use our blog. Go to this site, login in as "uncslav" with the password "uncslav", and click on "Just Google It?".

Useful links:


Linguists who have contributed to this discussion:

Hyug Ahn, Ashley Batten, Biljana Belamaric-Wilsey, Jamie Bishop, Sung-ho Choi, Dagmar Divjak, Sean Flanagan, Laura A. Janda, Anne S. Keown, Patrick Murphy, James Phillips, Jenne Powers

Back to home


This site last updated: October 28, 2003