Syntax

Syntax covers the material in Pinker's chapter 4, as well as Chapter 6 in Language Files. This review guide should help you to highlight important points for the exam, though it isn't meant as a complete substitute for re-reading your Pinker chapter on syntax, Language Files, or your class notes.

Background

a) Some basics

Words conjure up concepts even though they aren't the concept itself. So,"horse" is not actually a HORSE, and "cat" is not a CAT. Rather, these are examples of the arbitrariness of the sign, i.e. of the conventional pairing of sound and meaning. What we mean by this is that words and the things in the world that they point to are arbitrarily related. There's nothing inherently feline in the combination of sounds that makes up the word "cat." Thus, in Coatzospan Mixtec, a language spoken in Mexico, the word for CAT is "mishtun", which sounds completely different from the English word but means the same thing. Since signs are arbitrary, and one of the things we have to do when we learn a language is memorize them. This may bring back fond memories of your Spanish or French or German classes!

Syntax is the study of how we put words together to make sentences. One of the most interesting things about syntax is that it makes infinite use of finite media. We'll come back to this point. For now, consider this:

Man bites dog IS NOT EQUAL TO Dog bites man

We know this because we use a code, or set of rules, to translate between orders of words and combinations of thoughts. This set of rules is called a GENERATIVE GRAMMAR.

Note: this is NOT the same as a pedagogical or PRESCRIPTIVE grammar. The study of syntax is not about telling people how they should talk; rather, it is about understanding the rules that a speech community employs by examining the way that the members of that community actually DO talk. So, syntax is not about telling you that you can't start a sentence with a conjunction, or that you can't split an infinitive, or that you can't strand a preposition at the end of a sentence. In fact, you CAN do all of these things in English, and we do so freely and frequently. Let's keep two important concepts separate then:

Prescriptive Grammar: A set of artificial rules pertaining to some group's notion of "correct" usage. Examples from English are rules that prohibit split infinitives, as in the supposedly incorrect sentence "I wanted Zim to really try hard." Here, the prescriptive grammarian will tell us that the sentence is incorrect because the word "really" appears between the words "to" and "try", thus splitting the infinitive "to try". Prescriptive grammar is notable in that its rules often fly in the face of how many if not all of the speakers in a speech community actually use the language. And prescriptive grammar is neither what this course is about nor what serious linguistic scholarship concerns itself with (he says, while gleefully stranding the preposition with at the end of the phrase).

Generative Grammar: A grammar that reflects the way a speech community uses a language. A generative grammar attempts to encode what it is that a speaker of a language knows, i.e. it  characterizes the grammatical knowledge of the speaker. It DOESN'T prescribe what you can and can't say. The job of the generative grammar is two-fold. First, it has to characterize all of the sentences that native speakers judge to be acceptable sentences of their language. Secondly, it has to rule out sentences that are not possible sentences of a language. In a broader sense, a generative grammar must also accomplish these goals within the context of a set of rules and principles that are applicable across languages.

Big picture: what's a grammar?

b) grammar as a discrete combinatorial system

Let's start by saying that it is profitable to view grammar as a discrete combinatorial system. What does this mean? It means that we take a finite number of discrete elements (words!) and combine them to create larger structures that are different from the original words themselves. The words we choose and the way we combine them make a big difference, obviously. The order of combination lies at the heart of the problem of the following pair of headlines:
c) So how does language work (really simply)? Two parts:

Understanding the nature of the mental dictionary is the object of the study of morphology. Syntax focuses on the grammar of combination.
 
 

d) Questions:

e) Answers:

  • A1: the longest sentence can stretch to infinity (you can always make a sentence longer by adding words)
  • A2: you can make an infinite number of sentences (this actually follows from the fact above)
  • (Pinker notes: on average if you are interrupted at a random point in a sentence, there are 10 words that could be the next word. If you are capable of producing a 20 word sentence (we can actually do much better), the number of possible sentences you could produce is in principle at least 10 to the 20th or a hundred million trillion. At five seconds a sentence, you'd need a hundred trillion years to memorize all the sentences.)

    Anyway, this is what we mean by infinite use of finite media. We are in principle capable of never repeating a sentence. We are not restricted to a fix set of prefabricated phrases. We produce things we've never heard. We understand completely novel sentences that we've never produced.

    Autonomy: another crucial property of grammar

    f) Autonomy

    Grammar is autonomous. What do we mean by autonomous? Well, we mean autonomous from cognition. Give us an example, you say. Okay. Here's two examples. Consider the following sentences.
    What is striking here is that we all share a strong sense as speakers of English that these sentences are ill formed (i.e. not the kind of English sentence that we would ever put together) but that they are also clearly interpretable. That is, we can understand them even though we sense that they are badly formed. So whatever it is that lets us know that they are structurally flawed is autonomous from whatever it is that let's us understand their intended meaning.
    g) This brings us to the concept of UNGRAMMATICALITY.
    The sentences above are examples of ungrammatical sentences. Ungrammaticality is often a problematic concept for beginning linguistics students for the following reason. It is easy to get ungrammaticality confused with prescriptivity. If I say to you that the sentence "This car no radio" is ungrammatical, you might ask yourself how that is any less prescriptive than someone who says that it's wrong to say "Who did you see?" instead of "Whom did you see?" or that it is incorrect to say "Who are you going out with?" rather than the supposedly more proper "With whom are you going out?" The difference is that sentences like "Who are you going out with?" are produced by native English speakers ALL THE TIME. If I tell you that they are incorrect, I'm telling you, a native speaker of English, that you don't know how to speak your own language. That doesn't make any sense at all. By contrast, sentences like "this car no radio" are not systematically produced by native speakers, and, when asked, native speakers judge them to be odd. It is this type of sentence that we reserve the term "ungrammatical" for. In non-technical terms, we have a gut feeling that something is wrong with such sentences, despite their potential interpretability. In more formal terms, we have a strong sense that the rules of sentence formation used by by whoever produces "this car no radio" or similarly illformed sentences is not the same as the code or rules that we employ in interpreting them.

    h) What else suggests that grammar is autonomous?

    The flip side of ungrammaticality is that sentences can be well formed but completely nonsensical. Chomsky's famous Colorless green ideas sleep furiously is a well-known example of this.

    Is this sentence ungrammatical? Answer: NO. What's what's wrong with it is that it doesn't make sense. But what's interesting is that we have a strong sense that it is perfectly well formed from a syntactic point of view. That is, it is structurally sound. We could, in fact, substitute other words of the same grammatical category and make the sentence perfect, as in "Happy little kitten sleep soundly."

    Put sentences that are ungrammatical but understandable together with sentences that are perfectly grammatical but nonsensical and you have more arguments for the autonomy of syntax. In fact, this autonomy is precisely what allows us to enjoy nonsense verse like Lewis Carroll's. What we are feeling is the tension between the structural integrity of the verse and the nonsensical nature of the meaning.

    i) In short:
    The code we use to put sentences together is independent or autonomous--separate from meaning--and this is what we mean basically when we talk about an autonomous syntax.

    So how does our autonomous, discrete combinatorial syntax work?



    j) The simplest guess: the word chain

    We might view syntax as a finite state grammar (word chain device). Pinker actually argues against this view and for another view. First, what's a word chain device.

    Automated phone numbers are a finite state or word chain device. You call information, and a recorded voice spits out the number you are looking for. This works because the phone company has recorded each of 10 numbers with seven different intonation patterns for any place that a given number might fall in the string of a seven place phone number. The device then selects one number for first position and then moves on to select a number in second position and so forth. Mathematically, 10,000,000 numbers can be generated from the seventy recordings. That's a powerful device. Such devices can actually generate an infinite number of sentences from a finite number of words. See p. 92 in Pinker for a simple example and a picture. (Make sure you can understand how such a device works.)
     
     

    k) Why aren't such devices the best way to model a natural language like English?
    Pinker makes two basic arguments against viewing word chains as the best way to model natural language syntax. Here's a list of his main points. Make sure you understand his arguments:

    l) Q: If word chains aren't the best approach, what's the alternative???

    a: TREES

    Syntax and Trees



    m) Consider the following 2 English sentences:

    How do we turn these sentences into questions in English?
    This seems obvious enough, but it's worth asking why we don't make the question as follows:
    The reason is this: there's more to syntax that left to right order, i.e. more to it than just lining on word up after the next. Think back to the discussion above about child language acquisition. Note that the badly formed question would actually seem to involve the simpler rule: move the first "is" to the left of the subject ("a turtle" in this case). But that's not what happens. What this means is that we've uncovered hierarchical structure in the sentence.
    n) Constituency
    In slightly more technical terms, we've uncovered constituent structure. Specifically, the whole string of words [a turtle that is happy] is a UNIT (a Noun Phrase to be precise), which happens to be playing the role of the subject in this sentence. The "is" inside of this subject is thus invisible to the question rule (because it's buried inside the subject) which is looking to move the "is" that is the main verb of the whole sentence. So, we don't just move the first "is". We move the "is" that is the main verb and place it to the left of the subject. (This is, of course, itself oversimplified but will do for our purposes here.) In simple terms, constituents are groups of words that act as a single unit within a sentence. (Note: Each single word in a sentence considered on its own is also considered a constituent.)

    So here's what we've moved around: Is [a turtle that is happy] in the houseboat?

    o) Q: How do we identify constituents?
    A: Like so many detectives, we test for them! Guess what these tests are called? Yep, constituency tests.
    p) Constituency tests
    Here's a sentence: What are some tests that we can run on this sentence to check on whether a particular string (group of consecutive) words is a constituent. Well, here's a list: Let's apply some tests to the above to reveal constituents.
    q) Caveat Emptor (Buyer beware):
    Not all tests are passed by all constituents. We've just seen this. But if a string of words is a constituent it will pass at least one major constituency test.

    r) BE SURE TO KNOW AND BE ABLE TO APPLY THESE TESTS
     
     

    s) Practice: How many constituents can you identify in the following:

    Some firemen work in really bad conditions on Wednesdays.

    t) Constituents and syntactic ambiguity

    Constituency is cool. It gives us a take on ambiguity, situation that arises when a sentence has more than one clear interpretation. Conisder the following ambiguous sentence taken from Groucho Marx via Pinker. If I were to ask you to EXPLAIN the ambiguity, you could paraphrase the two readings of the sentence by saying that on one reading, I was wearing my pj's when I shot an elephant, while on the other more humorous reading, the elephant was wearing my pj's when I shot it. What's interesting is that constituent structure can actually give us a clear explanation for why the ambiguity arises. Specifically, in the less humorous reading, the NP (noun phrase) my underwear is not part of a larger constituent [an elephant in my underwear] but rather hangs directly off of the verb phrase. In the humorous reading, however, [an elephant in my underwear] is a constituent so that [in my underwear] is modifying [an elephant]. I've given you the explanation, now I want you to draw the trees. You'll need to know this for the exam! (Consult p. 103 of Pinker if you are confused about how the two different trees might look. The example is the same in principle.)

     
     
     
     
     
     
     
     

    I shot an elephant in my underwear
     
     
     
     
     
     
     
     
     
     
     
     

    I shot an elephant in my underwear


    u) Note that this sentence is different from another ambiguous sentence such as the ever-popular: The ambiguity in Mat sat on a bat is not syntactic but rather lexical. That just means that the ambiguity lies in the fact that "bat" has two completely different meanings, one of which refers to a wooden or aluminum instrument used to strike a baseball, and another which refers to a flying thing that eats insects and is guided by really cool radar. So, it's not syntactic structure which makes the sentence ambiguous. It's the multiple meanings of "bat".

    Hint: Know the difference between syntactic ambiguity and lexical ambiguity.

    u) So, what are trees made up of?

    Since I'm keeping this a text only document, take a look at the tree on page 187 of your Language Files book. The sentence is: Okay, what the tree consists of is this: Trees thus tell us two things: Regarding constituency, note that we have constituents within constituents. So, there is a constituent Noun Phrase [her cats] that is also a part of the larger constituent Verb Phrase [likes her cats]. When you "read" a tree, identify constituents by looking at all the elements that are exhaustively dominated by a single node in the tree. So, [her cats] is a constituent because this string comprises all of the material that is dominated by the final NP. Likewise, [likes her cats] is a constituent because it contains all the material dominated by VP.

    v) Trees and Phrase Structure Rules

    When we say that syntax is about tree structures, what we're really doing is approaching syntax as a phrase structure grammar. I think trees make it sound less formidable. What trees are really doing is representing the phrase structure rules that make up an important part of syntax. Look again at that tree on p. 187. Note that there is an S (for sentence node that dominates two sister nodes, an NP and a VP). This is just the way of drawing a basic phrase structure rule that says that a sentence consists of a Noun Phrase followed by a Verb Phrase, a rule which can be drawn as a tree or equally written this way: When we look at it this way, we see that phrase structure rules and trees amount to the same thing. They are ways of encoding rules of the order and hierarchy of our sentences. Two more rules are needed to finish characterizing the tree on p. 187. These are: With these three rules we've characterized this tree. (Actually, we also need rules like Det -> my , so that we know that "my" is the type of word that can be plugged in as a determiner, but you get the basic idea.)
    w) How do we get our infinite use of finite media?
    Basically, we get this from something known as recursion in our rules. Recursion refers to a situation where a rule calls itself or a set of rules results in a rule being called again. An example makes this easier to see. Consider the noun phrase [the cat in the hat] as in the sentence [the cat with the hat found a ball in the hall].

    If we write a phrase structure rule to get our Noun Phrase, we'd need a rule like this:

    Fine. But we also need a rule for our prepositional phrase, which would look like this: Put these together and we get: After a moment's thought, we can see infinity. Specifically, we see that the appearance of the NP within the Prepositional Phrase rule and the appearance of the PP within the Noun Phrase rule will let us make infinitely long Noun Phrases (or PP's for that matter). Why, well, consider this:
    [the cat with the hat from the shop]

    [the cat with the hat from the shop on the corner]

    [the cat with the hat from the shop on the corner by the bank]

    [the cat with the hat from the shop on the corner by the bank near the car]

    [the cat with the hat from the shop on the corner by the bank near the car on the curb]

    [the cat with the hat from the shop on the corner by the bank near the car on the curb by the cop]

    [the cat with the hat from the shop on the corner by the bank near the car on the curb by the cop with the jacket]

    [the cat with the hat from the shop on the corner by the bank near the car on the curb by the cop with the jacket from the shop]

    [the cat with the hat from the shop on the corner by the bank near the car on the curb by the cop with the jacket from the shop near the mall]

    an so on towards infinity...

    Crucially, every time we have an NP, we introduce a PP, and every PP requires an NP, which then lets us add a new PP and that's the trick of getting infinity from our finite means. In this case, we needed only two phrase structure rules. Question for you: explain how Pinker gets recursion in a single rule for sentences having either...or type constructions. (see his p. 101 and thereabouts).

    x) At the bottom of our tree, we have lexical categories, but how do we know what category a word belongs to?

    Again, we act like detectives and let the behavior of the word in question tell us. As with constituents, we can use tests to find the answer. Can we add the plural to the word? If so, it's a NOUN. Does the word take the -s suffix to agree with a third person singular subject, as in "He walks"? If so, you're looking at a VERB. File 6.2 in Language Files discusses a number of tests to help you identify the different lexical categories that words belong to. Review it well!

    y) Here are examples of the categories of various words