.. _mg-intro:
*****
Intro
*****
Overview
========
While researchers who use our data end up publishing on a wide range of topics, many of them will require some information on the morphology and/or syntax of parent and child speech.  To get this data, we use the CLAN software suite developed at Carnegie Mellon University.  Your job as a syntax coder is to run this software that will automatically code morphosyntactic information and make corrections where the automatic parser fails.  At the end of this process, you will create a file that will be inserted back into the original transcript, which the gesture coders will use for their work.
What's Covered?
===============
#. How to set up a morphosyntax workspace on your computer.  This includes downloading and installing the `CLAN software `_ and setting up the subversion repository.
#. A step-by-step guide to cleaning up, analyzing, and converting a transcript from a raw text file to an Excel file that will be inserted into the transcript.
#. A step-by-step guide to preparing a raw transcript for morphosyntactic analysis.
#. Common problems you may come across while coding syntax and the solutions to those problems.
See Also
========
The `Child Language Data Exchange System`_, typically referred to by the 
acronym *CHILDES*, is a central repository for first language acquisition 
data. In addition to the contributed corpora, the CHILDES project has 
specified a format for transcription called *CHAT* and a suite of tools 
called *CLAN* for analyzing CHAT-formatted transcriptions.
Our workflow for morphosyntactic analysis relies heavily on the CLAN tools.
The following documents from the CHILDES_ web site at Carnegie Mellon are
useful to have on hand for reference.
The `CHAT guide`_ provides a complete specification of the CHAT transcription 
format.  See especially Sections 6 (*Words*) and 14 (*Morphosyntactic Coding*).
The `CLAN guide`_ provides a complete description of all of CLAN tools.
`This paper `_ provides a 
nice overview of the process of morphosyntactic analysis with the CLAN tools.  
See esp. Sections 6 (*Analysis based on automatic morphosyntactic coding*), 11 
(*Difficult Decisions*), and 14 (*GRASP*).
Finally, `CHILDES GR Annotation`_ describes the grammatical relations we code.
.. _Child Language Data Exchange System: http://en.wikipedia.org/wiki/CHILDES
.. _CHILDES: http://childes.psy.cmu.edu
.. _CHAT guide: 
    http://childes.psy.cmu.edu/manuals/chat.pdf
.. _CLAN guide: 
    http://childes.psy.cmu.edu/manuals/clan.pdf
.. _CHILDES GR Annotation: 
    http://www.cs.cmu.edu/~sagae/childesparser/childes-annotation.pdf