Data Resources Guide¶

We’ve collected a variety of data on our subjects over the years. This data can be viewed as falling into three different categories:

SES (Socio-economic data)
Tasks
Transcripts

SES¶

“SES” is short for “Socio-Economic Status”. We’ve gathered a variety of socio-economic data on our subjects: subject gender, race, ethnicity, number of siblings; parent income and education; and much more.

All essential subject information is stored in the our main FileMaker database (LDP DB), which can be accessed using a FileMaker client.

Basic SES data is also now accessible via the Data section of the LDP portal. Choose “Browse Subjects” from the pull-down menu to get a list of all subjects and their basic SES data.

Contacts

See Kristi or Ben for a copy of FileMaker and access to the main database.

Talk to Ben if you need an account to access the LDP portal site.

Tasks¶

We’ve administered a number of surveys, tests, tasks, and parent questionaries during the course of our study. This has given us a lot of insight into our subjects’ cognitive development (vocabulary growth, spatial reasoning, syntactic abilities, etc.) over time. You can find a full index of the various assessments we’ve collected on the Tasks page.

Transcripts¶

In addition to the cognitive assessments described above, we’ve recorded samples of our subject’s and their primary caregivers interacting at home. Families are visited in their homes every four months for a total of 12 visits between 14 months and 58 months. The investigator videotapes the interactions of parental caregiver(s) and the target child during their ordinary daily activities for a 90-minute period at each visit, interacting minimally with the families.

The majority of families have a parent, usually the mother, who self-identified as the primary caregiver for the child. Several families (7 in TD, x in PL) are dual caregiver families, and these visits usually include both the mother and the father interacting with the target child. Other children and family members are sometimes present during these visits, but our video recordings focus on the interaction between the target child and the parental caregiver(s).

Transcripts are made from collected video recordings. Parents’ and children’s speech and gesture are systematically coded.

Speech¶

The speech utterances for both child and parent are transcribed verbatim using English words (gotta is transcribed as got to) and incorrect grammar is not corrected (where my puppy?). Rules were developed for delineating utterance boundaries, including 1) an utterance is never more than one conversational turn; 2) an utterance is never more than one sentence long; 3) an utterance can be a single word, a phrase, or a sentence and 4) intonational contours (such as raising the voice at the end of a question) indicate the end of an utterance. All child speech and all primary caregiver speech directed to the child is transcribed. In addition, primary caregiver speech to siblings under the age of 13 is transcribed and, if designated as a dual caregiver, the other parent’s child-directed speech is transcribed.

Gesture¶

Each gesture made by the child or the parent is marked in the transcript. Gestures are classified into 5 types (McNeill, 1992): deictic (either a point or a hold-up), conventional (e.g., nod, side-to-side shake, shrug), iconic (e.g., flap arms as though flying like a bird, thumb and finger form a circle that resembles a penny), metaphoric (e.g., extending a palm outward to represent putting an idea forward), or beats (e.g., a rhythmic movement that punctuates speech). The form of each gesture is described in terms of the shape of the hand, the type of movement, and the place of articulation. In addition, using non-linguistic context, the first four types of gestures are assigned a meaning (see Goldin-Meadow & Mylander 1984, for a detailed description of how meaning is assigned to gestures). Gesture interpretation in spontaneous conversations is facilitated by the fact that we are familiar with the activities that typically occur during the taping sessions, and by the fact that the parents frequently share their intimate knowledge of the child’s world with us during the taping sessions.

Reliability¶

We employ two different reliability measures. The first concerns the reliability of the transcription (inter-transcriber agreement). For a random 20% of transcripts, a second person transcribes 10% of the utterances. Agreement at or above 95%; conflicts are resolved by a third judge. The second measure concerns the intercoder agreement of particular speech or gesture catgories coded. A second person codes a random selection of 10% of the utterances and the proportion of utterances on which the two coders agreed is calculated for each category. Agreement needs to exceed 88% for all categories coded.

Accessing Transcripts¶

All submitted transcripts are stored on our shared file server. The original transcript files are all Excel-based and require a spreadsheet capable of reading Excel files for viewing. The file server also contains copies of the transcripts in tab-delimited text format, which can be viewed in any spreadsheet or text editor. (Contact Jason if you need access to the file server.)

A note on using Excel for transcription

Visit sessions are transcribed and annotated in Microsoft Excel. Excel serves as a decent framework for transcription and annotation. A standard excel-template ensures consistency and the spreadsheet grid maps nicely to the utterance markup context: one row per utterance with columns for various types of annotation. Excel also provides basic data validation features, customizable views, as well as the sorting and searching capabilities of a simple database.

Warning

Do not use original transcripts for analysis!

The transcript files on the file server function as historical records of the original transcript submissions and for cursory viewing of individual transcripts. Note, however, that after a transcript has been submitted, the transcript utterances may be slightly modified so that they conform as closely as possible with our full set of transcription and spelling conventions. If you need transcript data for a particular analysis or study, please contact Jason for a “normalized” transcript data set.

Transcript Variables¶

Our transcript dataset can be queried for particular analyses, but we’ve collected some standard measures and are making them available for quick and easy access. For a description of each measure see our Transcript Variables page.

Outline

This Page

Data Resources Guide¶

SES¶

Tasks¶

Transcripts¶

Speech¶

Gesture¶

Reliability¶

Accessing Transcripts¶

Transcript Variables¶

Navigation

Outline

This Page

Search

Data Resources Guide¶

SES¶

Tasks¶

Transcripts¶

Speech¶

Gesture¶

Reliability¶

Accessing Transcripts¶

Transcript Variables¶

Navigation