Spanish Learner Oral Corpus

1. Introduction

This page explains how the search interface for the Spanish Learner Oral Corpus works.

You can use the upper menu to browse the following sections:

Information (green box): it presents the data related to the corpus design, methodology, recording format, etc.
Interviews (blue box): you can access all the recordings and the corresponding data (length, learner's L1, etc.), and listen to every full interview.
Search (yellow box): you can look up specific error types in the data bank following criteria such as word category (or Part of speech), learner's mother tongue (L1), etc.

2. Consult the interviews

This section presents a table with the name and information (length, learner's L1, etc.) of all the interviews collected for the analysis.

The box at the lower part of the page (bordered in red) displays information about the file name (learner's L1, level, sex, etc.).

Click on the file name to go to the page where you can read and listen to the full interview.

Interviews

In the following page you can see these contents:

Metadata about the interview (blue box).
Transcription of the recording.
Sound player (lower left corner).

Screenshot of the page to listen to the interview

↑

2.1. Metadata about the interview

The blue box displays the following information:

Data about the recording: length, acoustic quality (A, the best possible; B, a worse quality, and so on), etc.
Learner's data:

Personal data (sex, age or educational level).
Proficiency level in Spanish.
Mother tongue.
Languages spoken.
Time studying Spanish (years, months or weeks, and frequency, place or learning context).
Time in Spanish-speaking country (years, months or weeks, and place).

Comments: other detais to be explained; e.g., if the learner has had some time to prepare previously the task.

↑

2.2. Transcription of the interview

The transcription has followed the marks explained in section Transcription conventions (Information).

A summary with examples can be looked up here.

On the right of the transcription (green colour) there are time codes (minutes and seconds) corresponding to each fragment of the interview; these time codes can be used to go to a specific part of the recording.

To go up to the beginning of the interview, click on the red arrow on the left of the page.

Screenshot of the transcription of the interview

↑

2.3. Sound player

Use the Dewplayer© sound player (on the lower left corner of the page) to listen to each interview.

Following are the functions of the sound player:

Play the file (green button).
Pause the sound (yellow button).
Stop the sound (orange button).
Go to a specific part of the recording: click on the progress bar with the mouse in order to go forward or backward.

You can use the time code displayed on the progress bar of the player (in minutes and seconds).

The first time code corresponds to the current minute and second of the recording, and the last time code, to the final time code.

Dewplayer Sound player to listen to the intervew

↑

3. Error search

In this section you can look up specific errors in the data bank according to search criteria such as learner's mother tongue (L1), type of error, category or part of speech of the word/phrase, etc.

Choose the search criterion or criteria and click on the "Select" button.

In the following page you will find a menu where you can select the choice for the error search (e.g. linguistic level: Pronunciation, Grammar, Lexis or Pragmatics-Discourse).

Select the value to perform the search on the menu and click the "Search" button.

Error search

After clicking the button, results according to your search will be displayed.

For certain types of search, only results found in one file are shown per page. Please click on the "Previous" or "Next" button to see the results found in other files.

It is possible to look up the following information:

Error count.
Error explanation.
Listen to the fragment where the error appears.
Informacion about the learner.

↑

3.1. Error count

To look up the error count for all the documents where the type of error searched was found, click on the table icon below the "Search" button.

A table will be displayed where the error count for all the files is presented (ambiguous and non-ambiguous errors).

Click on the "Hide" button in order to close the table and the error count.

Error count for all the documents

Click on the table icon on the left of the sound player to look up the error count of each document retrieved.

A table with the error count (ambiguous and non-ambiguous) in each corresponding document will be displayed.

Click on the "Hide" button to close the table and the error count.

Error count for one of the documents retrieved

↑

3.2. Display and explanation of the error

Learners' errors are displayed in red below every fragment of transcription where they appear.

Click on the magnifying glass icon to look up the explanation of the error.

Following is the information related to the error:

Part of speech (e.g. article) or category of the word/phrase/clause (e.g. subordinate clause).
Target modification (e.g. wrong order).
Linguistic level (e.g. Grammar).
Error type (e.g. ser/estar).
Etiology (e.g. interlinguistic).
Possible language of interference (if the error is interlinguistic).
Correction (when it is not clear, this value is empty).

Click the "Hide" button to close the explanation of the error.

Explanation and description of the error

↑

3.3. Listen to the fragment where the error appear

Use the sound player to listen to the context where the error appears.

Sound player of the wrong fragment

↑

3.4. Information about the learner

Click on the icon on the left of each fragment (below the sound player) to look up the information about the learner.

Following are the data displayed about the learner:

Personal data (sex, age or educational level).
Proficiency level in Spanish.
Mother tongue.
Languages spoken.
Time studying Spanish (years, months or weeks, and frequency, place or learning context).
Time in Spanish-speaking country (years, months or weeks, and place).

Click on the "Hide" button to close this information.

Information about the learner

↑

4. Search words in the corpus

The "Search" option allows the user to search words in the data bank according to the following criteria:

Word:

Lemma (e.g. comer o chico)
Form (e.g. he comido, comes, comen...; o chico, chica, chicos, chicas)
Category (e.g. verb or noun)

Participant:

Mother tongue
Proficiency level (Common European Framework of Reference for languages, hereafter CEFR)

Please, choose the criteria and click on the "Select" button.

You only can select two options in the field "Word" (e.g. category and lemma) and one option in "Participant" (e.g. mother tongue).

Search in the corpus

In the next page you will find a menu to perform the search according to the selected criteria.

Select the options and click on the "Search" button.

For certain types of search, only results found in one file are shown per page. Please click on the "Previous" or "Next" button to see the results found in other files.

↑

4.1. Count

To look up the count of lexical units in all the files, click on the table icon below the "Search" button.

A table will be displayed where the count of lexical units in all the files is presented. The following data are shown:

Raw frequency
Normalised frequency (count of lemmas according to the search criteria divided by the total number of lexical units in all the files (multiplied by a thousand)
Total count of lexical units (in a group of learners with the same mother tongue, or in a proficiency group according to the CEFR).

Click on the "Hide" or "Return" button to close the table and the count.

Count of lexical units in the corpus

To look up the count of lexical units in each file, please click on the table icon on the left of the first context of transcription of each file.

A table will be shown with the count according to the search criteria and the data explained above.

Please click on the "Close" button to the hide the information about the count.

Count of lexical units in the corpus

↑

4.2. Linguistic features

The search results are shown in blue colour below each line of transcribed text.

Please, click on the magnifying glass icon to see the explanation of the linguistic features of the lexical unit.

The following is the information provided:

Category of the word (e.g. article).
Lemma (e.g. el).
Gender (e.g. feminin).
Number (e.g. singular).
With regard to verbs, information about tense, person or aspect of the periphrasis (e.g. progressive) is also provided.
In the case of foreign words or wrong lexical forms, a comment is also provided.

To hide the information about the linguistic features, click on the "Close" button.

Linguistic features

↑

4.3. Information about the learner

Please, read Information about the learner.

↑

	Spanish Learner Oral Corpus
Information	Interviews	Errors	Search	Help