This data set contains multi-speaker high quality transcribed audio data for Sinhalese. The data set consists of wave files, and a TSV file. The file si_lk.lines.txt contains a FileID, which in tern contains the UserID and the Transcription of audio in the file.
The data set has been manually quality checked, but there might still be errors.
This dataset was collected by Google in Sri Lanka.
See LICENSE.txt file for license information.
Copyright 2015, 2016 Google, Inc.