We Use Cookies!!!
We use cookies to ensure that we give you the best experience on our website. Read cookies policies.
The audio dataset consist of general conversations between native English people from US along with metadata and transcription.
Unscripted General Conversations
25 Speech Hours
July 2023
45
Welcome to the English Language General Conversation Speech Dataset, a comprehensive and diverse collection of voice data specifically curated to advance the development of English language speech recognition models, with a particular focus on US accents and dialects.
With high-quality audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and Generative Voice AI algorithms. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in United States.
Speech Data:
This training dataset comprises 30 hours of audio recordings covering a wide range of topics and scenarios, ensuring robustness and accuracy in speech technology applications. To achieve this, we collaborated with a diverse network of 40 native English speakers from different states/provinces of United States. This collaborative effort guarantees a balanced representation of US accents, dialects, and demographics, reducing biases and promoting inclusivity.
Each audio recording captures the essence of spontaneous, unscripted conversations between two individuals, with an average duration ranging from 15 to 60 minutes. The speech data is available in WAV format, with stereo channel files having a bit depth of 16 bits and a sample rate of 8 kHz. The recording environment is generally quiet, without background noise and echo.
Metadata:
In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This metadata includes the participant's age, gender, country, state, and dialect. Furthermore, additional metadata such as recording device detail, topic of recording, bit depth, and sample rate will be provided.
The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of English language speech recognition models.
Transcription:
This dataset provides a manual verbatim transcription of each audio file to enhance your workflow efficiency. The transcriptions are available in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags.
Our goal is to expedite the deployment of English language conversational AI and NLP models by offering ready-to-use transcriptions, ultimately saving valuable time and resources in the development process.
Updates and Customization:
We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our voice dataset is regularly updated with new audio data captured in diverse real-world conditions.
If you require a custom training dataset with specific environmental conditions such as in-car, busy street, restaurant, or any other scenario, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.
License:
This audio dataset, created by FutureBeeAI, is now available for commercial use.
Conclusion:
Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, exploring generative voice AI, or building cutting-edge voice assistants and bots, our dataset serves as a reliable and valuable resource.
Channel 1 | Channel 2 | Format |
---|---|---|
Male(29) | Female(24) | wav, json |
LABEL | START | END | CHANNEL | TRANSCRIPT |
---|---|---|---|---|
Speech | 0.626 | 1.597 | Speaker 1 | Hello Futurebee. |
Speech | 1.895 | 3.028 | Speaker 2 | Hello Futurebee. |
Speech | 6.586 | 7.107 | Speaker 2 | Okay |
Speech | 6.751 | 12.791 | Speaker 1 | [filler]so (()) you can tell me what kind of vacation things you want to do? |
Speech | 13.336 | 14.646 | Speaker 1 | (()). |
Speech | 13.698 | 14.330 | Speaker 2 | Yeah |
Speech | 15.649 | 20.942 | Speaker 2 | [filler]when Veronica comes in, [filler] I was gonna start with (()) |
Speech | 18.454 | 18.968 | Speaker 1 | [filler] |
Speech | 22.295 | 26.100 | Speaker 2 | Because I am pretty they are gonna get here first. I am pretty sure we are gonna see. |
Speech | 23.945 | 24.439 | Speaker 1 | [filler] |
Speech | 26.663 | 29.669 | Speaker 2 | (()) and then towards the end of the trip |
Speech | 31.001 | 34.554 | Speaker 2 | there will be, there will be doing the all the thing once Veronica and grandma come. |
Noise | 32.310 | 32.490 | - | - |
Noise | 32.902 | 33.279 | - | - |
Speech | 35.261 | 37.088 | Speaker 2 | I don't know. I just, I am more |
Speech | 37.502 | 38.962 | Speaker 2 | interested in the (()) stuff. |
Speech | 38.277 | 38.813 | Speaker 1 | Yeah sure |
Speech | 39.381 | 40.658 | Speaker 2 | I don't know. I am kind of looking at |
Speech | 41.146 | 42.121 | Speaker 2 | whatever I guess. |
Speech | 42.972 | 45.640 | Speaker 2 | [filler]but right now I am looking at |
Speech | 45.405 | 46.658 | Speaker 1 | Okay [filler] |
Speech | 46.158 | 49.161 | Speaker 2 | Buses from Bangkok to Phuket. |
Speech | 50.317 | 54.066 | Speaker 2 | I don't know if I want to go to Phuket or no we are not going to Phuket. We are going to go visit (()). |
Speech | 54.432 | 55.456 | Speaker 2 | Where is Tulip? |
Speech | 56.393 | 62.920 | Speaker 1 | Yeah we are going to, we are going to visit (()). [filler] let me, let me look up where he lives (()). |
Speech | 63.887 | 66.322 | Speaker 1 | You know what, I can, do you want to message him? |
Speech | 68.069 | 69.123 | Speaker 2 | Me message him? |
Speech | 69.387 | 70.697 | Speaker 1 | Do you want me to message him? |
Speech | 70.887 | 72.290 | Speaker 2 | Yeah, yeah go ahead and message him. |
Speech | 72.977 | 73.750 | Speaker 2 | Because I will look very. |
Speech | 73.427 | 74.736 | Speaker 1 | Okay I will find out where he lives. |
Speech | 75.501 | 75.899 | Speaker 2 | Okay. |
Speech | 77.215 | 79.727 | Speaker 1 | I will find out where he lives (()). |
Speech | 78.129 | 80.709 | Speaker 2 | Yeah where we can, we can, we can go with him. |
Speech | 81.191 | 81.977 | Speaker 2 | [filler] |
Speech | 84.197 | 86.146 | Speaker 2 | I think if we go visit to |
Speech | 86.566 | 87.795 | Speaker 2 | I think it will be |
Speech | 90.474 | 97.456 | Speaker 2 | I don't know if there is going to be so many (()) things to do. I am not sure (()) like what (()) or kind of. |
Speech | 98.278 | 99.495 | Speaker 2 | Hoping to do |
Speech | 100.266 | 105.938 | Speaker 2 | like may be they are, because like Phuket is very very (()) but it is also pretty cute. |
Speech | 107.140 | 108.105 | Speaker 2 | And like it has |
Speech | 109.007 | 110.736 | Speaker 2 | like cute buildings and shops and stuff. |
Speech | 111.462 | 116.358 | Speaker 2 | [filler]but if we go with the (()) you know probably cheaper because it is not a tourist area. |
Speech | 114.162 | 114.587 | Speaker 1 | You could. |
Speech | 117.019 | 119.140 | Speaker 2 | Here have been living there for |
Speech | 119.635 | 122.013 | Speaker 2 | a couple of months already. So he can, you know |
Speech | 122.629 | 125.278 | Speaker 2 | show us around. Also, we could get to visit our friend. |
Speech | 126.250 | 127.227 | Speaker 2 | [filler] |
Speech | 128.639 | 129.554 | Speaker 2 | and |
Speech | 129.973 | 132.127 | Speaker 2 | its even on a small |
Speech | 133.943 | 134.729 | Speaker 2 | counts |
Speech | 135.405 | 137.347 | Speaker 2 | if they are on the water |
Speech | 137.782 | 139.294 | Speaker 2 | or like near the islands. |
Speech | 140.566 | 145.985 | Speaker 2 | I am pretty sure they still have like, little tourist things you can like rent [filler] |
Speech | 146.431 | 147.294 | Speaker 2 | tourist (()) |
Speech | 148.031 | 151.048 | Speaker 2 | to go take it to the different islands because thats what we did |
Speech | 151.812 | 154.479 | Speaker 2 | when we were (()) and (()) is really |
Speech | 154.905 | 157.229 | Speaker 2 | small town too. It is not small place. |
Speech | 157.905 | 159.554 | Speaker 2 | But we were still able to rent |
Speech | 160.413 | 161.359 | Speaker 2 | like a tourist |
Speech | 162.163 | 162.859 | Speaker 2 | tour boat |
Speech | 163.387 | 166.473 | Speaker 2 | and go to the islands and (()) and stuff so |
Speech | 167.393 | 168.709 | Speaker 2 | it's, it could be |
Speech | 170.020 | 174.794 | Speaker 2 | even nicer because its not super touristy and we have a person we already know. |
Speech | 175.580 | 176.715 | Speaker 2 | And |
Speech | 179.175 | 180.520 | Speaker 2 | the stuff will probably be cheaper. |
TIME | TRANSCRIPT |
---|---|
0.626 1.597 | Hello Futurebee. |
1.895 3.028 | Hello Futurebee. |
6.586 7.107 | Okay |
6.751 12.791 | [filler]so (()) you can tell me what kind of vacation things you want to do? |
13.336 14.646 | (()). |
13.698 14.330 | Yeah |
15.649 20.942 | [filler]when Veronica comes in, [filler] I was gonna start with (()) |
18.454 18.968 | [filler] |
22.295 26.100 | Because I am pretty they are gonna get here first. I am pretty sure we are gonna see. |
23.945 24.439 | [filler] |
26.663 29.669 | (()) and then towards the end of the trip |
31.001 34.554 | there will be, there will be doing the all the thing once Veronica and grandma come. |
32.310 32.490 | - |
32.902 33.279 | - |
35.261 37.088 | I don't know. I just, I am more |
37.502 38.962 | interested in the (()) stuff. |
38.277 38.813 | Yeah sure |
39.381 40.658 | I don't know. I am kind of looking at |
41.146 42.121 | whatever I guess. |
42.972 45.640 | [filler]but right now I am looking at |
45.405 46.658 | Okay [filler] |
46.158 49.161 | Buses from Bangkok to Phuket. |
50.317 54.066 | I don't know if I want to go to Phuket or no we are not going to Phuket. We are going to go visit (()). |
54.432 55.456 | Where is Tulip? |
56.393 62.920 | Yeah we are going to, we are going to visit (()). [filler] let me, let me look up where he lives (()). |
63.887 66.322 | You know what, I can, do you want to message him? |
68.069 69.123 | Me message him? |
69.387 70.697 | Do you want me to message him? |
70.887 72.290 | Yeah, yeah go ahead and message him. |
72.977 73.750 | Because I will look very. |
73.427 74.736 | Okay I will find out where he lives. |
75.501 75.899 | Okay. |
77.215 79.727 | I will find out where he lives (()). |
78.129 80.709 | Yeah where we can, we can, we can go with him. |
81.191 81.977 | [filler] |
84.197 86.146 | I think if we go visit to |
86.566 87.795 | I think it will be |
90.474 97.456 | I don't know if there is going to be so many (()) things to do. I am not sure (()) like what (()) or kind of. |
98.278 99.495 | Hoping to do |
100.266 105.938 | like may be they are, because like Phuket is very very (()) but it is also pretty cute. |
107.140 108.105 | And like it has |
109.007 110.736 | like cute buildings and shops and stuff. |
111.462 116.358 | [filler]but if we go with the (()) you know probably cheaper because it is not a tourist area. |
114.162 114.587 | You could. |
117.019 119.140 | Here have been living there for |
119.635 122.013 | a couple of months already. So he can, you know |
122.629 125.278 | show us around. Also, we could get to visit our friend. |
126.250 127.227 | [filler] |
128.639 129.554 | and |
129.973 132.127 | its even on a small |
133.943 134.729 | counts |
135.405 137.347 | if they are on the water |
137.782 139.294 | or like near the islands. |
140.566 145.985 | I am pretty sure they still have like, little tourist things you can like rent [filler] |
146.431 147.294 | tourist (()) |
148.031 151.048 | to go take it to the different islands because thats what we did |
151.812 154.479 | when we were (()) and (()) is really |
154.905 157.229 | small town too. It is not small place. |
157.905 159.554 | But we were still able to rent |
160.413 161.359 | like a tourist |
162.163 162.859 | tour boat |
163.387 166.473 | and go to the islands and (()) and stuff so |
167.393 168.709 | it's, it could be |
170.020 174.794 | even nicer because its not super touristy and we have a person we already know. |
175.580 176.715 | And |
179.175 180.520 | the stuff will probably be cheaper. |
English
en-us
USA
Arizona,...more
M:55, F:45
18-70
Silent, Noisy
16 bit
wav
8khz
Dual separate channel
15-60 minutes
Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.
Download Free Dataset
Contact Us