![cookie-icon](/assets/img/Privacy/cookie-color-icon.webp)
We Use Cookies!!!
We use cookies to ensure that we give you the best experience on our website. Read cookies policies.
The audio dataset includes call center conversations in Healthcare, featuring native English speakers from US, with detailed metadata and accurate transcriptions.
Unscripted Call Center Conversations
30 Speech Hours
July 2023
60
Welcome to the English Language Call Center Speech Dataset for the Healthcare domain. It is a specialized and comprehensive collection of voice data designed to enhance the development of call center speech recognition models specifically for the Healthcare industry.
With high-quality call center audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and generative voice AI algorithms in the Healthcare domain. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in United States.
Speech Data:
This training dataset comprises 30 hours of call center audio recordings covering various topics and scenarios related to the Healthcare domain, to build robust and accurate customer service speech technology.
To curate realistic call center interactions, we collaborated with a diverse network of 60 expert native English speakers from different states/provinces of United States. This collaborative effort ensures a balanced representation of US accents, dialects, and demographics, promoting inclusivity and reducing biases in the dataset.
Each audio recording captures the essence of unscripted and spontaneous conversations between call center agents and customers, with an average duration ranging from 5 to 15 minutes per call. The dataset includes both inbound and outbound calls, covering scenarios such as inquiries, promotional offers, complaints, technical support, and more. Additionally, the dataset contains call center conversations with both positive and negative outcomes, providing a diverse and realistic dataset.
The speech data is available in WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 kHz, ensuring high-quality audio for accurate analysis. The recording environment is generally quiet, without background noise and echo.
Metadata:
In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This includes the participant’s age, gender, country, state, and dialect. Additionally, it includes metadata like domain, topic, call type, outcome, bit depth, and sample rate for each conversation.
The metadata serves as a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of English language call center speech recognition models for the Healthcare domain.
Transcription:
To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags, covering both the agent and customer conversations.
These ready-to-use transcriptions accelerate the development of Healthcare call center conversational AI and ASR models for the English language.
Updates and Customization:
We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our call center voice dataset is regularly updated with new audio data captured in diverse real-world conditions.
If you require a custom training dataset with specific environmental conditions, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.
License:
This Healthcare call center audio dataset is created by FutureBeeAI and is available for commercial use!
Conclusion:
Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, or building state-of-the-art voice assistants to improve customer experiences in the Healthcare sector, our dataset serves as a trusted resource to meet your goals
Channel 1 | Channel 2 | Format |
---|---|---|
Male(29) | Female(24) | wav, json |
LABEL | START | END | CHANNEL | TRANSCRIPT |
---|---|---|---|---|
Speech | 4.774 | 5.799 | Speaker 2 | Hello Futurebee. |
Speech | 7.578 | 8.477 | Speaker 1 | Hello Futurebee. |
Noise | 7.703 | 7.961 | - | - |
Noise | 9.519 | 9.878 | - | - |
Speech | 12.669 | 14.336 | Speaker 2 | Hello is this <PII>Mr. Micheal</PII>? |
Speech | 15.579 | 17.335 | Speaker 1 | This is, whom I speaking to? |
Speech | 18.562 | 24.603 | Speaker 2 | Hi this is Kelly. I am with [filler], Dr Brigers office. [filler] I am calling because we |
Noise | 24.329 | 24.518 | - | - |
Speech | 25.082 | 26.722 | Speaker 2 | needed you a little pre test |
Speech | 27.361 | 29.411 | Speaker 2 | screening before your appointment tomorrow. |
Speech | 30.103 | 31.312 | Speaker 2 | You have just ten minutes. |
Speech | 33.868 | 37.679 | Speaker 1 | [filler]yeah I think so. [filler] yeah, yeah I have got. |
Noise | 37.722 | 38.222 | - | - |
Speech | 39.207 | 44.883 | Speaker 2 | Okay perfect. So we just do this phone call to make the check in process easier once you get here because |
Speech | 45.649 | 47.847 | Speaker 2 | ever since our office is reopened |
Speech | 48.283 | 50.347 | Speaker 2 | we had such a backlog of patients |
Speech | 51.024 | 53.265 | Speaker 2 | that when we reopened |
Noise | 52.816 | 54.182 | - | - |
Speech | 54.133 | 60.191 | Speaker 2 | we are trying to take on more patients than usual. So by doing this process it helps us to get through |
Speech | 60.740 | 61.731 | Speaker 2 | the check in process |
Speech | 61.957 | 65.248 | Speaker 2 | more quickly when you actually come to the office. Okay? |
Speech | 68.328 | 75.453 | Speaker 1 | I will tell you what, that makes me real happy because one of the things I hate about going into doctors offices is I get there in time to my appointment. |
Speech | 75.894 | 81.087 | Speaker 1 | And then I get a quick half an hour to call out all the paper work. That's in a, that's in a eliminate this right? |
Speech | 83.287 | 86.912 | Speaker 2 | For the most part, [filler] thankfully you are already [filler] |
Speech | 87.552 | 90.194 | Speaker 2 | repeat patient that we won't have to call out |
Speech | 90.587 | 91.686 | Speaker 2 | any paper work |
Speech | 92.436 | 95.052 | Speaker 2 | like sometimes you have to do with your new patient. |
Speech | 95.953 | 100.677 | Speaker 2 | [filler], but I just want you to know that like I said we have been really |
Speech | 101.403 | 103.453 | Speaker 2 | swamped with new patients and |
Speech | 103.843 | 105.686 | Speaker 2 | a backlog of patients so |
Speech | 106.170 | 108.412 | Speaker 2 | please continue to be patient with us if you are |
Speech | 108.811 | 110.569 | Speaker 2 | Appointment is on perfectly on time. |
Speech | 114.045 | 116.670 | Speaker 1 | Okay. Alright [filler], thanks for giving me a heads up. |
Speech | 117.203 | 120.170 | Speaker 2 | Alright so I just have a few questions to ask you. |
Speech | 121.927 | 126.644 | Speaker 2 | [filler]and they are just some general questions about your health within the last few days. |
Speech | 127.412 | 131.703 | Speaker 2 | So just think back to your last few days and you can give me an answer. Are you ready? |
Speech | 135.287 | 135.961 | Speaker 1 | I am ready. |
Speech | 135.453 | 138.978 | Speaker 2 | Okay my first question is, have you had a new fever? |
Speech | 139.425 | 145.336 | Speaker 2 | of a hundred and four degrees or higher, Of one hundred point four degrees or higher? Yes or no? |
Speech | 148.151 | 156.961 | Speaker 1 | [filler], I have not had any meter to take my own temperature in the last forty eight hours. I have a sense of fever. So [filler], I don't have any measurements to tell you this. But no, I, I don't. |
Noise | 157.032 | 157.663 | - | - |
Speech | 158.788 | 163.216 | Speaker 2 | Okay so that means that within the last couple of days there has been no symptoms that could be |
Noise | 159.274 | 160.066 | - | - |
Speech | 163.757 | 165.532 | Speaker 2 | connected with a fever either right? |
Speech | 167.757 | 168.191 | Speaker 1 | Correct |
Speech | 169.223 | 172.032 | Speaker 2 | Okay that's great. Let me just type that into my computer. |
Noise | 172.626 | 172.723 | - | - |
Speech | 174.830 | 176.247 | Speaker 2 | Okay and my next question. |
Speech | 176.782 | 181.449 | Speaker 2 | Have you had a new cough, you can now attribute to another health condition? |
TIME | TRANSCRIPT |
---|---|
4.774 5.799 | Hello Futurebee. |
7.578 8.477 | Hello Futurebee. |
7.703 7.961 | - |
9.519 9.878 | - |
12.669 14.336 | Hello is this <PII>Mr. Micheal</PII>? |
15.579 17.335 | This is, whom I speaking to? |
18.562 24.603 | Hi this is Kelly. I am with [filler], Dr Brigers office. [filler] I am calling because we |
24.329 24.518 | - |
25.082 26.722 | needed you a little pre test |
27.361 29.411 | screening before your appointment tomorrow. |
30.103 31.312 | You have just ten minutes. |
33.868 37.679 | [filler]yeah I think so. [filler] yeah, yeah I have got. |
37.722 38.222 | - |
39.207 44.883 | Okay perfect. So we just do this phone call to make the check in process easier once you get here because |
45.649 47.847 | ever since our office is reopened |
48.283 50.347 | we had such a backlog of patients |
51.024 53.265 | that when we reopened |
52.816 54.182 | - |
54.133 60.191 | we are trying to take on more patients than usual. So by doing this process it helps us to get through |
60.740 61.731 | the check in process |
61.957 65.248 | more quickly when you actually come to the office. Okay? |
68.328 75.453 | I will tell you what, that makes me real happy because one of the things I hate about going into doctors offices is I get there in time to my appointment. |
75.894 81.087 | And then I get a quick half an hour to call out all the paper work. That's in a, that's in a eliminate this right? |
83.287 86.912 | For the most part, [filler] thankfully you are already [filler] |
87.552 90.194 | repeat patient that we won't have to call out |
90.587 91.686 | any paper work |
92.436 95.052 | like sometimes you have to do with your new patient. |
95.953 100.677 | [filler], but I just want you to know that like I said we have been really |
101.403 103.453 | swamped with new patients and |
103.843 105.686 | a backlog of patients so |
106.170 108.412 | please continue to be patient with us if you are |
108.811 110.569 | Appointment is on perfectly on time. |
114.045 116.670 | Okay. Alright [filler], thanks for giving me a heads up. |
117.203 120.170 | Alright so I just have a few questions to ask you. |
121.927 126.644 | [filler]and they are just some general questions about your health within the last few days. |
127.412 131.703 | So just think back to your last few days and you can give me an answer. Are you ready? |
135.287 135.961 | I am ready. |
135.453 138.978 | Okay my first question is, have you had a new fever? |
139.425 145.336 | of a hundred and four degrees or higher, Of one hundred point four degrees or higher? Yes or no? |
148.151 156.961 | [filler], I have not had any meter to take my own temperature in the last forty eight hours. I have a sense of fever. So [filler], I don't have any measurements to tell you this. But no, I, I don't. |
157.032 157.663 | - |
158.788 163.216 | Okay so that means that within the last couple of days there has been no symptoms that could be |
159.274 160.066 | - |
163.757 165.532 | connected with a fever either right? |
167.757 168.191 | Correct |
169.223 172.032 | Okay that's great. Let me just type that into my computer. |
172.626 172.723 | - |
174.830 176.247 | Okay and my next question. |
176.782 181.449 | Have you had a new cough, you can now attribute to another health condition? |
English
en-us
USA
Arizona,...more
M: 55, F: 45
18-70
Silent, Noisy
16 bit
wav
8khz
Dual separate channel
5-15 minutes
Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.
Download Free Dataset
Contact Us