English (India) Call Center Speech Dataset for Telecom

The audio dataset includes call center conversations in Telecom, featuring native English speakers from India, with detailed metadata and accurate transcriptions.

Category

Unscripted Call Center Conversations

Total Volume

30 Speech Hours

Last updated

July 2023

Number of participants

60

Get this Speech Dataset

Get Dataset Btn

About this Off-the-shelf Speech Dataset

About Gradiet Line

What’s Included

Welcome to the English Language Call Center Speech Dataset for the Telecom domain. It is a specialized and comprehensive collection of voice data designed to enhance the development of call center speech recognition models specifically for the Telecom industry.

With high-quality call center audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and generative voice AI algorithms in the Telecom domain. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in India.

Speech Data:

This training dataset comprises 30 hours of call center audio recordings covering various topics and scenarios related to the Telecom domain, to build robust and accurate customer service speech technology.

To curate realistic call center interactions, we collaborated with a diverse network of 60 expert native English speakers from different states/provinces of India. This collaborative effort ensures a balanced representation of Indian accents, dialects, and demographics, promoting inclusivity and reducing biases in the dataset.

Each audio recording captures the essence of unscripted and spontaneous conversations between call center agents and customers, with an average duration ranging from 5 to 15 minutes per call. The dataset includes both inbound and outbound calls, covering scenarios such as inquiries, promotional offers, complaints, technical support, and more. Additionally, the dataset contains call center conversations with both positive and negative outcomes, providing a diverse and realistic dataset.

The speech data is available in WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 kHz, ensuring high-quality audio for accurate analysis. The recording environment is generally quiet, without background noise and echo.

Metadata:

In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This includes the participant’s age, gender, country, state, and dialect. Additionally, it includes metadata like domain, topic, call type, outcome, bit depth, and sample rate for each conversation.

The metadata serves as a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of English language call center speech recognition models for the Telecom domain.

Transcription:

To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags, covering both the agent and customer conversations.

These ready-to-use transcriptions accelerate the development of Telecom call center conversational AI and ASR models for the English language.

Updates and Customization:

We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our call center voice dataset is regularly updated with new audio data captured in diverse real-world conditions.

If you require a custom training dataset with specific environmental conditions, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.

License:

This Telecom call center audio dataset is created by FutureBeeAI and is available for commercial use!

Conclusion:

Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, or building state-of-the-art voice assistants to improve customer experiences in the Telecom sector, our dataset serves as a trusted resource to meet your goals

Use Cases

Use of speech data for Automatic Speech Recognition

ASR

Use of speech data in Conversational AI

Conversational AI

Use of speech data for Chatbot & voicebot creation

Chatbot

Use of speech data in Language Modeling

Language Modelling

Use of speech data in Text-into-speech

TTS

Speech data usecase in Speech Analytics

Speech Analytics

Dataset Sample(s)

Sample Line

ATTRIBUTES

Channel 1Channel 2Format
Female(22)Female(21)wav, json

TRANSCRIPTION

LABELSTARTENDCHANNELTRANSCRIPT
Speech0.3501.300Speaker 2Good morning.
Speech1.8993.625Speaker 2Thank you for calling telecom.
Speech4.4245.825Speaker 2My name is <PII>Sushmita</PII>.
Speech6.3498.125Speaker 2How can I assist you today?
Noise7.6508.099--
Speech8.3759.948Speaker 1Good morning <PII>Sushmita</PII>.
Speech11.69814.675Speaker 1I am having some issue with my phone service.
Speech15.94819.250Speaker 1I get experiencing drop calls and poor reception
Speech19.87022.370Speaker 1in certain area. Can you help me with that?
Speech24.95425.704Speaker 2Of course.
Speech26.04528.896Speaker 2I will be happy to assist you with that.
Speech29.13732.686Speaker 2Let's start by checking the coverage in your area.
Speech33.62835.160Speaker 2Could you please provide me
Speech35.44436.987Speaker 2with your zip code?
Speech38.95342.628Speaker 1Yeah sure. My zip code is <PII>one two three four five</PII>
Speech44.07044.771Speaker 2Thank you.
Speech45.13047.204Speaker 2Let me check the coverage map
Speech47.47848.527Speaker 2for your area.
Speech49.02450.024Speaker 2I see that
Speech50.44452.345Speaker 2there are a few cell towers
Speech53.16954.100Speaker 2near by you.
Speech54.63756.862Speaker 2So, you should have good coverage.
Speech57.53658.310Speaker 2However
Speech58.72859.402Speaker 2sometimes
Speech59.72861.703Speaker 2there can an environmental factors
Speech62.10363.878Speaker 2that affect the signal strength.
Speech64.59067.367Speaker 2Have you noticed is the signal issues
Speech67.60869.683Speaker 2occur indoors or outdoors?
Speech72.40077.552Speaker 1It seems to happen mostly in indoor specially in office building and at home.
Speech78.65279.277Speaker 2I see.
Speech79.73683.554Speaker 2Indoor coverage can be influence by factors like
Speech83.94785.289Speaker 2building materials,
Speech85.67186.645Speaker 2and obstruction.
Speech87.58790.138Speaker 2Let's troubleshoot the issue further.
Speech90.55891.009Speaker 2First
Speech91.54692.572Speaker 2I recommend
Speech92.92194.245Speaker 2try a simple step,
Speech94.71596.266Speaker 2restarting your phone.
Speech97.45998.286Speaker 2This can
Speech98.632101.639Speaker 2often resolve temporary network glitches.
Speech102.355106.057Speaker 2Can you please turn off your phone and turn it back on.
Speech108.751111.230Speaker 1Sure. I will do that right now.
Speech112.105114.605Speaker 1Okay. I have restarted my phone.
Speech116.918117.468Speaker 2Great.
Speech117.861118.611Speaker 2Let see
Speech118.882120.281Speaker 2if the issue persists.
Speech120.900121.724Speaker 2Can make
Speech122.117126.311Speaker 2Can you make a test call to friend or family member?
Speech126.555127.504Speaker 2And let me know
Speech127.772129.247Speaker 2if the call quality
Speech129.467130.442Speaker 2has improved.
Speech132.633133.556Speaker 1All right.
Speech133.919135.568Speaker 1I am making the call now.
Speech136.456144.181Speaker 1The call quality does seem a beat better than before. But I can still hear some static allocational drop in the call.
Speech146.901149.919Speaker 2Thank you for testing that. It's okay.
Speech150.443154.062Speaker 2It's good to know that the call quality has improved.
Speech154.501154.949Speaker 2But
Speech155.281158.031Speaker 2we will work on eliminating those
Speech158.354159.580Speaker 2remaining issues.
Speech160.131163.056Speaker 2To address this static and drop calls
Speech163.506164.431Speaker 2I recommend
Speech164.717167.044Speaker 2checking for any software update
Speech167.372169.270Speaker 2available for your phone.
Speech170.258174.286Speaker 2Keep your phone software up to date can often resolve
Speech174.586176.161Speaker 2compatibility issues
Speech176.485177.508Speaker 2with the network.
Speech178.102180.324Speaker 2Could you please go to your phone setting

TRANSCRIPTION

TIMETRANSCRIPT
0.350
1.300
Good morning.
1.899
3.625
Thank you for calling telecom.
4.424
5.825
My name is <PII>Sushmita</PII>.
6.349
8.125
How can I assist you today?
7.650
8.099
-
8.375
9.948
Good morning <PII>Sushmita</PII>.
11.698
14.675
I am having some issue with my phone service.
15.948
19.250
I get experiencing drop calls and poor reception
19.870
22.370
in certain area. Can you help me with that?
24.954
25.704
Of course.
26.045
28.896
I will be happy to assist you with that.
29.137
32.686
Let's start by checking the coverage in your area.
33.628
35.160
Could you please provide me
35.444
36.987
with your zip code?
38.953
42.628
Yeah sure. My zip code is <PII>one two three four five</PII>
44.070
44.771
Thank you.
45.130
47.204
Let me check the coverage map
47.478
48.527
for your area.
49.024
50.024
I see that
50.444
52.345
there are a few cell towers
53.169
54.100
near by you.
54.637
56.862
So, you should have good coverage.
57.536
58.310
However
58.728
59.402
sometimes
59.728
61.703
there can an environmental factors
62.103
63.878
that affect the signal strength.
64.590
67.367
Have you noticed is the signal issues
67.608
69.683
occur indoors or outdoors?
72.400
77.552
It seems to happen mostly in indoor specially in office building and at home.
78.652
79.277
I see.
79.736
83.554
Indoor coverage can be influence by factors like
83.947
85.289
building materials,
85.671
86.645
and obstruction.
87.587
90.138
Let's troubleshoot the issue further.
90.558
91.009
First
91.546
92.572
I recommend
92.921
94.245
try a simple step,
94.715
96.266
restarting your phone.
97.459
98.286
This can
98.632
101.639
often resolve temporary network glitches.
102.355
106.057
Can you please turn off your phone and turn it back on.
108.751
111.230
Sure. I will do that right now.
112.105
114.605
Okay. I have restarted my phone.
116.918
117.468
Great.
117.861
118.611
Let see
118.882
120.281
if the issue persists.
120.900
121.724
Can make
122.117
126.311
Can you make a test call to friend or family member?
126.555
127.504
And let me know
127.772
129.247
if the call quality
129.467
130.442
has improved.
132.633
133.556
All right.
133.919
135.568
I am making the call now.
136.456
144.181
The call quality does seem a beat better than before. But I can still hear some static allocational drop in the call.
146.901
149.919
Thank you for testing that. It's okay.
150.443
154.062
It's good to know that the call quality has improved.
154.501
154.949
But
155.281
158.031
we will work on eliminating those
158.354
159.580
remaining issues.
160.131
163.056
To address this static and drop calls
163.506
164.431
I recommend
164.717
167.044
checking for any software update
167.372
169.270
available for your phone.
170.258
174.286
Keep your phone software up to date can often resolve
174.586
176.161
compatibility issues
176.485
177.508
with the network.
178.102
180.324
Could you please go to your phone setting

Dataset Demographics

Details Headline

Language

English

Language code

en-In

Country

India

Accents

Chandigarh,...more

Gender Distribution

M:55, F:45

Age Group

18-70

Audio File Details

Details Headline

Environment

Silent, Noisy

Bit Depth

16 bit

Format

wav

Sample rate

8khz

Channel

Dual separate channel

Audio file duration

5-15 minutes

Download Sample Speech Dataset Now!

Explore Audio Data, Metadata and Transcription to get more clarity and hands on experience of this dataset.

Download Free Dataset

Audio Download Btn
Audio Promp Bg
Audio Promp Bg

Start your AI/ML model creation journey with FutureBeeAI!

Contact Us

Audio Arrow BtnAudio Arrow Btn Black
Audio Promp 2 Bg