General Conversation Speech Datasets

Explore the collection of high quality and diverse general conversation speech datasets across different languages. Leverage these ready to use audio datasets to train and fine tune your automatic speech recognition (ASR) and conversational AI models. These audio datasets includes high quality speech data, accurate transcription and detailed metadata.

Filter IconFilter Close

Filter

(43)

Clear

Apply

filter-mobile-icon
Arabic (Algeria) Speech dataset for Conversational AI
Arabic (Algeria)

Algerian Arabic General Conversation Speech Data

Unscripted conversation audio data in Algerian Arabic.

50 Speech Hours

70 People

Arabic (Egypt) Audio Dataset for Conversational AI
Arabic (Egypt)

Egyptian Arabic General Conversation Speech Data

Unscripted conversation audio data in Egyptian Arabic.

50 Speech Hours

70 People

Arabic (Saudi Arabia) Voice dataset for Conversational AI
Arabic (Saudi Arabia)

Saudi Arabian Arabic General Conversation Speech Data

Unscripted conversation audio data in Saudi Arabian Arabic.

50 Speech Hours

70 People

Bahasa (Indonesia) Speech dataset for Speech recognition
Bahasa (Indonesia)

Bahasa General Conversation Speech Data

Unscripted conversation audio data in Bahasa.

50 Speech Hours

70 People

Bengali (India) Audio Dataset for Speech recognition
Bengali (India)

Indian Bengali General Conversation Speech Data

Unscripted conversation audio data in Indian Bengali.

60 Speech Hours

80 People

Danish (Denmark) Voice dataset for Speech recognition
Danish (Denmark)

Danish General Conversation Speech Data

Unscripted conversation audio data in Danish.

50 Speech Hours

70 People

Dutch (Netherlands) Speech dataset for NLP
Dutch (Netherlands)

Dutch General Conversation Speech Data

Unscripted conversation audio data in Dutch.

50 Speech Hours

70 People

English (Australia) Audio Dataset for NLP
English (Australia)

Australian English General Conversation Speech Data

Unscripted conversation audio data in Australian English.

25 Speech Hours

45 People

English (Canada) Voice dataset for NLP
English (Canada)

Canadian English General Conversation Speech Data

Unscripted conversation audio data in Canadian English.

25 Speech Hours

45 People

English (India) Speech data for AI
English (India)

Indian English General Conversation Speech Data

Unscripted conversation audio data in Indian English.

90 Speech Hours

110 People

English (New Zealand) Speech dataset for Conversational AI
English (New Zealand)

New Zealand English General Conversation Speech Data

Unscripted conversation audio data in New Zealand English.

25 Speech Hours

45 People

English (UK) Audio Dataset for Conversational AI
English (UK)

British English General Conversation Speech Data

Unscripted conversation audio data in British English.

25 Speech Hours

45 People

English (USA) Voice dataset for Conversational AI
English (US)

American English General Conversation Speech Data

Unscripted conversation audio data in American English.

25 Speech Hours

45 People

Finnish (Finland) Speech dataset for Speech recognition
Finnish (Finland)

Finnish General Conversation Speech Data

Unscripted conversation audio data in Finnish.

50 Speech Hours

70 People

French (France) Audio Dataset for Speech recognition
French (France)

French General Conversation Speech Data

Unscripted conversation audio data in French.

50 Speech Hours

70 People

German (Germany) Voice dataset for Speech recognition
German (Germany)

German General Conversation Speech Data

Unscripted conversation audio data in German.

50 Speech Hours

70 People

Gujarati (India) Speech dataset for NLP
READY TO DEPLOYTimer-Icon
Gujarati (India)

Gujarati General Conversation Speech Data

Unscripted conversation audio data in Gujarati.

60 Speech Hours

80 People

Hindi (India) Audio Dataset for NLP
READY TO DEPLOYTimer-Icon
Hindi (India)

Hindi General Conversation Speech Data

Unscripted conversation audio data in Hindi.

150 Speech Hours

160 People

Italian (Italy) Voice dataset for NLP
Italian (Italy)

Italian General Conversation Speech Data

Unscripted conversation audio data in Italian.

50 Speech Hours

70 People

Japanese (Japan) Speech data for AI
Japanese (Japan)

Japanese General Conversation Speech Data

Unscripted conversation audio data in Japanese.

50 Speech Hours

70 People

Speech recognition dataset in Kannada (India)
Kannada (India)

Kannada General Conversation Speech Data

Unscripted conversation audio data in Kannada.

60 Speech Hours

80 People

Conversational AI dataset in Korean (South Korea)
Korean (South Korea)

Korean General Conversation Speech Data

Unscripted conversation audio data in Korean.

50 Speech Hours

70 People

Text-to-speech dataset in Malayalam (India)
Malayalam (India)

Malayalam General Conversation Speech Data

Unscripted conversation audio data in Malayalam.

60 Speech Hours

80 People

Conversational AI dataset in Mandarin (China)
Mandarin (China)

Mandarin General Conversation Speech Data

Unscripted conversation audio data in Mandarin Chinese.

50 Speech Hours

70 People

Speech-to-text dataset in Marathi (India)
Marathi (India)

Marathi General Conversation Speech Data

Unscripted conversation audio data in Marathi.

60 Speech Hours

80 People

AI speech training dataset in Norwegian (Norway)
Norwegian (Norway)

Norwegian General Conversation Speech Data

Unscripted conversation audio data in Norwegian.

50 Speech Hours

70 People

Machine learning speech dataset in Oriya/Odia (India)
Oriya/Odia (India)

Oriya/Odia General Conversation Speech Data

Unscripted conversation audio data in Odia.

60 Speech Hours

80 People

Machine learning voice dataset in Polish (Poland)
Polish (Poland)

Polish General Conversation Speech Data

Unscripted conversation audio data in Polish.

50 Speech Hours

70 People

Machine learning audio dataset in Portuguese (Portugal)
Portuguese (Portugal)

European Portuguese General Conversation Speech Data

Unscripted conversation audio data in Portuguese.

50 Speech Hours

70 People

Artificial intelligence speech dataset in Punjabi (India)
Punjabi (India)

Punjabi General Conversation Speech Data

Unscripted conversation audio data in Punjabi.

60 Speech Hours

80 People

Artificial intelligence voice dataset in Russian (Russia)
Russian (Russia)

Russian General Conversation Speech Data

Unscripted conversation audio data in Russian.

50 Speech Hours

70 People

artificial intelligence audio dataset in Spanish (Argentina)
Spanish (Argentina)

Argentine Spanish General Conversation Speech Data

Unscripted conversation audio data in Argentine Spanish.

50 Speech Hours

70 People

Speech recognition dataset in Spanish (Colombia)
Spanish (Colombia)

Colombian Spanish General Conversation Speech Data

Unscripted conversation audio data in Colombian Spanish.

50 Speech Hours

70 People

Conversational AI dataset in Spanish (Mexico)
Spanish (Mexico)

Mexican Spanish General Conversation Speech Data

Unscripted conversation audio data in Mexican Spanish.

50 Speech Hours

70 People

Text-to-speech dataset in Spanish (Spain)
Spanish (Spain)

Spanish (Spain) General Conversation Speech Data

Unscripted conversation audio data in Spanish (Spain).

50 Speech Hours

70 People

Conversational AI dataset in Swedish (Sweden)
Swedish (Sweden)

Swedish General Conversation Speech Data

Unscripted conversation audio data in Swedish.

50 Speech Hours

70 People

Speech-to-text dataset in Filipino (Philippines)
Filipino (Philippines)

Filipino General Conversation Speech Data

Unscripted conversation audio data in Filipino.

50 Speech Hours

70 People

AI speech training dataset in Tamil (India)
Tamil (India)

Tamil General Conversation Speech Data

Unscripted conversation audio data in Tamil.

60 Speech Hours

80 People

Machine learning speech dataset in Telugu (India)
Telugu (India)

Telugu General Conversation Speech Data

Unscripted conversation audio data in Telugu.

90 Speech Hours

110 People

Machine learning voice dataset in Turkish (Turkey)
Turkish (Turkey)

Turkish General Conversation Speech Data

Unscripted conversation audio data in Turkish.

50 Speech Hours

70 People

Machine learning audio dataset in Ukrainian (Ukraine)
Ukrainian (Ukraine)

Ukrainian General Conversation Speech Data

Unscripted conversation audio data in Ukrainian.

50 Speech Hours

70 People

Machine learning speech dataset in Urdu (Pakistan)
Urdu (Pakistan)

Urdu General Conversation Speech Data

Unscripted conversation audio data in Urdu.

60 Speech Hours

80 People

Machine learning speech dataset in Bulgarian (Bulgaria)
Bulgarian (Bulgaria)

Bulgarian General Conversation Speech Data

Unscripted conversation audio data in Bulgarian.

60 Speech Hours

80 People

Train & Fine-tune Your ASR Models with High-quality Multilingual Datasets!

Collect custom dataset with crowd community