Back
Multimodal AI
Image Captioning

Visual Speech Dataset Collection for Emotion Recognition Model

Calendar20 November 2024
MainImgBackground Custom Collection of Scripted Utterance Speech Dataset
Lines

Client's Challenge & Our Solution

An AI company developing emotion recognition models for various industries approached FutureBeeAI to collect a diverse visual speech dataset. The client needed 1000 HD videos featuring participants speaking unscripted monologues on given prompts that showcased a variety of emotions. Additionally, they required diversity in the dataset across participant age, gender, ethnicity, recording device, recording style, background, and the time of day.

FutureBeeAI executed a global, large-scale data collection project, recruiting 250 participants from 10 countries to ensure diverse demographics. We used a participants latest smartphones to capture high-quality videos with varied backgrounds and lighting conditions. The videos were then annotated with emotion labels and carefully reviewed for accuracy.

Outcome & Features:

ArrowCollected 1000 HD videos from 250 participants across 10 countries, ensuring all the diversity requirements
ArrowData captured across various devices, recording styles, and times of day to ensure the dataset’s applicability in diverse real-world scenarios.
ArrowCompleted the project within 8 weeks, providing a robust, real-world multimodal dataset for AI model training

Download Full Case Study

Get It Now

Audio Download Btn

Start your AI/ML model creation journey with FutureBeeAI!

Prompt Contact Arrow