Multimodal AI

Image Captioning

Visual Speech Dataset Collection for Emotion Recognition Model

20 November 2024

MainImgBackground

Custom Collection of Scripted Utterance Speech Dataset

Lines

Client's Challenge & Our Solution

An AI company developing emotion recognition models for various industries approached FutureBeeAI to collect a diverse visual speech dataset. The client needed 1000 HD videos featuring participants speaking unscripted monologues on given prompts that showcased a variety of emotions. Additionally, they required diversity in the dataset across participant age, gender, ethnicity, recording device, recording style, background, and the time of day.

FutureBeeAI executed a global, large-scale data collection project, recruiting 250 participants from 10 countries to ensure diverse demographics. We used a participants latest smartphones to capture high-quality videos with varied backgrounds and lighting conditions. The videos were then annotated with emotion labels and carefully reviewed for accuracy.

Outcome & Features:

Collected 1000 HD videos from 250 participants across 10 countries, ensuring all the diversity requirements

Data captured across various devices, recording styles, and times of day to ensure the dataset’s applicability in diverse real-world scenarios.

Completed the project within 8 weeks, providing a robust, real-world multimodal dataset for AI model training

Download Full Case Study

Get It Now

Explore More Case Studies

Enhancing LLM Security with Red Teaming Prompt & Responses

Large Language ModelRed Teaming

Enhancing LLM Security with Red Teaming Prompt & Responses

Learn more

Empowering Voice Assistants with Multilingual Commands

Voice AssistantCustom Data Collection

Empowering Voice Assistants with Multilingual Commands

Learn more

case_study_iamage

NER AnnotationUnstructured Text Preprocessing

NER Annotation for Multilingual Unstructured Text Data

Learn more

Start your AI/ML model creation journey with FutureBeeAI!

Prompt Contact Arrow