talentyGo

AI Data Scientist

Tential

📍 Rockville, Maryland, US0💼 Tempo pieno🕐 15 giorni fa
Candidati ora →

Crea un account gratis in 30 secondi: ottieni anche il match score AI con il tuo CV.

Descrizione

Data Scientist – Conversational AI Analytics The Data Scientist develops analytics and derives insights from conversational data generated by AI-powered platforms. This role is responsible for designing clustering pipelines, building facet extraction workflows, and producing actionable intelligence on how users interact with chat systems. The Data Scientist works closely with engineers and stakeholders to surface usage patterns, identify emerging topics, and inform product decisions through data. Key Responsibilities: Conversation Analytics & Insight Generation - Design and implement analytics pipelines that extract meaningful patterns from large-scale chat conversation data - Develop facet extraction approaches using LLMs to categorize conversations by request type, task performed, and topic discussed - Build dashboards and reporting artifacts that communicate usage trends, emerging topics, and user behavior to stakeholders - Identify and quantify shifts in conversation patterns over time to inform product roadmap and content strategy - Translate analytical findings into actionable recommendations for platform improvement Clustering & Unsupervised Learning - Architect and optimize hierarchical clustering pipelines using density-based algorithms (e.g., HDBSCAN) to group conversations by semantic similarity - Generate and manage text embeddings at scale using embedding models for downstream clustering and similarity tasks - Design multi-level clustering strategies that produce both granular groupings and higher-order category taxonomies - Evaluate cluster quality using persistence metrics, silhouette analysis, and domain-informed validation - Experiment with clustering parameters, distance metrics, and dimensionality reduction techniques to improve grouping coherence Data Engineering & Pipeline Development - Build and maintain data pipelines using Python for ingesting, transforming, and analyzing conversation datasets - Develop automated workflows using cloud-native orchestration and compute services to run analytics at scale on scheduled cadences - Work with object storage, search engines, and relational databases to store and query analytical outputs - Implement caching, batching, and incremental processing strategies to handle large embedding and clustering workloads efficiently - Maintain reproducible analysis environments and version analytical artifacts (models, cluster outputs, embeddings) LLM-Assisted Analysis - Design and refine LLM prompts for facet extraction, cluster labeling, and conversation summarization - Evaluate LLM output quality for analytical tasks and iterate on prompt strategies to improve accuracy - Leverage model infrastructure for embedding generation and LLM inference - Explore emerging techniques in LLM-driven data analysis, topic modeling, and automated insight generation Quality & Testing - Develop evaluation frameworks to measure clustering quality, facet extraction accuracy, and analytical pipeline correctness - Build automated regression tests to detect drift in clustering outputs or degradation in categorization quality - Validate analytical results against known baselines and domain expertise - Document methodologies, assumptions, and limitations of analytical approaches Security & Compliance - Assist with adherence to technology policies and comply with all security controls - Implement secure coding practices, particularly in handling personally identifiable information (PII) and sensitive data - Participate in threat modeling and security discussions for API and infrastructure components - Understand and apply organizational security standards and best practices #LI-WB
Candidati ora →

TalentyGo è un aggregatore di offerte da fonti pubbliche. Verifica sempre le informazioni direttamente con l'azienda. La candidatura avviene tramite il sito originale dell'azienda; TalentyGo non gestisce processi di selezione.