Data Products (3,900)

  • Mirage offers synthetic image data for computer vision models, with perfectly annotated bounding box, segmentation, keypoint, depth, and normals. The dataset covers 249 countries and is generated on 3D game engines, providing error-free data for various applications.

    Fine-tuning Raw Data Monthly
    RAG Compatibility 3.5/5
  • The Synthpop Dataset offers a curated selection of audio tracks with detailed metadata, tailored for innovative machine learning applications focusing on the Synthpop genre's distinct sound. It includes chords, instrumentation, key, tempo, and timestamp information.

    Inference One-off Purchase API Monthly
    RAG Compatibility 4.5/5
  • The Synthwave Dataset is a curated collection geared towards advanced machine learning applications. It consists of audio tracks enriched with metadata like chords, instruments, key signatures, tempo, and timestamps. This dataset uniquely blends intricate musical data with the nostalgic electronic sounds of the 1980s genre, offering a specialized resource for training models in generative AI music, Music Information Retrieval (MIR), and source separation.

    Inference One-off Purchase Raw Data Monthly
    RAG Compatibility 4.5/5
  • TAUS Language Translation Data provides parallel translation for Colloquial English into various languages for Machine Learning. It includes 1 million words per language pair with a vocabulary of over 37,000 unique words across 15 countries.

    Training Monthly One-off Purchase API Raw Data Daily
    RAG Compatibility 3.5/5
  • TAUS Language Translation Data provides parallel translation data for Covid-19, Medical, and Healthcare domains in various languages. It includes over 123 million target words and 5.39 million sentences sourced from translation memories and web crawls.

    Training API Monthly
    RAG Compatibility 3.5/5
  • Access parallel translation data specializing in legal contracts and obligations with various language pairs. The dataset covers 5M million words per language and 200,000 sentences per language, focusing on contracts' terms, obligations, and conditions.

    Training One-off Purchase API Raw Data Static
    RAG Compatibility 3.5/5
  • TAUS Language Translation Data provides parallel translation data for Medical and Pharmaceutical content in various languages, tailored for machine learning applications. The dataset includes 3 million words per language and covers aspects like product features, dosage recommendations, clinical trials, and more.

    Inference One-off Purchase API Daily
    RAG Compatibility 4.2/5
  • The TAUS Language Translation Data provides parallel translations for e-commerce in various language pairs. It offers 1 million words per language pair and 200,000 sentences per language pair with a historical data coverage of 1 year. The dataset covers 11 countries and is suitable for companies of all sizes, including small businesses, medium-sized businesses, and enterprises.

    Inference One-off Purchase API Raw Data Quarterly
    RAG Compatibility 4.2/5
  • Rocks & Gold
    / TECHNOGRAPHIC DATA
    0

    The Technographic Data set from Rocks & Gold comprises over 15 million unique IT job postings collected from various job boards in Europe, the US, and Canada since 2018. The dataset covers 65 countries and includes data on 200,000 hiring companies with historical information dating back 5 years. The data quality is reported to have 87% of job postings with tech stacks.

    Inference One-off Purchase API Daily
    RAG Compatibility 4.5/5
  • TL1 provides raw digital data extracted from user browser behavior for ad targeting rules, predictive research, and Big Data analysis. The data is based on server-to-server integration using cookie matching or public IDs, covering 200 countries.

    Raw Data Monthly
    RAG Compatibility 3.0/5