Claude 3 Haiku turns thousands of physical documents into structured data

1:50

Claude 3 Haiku turns thousands of physical documents into structured data

Anthropic 04.03.2024 39 847 просмотров 959 лайков обн. 18.02.2026

Machine-readable: Markdown · JSON API · Site index

Смотреть на YouTube

Поделиться Telegram VK Бот

Транскрипт Скачать .md

Анализ с AI

Описание видео

Introducing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision. Learn more in our blog post: https://www.anthropic.com/news/claude-3-family Build with Claude: https://www.anthropic.com/api

Оглавление (1 сегментов)

Segment 1 (00:00 - 01:00)

Claude Haiku is one of the fastest and most affordable Vision capable models in the world to demonstrate this we're going to read through thousands of scan documents in a matter of minutes the Library of Congress Federal writers project is a collection of thousands of scanned transcripts from interviews during the Great Depression this is a gold mine of incredible narratives and real life Heroes but it's locked away in hard to access scans of transcripts imagine you're a documentary filmmaker or journalist how can you dig through these thousands of messy documents to find the best source material for your research without reading them all yourself since these documents are scanned images we can't feed them into a Texton llm and these scans are messy enough that they' be a challenge for most dedicated OCR software but luckily Haiku is natively Vision capable and can use surrounding text to transcribe these images and really understand what's going on we can also go beyond simple transcription for each interview and ask Haiku to generate structured Json output with metadata like title date keywords but also use some creativity in judgment to assess how compelling a documentary the story and characters would be we can process each document in parallel for performance and with claud's high availability API do that at massive scale for hundreds or thousands of documents let's take a look at some of that structured output Haiku is able to not just transcribe but pull out creative things like keywords we've transformed this collection of many scans uh into Rich keyword structure data imagine what any organization with a knowledge base of scanned documents like a traditional publisher healthcare provider or Law Firm can do Haiku can parse their extensive archives and bodies of work we'd love for you to try it out and see what you build

Другие видео автора — Anthropic

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Лучшие методички за неделю — каждый понедельник