Unstructured-IO · Paul-Cornell · Apr 15, 2025 · Apr 16, 2025 · Apr 16, 2025 · Apr 18, 2025
diff --git a/snippets/general-shared-text/platform-partitioning-strategies.mdx b/snippets/general-shared-text/platform-partitioning-strategies.mdx
@@ -7,5 +7,10 @@ strategies other than **Auto** for sets of documents of different types could pr
 including reduction in transformation quality.
 
 - **VLM**: For the highest-quality transformation of these file types: `.bmp`, `.gif`, `.heic`, `.jpeg`, `.jpg`, `.pdf`, `.png`, `.tiff`, and `.webp`.
-- **High Res**: For all other [supported file types](/ui/supported-file-types), and for the generation of bounding box coordinates.
-- **Fast**: For text-only documents.
+- **High Res**: For all other [supported file types](/ui/supported-file-types) except video and audio files, and for the generation of bounding box coordinates.
+- **Fast**: For text-only documents.
+- **Multimedia**: For video and audio files.
+
+<Note>
+    Video and audio file partitioning is available only for [self-hosted](/self-hosted/overview) deployments of Unstructured.
+</Note>
diff --git a/snippets/general-shared-text/supported-file-types-platform.mdx b/snippets/general-shared-text/supported-file-types-platform.mdx
@@ -4,7 +4,10 @@ By file extension:
 
 | File extension |
 | --- |
+| `.3gp` |
+| `.aac` |
 | `.abw` |
+| `.avi` |
 | `.bmp` |
 | `.csv` |
 | `.cwk` |
@@ -19,6 +22,8 @@ By file extension:
 | `.epub` |
 | `.et` |
 | `.eth` |
+| `.flac` |
+| `.flv` |
 | `.fods` |
 | `.gif` |
 | `.heic` |
@@ -27,14 +32,26 @@ By file extension:
 | `.hwp` |
 | `.jpeg` |
 | `.jpg` |
+| `.m4a` |
 | `.md` |
 | `.mcw` |
+| `.mov` |
+| `.mp2` |
+| `.mp3` |
+| `.mp4` |
+| `.mpeg` |
+| `.mpegs` |
+| `.mpg` |
+| `.mpgs` |
 | `.mw` |
 | `.odt` |
+| `.ogg` |
+| `.opus` |
 | `.org` |
 | `.p7s` |
 | `.pages` |
 | `.pbd` |
+| `.pcm` |
 | `.pdf` |
 | `.png` |
 | `.pot` |
@@ -55,9 +72,12 @@ By file extension:
 | `.uof` |
 | `.uos1` |
 | `.uos2` |
+| `.wav` |
 | `.web` |
+| `.webm` |
 | `.webp` |
 | `.wk2` |
+| `.wmv` |
 | `.xls` |
 | `.xlsb` |
 | `.xlsm` |
@@ -71,6 +91,7 @@ By file type:
 | Category | File types |
 | --- | --- |
 | Apple | `.cwk`, `.mcw`, `.pages`
+| Audio | `.aac`, `.flac`, `.m4a`, `.mp2`, `.mp3`, `.mp4`, `.ogg`, `.opus`, `.pcm`, `.wav`, `.webm` |
 | CSV | `.csv` |
 | Data interchange | `.dif` |
 | dBase | `.dbf` |
@@ -90,5 +111,6 @@ By file type:
 | Spreadsheet | `.et`, `.fods`, `.uos1`, `.uos2`, `.wk2`, `.xls`, `.xlsb`, `.xlsm`, `.xlsx`, `.xlw` |
 | StarOffice | `.sxg` |
 | TSV | `.tsv` |
+| Video | `.3gp`, `.avi`, `.flv`, `.mov`, `.mp4`, `.mpeg`, `.mpegs`, `.mpg`, `.mpgs`, `.webm`, `.wmv` |
 | Word processing | `.abw`, `.doc`, `.docm`, `.docx`, `.dot`, `.dotm`, `.hwp`, `.zabw` |
 | XML | `.xml` |
diff --git a/snippets/quickstarts/single-file-ui.mdx b/snippets/quickstarts/single-file-ui.mdx
@@ -6,39 +6,21 @@ You can download that processed data as a `.json` file to your local machine.
 This approach enables rapid, local, run-adjust-repeat prototyping of end-to-end Unstructured ETL+ workflows with a full range of Unstructured features. 
 After you get the results you want, you can then attach remote source and destination connectors to both ends of your existing workflow to begin processing remote files and data at scale in production.
 
-To run this quickstart, you will need a local file with a size of 10 MB or less and one of the following file types:
-
-| File type |
-|---|
-| `.bmp` |
-| `.csv` |
-| `.doc` |
-| `.docx` |
-| `.email` |
-| `.epub` |
-| `.heic` |
-| `.html` |
-| `.jpg` |
-| `.md` |
-| `.odt` |
-| `.org` |
-| `.pdf` |
-| `.pot` |
-| `.potm` |
-| `.ppt` |
-| `.pptm` |
-| `.pptx` |
-| `.rst` |
-| `.rtf` |
-| `.sgl` |
-| `.tiff` |
-| `.txt` |
-| `.tsv` |
-| `.xls` |
-| `.xlsx` |
-| `.xml` |
+To run this quickstart, you will need a local file with a size of 20 MB or less for video and audio files, and 10 MB or less for 
+all other file types. This quickstart supports the following file types:
+
+| | | | | | | | | |
+| --- | --- | --- | --- | --- | --- | --- | --- | --- |
+| `.3gp` | `.aac` | `.avi` | `.bmp` | `.csv` | `.doc` | `.docx` | `.email` | `.epub` |
+| `.flac` | `.flv` | `.heic` | `.html` | `.jpg` | `.m4a` | `.md` | `.mov` | `.mp2` |
+| `.mp3` | `.mp4` | `.mpeg` | `.mpegs` | `.mpg` | `.mpgs` | `.odt` | `.ogg` | `.opus` |
+| `.org` | `.pcm` | `.pdf` | `.pot` | `.potm` |  `.ppt` | `.pptm` | `.pptx` | `.rst` |
+| `.rtf` | `.sgl` | `.tiff` | `.txt` | `.tsv` | `.wav` | `.webm` | `.wmv` | `.xls` |
+| `.xlsx` | `.xml` |
 
 <Note>
+    Video and audio file processing is available only for [self-hosted](/self-hosted/overview) deployments of Unstructured.
+
     For processing remote files at scale in production, Unstructured supports many more files types than these. [See the list of supported file types](/ui/supported-file-types).
 
     Unstructured also supports processing files from remote object stores, and data from remote sources in websites, web apps, databases, and vector stores. For more information, see the [source connector overview](/ui/sources/overview) and the [remote quickstart](/ui/quickstart#remote-quickstart) 
@@ -79,15 +61,23 @@ import GetStartedSimpleUIOnly from '/snippets/general-shared-text/get-started-si
     </Step>
     <Step title="Process a local file">
         1. Drag the file that you want Unstructured to process from your local machine's file browser app and drop it into the **Source** node's **Drop file to test** area. 
-           The file must have a size of 10 MB or less and one of the file types listed at the beginning of this quickstart.
+           The file must have a size of 20 MB or less for video and audio files, and 10 MB or less for all other file types. 
+           The file must be one of the supported file types listed at the beginning of this quickstart.
 
            If you are not able to drag and drop the file, you can click **Drop file to test** and then browse to and select the file instead.
 
            Alternatively, you can use a sample file that Unstructured offers. To do this, click the **Source** node, and then in the **Source** pane, with 
            **Details** selected, on the **Local file** tab, click one of the files under **Or use a provided sample file**. To view the file's contents before you 
            select it, click the eyes button next to the file.
 
-        2. Above the **Source** node, click **Test**.
+        2. If you are using a video or audio file, you must use a multimedia paritioning strategy; otherwise, you might get an error during processing. 
+           To select the multimedia partitioning strategy, click the **Partitioner** node, and then click **Auto** or **Multimedia**.
+
+           <Note>
+               Video and audio file processing is available only for [self-hosted](/self-hosted/overview) deployments of Unstructured.
+           </Note>
+
+        3. Above the **Source** node, click **Test**.
 
            ![Testing a single local file workflow](/img/ui/Workflow-Test-Source.png)
 
@@ -98,12 +88,12 @@ import GetStartedSimpleUIOnly from '/snippets/general-shared-text/get-started-si
 
            ![Viewing single local file output](/img/ui/Workflow-Test-Single-File-Output.png)
 
-        3. In the **Test output** pane, you can:
+        4. In the **Test output** pane, you can:
 
            - Search through the processed, JSON-formatted representation of the file by using the **Search JSON** box.
            - Download the full JSON as a `.json` file to your local machine by clicking **Download full JSON**.
 
-        4. When you are done, click the **Close** button in the **Test output** pane.
+        5. When you are done, click the **Close** button in the **Test output** pane.
 
     </Step> 
     <Step title="Add more nodes to the workflow">

diff --git a/ui/document-elements.mdx b/ui/document-elements.mdx
@@ -42,23 +42,29 @@ of the file and not care about its headers and footers. You can easily filter ou
 Here are some examples of the element types your file might contain:
 
 | Element type        | Description                                                                                                                                          |
-|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `Address`           | A text element for capturing physical addresses.                                                                                                     |
-| `CodeSnippet`       | A text element for capturing code snippets.                                                                                                          |
-| `EmailAddress`      | A text element for capturing email addresses.                                                                                                        |
-| `FigureCaption`     | An element for capturing text associated with figure captions.                                                                                       |
-| `Footer`            | An element for capturing document footers.                                                                                                           |
-| `FormKeysValues`    | An element for capturing key-value pairs in a form.                                                                                                  | 
-| `Formula`           | An element containing formulas in a file.                                                                                                            |
-| `Header`            | An element for capturing document headers.                                                                                                           |
-| `Image`             | A text element for capturing image metadata.                                                                                                         |
-| `ListItem`          | `ListItem` is a `NarrativeText` element that is part of a list.                                                                                      |
-| `NarrativeText`     | `NarrativeText` is an element consisting of multiple, well-formulated sentences. This excludes elements such titles, headers, footers, and captions. |
-| `PageBreak`         | An element for capturing page breaks.                                                                                                                |
-| `PageNumber`        | An element for capturing page numbers.                                                                                                               |
-| `Table`             | An element for capturing tables.                                                                                                                     |
-| `Title`             | A text element for capturing titles.                                                                                                                 |
-| `UncategorizedText` | Base element for capturing free text from within files. Applies to extracted text not associated with bounding boxes if the input is a PDF file.     |
+|--------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Address`            | A text element for capturing physical addresses.                                                                                                     |
+| `CodeSnippet`        | A text element for capturing code snippets.                                                                                                          |
+| `EmailAddress`       | A text element for capturing email addresses.                                                                                                        |
+| `FigureCaption`      | An element for capturing text associated with figure captions.                                                                                       |
+| `Footer`             | An element for capturing document footers.                                                                                                           |
+| `FormKeysValues`     | An element for capturing key-value pairs in a form.                                                                                                  | 
+| `Formula`            | An element containing formulas in a file.                                                                                                            |
+| `Header`             | An element for capturing document headers.                                                                                                           |
+| `Image`              | A text element for capturing image metadata.                                                                                                         |
+| `ListItem`           | `ListItem` is a `NarrativeText` element that is part of a list.                                                                                      |
+| `NarrativeText`      | `NarrativeText` is an element consisting of multiple, well-formulated sentences. This excludes elements such titles, headers, footers, and captions. |
+| `PageBreak`          | An element for capturing page breaks.                                                                                                                |
+| `PageNumber`         | An element for capturing page numbers.                                                                                                               |
+| `SceneDescription`   | An element for capturing scene descriptions, for example a description of a scene in a video.                                                        |
+| `Table`              | An element for capturing tables.                                                                                                                     |
+| `Title`              | A text element for capturing titles.                                                                                                                 |
+| `TranscriptFragment` | An element for capturing transcription of speech, for example a speaker's words in an audio clip or video.                                           |    
+| `UncategorizedText`  | Base element for capturing free text from within files. Applies to extracted text not associated with bounding boxes if the input is a PDF file.     |
+
+<Note>
+    `SceneDescription` and `TranscriptFragment` are specific to video and audio file processing, which is available only for [self-hosted](/self-hosted/overview) deployments of Unstructured.
+</Note>
 
 If you apply chunking, you will also see the `CompositeElement` type. 
 `CompositeElement` is a chunk formed from text (non-`Table`) elements. 
@@ -149,6 +155,27 @@ file.
 Headers and footers in Word files include a `header_footer_type` indicating which page a header or footer applies to.
 Valid values are `"primary"`, `"even_only"`, and `"first_page"`.
 
+#### Video files
+
+<Note>
+    Video file processing is available only for [self-hosted](/self-hosted/overview) deployments of Unstructured.
+</Note>
+
+Elements for video files include a `start_time` and `end_time`, representing the start and end times of a clip of video 
+from the parent video file to which this element belongs. Also included are the `model_version` representing the model that was used to 
+generate the element, and the `average_log_probability` representing the model's overall average confidence level for the model's output across the document, with values closer to 
+zero indicating higher confidence.
+
+#### Audio files
+
+<Note>
+    Audio file processing is available only for [self-hosted](/self-hosted/overview) deployments of Unstructured.
+</Note>
+
+Elements for audio files include a `start_time`, `end_time`, and `speaker`, representing the start and end times of a clip of audio 
+made by a specific speaker, as part of the parent audio file to which this element belongs. 
+If the speaker cannot be determined, `speaker` is set to `0` or `unknown`.
+
 ### Table-specific metadata
 
 For `Table` elements, the raw text of the table will be stored in the `text` attribute for the element, and HTML representation

diff --git a/ui/workflows.mdx b/ui/workflows.mdx
@@ -62,6 +62,7 @@ By default, this workflow partitions, chunks, and generates embeddings as follow
   - If the page or document has no images and likely does not have tables, **Fast** partitioning is used, and the page or document is billed at the **Fast** rate for processing.
   - If the page or document has only a few tables or images with standard layouts and languages, **High Res** partitioning is used, and the page or document is billed at the **High Res** rate for processing.
   - If the page or document has more than a few tables or images, **VLM** partitioning is used, and the page or document is billed at the **VLM** rate for processing.
+  - If the page or document is a video or audio file, **Multimedia** partitioning is used.
 
   [Learn about partitioning strategies](/ui/partitioning).