> ## Documentation Index > Fetch the complete documentation index at: https://docs.beyondwords.io/llms.txt > Use this file to discover all available pages before exploring further. # Data attributes Data attributes let you embed BeyondWords configuration directly in your HTML. Use them to set content metadata (title, author, publish date), override voices and languages for specific paragraphs, mark images for video, add pauses, and improve [segment detection](/docs-and-guides/distribution/player/developer-guides/segment-detection). They complement [content filters](/docs-and-guides/integrations/content-extraction#filters)—filters remove whole HTML elements; data attributes configure how remaining content is interpreted and synthesized. If you send content via the [API](/docs-and-guides/integrations/api-overview), you can set many metadata fields (`title`, `author`, `publish_date`, etc.) directly on the request instead of using global data attributes. Global attributes are most useful when BeyondWords fetches a live page ([Magic Embed](/docs-and-guides/integrations/magic-embed), [RSS Feed Importer](/docs-and-guides/integrations/rss-feed-importer) page extraction). ## How it works BeyondWords reads `data-beyondwords-*` attributes from your HTML at different stages of processing. Each attribute belongs to one of three **scopes**—global, segment, or document—which determines what it affects and where you should place it. Attributes must be on valid HTML elements. Plaintext content without HTML tags cannot carry data attributes. ## Attribute scopes Think of the three scopes as three layers: | | [Global](#global-metadata-attributes) | [Segment](#segment-attributes) | [Document](#advanced) | | --------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- | | **What it affects** | The **content item** as a whole—metadata fields and defaults that apply across the article | Individual **segments**—paragraphs, headings, and images as they are split from your HTML | How the **entire HTML document** is processed before segmentation | | **When it is read** | During metadata extraction (when BeyondWords fetches a live page) | During HTML → segment splitting (`auto_segment`) | Before [content filters](/docs-and-guides/integrations/content-extraction#filters) run on the HTML | | **How many values** | One per content item—BeyondWords uses the **first** matching element in the document for each global attribute | Many—each element can have its own value; voice and language **inherit** from ancestor elements | One flag on the root `` element | | **Typical placement** | ``, `

`, or a page wrapper | `

`, `

`, `
`, ``, inline `` / `` | `` only | | API alternative | Yes—set `title`, `author`, `publish_date`, and similar fields on the [API request](/docs-and-guides/integrations/api-overview) instead | No direct API field—configure in HTML, or use the [Editor](/docs-and-guides/tools/editor) for `manual_segment` content | No dashboard equivalent—must be in the HTML | Global attributes answer: what is this article? They map to content-item fields—title, author, publish date, whether the player should load, default voices for title/body/summary sections, and the single feature image for videos and share pages. Segment attributes answer: how should this part of the article be synthesized or played back? They attach to specific HTML elements and flow into individual audio/video segments—a different voice for one paragraph, a pause mid-sentence, a marker for click-to-play, or an in-article image for video. Document attributes answer: how should BeyondWords process this HTML file? The only document-scoped attribute today skips dashboard content filters for that HTML—useful when you need the raw markup to pass through unchanged. ### Global vs segment: a concrete example Both scopes can set voices, but they work at different levels: ```html theme={null}

Article title

First paragraph uses voice 100 (global body default).

This paragraph uses voice 300 (segment override).

``` Similarly, `data-beyondwords-feature-image` (global) picks one hero image for the content item, while `data-beyondwords-image` (segment) marks individual images inside the article for [video](/docs-and-guides/content/video) segments. ## When to use data attributes | Integration | Global metadata attributes | Segment attributes | | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------- | ----------------------------------------------- | | [Magic Embed](/docs-and-guides/integrations/magic-embed) (live page fetch) | Yes—extracted from the page | Yes—from editorial HTML on the page | | [RSS Feed Importer](/docs-and-guides/integrations/rss-feed-importer) (page fetch enabled) | Yes—extracted from fetched article HTML | Yes | | [API](/docs-and-guides/integrations/api-overview) / [WordPress](/docs-and-guides/integrations/publishing-platforms/wordpress) / [Ghost](/docs-and-guides/integrations/publishing-platforms/ghost) (`body` HTML, `auto_segment`) | Prefer API/plugin fields for metadata; attributes in HTML still work for segments | Yes—in submitted HTML | | Dashboard [Editor](/docs-and-guides/tools/editor) (`manual_segment`) | No—edit metadata and segments in the Editor | N/A—set voices and pauses in the Editor instead | Regenerate or re-publish content after adding or changing attributes in your HTML template. ## Attribute reference Each attribute belongs to a [scope](#attribute-scopes)—global, segment, or document. | Attribute | Scope | Purpose | | -------------------------------------------------------------------- | ------------------------------------- | ---------------------------------------------------------------------------------------------------------- | | [`data-beyondwords-title`](#title) | [Global](#global-metadata-attributes) | Content title | | [`data-beyondwords-author`](#author) | [Global](#global-metadata-attributes) | Author name | | [`data-beyondwords-publish-date`](#publish-date) | [Global](#global-metadata-attributes) | Publish date (ISO 8601) | | [`data-beyondwords-published`](#published) | [Global](#global-metadata-attributes) | Whether content is publicly available | | [`data-beyondwords-ads-enabled`](#ads-enabled) | [Global](#global-metadata-attributes) | Whether ads are enabled | | [`data-beyondwords-title-voice-id`](#title-body-and-summary-voice) | [Global](#global-metadata-attributes) | Voice for title sections | | [`data-beyondwords-body-voice-id`](#title-body-and-summary-voice) | [Global](#global-metadata-attributes) | Voice for body sections | | [`data-beyondwords-summary-voice-id`](#title-body-and-summary-voice) | [Global](#global-metadata-attributes) | Voice for summary/script sections | | [`data-beyondwords-article-language`](#article-language) | [Global](#global-metadata-attributes) | Default language for synthesis | | [`data-beyondwords-feature-image`](#feature-image) | [Global](#global-metadata-attributes) | Content-level feature image (`true` on an ``) | | [`data-beyondwords-voice-id`](#voice-override) | [Segment](#segment-attributes) | Voice override for an element and its descendants | | [`data-beyondwords-language`](#language-override) | [Segment](#segment-attributes) | Language override for an element and its descendants | | [`data-beyondwords-marker`](#segment-markers) | [Segment](#segment-attributes) | Stable ID for [segment detection](/docs-and-guides/distribution/player/developer-guides/segment-detection) | | [`data-beyondwords-pause`](#pauses) | [Segment](#segment-attributes) | Pause duration in seconds (max 3) | | [`data-beyondwords-image`](#image-markers-video) | [Segment](#segment-attributes) | Mark an image for [video](/docs-and-guides/content/video) generation | | [`data-beyondwords-skip-split-clean-filters`](#skip-content-filters) | [Document](#advanced) | Skip [content filters](/docs-and-guides/integrations/content-extraction#filters) for this HTML | ## Global metadata attributes Global attributes set content-item-level metadata and defaults. See [Attribute scopes](#attribute-scopes) for how they differ from segment and document attributes. Add them to any element in your HTML—commonly on ``, `
`, or a wrapper `
`. BeyondWords uses the first matching element in the document for each attribute. These are extracted automatically when BeyondWords fetches a live URL ([Magic Embed](/docs-and-guides/integrations/magic-embed), RSS page extraction). When sending HTML via the API, prefer the API's `title`, `author`, and other metadata fields; use global attributes when page fetch is your ingestion path or you need to override extracted values. ### Title ```html theme={null}
...
``` ### Author ```html theme={null}
...
``` ### Publish date ISO 8601 datetime. If the date is in the future, the player will not load until that time. Include a timezone suffix (`Z` or `+01:00`); if omitted, UTC is assumed. ```html theme={null}
...
``` ### Published Boolean (`"true"` or `"false"`). If `false`, the player will not load regardless of publish date. Content is still generated and visible in the dashboard. ```html theme={null}
...
``` ### Ads enabled Boolean (`"true"` or `"false"`). ```html theme={null}
...
``` ### Title, body, and summary voice Set default voices by section using voice IDs from Content → Preferences → Voices in your project dashboard. See [voices](/docs-and-guides/voices/overview). ```html theme={null}
...
``` If not specified, project default voices are used. ### Article language Default synthesis language as a locale code (e.g. `en_GB`, `en_US`). If not specified, the project default language is used. ```html theme={null}
...
``` ### Feature image Marks the content-level feature image—used in videos and on shareable play pages. Set `data-beyondwords-feature-image="true"` on the chosen ``. BeyondWords uses the first matching image's `src` (resolved to an absolute URL when possible). ```html theme={null} ``` This is different from `data-beyondwords-image` (see below), which marks images within the article body for video segments. ## Segment attributes Segment attributes control per-segment behavior—how individual paragraphs, headings, and images are synthesized and identified in the player. See [Attribute scopes](#attribute-scopes) for how they differ from global and document attributes. Set them on specific HTML elements. Nested elements inherit the nearest ancestor's value for voice and language. ### Voice override Override the voice for a section using a voice ID. Child elements inherit unless they set their own override. ```html theme={null}
This paragraph uses voice 784.

This paragraph uses voice 2194.

So does this one.

``` ### Language override Override the language for a section using a locale code. ```html theme={null}
This paragraph is synthesized in British English.

Ce paragraphe est synthétisé en français.
``` ### Segment markers Markers identify segments on your page for player features such as paragraph highlighting and click-to-play. BeyondWords extracts markers from your HTML during processing; you can also add them manually. Use stable, unique values—we recommend UUIDs. See [segment detection](/docs-and-guides/distribution/player/developer-guides/segment-detection) for full guidance. ```html theme={null}
Article title

First paragraph.
``` ### Pauses Insert a verbal pause at a specific point in a paragraph. Value is a number in seconds (maximum 3), with up to one decimal place. An optional `s` suffix is accepted (`1.0`, `1.2s`). ```html theme={null}
The policy is designed to reduce emissions. In practice, it may do the opposite.
``` You can also use a `` element instead of ``. ### Image markers (video) Mark images within the article body for [video](/docs-and-guides/content/video) generation. Unlike `data-beyondwords-feature-image`, this applies per image segment in the article—not the content-level hero image. ```html theme={null} ``` Set `data-beyondwords-image="false"` to exclude an image that would otherwise be picked up automatically. ## Advanced Document-scoped attributes affect processing of the whole HTML file, not metadata or individual segments. See [Attribute scopes](#attribute-scopes). ### Skip content filters Set on the root `` element to bypass [content filters](/docs-and-guides/integrations/content-extraction#filters) for that HTML document. BeyondWords still removes `script`, `style`, and HTML comments. ```html theme={null} ... ``` Use sparingly—only when you need the raw HTML to pass through unchanged by dashboard filters (for example, highly controlled CMS output). You can also target elements with `data-` attributes using a [Data content filter](/docs-and-guides/integrations/content-extraction#data-element_data)—for example, `exclude` matches elements with a `data-exclude` attribute. ## Example: Magic Embed page ```html theme={null}

My article

Opening paragraph text.

Second paragraph with a pause. And more text after the pause.

Newsletter sign-up — excluded via content filter, not read aloud.

``` ## FAQs Global attributes apply to the whole content item—one title, one author, one feature image, default voices per section type. Segment attributes apply to individual parts of the HTML as they are split into audio/video segments—a voice override on one paragraph, a pause mid-sentence, a marker for click-to-play. Global attributes use the first matching element in the document; segment attributes inherit from ancestor elements. See [Attribute scopes](#attribute-scopes). If you control a backend integration, set `title`, `author`, `publish_date`, and similar fields on the [API request](/docs-and-guides/integrations/api-overview) directly—these are global metadata fields. Use global data attributes when BeyondWords fetches your page (Magic Embed, RSS page extraction) and you need to override what automatic extraction would infer. Segment attributes (voice overrides, pauses, markers) have no API equivalent—they belong in your HTML. `data-beyondwords-feature-image="true"` marks the single content-level hero image (videos, share pages). `data-beyondwords-image="true"` marks individual in-article images as segments for video generation. A page can have one feature image and multiple image markers. Usually not. BeyondWords extracts markers from your HTML during processing and the player uses them for highlighting and click-to-play. Add markers manually if [segment detection](/docs-and-guides/distribution/player/developer-guides/segment-detection) is not working on your site—use stable UUIDs and keep them consistent across page updates. Yes—they solve different problems. [Content filters](/docs-and-guides/integrations/content-extraction#filters) remove whole HTML elements before extraction. Data attributes configure metadata and per-segment behavior on the elements that remain. Use both together for best results. Go to Content → Preferences → Voices in your project dashboard. Each voice has a numeric ID you can copy into `data-beyondwords--voice-id` attributes. ## Getting help If you encounter issues or have questions, [contact support](/docs-and-guides/support/get-support). Include a sample of your HTML and which attributes you have configured.