How it works
BeyondWords readsdata-beyondwords-* attributes from your HTML at different stages of processing. Each attribute belongs to one of three scopes—global, segment, or document—which determines what it affects and where you should place it.
Attributes must be on valid HTML elements. Plaintext content without HTML tags cannot carry data attributes.
Attribute scopes
Think of the three scopes as three layers:| Global | Segment | Document | |
|---|---|---|---|
| What it affects | The content item as a whole—metadata fields and defaults that apply across the article | Individual segments—paragraphs, headings, and images as they are split from your HTML | How the entire HTML document is processed before segmentation |
| When it is read | During metadata extraction (when BeyondWords fetches a live page) | During HTML → segment splitting (auto_segment) | Before content filters run on the HTML |
| How many values | One per content item—BeyondWords uses the first matching element in the document for each global attribute | Many—each element can have its own value; voice and language inherit from ancestor elements | One flag on the root <html> element |
| Typical placement | <body>, <article>, or a page wrapper | <p>, <h1>, <div>, <img>, inline <span> / <time> | <html> only |
| API alternative | Yes—set title, author, publish_date, and similar fields on the API request instead | No direct API field—configure in HTML, or use the Editor for manual_segment content | No dashboard equivalent—must be in the HTML |
Global vs segment: a concrete example
Both scopes can set voices, but they work at different levels:data-beyondwords-feature-image (global) picks one hero image for the content item, while data-beyondwords-image (segment) marks individual images inside the article for video segments.
When to use data attributes
| Integration | Global metadata attributes | Segment attributes |
|---|---|---|
| Magic Embed (live page fetch) | Yes—extracted from the page | Yes—from editorial HTML on the page |
| RSS Feed Importer (page fetch enabled) | Yes—extracted from fetched article HTML | Yes |
API / WordPress / Ghost (body HTML, auto_segment) | Prefer API/plugin fields for metadata; attributes in HTML still work for segments | Yes—in submitted HTML |
Dashboard Editor (manual_segment) | No—edit metadata and segments in the Editor | N/A—set voices and pauses in the Editor instead |
Attribute reference
Each attribute belongs to a scope—global, segment, or document.Global metadata attributes
Global attributes set content-item-level metadata and defaults. See Attribute scopes for how they differ from segment and document attributes. Add them to any element in your HTML—commonly on<body>, <article>, or a wrapper <div>. BeyondWords uses the first matching element in the document for each attribute.
These are extracted automatically when BeyondWords fetches a live URL (Magic Embed, RSS page extraction). When sending HTML via the API, prefer the API’s title, author, and other metadata fields; use global attributes when page fetch is your ingestion path or you need to override extracted values.
Title
Author
Publish date
ISO 8601 datetime. If the date is in the future, the player will not load until that time. Include a timezone suffix (Z or +01:00); if omitted, UTC is assumed.
Published
Boolean ("true" or "false"). If false, the player will not load regardless of publish date. Content is still generated and visible in the dashboard.
Ads enabled
Boolean ("true" or "false").
Title, body, and summary voice
Set default voices by section using voice IDs from Content → Preferences → Voices in your project dashboard. See voices.Article language
Default synthesis language as a locale code (e.g.en_GB, en_US). If not specified, the project default language is used.
Feature image
Marks the content-level feature image—used in videos and on shareable play pages. Setdata-beyondwords-feature-image="true" on the chosen <img>. BeyondWords uses the first matching image’s src (resolved to an absolute URL when possible).
data-beyondwords-image (see below), which marks images within the article body for video segments.
Segment attributes
Segment attributes control per-segment behavior—how individual paragraphs, headings, and images are synthesized and identified in the player. See Attribute scopes for how they differ from global and document attributes. Set them on specific HTML elements. Nested elements inherit the nearest ancestor’s value for voice and language.Voice override
Override the voice for a section using a voice ID. Child elements inherit unless they set their own override.Language override
Override the language for a section using a locale code.Segment markers
Markers identify segments on your page for player features such as paragraph highlighting and click-to-play. BeyondWords extracts markers from your HTML during processing; you can also add them manually. Use stable, unique values—we recommend UUIDs. See segment detection for full guidance.Pauses
Insert a verbal pause at a specific point in a paragraph. Value is a number in seconds (maximum 3), with up to one decimal place. An optionals suffix is accepted (1.0, 1.2s).
<time> element instead of <span>.
Image markers (video)
Mark images within the article body for video generation. Unlikedata-beyondwords-feature-image, this applies per image segment in the article—not the content-level hero image.
data-beyondwords-image="false" to exclude an image that would otherwise be picked up automatically.
Advanced
Document-scoped attributes affect processing of the whole HTML file, not metadata or individual segments. See Attribute scopes.Skip content filters
Set on the root<html> element to bypass content filters for that HTML document. BeyondWords still removes script, style, and HTML comments.
data-* attributes using a Data content filter—for example, exclude matches elements with a data-exclude attribute.
Example: Magic Embed page
FAQs
What is the difference between global and segment attributes?
What is the difference between global and segment attributes?
Global attributes apply to the whole content item—one title, one author, one feature image, default voices per section type. Segment attributes apply to individual parts of the HTML as they are split into audio/video segments—a voice override on one paragraph, a pause mid-sentence, a marker for click-to-play. Global attributes use the first matching element in the document; segment attributes inherit from ancestor elements. See Attribute scopes.
Should I use data attributes or the API for metadata?
Should I use data attributes or the API for metadata?
If you control a backend integration, set
title, author, publish_date, and similar fields on the API request directly—these are global metadata fields. Use global data attributes when BeyondWords fetches your page (Magic Embed, RSS page extraction) and you need to override what automatic extraction would infer. Segment attributes (voice overrides, pauses, markers) have no API equivalent—they belong in your HTML.What is the difference between feature-image and image attributes?
What is the difference between feature-image and image attributes?
data-beyondwords-feature-image="true" marks the single content-level hero image (videos, share pages). data-beyondwords-image="true" marks individual in-article images as segments for video generation. A page can have one feature image and multiple image markers.Do I need to add segment markers manually?
Do I need to add segment markers manually?
Usually not. BeyondWords extracts markers from your HTML during processing and the player uses them for highlighting and click-to-play. Add markers manually if segment detection is not working on your site—use stable UUIDs and keep them consistent across page updates.
Can I use data attributes with content filters?
Can I use data attributes with content filters?
Yes—they solve different problems. Content filters remove whole HTML elements before extraction. Data attributes configure metadata and per-segment behavior on the elements that remain. Use both together for best results.
Where do I find voice IDs?
Where do I find voice IDs?
Go to Content → Preferences → Voices in your project dashboard. Each voice has a numeric ID you can copy into
data-beyondwords-*-voice-id attributes.