Annotated Text JSON (ATJ) format
The format helps describe pieces of text (usualy generated from a Document Plan).
A piece of text consists of DOM-like Element tree starting with a Annotated Text Element at the root.
Note: this doc uses Yaml instead of JSON in code examples to increase readability. JSON example file is here.
Example
id: text-uuid-0001
type: ANNOTATED_TEXT
plan_id: plan-uuid-0002
plan_version: version-uuid-0003
annotations:
-
id: annotation-uuid-0004
type: PROPERTY_LIST
references:
-
id: reference-uuid-0005
type: CALL_TO_ACTION
children:
-
id: element-uuid-0006
type: PARAGRAPH
implements: blockly-uuid-0007
children:
-
id: element-uuid-0008
type: SENTENCE
implements: blockly-uuid-0009
children:
-
id: element-uuid-000a
type: SYNONYM_SET
synonyms: [ "Buy", "Grab" ]
text: "Buy"
references:
- reference-uuid-0005
-
id: element-uuid-000b
type: ATTRIBUTE
implements: blockly-uuid-000c
attribute_name: color
text: red
annotations:
- annotation-uuid-0004
-
id: element-uuid-000d
type: ATTRIBUTE
implements: blockly-uuid-000e
attribute_name: material
text: cotton
annotations:
- annotation-uuid-0004
-
id: element-uuid-000f
type: WORD
text: t-shirt
part_of_speech: noun
-
id: element-uuid-0010
type: PUNCTUATION
text: "!"
references:
- reference-uuid-0005
Elements
There are three groups of elements: Container Elements, Text Elements and Annotations & References.
At the root of the tree there should be one Annotated Text Element which is a type of a Container Element.
All Element objects:
- must have an
id
property unique within the piece of text, - must have a
type
property (an UPPER_CASE string), - may include other type-dependent or arbitrary properties.
- In the case of an unknown property, the program using the Element should ignore the property and not report any errors to the user.
Annotations & References
- Describe relationships between words.
- Can only be attached to the
ANNOTATED_TEXT
Element at the root. - Should be referenced by the Text Elements participating in the relationship.
Synthetic example:
id: reference-uuid-0001
type: SAME_PERSON
Container Elements
- Must have a (non-empty)
children
array containing Container Elements and/or Text Elements.
Example types: ANNOTATED_TEXT
, EMPHASIS
, PARAGRAPH
, SENTENCE
.
Annotated Text Element
- Must have type
ANNOTATED_TEXT
. - Must reference the Document Plan that created it.
- Should include arrays of Annotations & References used by child Elements.
Minimal Valid Example
id: text-uuid-0001
type: ANNOTATED_TEXT
plan_id: plan-uuid-0002
plan_version: plan-uuid-0003
annotations: []
references: []
children: []
Text Elements
- Must have a non-empty
text
string property. - Must not have children.
- Should list the relevant annotations and references.
Example types: ATTRIBUTE
, PUNCTUATION
, SYNONYM_SET
, WORD
.
Example
id: element-uuid-0001
type: WORD
text: dog
annotations:
- annotation-uuid-0002
- annotation-uuid-0003
references:
- reference-uuid-0004
- reference-uuid-0005
Reserved property names
These properties are reserved for future use and should NOT be used for Element types without a consultation:
- attributes
- class
- classList
- className
- data
- dataset
- href
- innerHTML
- innerText
- name
- outerHTML
- parent
- rel
- src
- title
- value
Note: this rule applies to all forms of the name (e.g. classList, class_list, etc.).