Move the graphics protocol documentation to a separate file

This commit is contained in:
Kovid Goyal 2017-09-29 06:36:04 +05:30
parent 5e5065715e
commit 9e5dbb45d4
No known key found for this signature in database
GPG Key ID: 06BC317B515ACE7C
4 changed files with 225 additions and 256 deletions

1
.gitignore vendored
View File

@ -4,7 +4,6 @@
.build-cache
tags
build
README.html
linux-package
logo/*.iconset
test-launcher

222
graphics-protocol.asciidoc Normal file
View File

@ -0,0 +1,222 @@
= The terminal graphics protocol
The goal of this specification is to create a flexible and performant protocol
that allows the program running in the terminal, hereafter called the _client_,
to render arbitrary pixel (raster) graphics to the screen of the terminal
emulator. The major design goals are
* Should not require terminal emulators to understand image formats.
* Should allow specifying graphics to be drawn at individual pixel positions.
* The graphics should integrate with the text, in particular it should be possible to draw graphics
below as well as above the text, with alpha blending.
* Should use optimizations when the client is running on the same computer as the terminal emulator.
For some discussion regarding the design choices, see link:../../issues/33[#33].
toc::[]
== Getting the window size
In order to know what size of images to display and how to position them, the client must be able to get the
window size in pixels and the number of cells per row and column. This can be done by using the `TIOCGWINSZ` ioctl.
Some C code to demonstrate its use
```C
struct ttysize ts;
ioctl(0, TIOCGWINSZ, &ts);
printf("number of columns: %i, number of rows: %i, screen width: %i, screen height: %i\n", sz.ws_col, sz.ws_row, sz.ws_xpixel, sz.ws_ypixel);
```
Note that some terminals return `0` for the width and height values. Such terminals should be modified to return the correct values.
Examples of terminals that return correct values: `kitty, xterm`
== The graphics escape code
All graphics escape codes are of the form:
```
<ESC>_G<control data>;<payload><ESC>\
```
This is a so-called _Application Programming Command (APC)_. Most terminal
emulators ignore APC codes, making it safe to use.
The control data is a comma-separated list of `key=value` pairs. The payload
is arbitrary binary data, base64-encoded to prevent interoperation problems
with legacy terminals that get confused by control codes within an APC code.
The meaning of the payload is interpreted based on the control data.
The first step is to transmit the actual image data.
== Transferring pixel data
The first consideration when transferring data between the client and the
terminal emulator is the format in which to do so. Since there is a vast and
growing number of image formats in existence, it does not make sense to have
every terminal emulator implement support for them. Instead, the client should
send simple pixel data to the terminal emulator. The obvious downside to this
is performance, especially when the client is running on a remote machine.
Techniques for remedying this limitation are discussed later. The terminal
emulator must understand pixel data in three formats, 24-bit RGB, 32-bit RGBA and
PNG. This is specified using the `f` key in the control data. `f=32` (which is the
default) indicates 32-bit RGBA data and `f=24` indicates 24-bit RGB data and `f=100`
indicates PNG data. The PNG format is supported for convenience and a compact way
of transmitting paletted images.
=== RGB and RGBA data
In these formats the pixel data is stored directly ans 3 or 4 bytes per pixel, respectively.
When specifying images in this format, the image dimensions **must** be sent in the control data.
For example:
```
<ESC>_Gf=24,s=10,v=20;<payload><ESC>\
```
Here the width and height are specified using the `s` and `v` keys respectively. Since
`f=24` there are three bytes per pixel and therefore the pixel data must be `3 * 10 * 20 = 600`
bytes.
=== PNG data
In this format any PNG image can be transmitted directly. The size of the PNG data
**must** be specified in the control data. For example:
```
<ESC>_Gf=100,S=4897;<payload><ESC>\
```
Here the size (in bytes) of the PNG file is specified using the `S` key and the
PNG format is specified using the `f` key. The pixel data must therefore be
`S=4897` bytes.
=== Compression
The client can send compressed image data to the terminal emulator, by specifying the
`o` key. Currently, only zlib based deflate compression is supported, which is specified using
`o=z`. For example,
```
<ESC>_Gf=24,s=10,v=20,o=z;<payload><ESC>\
```
This is the same as the example from the RGB data section, except that the
payload is now compressed using deflate. The terminal emulator will decompress
it before rendering. You can specify compression for any format. The terminal
emulator will decompress before interpreting the pixel data.
=== The transmission medium
The transmission medium is specified using the `t` key. The `t` key defaults to `d`
and can take the values:
|===
| Value of `t` | Meaning
| d | Direct (the data is transmitted within the escape code itself)
| f | A simple file
| t | A temporary file, the terminal emulator will delete the file after reading the pixel data
| s | A http://man7.org/linux/man-pages/man7/shm_overview.7.html[POSIX shared memory object]. The terminal emulator will delete it after reading the pixel data
|===
==== Local client
First let us consider the local client techniques (files and shared memory). Some examples:
```
<ESC>_Gf=100,S=3567,t=f;<encoded /path/to/file.png><ESC>\
```
Here we tell the terminal emulator to read PNG data from the specified file of
the specified size.
```
<ESC>_Gs=10,v=2,t=s,o=z;<encoded /some-shared-memory-name><ESC>\
```
Here we tell the terminal emulator to read compressed image data from
the specified shared memory object.
The client can also specify a size and offset to tell the terminal emulator
to only read a part of the specified file. The is done using the `S` and `O`
keys respectively. For example:
```
<ESC>_Gs=10,v=2,t=s,S=80,O=10;<encoded /some-shared-memory-name><ESC>\
```
This tells the terminal emulator to read `80` bytes starting from the offset `10`
inside the specified shared memory buffer.
==== Remote client
Remote clients, those that are unable to use the filesystem/shared memory to
transmit data, must send the pixel data directly using escape codes. Since
escape codes are of limited maximum length, the data will need to be chunked up
for transfer. This is done using the `m` key. The pixel data must first be
base64 encoded then chunked up into chunks no larger than `4096` bytes. The client
then sends the graphics escape code as usual, with the addition of an `m` key that
must have the value `1` for all but the last chunk, where it must be `0`. For example,
if the data is split into three chunks, the client would send the following
sequence of escape codes to the terminal emulator:
```
<ESC>_Gs=100,v=30,m=1;<encoded pixel data first chunk><ESC>\
<ESC>_Gm=1;<encoded pixel data second chunk><ESC>\
<ESC>_Gm=0;<encoded pixel data last chunk><ESC>\
```
Note that only the first escape code needs to have the full set of control
codes such as width, height, format etc. Subsequent chunks must have
only the `m` key. The client **must** finish sending all chunks for a single image
before sending any other graphics related escape codes.
=== Detecting available transmission mediums
Since a client has no a-priori knowledge of whether it shares a filesystem/shared emmory
with the terminal emulator, it can send an id with the control data, using the `i` key
(which can be an arbitrary positive integer up to 4294967295, it must not be zero).
If it does so, the terminal emulator will reply after trying to load the image, saying
whether loading was successful or not. For example:
```
<ESC>_Gi=31,s=10,v=2,t=s;<encoded /some-shared-memory-name><ESC>\
```
to which the terminal emulator will reply (after trying to load the data):
```
<ESC>_Gi=31;error message or OK<ESC>\
```
Here the `i` value will be the same as was sent by the client in the original
request. The message data will be a ASCII encoded string containing only
printable characters and spaces. The string will be `OK` if reading the pixel
data succeeded or an error message.
== Control data reference
The table below shows all the control data keys as well as what values they can
take, and the default value they take when missing.
|===
|Key | Value | Default | Description
| `a` | Single character. `(t, T, q, p)` | `t` | The overall action this graphics command is performing.
| `f` | Positive integer. `(24, 32, 100)`. | `32` | The format in which the image data is sent.
| `t` | Single character. `(d, f, t, s)`. | `d` | The transmission medium used.
| `s` | Positive integer. | `0` | The width of the image being sent.
| `v` | Positive integer. | `0` | The height of the image being sent.
| `S` | Positive integer. | `0` | The size of data to read from a file.
| `O` | Positive integer. | `0` | The offset from which to read data from a file.
| `i` | Positive integer. `(0 - 4294967295)` | `0` | The image id
| `o` | Single character. `only z` | - | The type of data compression.
| `m` | zero or one | `0` | Whether there is more chunked data available.
|===

View File

@ -427,7 +427,7 @@ grman_handle_command(GraphicsManager *self, const GraphicsCommand *g, const uint
if (!data_loaded) break;
snprintf(add_response, 10, "OK");
}
snprintf(rbuf, sizeof(rbuf)/sizeof(rbuf[0]) - 1, "\033_Gq=%u;%s\033\\", g->id, add_response);
snprintf(rbuf, sizeof(rbuf)/sizeof(rbuf[0]) - 1, "\033_Gi=%u;%s\033\\", g->id, add_response);
return rbuf;
}
break;

View File

@ -70,260 +70,8 @@ link:http://vt100.net/docs/vt510-rm/DECRPM[DECRPM]
== Graphics rendering
The goal of this specification is to create a flexible and performant protocol
that allows the program running in the terminal, hereafter called the _client_,
to render arbitrary pixel (raster) graphics to the screen of the terminal
emulator. The major design goals are
* Should not require terminal emulators to understand image formats.
* Should allow specifying graphics to be drawn per individual character cell. This allows graphics to mix with text using
the existing cursor based protocols.
* Should use optimizations when the client is running on the same computer as the terminal emulator.
For some discussion regarding the design choices, see link:../../issues/33[#33].
=== Getting the window size
In order to know what size of images to display and how to position them, the client must be able to get the
window size in pixels and the number of cells per row and column. This can be done by using the `TIOCGWINSZ` ioctl.
Some C code to demonstrate its use
```C
struct ttysize ts;
ioctl(0, TIOCGWINSZ, &ts);
printf("number of columns: %i, number of rows: %i, screen width: %i, screen height: %i\n", sz.ws_col, sz.ws_row, sz.ws_xpixel, sz.ws_ypixel);
```
Note that some terminals return `0` for the width and height values. Such terminals should be modified to return the correct values.
Examples of terminals that return correct values: `kitty, xterm`
=== Transferring pixel data
```
<ESC>_G<control data>;<payload><ESC>\
```
Before describing this escape code in detail, lets see some quick examples to get a flavor of it in action.
```
# Draw 10x20 pixels starting at the top-left corner of the current cell.
<ESC>_Gw=10,h=20,s=100;<pixel data><ESC>\
# Ditto, getting the pixel data from /tmp/pixel_data
<ESC>_Gw=10,h=20,t=f,s=100;<encoded /tmp/pixel_data><ESC>\
# Ditto, getting the pixel data from /dev/shm/pixel_data, deleting the file after reading data
<ESC>_Gw=10,h=20,t=t,s=100;<encoded /dev/shm/pixel_data><ESC>\
# Draw 10x20 pixels starting at the top-left corner of the current cell, ignoring the first 4 rows and 3 columns of the pixel data
<ESC>_Gw=10,h=20,x=3,y=4,s=100;<pixel data><ESC>\
```
This control code is an _Application-Programming Command (APC)_, indicated by
the leading `<ESC>_`. No modern terminals that I know of use APC codes, and
well-behaved terminals are supposed to ignore APC codes they do not understand.
The next character `G` indicates this APC code is for graphics data. In the future, we might
have different first letters for different needs.
The control data is a comma-separated list of key-value pairs with the restriction that
keys and values must contain only the characters `0-9a-zA-Z_-+/*`. The payload is arbitrary binary
data interpreted based on the control codes. The binary data must be base-64 encoded so as to minimize
the probability of problems with legacy systems that might interpret control
codes in the binary data incorrectly.
The key to the operation of this escape code is understanding the way the control data works.
The control data's keys are split up into categories for easier reference.
==== Controlling drawing
|===
| Key | Default | Meaning
| w | full width | width -- number of columns of the pixel data to draw
| h | full height | height -- number of rows of the pixel data to draw
| x | zero | x-offset -- the column in the pixel data to start from (0-based)
| y | zero | y-offset -- the row in the pixel data to start from (0-based)
|===
The origin for `(x, y)` is the top left corner of the pixel data, with `x`
increasing from left-to-right and `y` increasing downwards. The terminal
emulator will draw the specified region starting at the top-left corner of the
current cell. If the width is greater than a single cell, the cursor will be
moved one cell to the right and drawing will continue. If the cursor reaches
the end of the line, it moves to the next line and starts drawing the next row
of data. This means that the displayed image will be truncated at the right
edge of the screen. If the cursor needs to move past the bottom of the screen,
the screen is scrolled. After the entire region is drawn, the cursor will be
positioned at the first cell after the image.
Setting the width and/or height to zero means that no drawing is done and the
cursor position remains unchanged.
==== Transmitting data
The first consideration when transferring data between the client and the
terminal emulator is the format in which to do so. Since there is a vast and
growing number of image formats in existence, it does not make sense to have
every terminal emulator implement support for them. Instead, the client should
send simple pixel data to the terminal emulator. The obvious downside to this
is performance, especially when the client is running on a remote machine.
Techniques for remedying this limitation are discussed later. The terminal
emulator must understand pixel data in two formats, 24-bit RGB and 32-bit RGBA.
This is specified using the `f` key in the control data. `f=32` (which is the
default) indicates 32-bit RGBA data and `f=24` indicates 24-bit RGB data.
One additional parameter is needed to describe the pixel data, the _stride_,
that is the number of pixels per row. This is encoded using the `s` key, which
is **required**. For example, `s=100` means there are one hundred pixels per
row in the pixel data.
Now let us turn to considering how the data is actually transmitted.
===== Local client
When the client and the terminal emulator are on the same computer and share a
filesystem or shared memory, transfer can happen efficiently using files or
shared memory objects to pass the data around. The type of transfer is
controlled by the `t` key. When sending data via files/shared memory, `t` can
take three values, described below:
|===
| Value of `t` | Meaning
| f | A simple file
| t | A temporary file, the terminal emulator will delete the file after reading the pixel data
| s | A http://man7.org/linux/man-pages/man7/shm_overview.7.html[POSIX shared memory object]. The terminal emulator will delete it after reading the pixel data
|===
In all these cases, the payload data must be the base-64 encoded absolute file path.
[[query]]An important consideration is how the client can tell if the terminal emulator
and it share a filesystem. This can be done by using the _response mode_, specifying
the `q` key, with some unique id as the value. For example,
```
<ESC>_Gt=t,s=100,q=33;<encoded /tmp/pixel_data><ESC>\
```
When the terminal emulator receives this escape code, it will read and display
the pixel data as normal, and also send an escape code back to the client
indicating whether the reading of the data was successful or not. The returned
escape code will look like:
```
<ESC>_Gq=33;<encoded error message or OK><ESC>\
```
Here the `q` value will be the same as was sent by the client in the original
request. The payload data will be a ASCII encoded string containing only
printable characters and spaces. The string will be `OK` if reading the pixel
data succeeded or an error message. Clients can set the width and height to
zero to avoid actually drawing anything on screen during the test.
===== Remote client
Remote clients, those that are unable to use the filesystem/shared memory to
transmit data, must send the pixel data directly using escape codes. Since
escape codes are of limited maximum length, the data will need to be chunked up
for transfer. This is done using the `m` key. The pixel data must first be
base64 encoded then chunked up into chunks no larger than `4096` bytes. The client
then sends the graphics escape code as usual, with the addition of an `m` key that
must have the value `1` for all but the last chunk, where it must be `0`. For example,
if the data is split into three chunks, the client would send the following
sequence of escape codes to the terminal emulator:
```
<ESC>_Gw=100,h=30,s=100,m=1;<base-64 pixel data first chunk><ESC>\
<ESC>_Gm=1;<base-64 pixel data second chunk><ESC>\
<ESC>_Gm=0;<base-64 pixel data last chunk><ESC>\
```
Note that only the first escape code needs to have the full set of control
codes such as stride, width, height, format etc. Subsequent chunks must have
only the `m` key. The client must finish sending all chunks for a single image
before sending any other graphics related escape codes.
=== Image persistence
Full screen applications may need to render the same image multiple times or
even render different parts of an image, in different locations, for example,
if the image is sprite map. Resending the image data each time this happens is
wasteful. Instead this protocol allows the client to have the terminal emulator
manage a persistent store of images.
Persistence is implemented by simply assigning an id to transmitted pixel data using the
key `i`. So for example,
```
<ESC>_Gt=t,s=100,i=some-id;<encoded /tmp/pixel_data><ESC>\
```
Now, if the client wants to redraw that image in the future, all it has to do is send
a code with the keys `t=i,i=some-id`, and no payload, like this:
```
<ESC>_Gt=i,i=some-id;<ESC>\
```
The client can use the `w, h, x, y` keys to draw different parts of the image
and draw it at different locations by positioning the cursor before sending the
code.
Saved images can be deleted, to free up resources, by using the code:
```
<ESC>_Gt=d,i=some-id;<ESC>\
```
The special value of `i=*` will cause the terminal emulator to delete all
stored images. Well behaved clients should send this code before terminating.
Terminal emulators may limit the maximum amount of saved data to avoid denial-of-service
attacks. Terminal emulators should make the limit fairly generous, at least a
few hundred, full screen, RGBA images worth of data should be allowed.
Client applications can check if an image is still stored by sending the `q`
key, as described <<query,above>>. For example,
```
<ESC>_Gt=i,i=some-id,q=some-id;<ESC>\
```
The terminal emulator will respond with:
```
<ESC>_Gq=some-id;<encoded OK or error message><ESC>\
```
If `OK` is sent the image was successfully loaded from the persistent storage, if not,
then it must be resent.
Note that when using the local filesystem to send data (`t=f`) mode, there is
no need to use this persistence mechanism, as the client can directly refer to
the file repeatedly with no overhead.
=== A summary of the control keys used
|===
|Key | Description
| f | The _format_ of the transmitted pixel data
| h | _height_ -- number of rows of the pixel data to draw
| i | _id_ to save transmitted data in persistent storage
| m | indicates whether there is _more_ data to come during a chunked transfer
| q | _query_ the terminal emulator to see if transmission succeeded
| s | The _stride_ of the transmitted pixel data
| t | The _type_ of transmission medium used
| w | _width_ -- number of columns of the pixel data to draw
| x | _x-offset_ -- the column in the pixel data to start from (0-based)
| y | _y-offset_ -- the row in the pixel data to start from (0-based)
|===
See link:graphics-protocol.asciidoc[Graphics Protocol] for a description
of this protocol to enable drawing of arbitrary easter images in the terminal.
== Keyboard handling