This page describes some common recurring substructures present in the PEML data model.
Describing Files
A Single File
Schema:
A single file is specified as either a string (consisting of a
url(...)
that defines the file's name, location, and
content) or an object containing a series of
nested keys, as described below.
PEML example:
PEML example:
file.name optional: string
The nested name
key provides the name of the file.
This key is optional, if the name is implied by other parameters,
or if the tool processing the description supplies a default name.
file.type optional: string
The nested type
key provides the MIME type for the
file, if needed. This may be used in some cases to ensure the
tool processes the file as intended, but is optional, since its
use is tool-dependent.
file.content required: string or array
The nested content
key allows for the content of
the file to be inlined as part of the PEML description in situations
where that is simpler or more convenient, or where it allows
the PEML description to be used as a single self-contained
resource without requiring a zip archive containing multiple
files. HereDoc-style quote may be useful for inline file content.
In addition, if the >content
is being represented
inline in the PEML description and actually consists of structured
data, the structured data can be represented directly in PEML
in the form of a nested hash or an array of records.
file.content_encoding optional: string
The nested content_encoding
key describes
the content transfer
encoding for binary file contents. For example, when
inlining the content of a binary file,
base64
encoding is preferred. What content encodings beyond
base64
are supported is tool-dependent
(such as quoted-printable
, although non-text
encodings used in smtp and json seem unsuitable for
inlining files in a text representation, e.g.,
7bit
, 8bit
, or binary
).
A Set of Files
Schema:
Like a single file, a file set can be specified in one of two ways.
First, a file set can be a single string value, consisting of a
url(...)
that defines the location of a directory (or
just a single file, if the file set contains only a single element).
When a URL is specified, the entire subtree is considered to be the
set of files intended. If the URL actually refers to a file archive
(*.zip, *.jar, *.tar, *.tgz), and the file set is intended to be the
contents of the archive instead of that file itself, then use
extract(...)
in place of url(...)
.
Alternatively, a file set can be a plain
array, where each item in the array is a single file. In most cases,
the key name files
is used when a file set should be
provided.
When providing an array of files, remember that PEML uses repeated occurrences of the first key provided for the first array item to mark when each new item starts, so which ever key is provided first should consistently be used to start each new item in the array.
PEML example:
PEML example:
Repositories
While providing remote locations for files is useful, many authors may use forms of version control to manage the files or resources referenced in an exercise (which is recommended). As a result, it may be useful to refer to repository-based locations when a URL is not sufficient. Here, we focus on supporting git repositories, although expansions to support other useful version control repository structures are welcome.
Schema:
PEML example:
repository.url required: string
The url
key for a repository object provides an
access path to the repository. This could be an absolute URL
referring to a net-accessible repository, or a relative URL
resolved relative to the location of the PEML description.
repository.path optional: string
The path
key for a repository object provides a
relative path within the repository (relative to the repository's
root) to identify a specific subtree or resource within the
repository. This value can be encoded directly in the
url
value, but is provided as an optional separate
key for readability/writability.
repository.branch optional: string
The branch
key for a repository object names the
specific branch being referenced within the repository, if a
branch other than the default is desired. This value can be
encoded directly in the url
value, but is provided
as an optional separate key for readability/writability.
repository.tag optional: string
The tag
key for a repository object names the
specific tag being referenced within the repository, if desired.
This value can be
encoded directly in the url
value, but is provided
as an optional separate key for readability/writability.
repository.commit optional: string
The commit
key for a repository object names the
specific commit being referenced within the repository, if desired.
Commits are specified using the same identification scheme
supported by the underlying version control system.
We strongly recommend using branches or tags where possible,
since hard-coding commit identities in PEML is brittle.
This value can be
encoded directly in the url
value, but is provided
as an optional separate key for readability/writability.
Environments
While many PEML descriptions provide straightforward exercises, sometimes an exercise may require specific environmental contents, setup, or support. Some tools provide a pre-defined environment for building and executing exercise answers, but others provide different means of specifying, tailoring, or extending the resources available during processing. PEML uses the idea of an "environment" to capture the set of resources available or used during processing of an answer, although which aspects of answer processing are configurable are tool-dependent.
PEML allows exercise authors to specify environmental resources used for four separate phases of answer processing in the following order, although which of these phase(s) are recognized by individual tools are tool-dependent:
The start phase represents the environment or resources given to the learner when they start an exercise, in order for them to be able to write an answer (for example, any supporting libraries, header files, data files, etc., that should be part of the learner's setup in preparing to create an answer). Normally, this environment includes all starting resources other than skeleton source files (which are provided separately via the
src
key).The build phase represents the environment used to compile or build a runnable version of the exercise answer.
The run phase represents the environment used to run an exercise answer on behalf of the learner, for example, to run learner-written software tests or to perform interactive debugging executions for the learner to inspect behavior.
The test phase represents the environment used to evaluate/assess the exercise answer, for example, to run exercise-provided reference tests to judge the correctness of an answer. Normally, this environment includes all testing resources other than actual test cases (which are provided separately via the
suites
key).
A Single Environment
All of the environments follow the same basic structure. Some tools may support only one environment, and may pick the "phase" that best represents that tool's view of how exercises are processed. Other tools may support a separate environment for one or more phases. PEML descriptions can specify any environment using a nested object with the following structure.
Schema:
PEML example:
Note: Are there other missing
properties/features of environments that need to be covered here? For
example, a command
key might be useful to provide for
a customizable entry/execution command used for a given phase, for
tools that provide it?
inherits optional: string
The inherits
key for an environment indicates it
inherits all the contents of (or subsumes) another environment
(presumably one that appears earlier in the phases).
Here, "inherits" basically means that all of the files (or, more generally, contents) from the parent environment are used as the starting point for this environment, so that additional file(s) are added "on top of" the contents specified in the parent environment. That allows the new environment to add to or overwrite files or resources to define a new environment by extension (inheritance).
If the new environment specifies a container image
,
it is taken to replace any container image defined in the
parent environment (that is, images override/replace each other).
File sets or repositories (which also define sets of files,
accessed in a different way) are extended/overridden in the
new environment by imagining the new environment's files are
copied onto a root with the same contents as the parent environment.
files optional: array
The files
key for an environment is a
file set that identifies the files
that make up the environment. In spirit, these are additions
to the "root" of the environment used for the corresponding
phase, although the exact interpretation is tool-dependent.
While it is easy to associate the "root" with the "current
working directory" for building or running an answer, some
tools may interpret subdirectories within the environment
specially--for example, interpreting a ./lib
nested folder as representing library dependencies, or something
similar.
When files
and an image
are both
specified, the interpretation is tool-dependent (well, all of
the environmental handling is tool-dependent anyway). However,
we can imagine the for some tools, the files
describe the contents of a virtual directory tree that might
be supplied as a parameter to the corresponding container
image, say through a file system virtual mount or something
similar.
repository optional: object
The repository
key for an environment is a
repository object that identifies
the files
that make up the environment by pointing to a version
control repository (or possibly a path within one). In spirit,
these are treated exactly the same as the files
,
but simply accessed in a different way by going through a
version control repository instead. They should also be
interpreted using the same ideas: the repository contents
represent additions
to the "root" of the environment used for the corresponding
phase, although the exact interpretation is tool-dependent.
Note that if both repository
and files
are specified, we can imagine them being combined by treating
both of them as additions to the same concept of a virtual
directory root. Repository contents should be copied first,
and file sets second, so that file sets override/add to the
repository's contribution. If an image
is specified,
the repository
value is treated in the same manner
as any files
would be with respect to the
image
.
image optional: string
The complete über-solution to specifying an operating environment for building or executing student code is to provide a container image, similar to the solution used by gradescope. PEML allows for environments to be specified in the form of container images, although whether this feature is supported is tool-dependent.
The image
key for an environment identifies
a specific docker image that encapsulates the environment.
We choose docker for this, because it's easy and ubiquitous,
although if anyone has clear ideas about how we can provide
broader support here, please suggest!
The docker container image can be specified by using a docker image
identifier, optionally including a docker image repository and/or tag
(basically, anything that can be used to specify a container image to a
docker run
command).
registry optional: string
If an image
is specified for the environment, this
parameter can be used to specify the docker image registry where
the image can be pulled.
Note: We need to add keys here to specify authentication information and/or tokens to access non-public repositories and/or registries.
A Set of Environments
In PEML, a set of environments is one or more for the
start
, build
, run
,
or test
phases.
Schema:
PEML example: