Specification

Background

Motivation

There is a crisis of productivity and reproducibility in the life sciences today. Projects that should take weeks end up taking months and the vast majority of published literature struggles to be replicated by independent labs later.

Experimental protocols written in natural language are often ambiguous. For example, the phrases "spin down briefly" and "mix gently" are frequently used in many common protocols and convey much less information than is necessary for operators to reproduce each others' work.

Design Goals

Flexible
Autoprotocol allows for a plethora of possible protocols built from a small set of instructions. No biological knowledge is included in the specification. Adding new instructions is straightforward.
Composable
High levels of complexity are enabled by building up from smaller pieces. It should be possible to start from simple, rock-solid modules and compose them into cutting edge science.
Synthesizable
Autoprotocol is mappable directly to hardware commands for robotic automation. Human interpretation must not be necessary.
Platform Independent
Autoprotocol should be able to be generated and consumed by software written in any language on any platform.
Just Data
Encoded protocols are a linear series of instructions to execute and contain no branching logic or looping constructs evaluatable at runtime.
Learnable
A central design goal of Autoprotocol is the ability for users to extrapolate about how functionality they haven't yet used might work based on the parts they already know and frequently guess correctly.

Conventions In This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in IETF RFC 2119.

Syntax

In the code excerpts and examples here, there are a few conventions to know. An unquoted string as a value is a type (for example, the Volume and Time type designations in the "volume" and "duration" fields at right). A quoted string is a literal, as in the "op" value at right. Square brackets denote an array, as in the array of objects.

{
  "op": "my_instr",
  "objects": [Container],
  "volume": Volume,
  "duration": Time,
  "count": Int
}

Dimensioned values ("quantities")

Both the volume and duration are quantities, which are strings of the format "magnitude:unit". Duration strings might be 50:second, 12:minute, 50:millisecond, and so on. Similarly, volume strings might be 25:microliter or 5:milliliter. Measures may contain decimals, as in 25.2:microliter. Dimensions are always written singular.

Refs and Datarefs

A ref is an alphanumeric string; a string that contains only letters and numbers and no special characters. Refs are simply easy identifier strings to use to refer to a container defined in an access instruction. Similarly, datarefs are alphanumeric strings used to later identify any data generated by the given instruction. Refs and datarefs must be unique within each protocol.

Containers and Wells

Per the Protocol section, containers are referenced using their ref string. Wells are referenced using a slash syntax :ref/:index, like my_plate/A1.

Serialization

Autoprotocol protocols are serialized using Javascript Object Notation (JSON). This choice is not intrinsically semantic, but it is mandatory for consistency and compatibility. Alternative serializations such as XML, Protocol Buffers or custom formats shall not be used.

Protocols

Structure

A protocol is defined by three segments:

refs
the set of containers that will be used in the protocol
instructions
the list of instructions to be performed
constraints
constraints on how the instructions should be performed

A ref is a short alphanumeric name given to a container to identify it in later instructions. Every container referenced in a protocol must also be given a destiny: either discarded at the end of the protocol, or stored.

A protocol shall not contain any segments not defined here as mandatory.

{
  "refs": {
    "dye": {
      "id": "ct13zjq79whe",
      "store": { "where": "ambient" }
    },
    "water": {
      "id": "ct149x8mea3j",
      "store": { "where": "ambient" }
    },
    "samples": {
      "id": "ct3b245kx34l",
      "discard": true
    },
    ...
  }
}

Once you have references to all the objects you want to work with, you can use them in other instructions by referring to the container itself by its ref or to aliquots within the container with the syntax :ref/:index.

In the protocol snippet at right there are three instructions performing the operations:

  • Distribute 40 μl from well water/0 into each of test/A1, test/A2, and test/A3.
  • Distribute 5 μl from well dye/0 into each of test/A1, test/A2, and test/A3.
  • Centrifuge the plate test for 30 seconds at 2000 g.
  • Take a 600 nm absorbance reading through wells test/A1, test/A2, and test/A3.
{
  "refs": { ... },
  "instructions": [
    { "op": "pipette",
      "groups": [
        { "distribute": {
          "from": "water/0",
          "to": [
            { "well": "test/A1",
              "volume": "40:microliter" },
            { "well": "test/A2",
              "volume": "40:microliter" },
            { "well": "test/A3",
              "volume": "40:microliter" }
          ]
        } }, { "distribute": {
          "from": "dye/0",
          "to": [
            { "well": "test/A1",
              "volume": "5:microliter" },
            { "well": "test/A2",
              "volume": "5:microliter" },
            { "well": "test/A3",
              "volume": "5:microliter" },
          ]
        } }
      ]
    }, {
      "op": "spin",
      "object": "test",
      "duration": "30:second",
      "acceleration": "2000:g"
    }, {
      "op": "absorbance",
      "object": "test",
      "wells": ["A1", "A2", "A3"],
      "wavelength": "600:nanometer"
    }]
  ]
}

Aliquot Paths

While a protocol is just data and does not contain logic (e.g., if/then statements), it is common to use a program that does contain logic to dynamically generate a protocol. For example, the layout of wells on a variable number of plates may change depending on the number of samples being operated on, though the series of operations for each sample is the same (it is "scale invariant"). On the surface, this can make it appear complex to compare protocols over time or across different conditions.

The concept of aliquot paths captures the common scientifically-relevant structure across generated protocols that differ in their overall content due to scale. Two protocols are homomorphic if for every ref in one protocol there is one or more similar ref(s) in the second protocol with the same path. Protocol homomorphism is directional: if one protocol contains additional refs not seen in the other whose paths are independent from the paths of the isomorphic refs (the refs do not interact and constitute completely separate "subroutines" within the protocol), the protocols may still be said to be homomorphic in the context of the refs with common paths.

Put more simply, if there are two protocols that perform the same set of conceptual operations on a different number of samples, adding additional operations and samples that have nothing to do with the existing samples doesn't break the idea that the protocols are "similar, just scaled" for the original samples.

Aliquot paths are important because they allow us to compare logical blocks of operations irrespective of how they're physically configured.

Definitions

Types

Instruction and ref specification use the following common types.

Primitive Types

Type Definition
Boolean true or false
Float a floating point numeric value
Int an integer numeric value
String any sequence of utf-encoded characters bounded with "

Derived Types

Type Example Value Definition
Aliquot "growth_plate/A1" an Autoprotocol container and a well index delimited with a / represented as a String
Container "growth_plate" an Autoprotocol container referenced in the refs section of the protocol represented as a String
Quantity e.g. Volume "5:microliters" a magnitude and a unit delimited with a : represented as a String

Type Wrappers

Syntax Example Specification Definition
Enum(..) Enum("one, "two") any one of the enclosed values
Option<Type> Option<String> either be the enclosed Type or null

Units

Instruction and ref specification use the following units to represent quantities.

Unit Examples
Acceleration meter/second^2
Area meter^2
Capacitance farad, picofarad
ElectricPotential volt, millivolt
Frequency hertz, rpm
Length meter, millimeter, micrometer
Mass gram, milligram, microgram
Matter mole, millimole, micromole
Power watt, milliwatt
Pressure pascal
Temperature celsius, kelvin
Time day, hour, minute, second, millisecond
Velocity meter/second
Volume liter, milliliter, microliter
VolumeAcceleration microliter/second^2
VolumeFlow microliter/second

Fields

Some common fields that are shared across instructions are defined below.

{
  "shake_path": Enum(
    "cw_orbital",
    "ccw_orbital",
    "portrait_linear",
    "landscape_linear",
    "cw_diamond",
    "ccw_diamond",
    "portrait_down_double_orbital",
    "landscape_down_double_orbital",
    "portrait_up_double_orbital",
    "landscape_up_double_orbital"
  )
}

Refs

container_refs

Names: The refs field aliases Containers to descriptive Strings called refs.

Origins: The id field is used to specify the unique identifier of an existing Container. The new field is used to specify that this ref does not yet exist and what type of Container it should be. These two fields are mutually exclusive.

Destinies: The discard field indicates whether the ref should be discarded or not. The store field indicates how a ref should be stored. These two fields are mutually exclusive.

Covers The cover field indicates the type of cover a container is initially covered with. If no cover is specified the container is assumed to be uncovered.

{
  "refs": {
    String: {
      "id": Option<String>,
      "new": Option<String>,
      "store": Option<{
        "where": Option<Enum(
          "cold_80",
          "cold_20",
          "cold_4",
          "ambient",
          "warm_30",
          "warm_37"
        )>
      }>,
      "discard": Option<Boolean>,
      "cover": Option<String>
    }
  }
}

Instructions

The instructions field of a Protocol is a made up of a list of Instructions. Instructions are encoded as an op which is the instruction name and optionally a series of additional top-level fields to encode how it should be executed.

Following is the set of instructions currently in the Autoprotocol standard.

acoustic_transfer

Acoustic liquid handling uses acoustics to fly individual droplets from a source container to a destination one. Most acoustic liquid handlers only support a discrete set of droplet_size and the volume field of each transfer must be a multiple of it. prevalidate_sources is used to ensure that the source wells contain enough volume to successfully complete the transfer. source_volume_limits are used to overwrite vendor-specified defaults for what volumes should pass prevalidation.

{
  "op": "acoustic_transfer",
  "droplet_size": Option<Volume>,
  "prevalidate_sources": Option<Boolean>,
  "groups": [
    {
      "transfer": [
        {
          "from": Aliquot,
          "to": Aliquot,
          "volume": Volume
        }
      ]
    }
  ],
  "source_volume_limits": Option<{
    "min": Option<Volume>,
    "max": Option<Volume>
  }>
}

cover

Containers must be covered or sealed for storage, incubation, and centrifugation operations (among others). Many instructions including liquid handling operations require that a container be uncovered before use. retrieve_lid indicates that a lid previously saved by a uncover operation with store_lid should be used.

{
  "op": "cover",
  "object": Container,
  "lid": String,
  "retrieve_lid": Option<Boolean>
}

incubate

The incubate instruction stores a sample in an incubator with the appropriate settings for a given duration.

{
  "op": "incubate",
  "object": Container,
  "where": Enum(
    "cold_20",
    "cold_4",
    "ambient",
    "warm_37"
  ),
  "duration": Time,
  "shaking": Boolean,
  "co2_percent": Option<Float>,
  "target_temperature": Option<Temperature>,
  "shaking_params": Option<{
    "frequency": Frequency,
    "path": Option<shake_path>,
    "amplitude": Option<Length>
  }>
}

liquid_handle

The liquid_handle instruction acts as a framework to allow precise control over liquid handling parameters and express a broad range of liquid handling operations.

The liquid_handle operation is based around transporting volumes of liquid in and out of locations. Each operation is a locations sequence of location with transports sequences specifying the list of volumes. The position of device components may reset between elements of locations. Transports within the same locations use the same consumables (i.e. tips in the case of air_displacement liquid handling).

{
  "op": "liquid_handle",
  "locations": [
    {        
      "location": Option<Aliquot>,
      "transports": Option<[
        {
          "volume": Option<Volume>,
          "pump_override_volume": Option<Volume>,
          "flowrate": Option<{
            "target": Option<VolumeFlow>,
            "initial": Option<VolumeFlow>,
            "cutoff": Option<VolumeFlow>,
            "acceleration": Option<
              VolumeAcceleration
            >,
            "deceleration": Option<
              VolumeAcceleration
            >
          }>,
          "delay_time": Option<Time>,
          "mode_params": Option<{
            "liquid_class": Option<Enum(
              "air",
              "default"
            )>,
            "tip_position": Option<{
              "position_x": Option<{
                "position": Option<Float>,
                "move_rate": Option<{
                  "target": Option<Velocity>,
                  "acceleration": Option<
                    Acceleration
                  >
                }>
              }>,
              "position_y": Option<{
                "position": Option<Float>,
                "move_rate": Option<{
                  "target": Option<Velocity>,
                  "acceleration": Option<
                    Acceleration
                  >
                }>
              }>,
              "position_z": Option<{
                "offset": Option<Length>,
                "move_rate": Option<{
                  "target": Option<Velocity>,
                  "acceleration": Option<
                    Acceleration
                  >
                }>,
                "reference": Option<Enum(
                  "well_top",
                  "well_bottom",
                  "liquid_surface",
                  "preceding_position"
                )>,
                "detection": Option<{
                  "method": Option<Enum(
                    "tracked",
                    "pressure",
                    "capacitance"
                  )>,
                  "threshold": Option<Enum(
                    "pressure",
                    "capacitance"
                  )>,
                  "duration": Option<Time>,
                  "fallback": Option<position_z>
                }>
              }>
            }
          }>
        }
      ]>,
      "temperature": Option<Temperature>
    }
  ],
  "mode": Option<Enum(
    "air_displacement",
    "dispense"
  )>
  "mode_params": Option<{
    "tip_type": Option<String>
  }>,
  "shape": Option<{
    "rows": Int,
    "columns": Int,
    "format": Option<Enum("SBS96", "SBS384")>
  }>
}

measure_mass

The measure_mass instruction can be used to determine the mass of a sample (container). The execution is vendor specific and may or may not consume a fraction of the sample. The accuracy of results is vendor specific.

{
  "op": "measure_mass",
  "object": Container,
  "dataref": String
}

measure_volume

The measure_volume instruction can be used to determine the volume of a sample. The execution is vendor specific and may or may not consume a fraction of the sample. The accuracy of results is vendor specific.

{
  "op": "measure_volume",
  "object": [Aliquot],
  "dataref": String
}

provision

The provision instruction encodes adding some amount of an external resource to an aliquot or series of aliquots.

{
  "op": "provision",
  "resource_id": String,
  "to": [
    {
      "well": Aliquot,
      "volume": Volume,
      "dispense_velocity": Option<VolumeFlow>,
      "mix_after": Option<{
        "volume" Volume,
        "repetitions": Int,
        "velocity": Option<VolumeFlow>
      }>
    }
  ]
}

seal

Containers must be covered or sealed for storage, incubation, and centrifugation operations (among others). Seal types have useful properties ranging from optical clarity to gas permeability. Seals can be applied by either thermal or adhesive sealers which result in different seal integrity. thermal seals can be applied with a range of temperatures and durations that can be optimized for different plate types. Many instructions including liquid handling operations require that a container be uncovered before use.

{
  "op": "seal",
  "object": Container,
  "type": String,
  "mode": Option<Enum("thermal", "adhesive")>,
  "mode_params": Option<{
    "temperature": Option<Temperature>,
    "duration": Option<Time>
  }>
}

spectrophotometry

The spectrophotometry instruction encodes one or a series of plate reading steps executed on a single container with the same device. This could be executed once, or at a defined interval, across some total duration. There are 4 valid modes (absorbance, fluorescence, luminescence, and shake) that each accept a different set of mode_params

{
  "op": "spectrophotometry",
  "dataref": String,
  "object": Container,
  "interval": Option<Time>,
  "num_intervals": Option<Int>,
  "temperature": Option<Temperature>,
  "shake_before": Option<{
    "duration": Time,
    "frequency": Option<Frequency>,
    "amplitude": Option<Length>,
    "path": Option<shake_path>
  }>,
  "groups": [
    Option<{
      "mode": "absorbance",
      "mode_params": {
        "wells": [Aliquot],
        "wavelength": [Length],
        "num_flashes": Option<Int>,
        "settle_time": Option<Time>,
        "read_position": Option<Enum(
          "top",
          "bottom"
        )>,
        "position_z": Option<{
          "manual": Option<{
            "displacement": Length,
            "reference": Enum(
              "plate_bottom",
              "plate_top",
              "well_bottom",
              "well_top"
            )
          }>,
          "calculated_from_wells": Option<{
            "wells": [Aliquot],
            "heuristic": Enum(
              "max_mean_read_without_saturation",
              "closest_length_without_saturation"
            )
          }>,
        }>
      }
    }>,
    Option<{
      "mode": "fluorescence",
      "mode_params": {
        "excitation": [{
            "shortpass": Option<Length>,
            "longpass": Option<Length>,
            "ideal": Option<Length>
        }],
        "emission": [{
            "shortpass": Option<Length>,
            "longpass": Option<Length>,
            "ideal": Option<Length>
        }],
        "num_flashes": Option<Int>,
        "settle_time": Option<Time>,
        "lag_time": Option<Time>,
        "integration_time": Option<Time>,
        "gain": Option<Float>,
        "read_position": Option<Enum(
          "top",
          "bottom"
        )>,
        "position_z": Option<{
          "manual": Option<{
            "displacement": Length,
            "reference": Enum(
              "plate_bottom",
              "plate_top",
              "well_bottom",
              "well_top"
            )
          }>,
          "calculated_from_wells": Option<{
            "wells": [Aliquot],
            "heuristic": Enum(
              "max_mean_read_without_saturation",
              "closest_length_without_saturation"
            )
          }>,
        }>
      }
    }>,
    Option<{
      "mode": "luminescence",
      "mode_params": {
        "wells": [Aliquot],
        "num_flashes": Option<Int>,
        "settle_time": Option<Time>,
        "integration_time": Option<Time>,
        "gain": Option<Float>,
        "read_position": Option<Enum(
          "top",
          "bottom"
        )>,
        "position_z": Option<{
          "manual": Option<{
            "displacement": Length,
            "reference": Enum(
              "plate_bottom",
              "plate_top",
              "well_bottom",
              "well_top"
            )
          }>,
          "calculated_from_wells": Option<{
            "wells": [Aliquot],
            "heuristic": Enum(
              "max_mean_read_without_saturation",
              "closest_length_without_saturation"
            )
          }>,
      }
    }>,
    Option<{
      "mode": "shake",
      "mode_params": {
        "duration": Option<Time>,
        "frequency": Option<Frequency>,
        "amplitude": Option<Length>,
        "path": Option<shake_path>
      }
    }>
  ]
}

spin

The spin instruction is used to represent a series of centrifugation steps. The inward and outward flow_direction encodes spinning the contents into or out of of a container respectively. The operation is repeated with the appropriate direction for each element in spin_directions.

{
  "op": "spin",
  "object": Container,
  "acceleration": Acceleration,
  "duration": Time,
  "flow_direction": Option<Enum(
    "inwards",
    "outwards"
  )>,
  "spin_direction": [
    Enum("cw", "ccw")
  ]
}

uncover

Containers must be covered or sealed for storage, incubation, and centrifugation operations (among others). Many instructions including liquid handling operations require that a container be uncovered before use. store_lid indicates that the lid should be saved for some subsequent cover instruction with retrieve_lid.

{
  "op": "uncover",
  "object": Container,
  "store_lid": Option<Boolean>
}

unseal

Containers must be covered or sealed for storage, incubation, and centrifugation operations (among others). Many instructions including liquid handling operations require that a container be uncovered before use.

{
  "op": "unseal",
  "object": Container
}

Constraints

time_constraints

Time constraints encode a temporal relationship between two time points, from and to.

Each of the time points must specify exactly one of their optional fields. ref_start and ref_end encode the points at which a Container leaves its origin and enters its destiny respectively. instruction_start and instruction_end encode the points at the beginning and end of an instruction’s execution; the instruction is represented by its 0-indexed position within the instructions list.

Each time constraint may include any combination of the less_than, more_than, and ideal fields. less_than and more_than constraints encode the minimum and maximum amount of time that is allowable between the two time points. ideal constraints encode the intended timing between two time points as well as the optimization_cost by which these fields should be weighted.

{
  "time_constraints": Option<[
    {
      "from": {
        "ref_start": Option<Container>,
        "ref_end": Option<Container>,
        "instruction_start": Option<Int>,
        "instruction_end": Option<Int>
      },
      "to": {
        "ref_start": Option<Container>,
        "ref_end": Option<Container>,
        "instruction_start": Option<Int>,
        "instruction_end": Option<Int>
      },
      "less_than": Option<Time>,
      "more_than": Option<Time>,
      "ideal": Option<{
        "value": Time,
        "optimization_cost": Option<Enum(
          "linear",
          "squared",
          "exponential"
        )>
      }>
    }
  ]>
}