git.samba.org - third_party/subunit/blob - README

   1
   2   subunit: A streaming protocol for test results
   3   Copyright (C) 2005-2013 Robert Collins <robertc@robertcollins.net>
   4
   5   Licensed under either the Apache License, Version 2.0 or the BSD 3-clause
   6   license at the users choice. A copy of both licenses are available in the
   7   project source as Apache-2.0 and BSD. You may not use this file except in
   8   compliance with one of these two licences.
   9
  10   Unless required by applicable law or agreed to in writing, software
  11   distributed under these licenses is distributed on an "AS IS" BASIS, WITHOUT
  12   WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the
  13   license you chose for the specific language governing permissions and
  14   limitations under that license.
  15
  16   See the COPYING file for full details on the licensing of Subunit.
  17
  18   subunit reuses iso8601 by Michael Twomey, distributed under an MIT style
  19   licence - see python/iso8601/LICENSE for details.
  20
  21 Subunit
  22 -------
  23
  24 Subunit is a streaming protocol for test results.
  25
  26 There are two major revisions of the protocol. Version 1 was trivially human
  27 readable but had significant defects as far as highly parallel testing was
  28 concerned - it had no room for doing discovery and execution in parallel,
  29 required substantial buffering when multiplexing and was fragile - a corrupt
  30 byte could cause an entire stream to be misparsed. Version 1.1 added
  31 encapsulation of binary streams which mitigated some of the issues but the
  32 core remained.
  33
  34 Version 2 shares many of the good characteristics of Version 1 - it can be
  35 embedded into a regular text stream (e.g. from a build system) and it still
  36 models xUnit style test execution. It also fixes many of the issues with
  37 Version 1 - Version 2 can be multiplexed without excessive buffering (in
  38 time or space), it has a well defined recovery mechanism for dealing with
  39 corrupted streams (e.g. where two processes write to the same stream
  40 concurrently, or where the stream generator suffers a bug).
  41
  42 More details on both protocol version s can be found in the 'Protocol' section
  43 of this document.
  44
  45 Subunit comes with command line filters to process a subunit stream and
  46 language bindings for python, C, C++ and shell. Bindings are easy to write
  47 for other languages.
  48
  49 A number of useful things can be done easily with subunit:
  50  * Test aggregation: Tests run separately can be combined and then
  51    reported/displayed together. For instance, tests from different languages
  52    can be shown as a seamless whole, and tests running on multiple machines
  53    can be aggregated into a single stream through a multiplexer.
  54  * Test archiving: A test run may be recorded and replayed later.
  55  * Test isolation: Tests that may crash or otherwise interact badly with each
  56    other can be run seperately and then aggregated, rather than interfering
  57    with each other or requiring an adhoc test->runner reporting protocol.
  58  * Grid testing: subunit can act as the necessary serialisation and
  59    deserialiation to get test runs on distributed machines to be reported in
  60    real time.
  61
  62 Subunit supplies the following filters:
  63  * tap2subunit - convert perl's TestAnythingProtocol to subunit.
  64  * subunit2csv - convert a subunit stream to csv.
  65  * subunit2pyunit - convert a subunit stream to pyunit test results.
  66  * subunit2gtk - show a subunit stream in GTK.
  67  * subunit2junitxml - convert a subunit stream to JUnit's XML format.
  68  * subunit-diff - compare two subunit streams.
  69  * subunit-filter - filter out tests from a subunit stream.
  70  * subunit-ls - list info about tests present in a subunit stream.
  71  * subunit-stats - generate a summary of a subunit stream.
  72  * subunit-tags - add or remove tags from a stream.
  73
  74 Integration with other tools
  75 ----------------------------
  76
  77 Subunit's language bindings act as integration with various test runners like
  78 'check', 'cppunit', Python's 'unittest'. Beyond that a small amount of glue
  79 (typically a few lines) will allow Subunit to be used in more sophisticated
  80 ways.
  81
  82 Python
  83 ======
  84
  85 Subunit has excellent Python support: most of the filters and tools are written
  86 in python and there are facilities for using Subunit to increase test isolation
  87 seamlessly within a test suite.
  88
  89 The most common way is to run an existing python test suite and have it output
  90 subunit via the ``subunit.run`` module::
  91
  92   $ python -m subunit.run mypackage.tests.test_suite
  93
  94 For more information on the Python support Subunit offers , please see
  95 ``pydoc subunit``, or the source in ``python/subunit/``
  96
  97 C
  98 =
  99
 100 Subunit has C bindings to emit the protocol. The 'check' C unit testing project
 101 has included subunit support in their project for some years now. See
 102 'c/README' for more details.
 103
 104 C++
 105 ===
 106
 107 The C library is includable and usable directly from C++. A TestListener for
 108 CPPUnit is included in the Subunit distribution. See 'c++/README' for details.
 109
 110 shell
 111 =====
 112
 113 There are two sets of shell tools. There are filters, which accept a subunit
 114 stream on stdin and output processed data (or a transformed stream) on stdout.
 115
 116 Then there are unittest facilities similar to those for C : shell bindings
 117 consisting of simple functions to output protocol elements, and a patch for
 118 adding subunit output to the 'ShUnit' shell test runner. See 'shell/README' for
 119 details.
 120
 121 Filter recipes
 122 --------------
 123
 124 To ignore some failing tests whose root cause is already known::
 125
 126   subunit-filter --without 'AttributeError.*flavor'
 127
 128
 129 The xUnit test model
 130 --------------------
 131
 132 Subunit implements a slightly modified xUnit test model. The stock standard
 133 model is that there are tests, which have an id(), can be run, and when run
 134 start, emit an outcome (like success or failure) and then finish.
 135
 136 Subunit extends this with the idea of test enumeration (find out about tests
 137 a runner has without running them), tags (allow users to describe tests in
 138 ways the test framework doesn't apply any semantic value to), file attachments
 139 (allow arbitrary data to make analysing a failure easy) and timestamps.
 140
 141 The protocol
 142 ------------
 143
 144 Version 2, or v2 is new and still under development, but is intended to
 145 supercede version 1 in the very near future. Subunit's bundled tools accept
 146 only version 2 and only emit version 2, but the new filters subunit-1to2 and
 147 subunit-2to1 can be used to interoperate with older third party libraries.
 148
 149 Version 2
 150 =========
 151
 152 Version 2 is a binary protocol consisting of independent packets that can be
 153 embedded in the output from tools like make - as long as each packet has no
 154 other bytes mixed in with it (which 'make -j N>1' has a tendency of doing).
 155 Version 2 is currently in draft form, and early adopters should be willing
 156 to either discard stored results (if protocol changes are made), or bulk
 157 convert them back to v1 and then to a newer edition of v2.
 158
 159 The protocol synchronises at the start of the stream, after a packet, or
 160 after any 0x0A byte. That is, a subunit v2 packet starts after a newline or
 161 directly after the end of the prior packet.
 162
 163 Subunit is intended to be transported over a reliable streaming protocol such
 164 as TCP. As such it does not concern itself with out of order delivery of
 165 packets. However, because of the possibility of corruption due to either
 166 bugs in the sender, or due to mixed up data from concurrent writes to the same
 167 fd when being embedded, subunit strives to recover reasonably gracefully from
 168 damaged data.
 169
 170 A key design goal for Subunit version 2 is to allow processing and multiplexing
 171 without forcing buffering for semantic correctness, as buffering tends to hide
 172 hung or otherwise misbehaving tests. That said, limited time based buffering
 173 for network efficiency is a good idea - this is ultimately implementator
 174 choice. Line buffering is also discouraged for subunit streams, as dropping
 175 into a debugger or other tool may require interactive traffic even if line
 176 buffering would not otherwise be a problem.
 177
 178 In version two there are two conceptual events - a test status event and a file
 179 attachment event. Events may have timestamps, and the path of multiplexers that
 180 an event is routed through is recorded to permit sending actions back to the
 181 source (such as new tests to run or stdin for driving debuggers and other
 182 interactive input). Test status events are used to enumerate tests, to report
 183 tests and test helpers as they run. Tests may have tags, used to allow
 184 tunnelling extra meanings through subunit without requiring parsing of
 185 arbitrary file attachments. Things that are not standalone tests get marked
 186 as such by setting the 'Runnable' flag to false. (For instance, individual
 187 assertions in TAP are not runnable tests, only the top level TAP test script
 188 is runnable).
 189
 190 File attachments are used to provide rich detail about the nature of a failure.
 191 File attachments can also be used to encapsulate stdout and stderr both during
 192 and outside tests.
 193
 194 Most numbers are stored in network byte order - Most Significant Byte first
 195 encoded using a variation of http://www.dlugosz.com/ZIP2/VLI.html. The first
 196 byte's top 2 high order bits encode the total number of octets in the number.
 197 This encoding can encode values from 0 to 2**30-1, enough to encode a
 198 nanosecond. Numbers that are not variable length encoded are still stored in
 199 MSB order.
 200
 201  prefix   octets   max       max
 202 +-------+--------+---------+------------+
 203 | 00    |      1 |  2**6-1 |         63 |
 204 | 01    |      2 | 2**14-1 |      16383 |
 205 | 10    |      3 | 2**22-1 |    4194303 |
 206 | 11    |      4 | 2**30-1 | 1073741823 |
 207 +-------+--------+---------+------------+
 208
 209 All variable length elements of the packet are stored with a length prefix
 210 number allowing them to be skipped over for consumers that don't need to
 211 interpret them.
 212
 213 UTF-8 strings are with no terminating NUL and should not have any embedded NULs
 214 (implementations SHOULD validate any such strings that they process and take
 215 some remedial action (such as discarding the packet as corrupt).
 216
 217 In short the structure of a packet is:
 218 PACKET := SIGNATURE FLAGS PACKET_LENGTH TIMESTAMP? TESTID? TAGS? MIME?
 219           FILECONTENT? ROUTING_CODE? CRC32
 220
 221 In more detail...
 222
 223 Packets are identified by a single byte signature - 0xB3, which is never legal
 224 in a UTF-8 stream as the first byte of a character. 0xB3 starts with the first
 225 bit set and the second not, which is the UTF-8 signature for a continuation
 226 byte. 0xB3 was chosen as 0x73 ('s' in ASCII') with the top two bits replaced by
 227 the 1 and 0 for a continuation byte.
 228
 229 If subunit packets are being embedded in a non-UTF-8 text stream, where 0x73 is
 230 a legal character, consider either recoding the text to UTF-8, or using
 231 subunit's 'file' packets to embed the text stream in subunit, rather than the
 232 other way around.
 233
 234 Following the signature byte comes a 16-bit flags field, which includes a
 235 4-bit version field - if the version is not 0x2 then the packet cannot be
 236 read. It is recommended to signal an error at this point (e.g. by emitting
 237 a synthetic error packet and returning to the top level loop to look for
 238 new packets, or exiting with an error). If recovery is desired, treat the
 239 packet signature as an opaque byte and scan for a new synchronisation point.
 240 NB: Subunit V1 and V2 packets may legitimately included 0xB3 internally,
 241 as they are an 8-bit safe container format, so recovery from this situation
 242 may involve an arbitrary number of false positives until an actual packet
 243 is encountered : and even then it may still be false, failing after passing
 244 the version check due to coincidence.
 245
 246 Flags are stored in network byte order too.
 247 +-------------------------+------------------------+
 248 | High byte               | Low byte               |
 249 | 15 14 13 12 11 10  9  8 | 7  6  5  4  3  2  1  0 |
 250 | VERSION    |feature bits|                        |
 251 +------------+------------+------------------------+
 252
 253 Valid version values are:
 254 0x2 - version 2
 255
 256 Feature bits:
 257 Bit 11 - mask 0x0800 - Test id present.
 258 Bit 10 - mask 0x0400 - Routing code present.
 259 Bit  9 - mask 0x0200 - Timestamp present.
 260 Bit  8 - mask 0x0100 - Test is 'runnable'.
 261 Bit  7 - mask 0x0080 - Tags are present.
 262 Bit  6 - mask 0x0040 - File content is present.
 263 Bit  5 - mask 0x0020 - File MIME type is present.
 264 Bit  4 - mask 0x0010 - EOF marker.
 265 Bit  3 - mask 0x0008 - Must be zero in version 2.
 266
 267 Test status gets three bits:
 268 Bit 2 | Bit 1 | Bit 0 - mask 0x0007 - A test status enum lookup:
 269 000 - undefined / no test
 270 001 - Enumeration / existence
 271 002 - In progress
 272 003 - Success
 273 004 - Unexpected Success
 274 005 - Skipped
 275 006 - Failed
 276 007 - Expected failure
 277
 278 After the flags field is a number field giving the length in bytes for the
 279 entire packet including the signature and the checksum. This length must
 280 be less than 4MiB - 4194303 bytes. The encoding can obviously record a larger
 281 number but one of the goals is to avoid requiring large buffers, or causing
 282 large latency in the packet forward/processing pipeline. Larger file
 283 attachments can be communicated in multiple packets, and the overhead in such a
 284 4MiB packet is approximately 0.2%.
 285
 286 The rest of the packet is a series of optional features as specified by the set
 287 feature bits in the flags field. When absent they are entirely absent.
 288
 289 Forwarding and multiplexing of packets can be done without interpreting the
 290 remainder of the packet until the routing code and checksum (which are both at
 291 the end of the packet). Additionally, routers can often avoid copying or moving
 292 the bulk of the packet, as long as the routing code size increase doesn't force
 293 the length encoding to take up a new byte (which will only happen to packets
 294 less than or equal to 16KiB in length) - large packets are very efficient to
 295 route.
 296
 297 Timestamp when present is a 32 bit unsigned integer for secnods, and a variable
 298 length number for nanoseconds, representing UTC time since Unix Epoch in
 299 seconds and nanoseconds.
 300
 301 Test id when present is a UTF-8 string. The test id should uniquely identify
 302 runnable tests such that they can be selected individually. For tests and other
 303 actions which cannot be individually run (such as test
 304 fixtures/layers/subtests) uniqueness is not required (though being human
 305 meaningful is highly recommended).
 306
 307 Tags when present is a length prefixed vector of UTF-8 strings, one per tag.
 308 There are no restrictions on tag content (other than the restrictions on UTF-8
 309 strings in subunit in general). Tags have no ordering.
 310
 311 When a MIME type is present, it defines the MIME type for the file across all
 312 packets same file (routing code + testid + name uniquely identifies a file,
 313 reset when EOF is flagged). If a file never has a MIME type set, it should be
 314 treated as application/octet-stream.
 315
 316 File content when present is a UTF-8 string for the name followed by the length
 317 in bytes of the content, and then the content octets.
 318
 319 If present routing code is a UTF-8 string. The routing code is used to
 320 determine which test backend a test was running on when doing data analysis,
 321 and to route stdin to the test process if interaction is required.
 322
 323 Multiplexers SHOULD add a routing code if none is present, and prefix any
 324 existing routing code with a routing code ('/' separated) if one is already
 325 present. For example, a multiplexer might label each stream it is multiplexing
 326 with a simple ordinal ('0', '1' etc), and given an incoming packet with route
 327 code '3' from stream '0' would adjust the route code when forwarding the packet
 328 to be '0/3'.
 329
 330 Following the end of the packet is a CRC-32 checksum of the contents of the
 331 packet including the signature.
 332
 333 Example packets
 334 ~~~~~~~~~~~~~~~
 335
 336 Trivial test "foo" enumeration packet, with test id, runnable set,
 337 status=enumeration. Spaces below are to visually break up signature / flags /
 338 length / testid / crc32
 339
 340 b3 2901 0c 03666f6f 08555f1b
 341
 342
 343 Version 1 (and 1.1)
 344 ===================
 345
 346 Version 1 (and 1.1) are mostly human readable protocols.
 347
 348 Sample subunit wire contents
 349 ----------------------------
 350
 351 The following::
 352   test: test foo works
 353   success: test foo works.
 354   test: tar a file.
 355   failure: tar a file. [
 356   ..
 357    ]..  space is eaten.
 358   foo.c:34 WARNING foo is not defined.
 359   ]
 360   a writeln to stdout
 361
 362 When run through subunit2pyunit::
 363   .F
 364   a writeln to stdout
 365
 366   ========================
 367   FAILURE: tar a file.
 368   -------------------
 369   ..
 370   ]..  space is eaten.
 371   foo.c:34 WARNING foo is not defined.
 372
 373
 374 Subunit protocol description
 375 ============================
 376
 377 This description is being ported to an EBNF style. Currently its only partly in
 378 that style, but should be fairly clear all the same. When in doubt, refer the
 379 source (and ideally help fix up the description!). Generally the protocol is
 380 line orientated and consists of either directives and their parameters, or
 381 when outside a DETAILS region unexpected lines which are not interpreted by
 382 the parser - they should be forwarded unaltered.
 383
 384 test|testing|test:|testing: test LABEL
 385 success|success:|successful|successful: test LABEL
 386 success|success:|successful|successful: test LABEL DETAILS
 387 failure: test LABEL
 388 failure: test LABEL DETAILS
 389 error: test LABEL
 390 error: test LABEL DETAILS
 391 skip[:] test LABEL
 392 skip[:] test LABEL DETAILS
 393 xfail[:] test LABEL
 394 xfail[:] test LABEL DETAILS
 395 uxsuccess[:] test LABEL
 396 uxsuccess[:] test LABEL DETAILS
 397 progress: [+|-]X
 398 progress: push
 399 progress: pop
 400 tags: [-]TAG ...
 401 time: YYYY-MM-DD HH:MM:SSZ
 402
 403 LABEL: UTF8*
 404 NAME: UTF8*
 405 DETAILS ::= BRACKETED | MULTIPART
 406 BRACKETED ::= '[' CR UTF8-lines ']' CR
 407 MULTIPART ::= '[ multipart' CR PART* ']' CR
 408 PART ::= PART_TYPE CR NAME CR PART_BYTES CR
 409 PART_TYPE ::= Content-Type: type/sub-type(;parameter=value,parameter=value)
 410 PART_BYTES ::= (DIGITS CR LF BYTE{DIGITS})* '0' CR LF
 411
 412 unexpected output on stdout -> stdout.
 413 exit w/0 or last test completing -> error
 414
 415 Tags given outside a test are applied to all following tests
 416 Tags given after a test: line and before the result line for the same test
 417 apply only to that test, and inherit the current global tags.
 418 A '-' before a tag is used to remove tags - e.g. to prevent a global tag
 419 applying to a single test, or to cancel a global tag.
 420
 421 The progress directive is used to provide progress information about a stream
 422 so that stream consumer can provide completion estimates, progress bars and so
 423 on. Stream generators that know how many tests will be present in the stream
 424 should output "progress: COUNT". Stream filters that add tests should output
 425 "progress: +COUNT", and those that remove tests should output
 426 "progress: -COUNT". An absolute count should reset the progress indicators in
 427 use - it indicates that two separate streams from different generators have
 428 been trivially concatenated together, and there is no knowledge of how many
 429 more complete streams are incoming. Smart concatenation could scan each stream
 430 for their count and sum them, or alternatively translate absolute counts into
 431 relative counts inline. It is recommended that outputters avoid absolute counts
 432 unless necessary. The push and pop directives are used to provide local regions
 433 for progress reporting. This fits with hierarchically operating test
 434 environments - such as those that organise tests into suites - the top-most
 435 runner can report on the number of suites, and each suite surround its output
 436 with a (push, pop) pair. Interpreters should interpret a pop as also advancing
 437 the progress of the restored level by one step. Encountering progress
 438 directives between the start and end of a test pair indicates that a previous
 439 test was interrupted and did not cleanly terminate: it should be implicitly
 440 closed with an error (the same as when a stream ends with no closing test
 441 directive for the most recently started test).
 442
 443 The time directive acts as a clock event - it sets the time for all future
 444 events. The value should be a valid ISO8601 time.
 445
 446 The skip, xfail and uxsuccess outcomes are not supported by all testing
 447 environments. In Python the testttools (https://launchpad.net/testtools)
 448 library is used to translate these automatically if an older Python version
 449 that does not support them is in use. See the testtools documentation for the
 450 translation policy.
 451
 452 skip is used to indicate a test was discovered but not executed. xfail is used
 453 to indicate a test that errored in some expected fashion (also know as "TODO"
 454 tests in some frameworks). uxsuccess is used to indicate and unexpected success
 455 where a test though to be failing actually passes. It is complementary to
 456 xfail.
 457
 458 Hacking on subunit
 459 ------------------
 460
 461 Releases
 462 ========
 463
 464 * Update versions in configure.ac and python/subunit/__init__.py.
 465 * Make PyPI and regular tarball releases. Upload the regular one to LP, the
 466   PyPI one to PyPI.
 467 * Push a tagged commit.
 468