Documentation/media/kapi/v4l2-videobuf.rst

   1 .. SPDX-License-Identifier: GPL-2.0
   2
   3 .. _vb_framework:
   4
   5 Videobuf Framework
   6 ==================
   7
   8 Author: Jonathan Corbet <corbet@lwn.net>
   9
  10 Current as of 2.6.33
  11
  12 .. note::
  13
  14    The videobuf framework was deprecated in favor of videobuf2. Shouldn't
  15    be used on new drivers.
  16
  17 Introduction
  18 ------------
  19
  20 The videobuf layer functions as a sort of glue layer between a V4L2 driver
  21 and user space.  It handles the allocation and management of buffers for
  22 the storage of video frames.  There is a set of functions which can be used
  23 to implement many of the standard POSIX I/O system calls, including read(),
  24 poll(), and, happily, mmap().  Another set of functions can be used to
  25 implement the bulk of the V4L2 ioctl() calls related to streaming I/O,
  26 including buffer allocation, queueing and dequeueing, and streaming
  27 control.  Using videobuf imposes a few design decisions on the driver
  28 author, but the payback comes in the form of reduced code in the driver and
  29 a consistent implementation of the V4L2 user-space API.
  30
  31 Buffer types
  32 ------------
  33
  34 Not all video devices use the same kind of buffers.  In fact, there are (at
  35 least) three common variations:
  36
  37  - Buffers which are scattered in both the physical and (kernel) virtual
  38    address spaces.  (Almost) all user-space buffers are like this, but it
  39    makes great sense to allocate kernel-space buffers this way as well when
  40    it is possible.  Unfortunately, it is not always possible; working with
  41    this kind of buffer normally requires hardware which can do
  42    scatter/gather DMA operations.
  43
  44  - Buffers which are physically scattered, but which are virtually
  45    contiguous; buffers allocated with vmalloc(), in other words.  These
  46    buffers are just as hard to use for DMA operations, but they can be
  47    useful in situations where DMA is not available but virtually-contiguous
  48    buffers are convenient.
  49
  50  - Buffers which are physically contiguous.  Allocation of this kind of
  51    buffer can be unreliable on fragmented systems, but simpler DMA
  52    controllers cannot deal with anything else.
  53
  54 Videobuf can work with all three types of buffers, but the driver author
  55 must pick one at the outset and design the driver around that decision.
  56
  57 [It's worth noting that there's a fourth kind of buffer: "overlay" buffers
  58 which are located within the system's video memory.  The overlay
  59 functionality is considered to be deprecated for most use, but it still
  60 shows up occasionally in system-on-chip drivers where the performance
  61 benefits merit the use of this technique.  Overlay buffers can be handled
  62 as a form of scattered buffer, but there are very few implementations in
  63 the kernel and a description of this technique is currently beyond the
  64 scope of this document.]
  65
  66 Data structures, callbacks, and initialization
  67 ----------------------------------------------
  68
  69 Depending on which type of buffers are being used, the driver should
  70 include one of the following files:
  71
  72 .. code-block:: none
  73
  74     <media/videobuf-dma-sg.h>           /* Physically scattered */
  75     <media/videobuf-vmalloc.h>          /* vmalloc() buffers    */
  76     <media/videobuf-dma-contig.h>       /* Physically contiguous */
  77
  78 The driver's data structure describing a V4L2 device should include a
  79 struct videobuf_queue instance for the management of the buffer queue,
  80 along with a list_head for the queue of available buffers.  There will also
  81 need to be an interrupt-safe spinlock which is used to protect (at least)
  82 the queue.
  83
  84 The next step is to write four simple callbacks to help videobuf deal with
  85 the management of buffers:
  86
  87 .. code-block:: none
  88
  89     struct videobuf_queue_ops {
  90         int (*buf_setup)(struct videobuf_queue *q,
  91                          unsigned int *count, unsigned int *size);
  92         int (*buf_prepare)(struct videobuf_queue *q,
  93                            struct videobuf_buffer *vb,
  94                            enum v4l2_field field);
  95         void (*buf_queue)(struct videobuf_queue *q,
  96                           struct videobuf_buffer *vb);
  97         void (*buf_release)(struct videobuf_queue *q,
  98                             struct videobuf_buffer *vb);
  99     };
 100
 101 buf_setup() is called early in the I/O process, when streaming is being
 102 initiated; its purpose is to tell videobuf about the I/O stream.  The count
 103 parameter will be a suggested number of buffers to use; the driver should
 104 check it for rationality and adjust it if need be.  As a practical rule, a
 105 minimum of two buffers are needed for proper streaming, and there is
 106 usually a maximum (which cannot exceed 32) which makes sense for each
 107 device.  The size parameter should be set to the expected (maximum) size
 108 for each frame of data.
 109
 110 Each buffer (in the form of a struct videobuf_buffer pointer) will be
 111 passed to buf_prepare(), which should set the buffer's size, width, height,
 112 and field fields properly.  If the buffer's state field is
 113 VIDEOBUF_NEEDS_INIT, the driver should pass it to:
 114
 115 .. code-block:: none
 116
 117     int videobuf_iolock(struct videobuf_queue* q, struct videobuf_buffer *vb,
 118                         struct v4l2_framebuffer *fbuf);
 119
 120 Among other things, this call will usually allocate memory for the buffer.
 121 Finally, the buf_prepare() function should set the buffer's state to
 122 VIDEOBUF_PREPARED.
 123
 124 When a buffer is queued for I/O, it is passed to buf_queue(), which should
 125 put it onto the driver's list of available buffers and set its state to
 126 VIDEOBUF_QUEUED.  Note that this function is called with the queue spinlock
 127 held; if it tries to acquire it as well things will come to a screeching
 128 halt.  Yes, this is the voice of experience.  Note also that videobuf may
 129 wait on the first buffer in the queue; placing other buffers in front of it
 130 could again gum up the works.  So use list_add_tail() to enqueue buffers.
 131
 132 Finally, buf_release() is called when a buffer is no longer intended to be
 133 used.  The driver should ensure that there is no I/O active on the buffer,
 134 then pass it to the appropriate free routine(s):
 135
 136 .. code-block:: none
 137
 138     /* Scatter/gather drivers */
 139     int videobuf_dma_unmap(struct videobuf_queue *q,
 140                            struct videobuf_dmabuf *dma);
 141     int videobuf_dma_free(struct videobuf_dmabuf *dma);
 142
 143     /* vmalloc drivers */
 144     void videobuf_vmalloc_free (struct videobuf_buffer *buf);
 145
 146     /* Contiguous drivers */
 147     void videobuf_dma_contig_free(struct videobuf_queue *q,
 148                                   struct videobuf_buffer *buf);
 149
 150 One way to ensure that a buffer is no longer under I/O is to pass it to:
 151
 152 .. code-block:: none
 153
 154     int videobuf_waiton(struct videobuf_buffer *vb, int non_blocking, int intr);
 155
 156 Here, vb is the buffer, non_blocking indicates whether non-blocking I/O
 157 should be used (it should be zero in the buf_release() case), and intr
 158 controls whether an interruptible wait is used.
 159
 160 File operations
 161 ---------------
 162
 163 At this point, much of the work is done; much of the rest is slipping
 164 videobuf calls into the implementation of the other driver callbacks.  The
 165 first step is in the open() function, which must initialize the
 166 videobuf queue.  The function to use depends on the type of buffer used:
 167
 168 .. code-block:: none
 169
 170     void videobuf_queue_sg_init(struct videobuf_queue *q,
 171                                 struct videobuf_queue_ops *ops,
 172                                 struct device *dev,
 173                                 spinlock_t *irqlock,
 174                                 enum v4l2_buf_type type,
 175                                 enum v4l2_field field,
 176                                 unsigned int msize,
 177                                 void *priv);
 178
 179     void videobuf_queue_vmalloc_init(struct videobuf_queue *q,
 180                                 struct videobuf_queue_ops *ops,
 181                                 struct device *dev,
 182                                 spinlock_t *irqlock,
 183                                 enum v4l2_buf_type type,
 184                                 enum v4l2_field field,
 185                                 unsigned int msize,
 186                                 void *priv);
 187
 188     void videobuf_queue_dma_contig_init(struct videobuf_queue *q,
 189                                        struct videobuf_queue_ops *ops,
 190                                        struct device *dev,
 191                                        spinlock_t *irqlock,
 192                                        enum v4l2_buf_type type,
 193                                        enum v4l2_field field,
 194                                        unsigned int msize,
 195                                        void *priv);
 196
 197 In each case, the parameters are the same: q is the queue structure for the
 198 device, ops is the set of callbacks as described above, dev is the device
 199 structure for this video device, irqlock is an interrupt-safe spinlock to
 200 protect access to the data structures, type is the buffer type used by the
 201 device (cameras will use V4L2_BUF_TYPE_VIDEO_CAPTURE, for example), field
 202 describes which field is being captured (often V4L2_FIELD_NONE for
 203 progressive devices), msize is the size of any containing structure used
 204 around struct videobuf_buffer, and priv is a private data pointer which
 205 shows up in the priv_data field of struct videobuf_queue.  Note that these
 206 are void functions which, evidently, are immune to failure.
 207
 208 V4L2 capture drivers can be written to support either of two APIs: the
 209 read() system call and the rather more complicated streaming mechanism.  As
 210 a general rule, it is necessary to support both to ensure that all
 211 applications have a chance of working with the device.  Videobuf makes it
 212 easy to do that with the same code.  To implement read(), the driver need
 213 only make a call to one of:
 214
 215 .. code-block:: none
 216
 217     ssize_t videobuf_read_one(struct videobuf_queue *q,
 218                               char __user *data, size_t count,
 219                               loff_t *ppos, int nonblocking);
 220
 221     ssize_t videobuf_read_stream(struct videobuf_queue *q,
 222                                  char __user *data, size_t count,
 223                                  loff_t *ppos, int vbihack, int nonblocking);
 224
 225 Either one of these functions will read frame data into data, returning the
 226 amount actually read; the difference is that videobuf_read_one() will only
 227 read a single frame, while videobuf_read_stream() will read multiple frames
 228 if they are needed to satisfy the count requested by the application.  A
 229 typical driver read() implementation will start the capture engine, call
 230 one of the above functions, then stop the engine before returning (though a
 231 smarter implementation might leave the engine running for a little while in
 232 anticipation of another read() call happening in the near future).
 233
 234 The poll() function can usually be implemented with a direct call to:
 235
 236 .. code-block:: none
 237
 238     unsigned int videobuf_poll_stream(struct file *file,
 239                                       struct videobuf_queue *q,
 240                                       poll_table *wait);
 241
 242 Note that the actual wait queue eventually used will be the one associated
 243 with the first available buffer.
 244
 245 When streaming I/O is done to kernel-space buffers, the driver must support
 246 the mmap() system call to enable user space to access the data.  In many
 247 V4L2 drivers, the often-complex mmap() implementation simplifies to a
 248 single call to:
 249
 250 .. code-block:: none
 251
 252     int videobuf_mmap_mapper(struct videobuf_queue *q,
 253                              struct vm_area_struct *vma);
 254
 255 Everything else is handled by the videobuf code.
 256
 257 The release() function requires two separate videobuf calls:
 258
 259 .. code-block:: none
 260
 261     void videobuf_stop(struct videobuf_queue *q);
 262     int videobuf_mmap_free(struct videobuf_queue *q);
 263
 264 The call to videobuf_stop() terminates any I/O in progress - though it is
 265 still up to the driver to stop the capture engine.  The call to
 266 videobuf_mmap_free() will ensure that all buffers have been unmapped; if
 267 so, they will all be passed to the buf_release() callback.  If buffers
 268 remain mapped, videobuf_mmap_free() returns an error code instead.  The
 269 purpose is clearly to cause the closing of the file descriptor to fail if
 270 buffers are still mapped, but every driver in the 2.6.32 kernel cheerfully
 271 ignores its return value.
 272
 273 ioctl() operations
 274 ------------------
 275
 276 The V4L2 API includes a very long list of driver callbacks to respond to
 277 the many ioctl() commands made available to user space.  A number of these
 278 - those associated with streaming I/O - turn almost directly into videobuf
 279 calls.  The relevant helper functions are:
 280
 281 .. code-block:: none
 282
 283     int videobuf_reqbufs(struct videobuf_queue *q,
 284                          struct v4l2_requestbuffers *req);
 285     int videobuf_querybuf(struct videobuf_queue *q, struct v4l2_buffer *b);
 286     int videobuf_qbuf(struct videobuf_queue *q, struct v4l2_buffer *b);
 287     int videobuf_dqbuf(struct videobuf_queue *q, struct v4l2_buffer *b,
 288                        int nonblocking);
 289     int videobuf_streamon(struct videobuf_queue *q);
 290     int videobuf_streamoff(struct videobuf_queue *q);
 291
 292 So, for example, a VIDIOC_REQBUFS call turns into a call to the driver's
 293 vidioc_reqbufs() callback which, in turn, usually only needs to locate the
 294 proper struct videobuf_queue pointer and pass it to videobuf_reqbufs().
 295 These support functions can replace a great deal of buffer management
 296 boilerplate in a lot of V4L2 drivers.
 297
 298 The vidioc_streamon() and vidioc_streamoff() functions will be a bit more
 299 complex, of course, since they will also need to deal with starting and
 300 stopping the capture engine.
 301
 302 Buffer allocation
 303 -----------------
 304
 305 Thus far, we have talked about buffers, but have not looked at how they are
 306 allocated.  The scatter/gather case is the most complex on this front.  For
 307 allocation, the driver can leave buffer allocation entirely up to the
 308 videobuf layer; in this case, buffers will be allocated as anonymous
 309 user-space pages and will be very scattered indeed.  If the application is
 310 using user-space buffers, no allocation is needed; the videobuf layer will
 311 take care of calling get_user_pages() and filling in the scatterlist array.
 312
 313 If the driver needs to do its own memory allocation, it should be done in
 314 the vidioc_reqbufs() function, *after* calling videobuf_reqbufs().  The
 315 first step is a call to:
 316
 317 .. code-block:: none
 318
 319     struct videobuf_dmabuf *videobuf_to_dma(struct videobuf_buffer *buf);
 320
 321 The returned videobuf_dmabuf structure (defined in
 322 <media/videobuf-dma-sg.h>) includes a couple of relevant fields:
 323
 324 .. code-block:: none
 325
 326     struct scatterlist  *sglist;
 327     int                 sglen;
 328
 329 The driver must allocate an appropriately-sized scatterlist array and
 330 populate it with pointers to the pieces of the allocated buffer; sglen
 331 should be set to the length of the array.
 332
 333 Drivers using the vmalloc() method need not (and cannot) concern themselves
 334 with buffer allocation at all; videobuf will handle those details.  The
 335 same is normally true of contiguous-DMA drivers as well; videobuf will
 336 allocate the buffers (with dma_alloc_coherent()) when it sees fit.  That
 337 means that these drivers may be trying to do high-order allocations at any
 338 time, an operation which is not always guaranteed to work.  Some drivers
 339 play tricks by allocating DMA space at system boot time; videobuf does not
 340 currently play well with those drivers.
 341
 342 As of 2.6.31, contiguous-DMA drivers can work with a user-supplied buffer,
 343 as long as that buffer is physically contiguous.  Normal user-space
 344 allocations will not meet that criterion, but buffers obtained from other
 345 kernel drivers, or those contained within huge pages, will work with these
 346 drivers.
 347
 348 Filling the buffers
 349 -------------------
 350
 351 The final part of a videobuf implementation has no direct callback - it's
 352 the portion of the code which actually puts frame data into the buffers,
 353 usually in response to interrupts from the device.  For all types of
 354 drivers, this process works approximately as follows:
 355
 356  - Obtain the next available buffer and make sure that somebody is actually
 357    waiting for it.
 358
 359  - Get a pointer to the memory and put video data there.
 360
 361  - Mark the buffer as done and wake up the process waiting for it.
 362
 363 Step (1) above is done by looking at the driver-managed list_head structure
 364 - the one which is filled in the buf_queue() callback.  Because starting
 365 the engine and enqueueing buffers are done in separate steps, it's possible
 366 for the engine to be running without any buffers available - in the
 367 vmalloc() case especially.  So the driver should be prepared for the list
 368 to be empty.  It is equally possible that nobody is yet interested in the
 369 buffer; the driver should not remove it from the list or fill it until a
 370 process is waiting on it.  That test can be done by examining the buffer's
 371 done field (a wait_queue_head_t structure) with waitqueue_active().
 372
 373 A buffer's state should be set to VIDEOBUF_ACTIVE before being mapped for
 374 DMA; that ensures that the videobuf layer will not try to do anything with
 375 it while the device is transferring data.
 376
 377 For scatter/gather drivers, the needed memory pointers will be found in the
 378 scatterlist structure described above.  Drivers using the vmalloc() method
 379 can get a memory pointer with:
 380
 381 .. code-block:: none
 382
 383     void *videobuf_to_vmalloc(struct videobuf_buffer *buf);
 384
 385 For contiguous DMA drivers, the function to use is:
 386
 387 .. code-block:: none
 388
 389     dma_addr_t videobuf_to_dma_contig(struct videobuf_buffer *buf);
 390
 391 The contiguous DMA API goes out of its way to hide the kernel-space address
 392 of the DMA buffer from drivers.
 393
 394 The final step is to set the size field of the relevant videobuf_buffer
 395 structure to the actual size of the captured image, set state to
 396 VIDEOBUF_DONE, then call wake_up() on the done queue.  At this point, the
 397 buffer is owned by the videobuf layer and the driver should not touch it
 398 again.
 399
 400 Developers who are interested in more information can go into the relevant
 401 header files; there are a few low-level functions declared there which have
 402 not been talked about here.  Also worthwhile is the vivi driver
 403 (drivers/media/platform/vivi.c), which is maintained as an example of how V4L2
 404 drivers should be written.  Vivi only uses the vmalloc() API, but it's good
 405 enough to get started with.  Note also that all of these calls are exported
 406 GPL-only, so they will not be available to non-GPL kernel modules.