docs-xml/Samba3-Developers-Guide/architecture.xml

   1 <?xml version="1.0" encoding="iso-8859-1"?>
   2 <!DOCTYPE chapter PUBLIC "-//Samba-Team//DTD DocBook V4.2-Based Variant V1.0//EN" "http://www.samba.org/samba/DTD/samba-doc">
   3 <chapter id="architecture">
   4 <chapterinfo>
   5         <author>
   6                 <firstname>Dan</firstname><surname>Shearer</surname>
   7         </author>
   8         <pubdate> November 1997</pubdate>
   9 </chapterinfo>
  10
  11 <title>Samba Architecture</title>
  12
  13 <sect1>
  14 <title>Introduction</title>
  15
  16 <para>
  17 This document gives a general overview of how Samba works
  18 internally. The Samba Team has tried to come up with a model which is
  19 the best possible compromise between elegance, portability, security
  20 and the constraints imposed by the very messy SMB and CIFS
  21 protocol.
  22 </para>
  23
  24 <para>
  25 It also tries to answer some of the frequently asked questions such as:
  26 </para>
  27
  28 <orderedlist>
  29 <listitem><para>
  30         Is Samba secure when running on Unix? The xyz platform?
  31         What about the root priveliges issue?
  32 </para></listitem>
  33
  34 <listitem><para>Pros and cons of multithreading in various parts of Samba</para></listitem>
  35
  36 <listitem><para>Why not have a separate process for name resolution, WINS, and browsing?</para></listitem>
  37
  38 </orderedlist>
  39
  40 </sect1>
  41
  42 <sect1>
  43 <title>Multithreading and Samba</title>
  44
  45 <para>
  46 People sometimes tout threads as a uniformly good thing. They are very
  47 nice in their place but are quite inappropriate for smbd. nmbd is
  48 another matter, and multi-threading it would be very nice.
  49 </para>
  50
  51 <para>
  52 The short version is that smbd is not multithreaded, and alternative
  53 servers that take this approach under Unix (such as Syntax, at the
  54 time of writing) suffer tremendous performance penalties and are less
  55 robust. nmbd is not threaded either, but this is because it is not
  56 possible to do it while keeping code consistent and portable across 35
  57 or more platforms. (This drawback also applies to threading smbd.)
  58 </para>
  59
  60 <para>
  61 The longer versions is that there are very good reasons for not making
  62 smbd multi-threaded.  Multi-threading would actually make Samba much
  63 slower, less scalable, less portable and much less robust. The fact
  64 that we use a separate process for each connection is one of Samba's
  65 biggest advantages.
  66 </para>
  67
  68 </sect1>
  69
  70 <sect1>
  71 <title>Threading smbd</title>
  72
  73 <para>
  74 A few problems that would arise from a threaded smbd are:
  75 </para>
  76
  77 <orderedlist>
  78 <listitem><para>
  79         It's not only to create threads instead of processes, but you
  80         must care about all variables if they have to be thread specific
  81         (currently they would be global).
  82 </para></listitem>
  83
  84 <listitem><para>
  85         if one thread dies (eg. a seg fault) then all threads die. We can
  86         immediately throw robustness out the window.
  87 </para></listitem>
  88
  89 <listitem><para>
  90         many of the system calls we make are blocking. Non-blocking
  91         equivalents of many calls are either not available or are awkward (and
  92         slow) to use. So while we block in one thread all clients are
  93         waiting. Imagine if one share is a slow NFS filesystem and the others
  94         are fast, we will end up slowing all clients to the speed of NFS.
  95 </para></listitem>
  96
  97 <listitem><para>
  98         you can't run as a different uid in different threads. This means
  99         we would have to switch uid/gid on _every_ SMB packet. It would be
 100         horrendously slow.
 101 </para></listitem>
 102
 103 <listitem><para>
 104         the per process file descriptor limit would mean that we could only
 105         support a limited number of clients.
 106 </para></listitem>
 107
 108 <listitem><para>
 109         we couldn't use the system locking calls as the locking context of
 110         fcntl() is a process, not a thread.
 111 </para></listitem>
 112
 113 </orderedlist>
 114
 115 </sect1>
 116
 117 <sect1>
 118 <title>Threading nmbd</title>
 119
 120 <para>
 121 This would be ideal, but gets sunk by portability requirements.
 122 </para>
 123
 124 <para>
 125 Andrew tried to write a test threads library for nmbd that used only
 126 ansi-C constructs (using setjmp and longjmp). Unfortunately some OSes
 127 defeat this by restricting longjmp to calling addresses that are
 128 shallower than the current address on the stack (apparently AIX does
 129 this). This makes a truly portable threads library impossible. So to
 130 support all our current platforms we would have to code nmbd both with
 131 and without threads, and as the real aim of threads is to make the
 132 code clearer we would not have gained anything. (it is a myth that
 133 threads make things faster. threading is like recursion, it can make
 134 things clear but the same thing can always be done faster by some
 135 other method)
 136 </para>
 137
 138 <para>
 139 Chris tried to spec out a general design that would abstract threading
 140 vs separate processes (vs other methods?) and make them accessible
 141 through some general API. This doesn't work because of the data
 142 sharing requirements of the protocol (packets in the future depending
 143 on packets now, etc.) At least, the code would work but would be very
 144 clumsy, and besides the fork() type model would never work on Unix. (Is there an OS that it would work on, for nmbd?)
 145 </para>
 146
 147 <para>
 148 A fork() is cheap, but not nearly cheap enough to do on every UDP
 149 packet that arrives. Having a pool of processes is possible but is
 150 nasty to program cleanly due to the enormous amount of shared data (in
 151 complex structures) between the processes. We can't rely on each
 152 platform having a shared memory system.
 153 </para>
 154
 155 </sect1>
 156
 157 <sect1>
 158 <title>nbmd Design</title>
 159
 160 <para>
 161 Originally Andrew used recursion to simulate a multi-threaded
 162 environment, which use the stack enormously and made for really
 163 confusing debugging sessions. Luke Leighton rewrote it to use a
 164 queuing system that keeps state information on each packet.  The
 165 first version used a single structure which was used by all the
 166 pending states.  As the initialisation of this structure was
 167 done by adding arguments, as the functionality developed, it got
 168 pretty messy.  So, it was replaced with a higher-order function
 169 and a pointer to a user-defined memory block.  This suddenly
 170 made things much simpler: large numbers of functions could be
 171 made static, and modularised.  This is the same principle as used
 172 in NT's kernel, and achieves the same effect as threads, but in
 173 a single process.
 174 </para>
 175
 176 <para>
 177 Then Jeremy rewrote nmbd. The packet data in nmbd isn't what's on the
 178 wire. It's a nice format that is very amenable to processing but still
 179 keeps the idea of a distinct packet. See "struct packet_struct" in
 180 nameserv.h.  It has all the detail but none of the on-the-wire
 181 mess. This makes it ideal for using in disk or memory-based databases
 182 for browsing and WINS support.
 183 </para>
 184
 185 </sect1>
 186 </chapter>