cifs.upcall: try to use container ipc/uts/net/pid/mnt/user namespaces
authorAlastair Houghton <alastair@alastairs-place.net>
Tue, 29 Dec 2020 14:02:39 +0000 (14:02 +0000)
committerPavel Shilovsky <pshilov@microsoft.com>
Tue, 6 Apr 2021 19:20:18 +0000 (12:20 -0700)
commite461afd8cfa6d0781ae0c5c10e89b6ef1ca6da32
tree23689ac06c11e56cc0a85283cbcf367a3eb75577
parent73008e3292e4d46fde3eab5d5f618886210ec4a1
cifs.upcall: try to use container ipc/uts/net/pid/mnt/user namespaces

In certain scenarios (e.g. kerberos multimount), when a process does
syscalls, the kernel sometimes has to query information or trigger
some actions in userspace. To do so it calls the cifs.upcall binary
with information on the process that triggered the syscall in the
first place.

ls(pid=10) ====> open("foo") ====> kernel

                                   that user doesn't have an SMB
                                   session, lets create one using his
                                   kerberos credential cache

                                   call cifs.upcall and ask for krb info
                                   for whoever owns pid=10
                                                         |
                  cifs.upcall --pid 10 <=================+

               ...gather info...
                  return binary blob used
                  when establishing SMB session
                        ===================> kernel
                                              open SMB session, handle
                                              open() syscall
ls <===================================   return open() result to ls

On a system using containers, the kernel is still calling the host
cifs.upcall and using the host configuration (for network, pid, etc).

This patch changes the behaviour of cifs.upcall so that it uses the
calling process namespaces (ls in the example) when doing its
job.

Note that the kernel still calls the binary in the host, but the
binary will place itself the contexts of the calling process
namespaces.

This code makes use of (but shouldn't require) the following kernel
config options and syscall flags:

approx. year   |
introduced     |  config/flags
---------------+----------------
2008           | CONFIG_NAMESPACES=y
2007           | CONFIG_UTS_NS=y
2020           | CONFIG_TIME_NS=y
2006           | CONFIG_IPC_NS=y
2007           | CONFIG_USER_NS
2008           | CONFIG_PID_NS=y
2007           | CONFIG_NET_NS=y
2007           | CONFIG_CGROUPS
2016           | CLONE_NEWCGROUP setns() flag

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Alastair Houghton <alastair@alastairs-place.net>
cifs.upcall.c