git.samba.org - martins/ctdb.git/log

when a ctdb_takeover_run has failed  we must make sure that
need_takeover_run is set to true  or else we might forget to rerun it
again during the next recovery

othervise,  need_takeover_run is only set to true IFF the node flags for
a remote node and the local nodes differ.
It is possible that a takeover run fails  and thus the reassignment of
ip addresses is incomplete  but before we get back to the test in
monitor_cluster()  that all the node flags of all nodes have converged
and they now match each others again.   and thus causing
monitor_cluster() to fail to realize that a takeover run is needed.

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 04:36:23 +0000 (14:36 +1000)]

ensure smbd and winbindd do die in 50.samba

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 13 Sep 2007 04:28:18 +0000 (14:28 +1000)]

merge from tridge

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 04:08:18 +0000 (14:08 +1000)]

prevent recursion in the calling of ctdb_takeover_run

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 01:57:42 +0000 (11:57 +1000)]

more shell scripting fixes in 10.interface

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 01:19:49 +0000 (11:19 +1000)]

force recovery if unable to tell a node to release an IP

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 01:19:30 +0000 (11:19 +1000)]

fixed script errors in 10.interface

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 00:45:06 +0000 (10:45 +1000)]

we don't need the is_loopback logic in ctdb any more

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 00:39:05 +0000 (10:39 +1000)]

remove more cruft from the logs

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 00:24:48 +0000 (10:24 +1000)]

new approach for killing TCP connections on IP release

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 00:03:18 +0000 (10:03 +1000)]

remove clutter from ctdb log file

commit | commitdiff | tree

Andrew Tridgell [Thu, 13 Sep 2007 00:02:56 +0000 (10:02 +1000)]

fixed return code

commit | commitdiff | tree

Andrew Tridgell [Wed, 12 Sep 2007 03:26:24 +0000 (13:26 +1000)]

handle hung or slow ctdb daemons on shutdown

commit | commitdiff | tree

Andrew Tridgell [Wed, 12 Sep 2007 03:23:36 +0000 (13:23 +1000)]

- set arp_ignore to prevent replying to arp requests for addresses on loopback
- put removed IPs on loopback with scope host
- check for nul strings in ethtool call
;

commit | commitdiff | tree

Andrew Tridgell [Wed, 12 Sep 2007 03:22:31 +0000 (13:22 +1000)]

- don't allow the registration of clients with IPs we don't hold
- change some debug levels to make tracking of IP release problems easier

commit | commitdiff | tree

Andrew Tridgell [Wed, 12 Sep 2007 03:21:19 +0000 (13:21 +1000)]

changed some debug levels

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 11 Sep 2007 21:28:24 +0000 (07:28 +1000)]

use the public addresses variable instead of hardcoding the path

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 11 Sep 2007 21:26:30 +0000 (07:26 +1000)]

move all ip addresses onto loopback when we startup ctdb

commit | commitdiff | tree

Andrew Tridgell [Tue, 11 Sep 2007 06:38:32 +0000 (16:38 +1000)]

fixed location of arp_filter

commit | commitdiff | tree

Andrew Tridgell [Mon, 10 Sep 2007 10:45:27 +0000 (20:45 +1000)]

get interface right

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 10 Sep 2007 06:34:11 +0000 (16:34 +1000)]

grab the interface name from tok and not from the uninitialized array

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 10 Sep 2007 06:23:06 +0000 (16:23 +1000)]

merged patch from tridge

commit | commitdiff | tree

Andrew Tridgell [Mon, 10 Sep 2007 05:16:17 +0000 (15:16 +1000)]

fixed a pointer cast warning

commit | commitdiff | tree

Andrew Tridgell [Mon, 10 Sep 2007 05:09:28 +0000 (15:09 +1000)]

added back --public-interface to startup script

commit | commitdiff | tree

Andrew Tridgell [Mon, 10 Sep 2007 04:27:29 +0000 (14:27 +1000)]

- use struct sockaddr_in more consistently instead of string addresses
- allow for public_address lines with a defaulting interface

commit | commitdiff | tree

Andrew Tridgell [Mon, 10 Sep 2007 04:26:35 +0000 (14:26 +1000)]

add back in --public-interface as a default

commit | commitdiff | tree

Andrew Tridgell [Mon, 10 Sep 2007 03:21:11 +0000 (13:21 +1000)]

merge from ronnie

commit | commitdiff | tree

Andrew Tridgell [Mon, 10 Sep 2007 01:27:07 +0000 (11:27 +1000)]

add crontab and sysctl output

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 9 Sep 2007 21:45:57 +0000 (07:45 +1000)]

update a comment

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 9 Sep 2007 21:20:44 +0000 (07:20 +1000)]

change the signature to ctdb_sys_have_ip() to also return:
a bool that specifies whether the ip was held by a loopback adaptor or
not
the name of the interface where the ip was held

when we release an ip address from an interface, move the ip address
over to the loopback interface

when we release an ip address  after we have move it onto loopback,
use 60.nfs to kill off the server side (the local part) of the tcp
connection   so that the tcp connections dont survive a
failover/failback

61.nfstickle,   since we kill hte tcp connections when we release an ip
address   we no longer need to restart the nfs service in 61.nfstickle

update ctdb_takeover to use the new signature for ctdb_sys_have_ip

when we add a tcp connection to kill in ctdb_killtcp_add_connection()
check if either the srouce or destination address match a known public
address

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 7 Sep 2007 22:09:02 +0000 (08:09 +1000)]

set /proc/sys/net/ipv4/conf/all/arp_filter to 1 by default when
10.interfaces startsup

this setting makes the system only respond to APR requests from the NIC
where the ip address is tied to and adds to the
"principle of least surprise" when using multihoming servers

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 7 Sep 2007 06:45:19 +0000 (16:45 +1000)]

ctdb ip    must loop over all connected nodes to pull hte public ip list
and merge into a big list   since with the deassociation between a node
and a public ipaddress    the /etc/ctdb/public_addresses files can
differ between nodes and no node know about all public addresses that a
cluster can use

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 7 Sep 2007 05:39:26 +0000 (15:39 +1000)]

remove the ctdb publicip command
this command no longer makes sense when there is no on-to-one mapping
between a node and its default public ip

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 7 Sep 2007 02:20:48 +0000 (12:20 +1000)]

update web nfs with the new NFS_HOSTNAME variable we need to be able to
stat notify using the correct hostname

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 7 Sep 2007 02:14:53 +0000 (12:14 +1000)]

add a short delay after stopping nfslock to make it less likely that
"weird" things happen

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 6 Sep 2007 23:21:40 +0000 (09:21 +1000)]

merge from tridge

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 6 Sep 2007 22:52:56 +0000 (08:52 +1000)]

60.nfs:
we must always restart the lockmanager when the cluster has been
reconfigured and ip addresses has changed. This is to make sure we get a
clusterwide grace period for nfs locking.
if we dont do this and only restart locking on the nodes that were
direclty affected, a different client can take out a conflicting lock
from a different node before affected clients has had a chance to
reclaim all the locks lost during reconfigure.
grace period on rhel5 kernel has bene increased to 90 seconds!

statd-callout:
we must restart lockmanager to ensure a clusterwide grace period for
nfs. this makes locking "more correct" for nfs clients and prevents
other clients/nodes from taking out a conflicting lock while a different
client/node tries to reclaim lost locks.
This makes it "almost consistent" for NFS clients   but there is still
the possibility that a cifs client can take out a conflicting lock
before an nfs client has had a chance to reclaim an existing lock.
This can not be solved with anything less than making the kernel nfs
lock manager "samba aware" and making samba aware of the internal state
of the kernel lock manager so that they can cooperate.

we can not just stop/start the lockmanager back to back in rhel5 since
if they are stopped/started too close to eachother then when the new
lockmanager upon starting up sends out statd notifications two things
can happen:
1, new lockmanager sends out notification BEFORE it has registered with
portmapper leading to
  lockmanager starts
  lockmanager sends notification to the client
  client tries to recover the lock and tries to portmap the lockmanager
  port on the server.
  server is not (yet) registered with portmapper and server responds
  "no such program" to hte clients request to discover where lockmanager
   is.
  client then just completely gives up reclaiming the lock and doesnt
  even reattempt the portmapper call after some timeout.
  ==> lock reclaim failed.
2, if they are started back to back, and a client tries to reclaim the
   lock  the lockmanager sometimes sends two responses back to back
   to the client.   one with status NLM_GRANTED (==you got the lock
reclaimed) and one with status NLM_DENIED (==you could not get the lock
reclaimed)
   This confuses the client and leads to the server thinking that the
client does have the lock   and the client thinking it has not got the
lock    and orphaned locks result.

We also send out additional notification messages of different formats
to allow more legacy clients to interoperate with locking.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 6 Sep 2007 01:32:18 +0000 (11:32 +1000)]

we dont need the rpc.statd on shared directory neither do we need
PUBLIC_IP anymore

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 6 Sep 2007 01:30:49 +0000 (11:30 +1000)]

improve the handling of hosts to notify with statd

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 6 Sep 2007 00:26:44 +0000 (10:26 +1000)]

specify the additional ports for nfs

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 6 Sep 2007 00:18:13 +0000 (10:18 +1000)]

the event scripts for nfs are called 60.nfs and 61.nfstickle

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 5 Sep 2007 22:21:11 +0000 (08:21 +1000)]

document NFS_TICKLE_SHARED_DIRECTORY on our web page

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 5 Sep 2007 05:39:51 +0000 (15:39 +1000)]

we dont use 'sendip' any more so dont check for it and exit from the
61.nfstickles script if it is missing from the host

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 5 Sep 2007 04:59:29 +0000 (14:59 +1000)]

we should always get data back from getnodemap

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 5 Sep 2007 04:29:44 +0000 (14:29 +1000)]

dont dereference vnn before we have assigned it a pointer value

commit | commitdiff | tree

Andrew Tridgell [Wed, 5 Sep 2007 04:20:34 +0000 (14:20 +1000)]

added a diagnostics tool for ctdb

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 13:15:23 +0000 (23:15 +1000)]

allow different nodes in the cluster to use different public_addresses
files
so that we can partition the cluster into different subsets of nodes
which each serve a different subset of the public addresses

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 08:20:29 +0000 (18:20 +1000)]

get rid of the ctdb_vnn_list structure and just use a single list of
ctdb_vnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 04:36:52 +0000 (14:36 +1000)]

we cant have takeover_ctx hanging off ctdb  since it is freed/recreated
everytime we release an ip.
this context is used to hold all resources needed when sending out
gratious arps and tcp tickles during ip takeover.

we hang it off the vnn structure that manages that particular ip address
instead   so that we can have multiple ones going in parallell

this bug (or the same bug in different shape) has probably been in ctdb
for very very long   but is likely to be hard to trigger

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 04:21:35 +0000 (14:21 +1000)]

fix typo in debug output

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 04:19:18 +0000 (14:19 +1000)]

dont just always return 0 from the killtcp control.
return 0 or -1 so that the ctdb tool knows whether the control succeeded
or not

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:49:21 +0000 (10:49 +1000)]

change vnn to pnn in the traverse structure

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:47:02 +0000 (10:47 +1000)]

change debug output from vnn to pnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:45:41 +0000 (10:45 +1000)]

change debug output from vnn to pnn

change ctdb_daemon_send_message to take pnn as parameter isntead of vnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:42:20 +0000 (10:42 +1000)]

change ctdb_send_message to take pnn as parameter instead of vnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:38:48 +0000 (10:38 +1000)]

change ctdb_ctrl_getvnn to ctdb_ctrl_getpnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:33:10 +0000 (10:33 +1000)]

change ctdb_node_flags_change.vnn to ctdb_node_flags_changed.pnn

change ctdb_ban_info.vnn to ctdb_ban_info.pnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:21:51 +0000 (10:21 +1000)]

change server_id.vnn to server_id.pnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:18:44 +0000 (10:18 +1000)]

change ctdb_get_vnn to ctdb_get_pnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:14:41 +0000 (10:14 +1000)]

change vnn to pnn in the ctdb tool

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:09:58 +0000 (10:09 +1000)]

change ctdb_validate_vnn to ctdb_validate_pnn

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 4 Sep 2007 00:06:36 +0000 (10:06 +1000)]

change ctdb->vnn to ctdb->pnn

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 3 Sep 2007 23:50:07 +0000 (09:50 +1000)]

change how we do public addresses and takeover so that we can have
multiple public addresses spread across multiple interfaces on each
node.

this is a massive patch since we have previously made the assumtion that
we only have one public address per node.

get rid of the public_interface argument. the public addresses file
now explicitely lists which interface the address belongs to

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 2 Sep 2007 23:29:30 +0000 (09:29 +1000)]

merge from tridge

commit | commitdiff | tree

Andrew Tridgell [Thu, 30 Aug 2007 07:51:05 +0000 (17:51 +1000)]

up the release number

commit | commitdiff | tree

Andrew Tridgell [Thu, 30 Aug 2007 07:16:23 +0000 (17:16 +1000)]

merge from ronnie

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 30 Aug 2007 05:27:45 +0000 (15:27 +1000)]

when we start 60.nfs we must make sure that the shared storage
nfs-state directory actually exists (by creating it)
or else the lock manager will not start

commit | commitdiff | tree

Andrew Tridgell [Mon, 27 Aug 2007 08:04:53 +0000 (18:04 +1000)]

merge from ronnie

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 27 Aug 2007 08:04:17 +0000 (18:04 +1000)]

merge from tridge

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 27 Aug 2007 07:33:46 +0000 (17:33 +1000)]

add an extra debug statement when we send a SIGTERM to a process

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 27 Aug 2007 05:03:52 +0000 (15:03 +1000)]

make the ctdb shutdown command use the async _send() function to send
the shutdown command
and return success to the caller if the _send() was successful

commit | commitdiff | tree

Andrew Tridgell [Mon, 27 Aug 2007 01:49:42 +0000 (11:49 +1000)]

fixed segv when no public interface is set

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 27 Aug 2007 00:31:22 +0000 (10:31 +1000)]

add async versions of the freeze node control and freeze all nodes in
parallell

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 26 Aug 2007 23:40:10 +0000 (09:40 +1000)]

change the monitoring of recmode in the recovery daemon to use a fully
async eventdriven api for controls

commit | commitdiff | tree

Ronnie Sahlberg [Sun, 26 Aug 2007 00:57:02 +0000 (10:57 +1000)]

add a control to pull the server id list off a node

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 24 Aug 2007 05:53:41 +0000 (15:53 +1000)]

add an initial implementation of a service_id structure and three
controls to  register/unregister/check a server id.

a server id consists of TYPE:VNN:ID    where type is specific to the
application.  VNN is the node where the serverid was registered and ID
might be a node unique identifier such as a pid or similar.

Clients can register a server id for themself at the local ctdb daemon.
When a client dissappears   or when the domain socket connection for the
client drops  then any and all server ids registered across that domain
socket will also be automatically removed from the store.

clients can register as many server_ids as they want at the same time
but each TYPE:VNN:ID must be globally unique.

Clients have the option of explicitely unregister a server id by using
the UNREGISTER control.

Registration and unregistration can only be done by clients to the local
daemon. clients can not register their server id to a remote node.

clients can check if a server id does exist on any ctdb node in the
network by using the check control

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 24 Aug 2007 00:54:34 +0000 (10:54 +1000)]

cleanup invoke_control_callback. we dont need to pass some of these
parameters to _recv() since they are already set

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 24 Aug 2007 00:42:06 +0000 (10:42 +1000)]

change the api for managing callbacks to controls so that isntead of
passing it as a parameter we set the callback function explicitely from
the caller if the ..._send() function returned a valid state pointer.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 23 Aug 2007 23:34:04 +0000 (09:34 +1000)]

comment why we do a talloc_steal

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 23 Aug 2007 09:38:54 +0000 (19:38 +1000)]

get rid of the explicit global timeout used in the previous example and
try this time by relying on the timeouts for the individual controls

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 23 Aug 2007 09:27:09 +0000 (19:27 +1000)]

try out a slightly different api for controls where you provide a
callback function which is called upon completion (or timeout) of the
control.

modify scanning of recmaster in the monitoring_cluster code to try the
api out

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 23 Aug 2007 03:48:39 +0000 (13:48 +1000)]

break checking that the recoverymode on all nodes are ok out into its
own function

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 23 Aug 2007 03:00:10 +0000 (13:00 +1000)]

hang the ctdb_req_control structure off the ctdb_client_control_state
struct so that if we timeout a control we can print debug info such as
what opcode failed and to which node

we dont need the *status parameter to ctdb_client_control_state

create async versions of the getrecmaster control

pass a memory context to getrecmaster

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 23 Aug 2007 01:58:09 +0000 (11:58 +1000)]

in ctdb_call_recv() we must check that state is non-NULL since
ctdb_call() may pass a null pointer to _recv() and this would cause a
segfault.
fortunately there appears there are no critical users for this codepath
right now so the risk was more theoretical IF clients start using this
call it coult segfault.

change ctdb_control() to become fully async so we later can make
recovery daemon do the expensive controls to nodes in parallell instead
of in sequence

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 22 Aug 2007 23:53:10 +0000 (09:53 +1000)]

create an enum to describe the state of a control in flight instead of
using the enum that is for calls

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 22 Aug 2007 09:28:03 +0000 (19:28 +1000)]

merge from tridge

commit | commitdiff | tree

Andrew Tridgell [Wed, 22 Aug 2007 07:31:29 +0000 (17:31 +1000)]

merge from ronnie

commit | commitdiff | tree

Andrew Tridgell [Wed, 22 Aug 2007 07:18:55 +0000 (17:18 +1000)]

merge from volker

commit | commitdiff | tree

Andrew Tridgell [Wed, 22 Aug 2007 07:16:01 +0000 (17:16 +1000)]

merge from volker

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 22 Aug 2007 02:53:24 +0000 (12:53 +1000)]

when we receive a packet from the network, check explicitely that the
node is not banned it the call is for a database record. i.e a REQ/REPLY
CALL/DMASTER

if we get such a call while banned, ignore the packet and write an entry
in the logfile

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 22 Aug 2007 02:38:31 +0000 (12:38 +1000)]

create a define to represent the 'invalid' generation id we used in two
places.

create a new helper function to generate new generation id values that
know about the invalid id and avoids generating it.

update the ctdb status tool to know about the invalid generation id and
print the string INVALID instead

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 22 Aug 2007 01:34:48 +0000 (11:34 +1000)]

if the node is inactive i.e. banned or disconnected then that node is
not participating in the cluster

if a client tries to attach to a database while the node is inactive,
return an error back to the client and fail the attach

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 22 Aug 2007 00:38:35 +0000 (10:38 +1000)]

when a node becomes banned its databases are no longer part of ctdb
and it should thus no longer serve any database access calls until it
has been reintroduced into the cluster.

when becoming banned, reset the local generation id to 1 to prevent
any further database access calls from other nodes from being processed.

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 21 Aug 2007 23:46:48 +0000 (09:46 +1000)]

if lockwait takes an excessive time to complete. log the time it took to
complete and also the name of the database

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 21 Aug 2007 07:25:15 +0000 (17:25 +1000)]

change the structure used for node flag change messages so that we can
see both the old flags as well as the new flags (so we can tell which
flags changed)

send the CTDB_SRVID_RECONFIGURE messages to connected nodes only, not to
every node, connected or not, in the cluster.

in the handler inside the recovery daemon which is invoked for node flag
change messages, only do a takeover_run() and redistribute the ip addresses IF it was the
disabled or the unhealthy flags that changed. Also send out the cluster
reconfigured message in this case.
If any of the other flags changed we dont need to do the takeover_run(0
here since that will be done during recovery.

Martin's CTDB development

RSS Atom