The 411 Secure Information Service provides NIS-like functionality for Rocks clusters. It is named after the common "411" code for information in the phone system. We use 411 to securely distribute password files, user and group configuration files and the like.
411 uses Public Key Cryptography to protect files' contents. It operates on a file level, rather than the RPC-based per-line maps of NIS. 411 does not rely on RPC, and instead distributes the files themselves using HTTP (web service). Its central task is to securely maintain critical login/password files on the worker nodes of a cluster. It does this by implementing a file-based distributed database with weak consistency semantics. The design goals of 411 include scalablility, security, low-latency when changes occur, and resiliance to failures.
Beginning with the Rocks 3.1.0 Matterhorn release, 411 replaces NIS as the default method of distributing /etc/passwd and other login files. We no longer support NIS.
The 411 system intentionally mimics the NIS interface for system administrators. Of course there are elements in 411 which are not present in NIS, namely RSA public and private cryptographic keys. However we have attempted to make 411 as easy to use in the NIS capacity as possible.
Files listed in /var/411/Files.mk are automatically serviced by 411. This means that any file listed there will be kept up to date by the 411 agents on all compute nodes in your cluster. This is done using the makefile /var/411/Makefile in a similar fashion to NIS. To force the 411 system to flush all changes, execute the following on the frontend node:
# make -C /var/411
Note that this command is run by cron every hour on the frontend to propagate password changes, etc to compute nodes. New files can be added to Files.mk as necessary for custom services on the cluster.
We provide a 411 initscript to do some useful tasks. The command "/etc/init.d/411 commit" will run the make command in the /var/411 directory as shown above. Running "/etc/init.d/411 restart" will force all 411 files to be re-encrypted and change alerts resent for each. This can be useful for troubleshooting purposes. Note that this command is not necessary when restarting or adding nodes to your cluster, as they will pull the latest 411 files from the frontend automatically upon startup.
The 411 service requires multicast support on your cluster's private network for optimum performance. If you suspect the 411 service of not working correctly, please verify that multicast messages sent by the frontend are visible on all compute nodes in your cluster. Using the "tcpdump" command can be useful here.
The 411 service defines Master and client nodes like standard NIS. Master nodes encrypt and serve 411 files (called 411 messages once they are encrypted) using their local Apache web server. Client nodes, generally the compute nodes in the cluster, retrieve 411 messages using HTTP, decrypt them, and save the resultant file to their local filesystem.
Client nodes may recognize multiple master servers, and make some attempt to load balance their 411 message retrievals accross the set of master servers to reduce strain on the cluster. Client nodes have two components: the poller and the listener, which are described in detail below. Master nodes have only the 411put command, which publishes a new or changed 411 file. It is possible for a node to be both a master and client at the same time.
Rocks does not currently use multiple 411 masters. The frontend node serves as the single master node in the cluster.
Client nodes listen on the Ganglia multicast channel for "411alert" messages from the master. The master will send 411alerts during a 411put operation, just after it has encrypted a 411 file. The alert message serves as a cue to the client nodes that a file has changed and needs to be retrieved. In this way the 411 system generally achieves a low-latency response to changes.
Using lightweight UDP messages sent over a multicast channel has advantages and disadvantages. The advantages are speed and widespread reception of the alert - the master does not need to know the names of its clients. The disadvantages are speed and unreliable message delivery. A master may get overwhelmed with 411 get requests, and some clients may not receive the alert at all.
Figure: The 411 listener architecture. When the frontend changes a login file, the 411 makefile sends out a UDP multicast alert to the cluster nodes notifying of the event. Upon receipt nodes pull the file from the frontend via HTTP. Random backoffs on the node ensure the frontend will not suffocate under a barrage of requests. Every alert is sent multiple times by the master to ensure each node hears it.
To prevent flooding the master server with requests, the listeners keep an estimate of the cluster size, and use that estimate to set a random backoff for their requests. A client does not immediately request the changed file, but waits some amount of time before asking for it. Currently, the backoff is calibrated to allow only 8 concurrent connections to the master server on average.
The problem of listeners not receiving an alert, perhaps because the message was dropped in the network, is solved by relying on the 411 poller as a backup. Any new or changed files will eventually make it to all correctly functioning client nodes, regardless of the multicast message reliability.
In Rocks 3.2.0 the 411 service has been strengthened. Instead of sending a single multicast alert when a 411 file has changed, the frontend repeats the alert every 10 seconds for about an hour. Clients will ignore duplicate alerts, so the these repeates incur minimal overhead. The benefit is when viewed over the hour, we have achieved reliable multicast: every client is guarenteed to receive an alert with very high probability. Ganglia uses this strategy to insure reliable UDP multicast as well.
411 is akin to a distributed database, and is not a centralized lookup service like NIS. Unlike NIS, however, it has been designed to scale to hundreds of nodes, and many reports have shown this to be the case. It also uses full RSA-grade cryptographic security. However, all systems have limitations, and 411 can never ensure correct operation over an arbitrarily small time window. As the window (or delta) gets smaller, the chance grows that some node does not receive a 411 file when it should. There are fundamental reasons for this deficiency: any single alert can be lost, and clients do not poll constantly. However over a reasonable time window, 411 works correctly. Tests have shown that 124 nodes converge on a new 411 file in approximately 20s.
Each client node polls its master server at a random interval, on average once per day. During these polls, the client retrieves all available 411 messages by getting a directory listing of the master's 411 web directory and performs a 411get on each message it finds. The purpose of the 411 poller is to allow nominally correct operation of the service in case the 411 listener has failed, or the UDP multicast channel is unavailable.
Figure: The 411 poller architecture. Independent of any alerts, each 411 client node will retrieve all available files from the frontend via http.
The poller is implemented as a greceptor event module, and relies on the operation of that daemon. 411 Pollers obtain their master servers by reading a configuration file on their local disk. This file, written in XML, is generated automatically by the 411 listener.
Since the 411 polling daemon operates as the root user on client nodes, it is essential that the 411 http directory on a master server be writable only by root. If any user is allowed to write messages to the master's 411 directory, security can easily be compromized by a user-level file being written as root by a client node.
The 411 message format is simple and easy to mimic, and a clever user with write access to the 411 web-visible directory (currently /etc/411.d) could put a file anywhere on the client node's filesystem. Insure this directory is writable only by root. This state is enforced by standard Rocks, but we want to reiterate its importance.
Security is achived by using 1024-bit RSA public and private keys. The masters and all clients in a cluster share the same public and private key. Although this scheme is less secure than using a unique key per host, it reduces the processing requirements of the system dramatically.
When a master server wishes to publish a file using 411, it uses the "411put" command to encrypt the file (or directory) with the cluster public key. This encryption is done only once, no matter how many nodes are present in the cluster. The encrypted file is stored on a plain HTTP server, with no HTTPS requirements. A client node must posses the cluster private key to decrypt and interpret the 411 message.
411 messages contain encrypted headers that specify the location, owner, and permissions of the file on the master server. In this way a client securely knows where and how to place the file on its local filesystem. In a correctly functioning 411 system, the clients look identical to their masters in terms of their 411 files.
We found that widely available HTTPS implementations take an order of magnitude more processing time to serve a file than regular HTTP. Large clusters using 411 would quickly spend all their time maintaining secure HTTPS connections.
411 uses a Hybrid encryption scheme for performance reasons, similar to PGP and GnuPG. A 256-bit random number is chosen as a session key and encrypted with the cluster public key. The session key is used to encrypt the file's contents and headers with the fast Blowfish symmetric cypher. A new session key is chosen for every file, and the encrypted message is transformed into ascii text using Base64 encoding.
One might observe that a malicious attacker could send bogus 411alert messages to bring down the master's HTTP server. The problem is the alert multicast packet is easy to make, and will cause a large response in the system. The 411 service addresses this problem by cryptographically signing each alert. The master server uses the secret cluster private key to sign the alert message, which the client nodes verify with the public key. In addition, the master includes a timestamp with each alert so old (valid) alerts cannot be stored by the attacker and reused.
Any 411 alert that does not verify or that carries a timestamp that we have already seen is ignored. In this way as long as the cluster private key is kept secure, only legitimate alerts will carry weight.
The final observation is that any true 411alert must come from a valid master server. We use this fact to automatically learn about new masters. The text of a 411 alert contains the network address of the master that sent it, which cannot be altered without breaking the cryptographic signature. The listeners use this address to create the 411 configuration file, which is used by the poller.
Upon verifying an alert from a master server it has not seen before, a 411 listener writes a new configuration file for the service. Pollers monitor the modification time of this file, and reload its contents on a change. In this way new master servers are easily and securely added to the system. In the face of unreliable and lost alerts the file must be created by hand, but we do not expect this to be the common case.
In addition, each client node keeps a "quality score" counter for each master server in its configuration file. When a client sucessfully contacts a 411 master, the client increments the saturating score counter for that master. Similarly, if the client fails to contact a particular 411 master server, the client decrements the appropriate score. When a client needs to perform a 411 operation, it chooses the master server with the highest quality rating. In this way dead or overloaded master servers are used less frequently by clients, promoting good health and load balance in the cluster.
Beginning in Rocks 3.3.0, 411 has the ability to send messages to subsets of the cluster. This facility, called 411 groups, allows us to distribute different files to nodes depending on their type. The group mechanism depends on the client nodes specifying group names in their local 411 configuration file; these are called the client's "registered" groups.
The 411 alerts from the master are not automatically accepted on the basis of the digital signature alone. Now clients ask "am I interested in this message?" before writing the decrypted file locally. Clients ignore all uninteresting messages.
Group names are multi-level, and resemble file paths. By default, every node is a member of the '/' group (corresponding to the traditional top-level 411 group), and the '/Membership' group, where membership is the node membership in the frontend database, such as "Compute" or "NAS".
A sample 411.conf file with several groups looks like this:
<!-- Configuration file for the 411 Information Service --> <config> <master url="http://10.1.1.1/411.d/" score="0"/> <group>/blue</group> <group>Compute/light</group> </config>
Muli-element group names have a simple inheritance model: specific groups imply more general ones. For example, if you are a member of the group /compute/light, you will automatically be interested in messages in group "/compute/light" and "/compute". You will not be interested in messages from group "/compute/heavy". In this case "/compute/light" is the specific group, and "/compute" is the more general one.
Figure: 411 groups. The client uses registered groups from its local configuration file to filter a stream of offered messages from the master. The messages with the dashed border represent newly changed 411 files on the master, the solid messages at the bottom have been chosen by the client. Note that group "/compute/light" implies "/compute".
411put [--411dir=dir] [--urldir=dir] [--see] [--noalert] [--alert=channel] [--411name] [--pub] [--priv] [--comment=char] [--chroot=dir] [--chroot-here] [--group=group] file1 file2 ...
Encrypts and publishes files using the 411 secure information service. Will send a multicast message to client nodes by default, alerting them of a changed file.
The following options are available:
--chroot=dir Turn "dir" into the root directory of the destination file. This allows files to be located in a different place on the master and clients.
Example: 411put --chroot=/var/411/groups/compute /var/411/groups/compute/etc/passwd
Will put "/var/411/groups/compute/etc/passwd" on compute nodes as "/etc/passwd".
--chroot-here A convenience option, equivalent to --chroot=$PWD.
--group=name A 411 group for this file. Clients will ignore 411 messages in groups which they are not a part of. Allows 411 files to be published to a subset of the cluster. Name is path-like: "Compute/green", or "/Compute/green". Spaces are ok: "a space/yellow" is a valid group name as well.
--comment The comment character for this file. Used to place a descriptive header without disrupting normal operations. Often set to "#". Default is none.
--411dir The local directory to place encrypted 411 messages. Defaults to "/etc/411.d/". Be careful about the permissions of this directory.
--urldir The web directory where 411 messages are available. Defaults to "/411.d/".
--see Shows the encrypted file contents on stdout.
--noalert Suppresses alert message.
--alert Specifies the alert channel, which can be multicast or unicast UDP. Defaults to the ganglia channel (126.96.36.199).
--411name Prints the 411 message name for the file. Provided for convenience.
--pub The location of the cluster public RSA key. Defaults to a 1024 bit key in "/etc/security/cluster-public-key.pem". This file should have permissions 0444 (read by all) and be owned by root.
--priv The location of the cluster private RSA key. Defaults to a 1024 bit key in "/etc/security/cluster-private-key.pem". This file should be owned by root and have permissions 0400 (read only by root).
411get [--all] [--master=url] [--conf] [--pub] [--priv] [file]
Retrieves and decrypts 411 messages. Prints resulting file to standard out. When invoked with no files, 411get will list the available 411 messages.
The following options are available:
--master The url of a 411 master server to use. Defaults to "http://10.1.1.1/411.d/" or whatever is present in "/etc/411.conf". If given, this master takes precedence over those listed in the configuration file.
--conf The configuration file to use. Defaults to "/etc/411.conf".
--all Retrieves and writes all available 411 messages from the most attractive master. Does not print output to stdout, nor ask for confirmation before overwriting files.
--pub The location of the cluster public RSA key. Defaults to "/etc/security/cluster-public-key.pem".
--priv The location of the cluster private RSA key. Defaults to "/etc/security/cluster-private-key.pem".
To cause a compute node to pull all 411 files from its favorite master server, run the following command as root on the compute node:
# 411get --all
The master servers, along with their quality score, are listed in the "/etc/411.conf" file on compute nodes.
The 411 service is intended to provide a secure and scalable alternative to NIS. In the future we plan to allow muliple levels of master servers so files can be defined from many locations, enabling on-the-fly addition of users, etc. with an eye on Grid Services. We hope this new service is useful to you, and look forward to your comments.