Here you will find the personal blogs of Samba developers (for those that keep them). More information about members can also be found on the Samba Team page.
Planet Samba
May 14, 2012
Kai
Playing with POSIX pipes in Python
Recently I was faced with an external program that I wanted to call from my script that only writes its output to a file, not to stdout. Faced with having to call this program a lot of times in parallel, I decided to fake up its output files via POSIX FIFO pipes.
Unfortunately the python API around FIFOs is pretty close to the POSIX API, so it feels a bit un-pythonish. The following post illustrates my approach to getting around this limitation.
Workload
In order to simulate my workload, I came up with the following simple script called pipetest.py that takes an output file name and then writes some text into that file.
#!/usr/bin/env python
import sys
def main():
pipename = sys.argv[1]
with open(pipename, 'w') as p:
p.write("Ceci n'est pas une pipe!\n")
if __name__ == "__main__":
main()
The Code
In my test, this "file" will be a FIFO created by my wrapper code. The implementation of the wrapper code is as follows, I will go over the code in detail further down this post:
#!/usr/bin/env python
import tempfile
import os
from os import path
import shutil
import subprocess
class TemporaryPipe(object):
def __init__(self, pipename="pipe"):
self.pipename = pipename
self.tempdir = None
def __enter__(self):
self.tempdir = tempfile.mkdtemp()
pipe_path = path.join(self.tempdir, self.pipename)
os.mkfifo(pipe_path)
return pipe_path
def __exit__(self, type, value, traceback):
if self.tempdir is not None:
shutil.rmtree(self.tempdir)
def call_helper():
with TemporaryPipe() as p:
script = "./pipetest.py"
subprocess.Popen(script + " " + p, shell=True)
with open(p, 'r') as r:
text = r.read()
return text.strip()
def main():
call_helper()
if __name__ == "__main__":
main()
Code in Detail
So let's look at the code in more detail. The code I'm using relies on a bunch of libs from the python standard library, and is working with Python 2.6 and up.
tempfileis used to get a temporary directory for me to create the FIFO in.oshas theos.mkfifo()call.os.pathhandles the path crunching required.shutilis used to remove the temporary directory after use.subprocessis used to run the workload script.
TemporaryPipe class
Next comes the nifty part, a context manager object handling the creation and removal of the temporary FIFO pipe. Let's look at the class in detail.
class TemporaryPipe(object):
def __init__(self, pipename="pipe"):
self.pipename = pipename
self.tempdir = None
The class definition and the constructor don't really hide anything interesting, though it's worth noting that self.tempdir is set to None. That will make the clean-up easier further down.
__enter__
def __enter__(self):
self.tempdir = tempfile.mkdtemp()
pipe_path = path.join(self.tempdir, self.pipename)
os.mkfifo(pipe_path)
return pipe_path
The __enter__(self) function is the set-up code for the context manager. Here, a temporary directory is created. Afterwards, os.mkfifo() creates the FIFO. Finally, the pipe's path is returned.
__exit__
def __exit__(self, type, value, traceback):
if self.tempdir is not None:
shutil.rmtree(self.tempdir)
The __exit__(self, type, value, traceback) function is always called when the context manager's block is exited. Thus, it's the ideal place to run the clean-up, in our case removing the temporary directory and the pipe contained within it.
shutil.rmtree() takes care of this just fine. If mkdtemp() failed, we don't have to bother, of course. Our clean-up doesn't require any extra knowledge of the things we're cleaning up, so we're free to ignore all those parameters.
The call_helper Function
def call_helper():
with TemporaryPipe() as p:
script = "./pipetest.py"
subprocess.Popen(script + " " + p, shell=True)
with open(p, 'r') as r:
text = r.read()
return text.strip()
Because TemporaryPipe is a context manager, it's useable from a with statement. This means that in the block inside the with TemporaryPipe() as p block, there is a temporary directory containing a FIFO pipe. Because __enter__() returns the pipe's path, that will be assigned to p within the block.subprocess.Popen() is now used to run the workload script, going via a shell to evaluate the hashtag. This probably isn't the smartest idea performance-wise, but this is proof-of-concept code after all.After the workload script was run, another
with statement opens a new block using the pipe's path, opening the FIFO for reading. The text is read out and the newline stripped. Now, the return statement returns the read text, and also causes the pipe's context manager to call the __exit__() function to clean up.
Conclusions
I'm pretty content with the way the call_helper() function reads. The complexity of setting up and then cleaning up the FIFO is hidden away in the TemporaryPipe class. I spent a bit of time coming up with this, so I thought I'd share this solution with other people. Now I just need to add this to my utility library and write tests for it.
May 06, 2012
Andreas
CM9 (Android 4.0 ICS) and deep sleep
I’ve had the problem that the device didn’t want to switch into deep sleep mode if radio was on. What is deep sleep? To make it simple we break it down. Your device has 3 modes. The fisrst is “Screen On and Awake”, “Awake” and “Deep Sleep”. If you use your device it you’re in the first mode and you need a obviously a lot of battery. The second “Awake” means it is doing some background work. Checking for calls, checking Emails, syncing contacts. The last one means it goes for some time into a mode were it uses almost no battery, and this is Deep Sleep. If you don’t do anything and your phone is in your pocket you want that it is in the Deep Sleep mode most of the time.
My HTC Wildfire S didn’t want to go into the “Deep Sleep” mode at all if radio was turned on. It worked with Airplane mode. I thought this has something todo with RIL but I was wrong. Actually it was a bluetooth wakelock. The wakelock “msm_serial_hs_dma” was held all the time. The problem is that the msm7227 platform doesn’t supports quick switch-on/off of the bluetooth module and you need to deactivate it with an overlay else ICS always tries to trigger it.
So adding
<bool name="config_bluetooth_adapter_quick_switch">false</bool>
to overlay/frameworks/base/core/res/res/values/config.xml fixed the problem and the wakelock was gone.
April 23, 2012
Andreas
libhtc_ril.so and segfaults
If you try to get a new Android version, in this case CyanogenMod9, working on your old phone you have to deal with binary blobs. One of these blobs is the library talking to the radio, libhtc_ril.so.
I wanted to document what I learned about libhtc_ril.so. I’ve wanted to get the library version matching my baseband version working with cm9. This resulted it several segfaults. So I’ve started to strace the rild process to find what’s going wrong, which permissions are missing etc. The library doesn’t check return values so it segfaults. One of these segfaults was a missing kernel interface called usb_function_switch. The file should be in /sys/devices/platform/msm_hsusb/usb_function_switch. I’ve implemented that function in the kernel and it still segfaulted and I had no idea what to do now. Today I analyzed the RADIO logs and stumpled upon:
D/RILJ ( 328): [0100]> SCREEN_STATE: false D/HTC_RIL ( 1360): ril_func_screen_state_notified():called D/HTC_RIL ( 1360): ril_func_screen_state_notified():Not found 'ether:' in USB_STATE_PATH
As it segfaulted directly after closing /sys/devices/platform/msm_hsusb/usb_function_switch it smelled like it expeced to have something like:
ether:disable
I’ve dived into the code and found out that in my kernel tree it was called rndis and in the htc kernel tree it was called ether. So I’ve fixed that and added the other values of /sys/devices/platform/msm_hsusb/usb_function_switch it started to work just fine. I hope this post will help other developers with similar problems.
This is the full set of the usb_function_switch:
ether:disable accessory:disable usb_mass_storage:enable adb:enable cdc_ethernet:disable diag:disable modem:disable serial:disable
April 12, 2012
Andreas
CM9 on Marvel (HTC Wildfire S)
After Qualcom released new graphic blobs for ARMv6 I was able to get CyanogenMod 9 working on my HTC Wildfire S pretty well. There are still some problem which need to be fixed. GPS isn’t working, if you have GSM/3G turned on the battery drains pretty fast. I’m currently trying to get the camera working. There is also a wakelock bug with bluetooth in the kernel right now.
If you’re a developer working on a msm7x27 device and are interested to work together join #cyanogenmod-msm7x27 @ freenode.
You can find my work at http://git.cryptomilk.org/
March 31, 2012
Kai
Samba4 DNS sprint, day 5 summary
Another long and only partially successful day is behind me, and my allocated time for this sprint is over. I said "partially successful", because I did not manage to get GSS-TSIG working. This is mostly due to the fact that I don't understand how to hook it up to GENSEC/gss on the Samba side. The API is a bit confusing to the uninitiated. What I did get done was to get to a point where incoming TKEY messages are parsed and checked, and pretty much handled correctly. We currently bail out of there with a BADKEY error, pretending the client's key didn't work. If someone with a reasonable grasp of GENSEC would explain what I need to do there to get the GSSAPI blob from the client authenticated, I would expect GSS-TSIG is very, very close.
Because it's the end of the week let me take a look at the high and low points of this sprint over the week:
- High point: On Tuesday morning, I finally got forwarding sorted out. Ever since Tuesday, all DNS requests on my dev machine were handled by my local samba server.
- Low point: I wasted most of Tuesday trying to debug my HMAC-MD5 signing code. Debugging crypto is hard, because the only debug tool available is "stare at the code and think very hard". This might be the weapon of choice of the kernel community, but certainly not my preferred way of doing things.
- High point: On Wednesday morning, I managed to fix signing of TSIG requests.
- Low point: This got me work on TSIG some more instead of moving on to GSS-TSIG, and ultimately failed because signing of TSIG replies doesn't work correctly yet, another day wasted.
- Low point: After reading up on TKEY and GSS-TSIG, I realized that I didn't really understand what I had to do in Samba to get this sorted out. This ended up being a major stumbling block, in fact I'm still stuck there.
- High point: During my tries to find a useful test for TKEY, I set up a Win7 client for my domain, and after a tiny fix to get PTR records handled in the update code, that machine would correctly register forward and reverse zones (without crypto, but also without complaining), and was perfectly happy using samba's DNS service for it's needs.
So to sum up, forwarding turned out to be a neater feature than I initially expected it to be, and allows me to run samba as my main name server for the local network. On the negative side, all that fancy crypto stuff isn't working yet. I do feel that none of these is really far off anymore. Maybe another pair or two of eyes would help there. I've updated the Samba Wiki DNS page to reflect the current status.
March 30, 2012
Kai
Samba DNS sprint, day 4 summary.
I'm still a but stuck with TKEY/TSIG, unfortunately. While looking at the GSS-TSIG implementation we have in libaddns, I realized that I could simplify my time handling. That ended up fixing my TSIG issues from yesterday. That is, I can now correctly generate the client/request side of a HMAC-MD5 TSIG. The server side still seems broken, at least I can't get dig to accept my reply signature, and if I query bind the server reply differs from what I would calculate fore it. Oh well.
I've looked at plain TKEY, but for now it doesn't really seem worth the effort. So I've decided to work on GSS-TSIG directly instead. I don't really know how to deal with the Gensec side of this, though, so it's a bit hard to keep the momentum going for this. I'm beginning to fear that I won't get this implemented this week. Not because any part of it was particularly hard, but because there's tons of little things that all take a couple of minutes. And of course sitting in front of the computer alone lone ranger style isn't the most fun way to develop software.
For tomorrow, I hope to get a bit more done than today. I'll be working on a little gss-tsig test utility based on libaddns that I can use to test my server implementation. That should at least allow me to figure out what's going on at specific steps. I still might need some help on the Gensec side.
March 29, 2012
Kai
Samba DNS sprint, day 3 summary
Some progress on the TSIG front, but I'm stuck with the exact signing method for a packet. For some reason dig and I disagree on what the HMAC-MD5 of a specific query should be. The RFC is a bit vague, and the BIND code of that area seems to be in assembler. (Ok, it's C, but their coding conventions differ so much from ours that I probably have to spend a week getting my brain to adjust to that)
So I'm not continuing on hmac-md5 support, but will instead look at GSS-TSIG directly today. That's the must-have feature, and the whole week would be wasted if I didn't get that in.
TL;DR: HMAC-MD5-TSIG stupid, working on GSS-TSIG now.
Rusty
1 Week to Go, and Rusty Goes Offline
Just as the Linux kernel merge window closes, I’m going offline. My wedding is exactly a week away, but I’ll be entertaining guests and doing final preparation. I’ll be back from our honeymoon and wading through mail on the 7 May.
Alex’s “A Bald Target” campaign to raise awareness for TimeForKids has been a huge success, even though we’re currently far short of the hair-shaving goal. She’s been on one of the local radio stations, with newspaper coverage expected this weekend; two local TV stations want to cover the actual shave if it happens. The charity is delighted with the amount of publicity they have received; given that they need local people to volunteer to mentor the disadvantaged children, that’s worth at least as much as the money.
Special thanks to a couple of people who donated direct to the charity, to avoid causing baldness! And yes, if we were starting again, having competing “shave” vs “save” campaigns would have been awesome…
Sources of Randomness for Userspace
I’ve been thinking about a new CCAN module for getting a random seed. Clearly, /dev/urandom is your friend here: on Ubuntu and other distributions it’s saved and restored across reboots, but traditionally server systems have lacked sources of entropy, so it’s worth thinking about other sources of randomness. Assume for a moment that we mix them well, so any non-randomness is irrelevant.
There are three obvious classes of randomness: things about the particular machine we’re on, things about the particular boot of the machine we’re on, and things which will vary every time we ask.
The Machine We’re On
Of course, much of this is guessable if someone has physical access to the box or knows something about the vendor or the owner, but it might be worth seeding this into /dev/urandom at install time.
On Linux, we can look in /proc/cpuinfo for some sources of machine info: for the 13 x86 machines my friends on IRC had in easy reach, we get three distinct values for cpu cores, three for siblings, two for cpu family, eight for model, six for cache size, and twelve for cpu MHz. These values are obviously somewhat correlated, but it’s a fair guess that we can get 8 bits here.
Ethernet addresses are unique, so I think it’s fair to say there’s at least another 8 bits of entropy there, though often devices have consecutive numbers if they’re from the same vendor, so this doesn’t just multiply by number of NICs.
The amount of RAM in the machine is worth another two bits, and the other kinds of devices eg. trolling /sys/devices, which can be expected to give another few bits, even in machines which have fairly standard hardware settings like laptops. Alternately, we could get this information indirectly by looking at /proc/modules.
Installed software gives a maximum three bits, since we can assume a recent version of a mainstream distribution. Package listings can also be fairly standard, but most people install some extra things so we might assume a few more bits here. Ubuntu systems ask for your name to base the system name on, so there might be a few bits there (though my laptop is predictably “rusty-x201″).
So, let’s have a guess at 8 + 7 + 2 + 3 + 3 + 2 + 2, ie. 27 bits from the machine configuration itself.
Information About This Boot
I created an upstart script to reboot (and had to hack grub.conf so it wouldn’t set the timeout to -1 for next boot), and let it loop for a day: just under 2000 times in all. I eyeballed the graphs of each stat I gathered against each other, and there didn’t seem to be any surprising correlations. /proc/uptime gives a fairly uniform range of uptime values within a range of 1 second, at least 6 bits there (every few dozen boots we get an fsck, which gives a different range of values, but the same amount of noise). /proc/loadavg is pretty constant, unfortunately. bogomips on CPU1 was fairly constant, but for the boot CPU it looks like a standard distribution within 1 bogomip, in increments of 0.01: say another 7 bits there.
So for each boot we can extract 13 bits from uptime and /proc/cpuinfo.
Things Which Change Every Time We Run
The pid of our process will change every time we’re run, even when started at boot. My pid was fairly evenly divided on every value between 1220 and 1260, so there’s five bits there. Unfortunately on both 64 and 32-bit Ubuntu, pids are restricted to 32768 by default.
We can get several more bits from simply timing the other randomness operations. Modern machines have so much going on that you can probably count on four or five bits of unpredictability over the time you gather these stats.
So another 9 bits every time our process runs, even if it’s run from a boot script or cron.
Conclusion
We can get about 50 bits of randomness without really trying too hard, which is fine for a random server on the internet facing a remote attacker without any inside knowledge, but only about five of these bits (from the process’ own timing) would be unknown to an attacker who has access to the box itself. So /dev/urandom is still very useful.
On a related note, Paul McKenney pointed me to a paper (abstract, presentation, paper) indicating that even disabling interrupts and running a few instructions gives an unpredictable value in the TSC, and inserting a usleep can make quite a good random number generator. So if you have access to a high-speed, high-precision timing method, this may itself be sufficient.
March 28, 2012
Kai
Samba4 DNS sprint, day 2 summary
I actually spent my time working out some smaller kinks in the DNS server that I ran into while using it as the only DNS server on my development machine. I also started with restructuring my dns processing code a bit so I can handle TSIGs in a sensible way. I've got dig set up to send TSIGs with an all-0 hmac key, so for tomorrow I should be ready to go.
Oh, and I pushed my dns forwarder work to master, and it passed autobuild. Life is good.
March 27, 2012
Kai
Samba4 DNS sprint, day 2
Ok, so I cheated a bit and kept poking at the DNS forwarder code a bit more yesterday after posting my summary. I didn't quite get anywhere final before I went to bed, but this morning, while waiting for my coffee to run through the machine, I got this thing set up. I now can forward requests the internal server doesn't feel responsible for to another DNS server and get the reply back to the client. :) It's not quite production-ready code, but it sure works good enough to switch my DNS settings on my development machine to use Samba DNS.
That makes today TSIG-day. Time to re-read RFC2845 and see if I can get this implemented in my test client.
March 26, 2012
Kai
Samba4 DNS sprint, day 1 summary
Ok, of course this didn't go as planned. It took longer than expected to figure out how to best test my DNS library, which by itself seems to work ok but also only is a thin wrapper around tdgram, so it doesn't do anything fancy yet.
I played with getting some code into the server, but I think I'm not quite doing the right thing there yet. I've set myself a deadline until tomorrow 11:00, if I haven't got it by then, I'm back to TSIG et al.
All in all, I notice that with all the python programming I've been doing recently, my C-fu has rusted a bit. I hope today will prove to be the WD-40 I needed to get going again. :)
Oh well, enough for today, more Samba DNS work will come tomorrow.
Samba4 DNS sprint, day 1
Samba has it's own small DNS server built in, but it's still lacking a couple of very nice-to-have features. This week, I'll be trying to get as many of those in as possible. There's two big parts here. One is getting forwarder support, so we can query other name servers on behalf of our clients. The other big item is getting signed updates to work so windows clients can sign their dynamic update requests. My battle plan for this week is:
- Have a quick stab at a really simple forwarder library, but fall back to running dnsmasq with forwarding set up if I don't get anywhere until early afternoon today
- Implement shared secret TSIG updates, to get the TSIG logic sorted out
- Implement TKEY exchanges as specified in RFC2930, to set up the TKEY handling infrastructure
- Make GSS-TSIG work as a possible signing method, so Windows is happy finally
- More work on the forwarder library if needed/I have the time
March 21, 2012
Andreas
Synchronize two folders on a Mac and other Unix Systems with csync
I’ll show you how to to synchronise files of two different directories in a terminal using a mighty automator. The tool is called csync and is a client side file synchronizer. Unless like rsync it syncs in two directions so that the contents are equal as soon as it finished.
Here is the simple example of syncing two folders from terminal:
csync /path/to/folder1 /path/to/folder2
If you run it the first time, this line compares the both directories and copies the files missing in each other directory to the opposite side. So in the end they are equal. If you delete a file in folder1 it and run it again, csync will notice that the file has been deleted in folder1 and will delete it in folder2. If you create a new file in folder2 and run csync, it will copy the new file to folder2. If a file has changed it will detect it and copy the file to the other folder. If a file has been changed on both sides, the newer file wins.
The options are pretty simple and don’t need further documentation here. The only interesting option is an additional exclude list. The default one can be found in ‘~/.csync/csync_exclude.conf’.
You can always check the manual of Rsync by typing “man csync” in terminal.
The current stable version supports the SMB (Windows sharing) protocol and SFTP.
If you want to synchronize a local folder with a folder on another unix machine you can use the following command:
csync /path/to/my/music/collection sftp://my.notebook/home/me/my/music/collection
and it will do the same as stated above, but over a sftp network connection. SFTP is the file transfer protocol which is based on SSH and every Unix machine has it normally enabled by default.
We are currently improving csync and adding support for OwnCloud. A graphical Qt based frontend for csync is mirall.
This post is inspired by this one
March 16, 2012
Kai
Running Samba's autobuild.py
Samba has a lot of tests, and we like to run them often. In order to easily do that, we've got a script that checks out a bunch of repositories and runs all tests in them, in parallel and independent of each other. It's living in the source tree at scripts/autobuild.py. Here's my notes for running autobuild.py on a local machine.
First, set up an in-memory file system. autobuild.py and the tests run by it touch a lot of files, and not running these tests on a spinning disk will speed things up a lot.
# create the memdisk location mkdir /memdisk # default size is half your ram, use -o size=SIZE # to change that if needed mount -t tmpfs tmpfs /memdisk # now create an image file, samba's tests don't like plain tmpfs # Needs to be bigger than 3 gig dd if=/dev/zero of=/memdisk/build.img bs=1MiB count=4000 losetup /dev/loop0 /memdisk/build.img # format as ext2, no need to do journalling # it's gone when the machine fails anyway mkfs.ext2 /dev/loop0 # mount mkdir /memdisk/kai mount /dev/loop0 /memdisk/kai chown -R kai:kai /memdisk/kaiAnd now, I can just run
./script/autobuild.py and get a coffee while all the tests are run.
March 13, 2012
Michael
Samba Team Visits Microsoft For SMB2.2 Interop Event
In the week of February 27 to March 2, 2012, a few Samba developers accepted an invitation by Microsoft and attended an SMB2.2 testing opportunity at Microsofts Enterprise Engineering Center in Redmond. Jeremy Allison, Steve French, Volker Lendecke, Chris Hertel, Christian Ambach, Matthieu Patou and I found our way to Redmond with Stefan Metzmacher participating to some extent via IRC and mumble. For me, the event was a big success, and I am happy that I finally made up my mind to go there. This is my personal report.
Background
Microsoft will ship a new version 2.2 of the SMB protocol with Windows 8. Along with this, a whole new scale out clustering mode is added. The target of these new features is clearly server workload instead of client workload, the two most prominent applications being Virtualization (Hyper-V) and SQL. These two applications that were originally typical applications that ran from SAN-Storage, can now run directly from SMB2.2, and they can go even further when an RDMA adapter is installed, thanks to the new RDMA support in SMB2.2 called SMB Direct. Other intersting features in SMB2.2 are multi-channel sessions and persistent file handles that can survive server failures without data loss.
These new features were first presented at the Storage Developer Conference in September 2011.
There are preview versions of the documents for these new features available from the msdn library, but they are not complete yet and partly still subject to change. Since February 29, the beta version of Windows 8 can be tested. While the client is freely available, the server variant is only available via MSDN subscriptions. Before this date, only a preview from September 2011 was available that did not yet support many of the announced features.
The Test Setup
Microsoft had established two test environments for us with a domain and Windows 8 clients equipped with some test suites. One network contained a Windows 8 cluster server installation, and in the other network was intended for us to integrate our own server implementations to run Microsoft’s test suite against.
After we had trouble accessing our test network the first day, it worked nicely from the second day on and gave us the opportunity to run tests against Windows 8 beta. It was especially useful to run tests against a fully installed Windows 8 cluster.
Signing
Since the beta release, Windows 8 sports a new signing algorithm, aes-cmac, that had not been available in earlier Window 8 previews. In a joint effort with Jeremy and Metze, we were able to fix the last bugs in the code that Metze had written in the last days before the event (by just looking at the docs). So we now have a working signing code against Windows 8, (client and server side), which we did not have before.
Persistent Handles – Test Suites
My focus for the event was on durable (SMB2.0) and persistent (SMB2.2) file handles and clustering features. I gained a better understanding of persistent file handles and I was able to extend our testsuites and add precision to them. I also spotted some bugs in the documentation and a bug in Windows 8 durable-handle-vs-oplock behaviour (a regression from Windows 2008R2). I am still working on extending our testsuites with respect to durable and persistent handles.
It was extremely useful to have several of the core engineers from Microsoft available during the test lab for discussions about product behaviour and the documentation. It helped me to improve my understanding of the new clustering concepts and to deepen my knowledge of durable and persistent file handles.
Persistent Handles – Server Hacking
On the server side, Christian set up a Samba-CTDB cluster in the partner environment and we installed the code from the durable handle work-in-progress-branch that Metze and I are working on, and we started hacking some 2.2 features into it. In the end, we were able to talk SMB 2.2 with the client and offered persistent handles – in a somewhat faked up manner, not being able to give the full set of guarantees attached to persistent handles.
The nice visible success was that we were able to do a transparent node failover of the client copying a dvd image to the samba cluster. We ran “ctdb disable” on the connected node and the client switched over to the other node and kept copying. In the end, the dvd image was complete and its md5sum correct!
You can see this as a record-my-desktop video here.
Note: The new copy progress bar is a really nifty feature in Windows 8.
As a background, it is important to know that in the Windows 8 clustering, it is the client that implements most of the logic for failover scenarios. The client notices the server outage and reconnects to the new node consciously. It then does a replay of the last write that failed. So the additions on the server side for this first success were resonably small after having some support (still work-in-progress support) for durable handles.
Summary
Summing up, it was a very successful and productive event. I would also like to emphasize that the Microsoft folks have really been very cooperative, and they are also happily accepting comments on documentation and product behaviour.
The outcome for me is that we are on the right track with our work on durable handles, and it is good to see that some of it is already working. There still is a lot of work needed to get the things right, and we have for instance not even seriously touched RDMA yet, but we have a much clearer picture now. The good bit that has been confirmed now, is that the logic of the new Windows clustering is largely in the client. And we were able to demonstrate that we have a chance to take advantage of it in our CTDB clusters without having to throw the established CTDB-clustering away and implement something completely different.
Hello World!
A blog, again…
March 09, 2012
Rusty
Oh, BTW, I Am Engaged!
A few of my friends saw the LWN coverage of http://baldalex.org, and sent me a note of congratulations. This reveals how incredibly slack I am in maintaining connections with my disparate and distributed friends.
So: I met a wonderful lady, we fell in love, and I proposed on April the 8th last year, at Mt Lofty Gardens overlooking the scenery. She was speechless, delighted, and said yes! In related news, one year later, we are set to marry: 5th April 2012, and 3pm at the McLaren Vale Visitor’s centre. All welcome! (Yes, that’s Easter Thursday).
So I’ll be offline for April, except briefly to post pictures if we meet the target!
February 27, 2012
Rusty
A Plea For Help: Charity
I’m getting married in just over five weeks!
My fiancée is raising money for charity; if we raise $50,000 by the big day, she will shave her head at the wedding. Alexandra has had long hair all her life: she’s terrified but determined, so I’m determined to help.
We’re already asking for donations in lieu of wedding presents, but if you’ve ever wanted to buy me a beer for ipchains, iptables, netfilter, module-init-tools, lguest, CCAN, Rusty’s Unreliable Guides, CALU, or any other reason, I’ll take a $100/$20/$5 donation here instead :)
(Compulsory Facebook page here).
February 12, 2012
Simo
Kerberos: delegation and s4u2proxy
One of the most obscure parts of the Kerberos protocol is delegation. And yet it is a very powerful and useful tool to let "agents" work on behalf of users w/o fully trusting them to do everything a user or an admin can.
So what is delegation ? Simply put is the ability to give a service a token that can be used on the user's behalf so that a service can act as if it were the user himself.
In FreeIPA, for example, the web framework used to mediate administration of the system is such an agent. The framework on it's own has absolutely no privileges over the rest of the system. It interacts almost exclusively with the LDAP server and authenticates to the LDAP server using delegated credentials from the user that is sending in the requests.
This is possible because through Kerberos and GSSAPI it is possible to delegate user's credentials during the Negotiate exchange that happens at the HTTP layer when a user contacts the Web Server and authenticates to it.
How does it work ?
Before we answer this question we have to make a step back and explain what kind of delegations are possible. Historically only one kind of very inflexible delegation was really implemented in standard Kerberos implementations like MIT's or Heimdal's. The full delegation (transmission) of the user's krbtgt to the target service.
This kind of delegation is perfect for services like SSH, where the user wants to have full access to their own credentials after they jumped on the target host, and they generally remain in full control of them.
The drawback of this method is that by transmitting the full krbtgt we are now giving another host potential access to each and all services our user has access to. And while that is "powerful" it is also sort of overly broad in many other situations. the other minor issue is that normally KDC's do not have fine grained authorization attached to this feature, meaning that a user (or often more generally a program acting on the user's machine) can delegate these credentials to any service in the network, w/o much control from admins.
Enter S4U constrained delegation
Luckily for us Microsoft introduced a new type of "constrained" delegation normally referred to as S4U. This is an extension to the age old Kerberos delegation method and adds 2 flavors of delegation each depending on the KDC for authorization; they are called Service-for-User-to-Self (S4U2Self) and Service-for-User-to-Proxy (S4U2Proxy).
Service-for-User-to-Self
S4U2Self allows a service to get a ticket for itself on behalf of a user, or in other terms is allows to get a ticket as if a user requested it using it's krbtgt form a KDC and then contacted the service.
This option may seem of little use, why would a service care for a ticket to itself ? If it is asking it, it already knows the identify of the user and can operate on its behalf right ? Wrong.
There are at least 3 aspects that makes this function useful. First of all you get the KDC to give you a ticket and therefore validate that the user identity actually exist and is active. Second it may attach a MS-PAC (or other authorization data to the ticket, allowing the service to know, form an authoritative source, authorization information about the user. Finally, it may allow the service to do further actions on behalf of a user by using S4U2Proxy constrained delegation on top.
All this is possible only if the KDC allows the specific service to request S4U2Self services. This is an additional layer of authorization that is very useful to admins, it allows them to limit what services can use this feature.
Service-for-User-to-Proxy
S4U2Proxy is the actual method used to perform impersonation against a 3rd service. To use S4U2Proxy a service A that wants to authenticate to service B on behalf of user X, contacts the KDC using a ticket for A from user X (this could also be a ticket obtained through S4U2Self) and sends this ticket to the KDC as evidence that user X did in fact contact service A. The KDC can now make authorization decisions about whether to allow service A to get a ticket for service B in the name of user X. Normally admins will allow this operation only for services that are authorized "Proxies" to other services.
In FreeIPA we just switched to using S4U2Proxy in order to reduce the attack surface against the web framework. By using S4U2Proxy we do not need the user to delegate us a full krbtgt. By doing this we allow the web framework to effectively be able to operate against the LDAP server and no other service in the domain
These 2 delegation methods are available now both in MIT's and Heimdal's Kerberos implementations. In MIT's case (which is the implementation we use in FreeIPA) it is really possible to use these features only if you use an LDAP back-end (or in general a custom back-end that implements the necessary kdb functions. The native back-end does not have support for these features, because it lacks meaningful grouping methods and Access Control facilities to control them.
In coding up the support for FreeIPA we ended up fixing a few bugs in MIT's implementation that will hopefully be available for general use in 1.11 (We have back ported patches to RHEL and Fedora). We also had to modify the Apache mod_auth_kerb module to properly deal with S4U2Proxy, which requires the requesting service to have a valid krbtgt in order to send the request to the KDC. Something mod_auth_kerb did not need before (you do not need a krbtgt if you are just validating a ticket).
Conclusion
S4U constrained delegation is extremely useful, it reduces attack surface by allowing admins to effectively constrain services, and gives admins a lot more control about what users can delegate to. Finally it also makes clients simpler, and this is a key winning feature. In the classic delegation scheme clients needs to decide on their own whether to delegate a krbtgt, which ultimately means either asking the user or always/never do it. And given it is quite dangerous to liberally forward your ticket to random services the default is generally to not delegate the krbtgt, making it very difficult to rely on this feature to make powerless agents. With S4U the user only needs a Forward-able TGT, but does not need to actually forward it at all. This is a reasonable compromise and does not require applications to make choice on user's behalf, nor to make user's need to make any decision. The decision rests on admins to allow certain service or not, and is taken generally once, when the service is put in production, greatly reducing the burden to administrators and the risks involved in the traditional delegation scheme.
Last updated: May 16, 2012 12:00 PM




