Samba

Planet Samba

Here you will find the personal blogs of Samba developers (for those that keep them). More information about members can also be found on the Samba Team page.

August 19, 2014

Rusty

POLLOUT doesn’t mean write(2) won’t block: Part II

My previous discovery that poll() indicating an fd was writable didn’t mean write() wouldn’t block lead to some interesting discussion on Google+.

It became clear that there is much confusion over read and write; eg. Linus thought read() was like write() whereas I thought (prior to my last post) that write() was like read(). Both wrong…

Both Linux and v6 UNIX always returned from read() once data was available (v6 didn’t have sockets, but they had pipes). POSIX even suggests this:

The value returned may be less than nbyte if the number of bytes left in the file is less than nbyte, if the read() request was interrupted by a signal, or if the file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading.

But write() is different. Presumably so simple UNIX filters didn’t have to check the return and loop (they’d just die with EPIPE anyway), write() tries hard to write all the data before returning. And that leads to a simple rule.  Quoting Linus:

Sure, you can try to play games by knowing socket buffer sizes and look at pending buffers with SIOCOUTQ etc, and say “ok, I can probably do a write of size X without blocking” even on a blocking file descriptor, but it’s hacky, fragile and wrong.

I’m travelling, so I built an Ubuntu-compatible kernel with a printk() into select() and poll() to see who else was making this mistake on my laptop:

cups-browsed: (1262): fd 5 poll() for write without nonblock
cups-browsed: (1262): fd 6 poll() for write without nonblock
Xorg: (1377): fd 1 select() for write without nonblock
Xorg: (1377): fd 3 select() for write without nonblock
Xorg: (1377): fd 11 select() for write without nonblock

This first one is actually OK; fd 5 is an eventfd (which should never block). But the rest seem to be sockets, and thus probably bugs.

What’s worse, are the Linux select() man page:

       A file descriptor is considered ready if it is possible to
       perform the corresponding I/O operation (e.g., read(2)) without
       blocking.
       ... those in writefds will be watched to see if a write will
       not block...

And poll():

	POLLOUT
		Writing now will not block.

Man page patches have been submitted…

August 19, 2014 01:57 PM

August 18, 2014

Jelmer

Using Propellor for configuration management

For a while, I've been wanting to set up configuration management for my home network. With half a dozen servers, a VPS and a workstation it is not big, but large enough to make it annoying to manually log into each machine for network-wide changes.

Most of the servers I have are low-end ARM machines, each responsible for a couple of tasks. Most of my machines run Debian or something derived from Debian. Oh, and I'm a member of the declarative school of configuration management.

Propellor

Propellor caught my eye earlier this year. Unlike some other configuration management tools, it doesn't come with its own custom language but it is written in Haskell, which I am already familiar with. It's also fairly simple, declarative, and seems to do most of the handful of things that I need.

Propellor is essentially a Haskell application that you customize for your site. It works very similar to e.g. xmonad, where you write a bit of Haskell code for configuration which uses the upstream library code. When you run the application it takes your code and builds a binary from your code and the upstream libraries.

Each host on which Propellor is used keeps a clone of the site-local Propellor git repository in /usr/local/propellor. Every time propellor runs (either because of a manual "spin", or from a cronjob it can set up for you), it fetches updates from the main site-local git repository, compiles the Haskell application and runs it.

Setup

Propellor was surprisingly easy to set up. Running propellor creates a clone of the upstream repository under ~/.propellor with a README file and some example configuration. I copied config-simple.hs to config.hs, updated it to reflect one of my hosts and within a few minutes I had a basic working propellor setup.

You can use ./propellor <host> to trigger a run on a remote host.

At the moment I have propellor working for some basic things - having certain Debian packages installed, a specific network configuration, mail setup, basic Kerberos configuration and certain SSH options set. This took surprisingly little time to set up, and it's been great being able to take full advantage of Haskell.

Propellor comes with convenience functions for dealing with some commonly used packages, such as Apt, SSH and Postfix. For a lot of the other packages, you'll have to roll your own for now. I've written some extra code to make Propellor deal with Kerberos keytabs and Dovecot, which I hope to submit upstream.

I don't have a lot of experience with other Free Software configuration management tools such as Puppet and Chef, but for my use case Propellor works very well.

The main disadvantage of propellor for me so far is that it needs to build itself on each machine it runs on. This is fine for my workstation and high-end servers, but it is somewhat more problematic on e.g. my Raspberry Pi's. Compilation takes a while, and the Haskell compiler and libraries it needs amount to 500Mb worth of disk space on the tiny root partition.

In order to work with Propellor, some Haskell knowledge is required. The Haskell in the configuration file is reasonably easy to understand if you keep it simple, but once the compiler spits out error messages then I suspect you'll have a hard time without any Haskell knowledge.

Propellor relies on having a central repository with the configuration that it can pull from as root. Unlike Joey, I am wary of publishing the configuration of my home network and I don't have a highly available local git server setup.

August 18, 2014 09:15 PM

August 15, 2014

Chris

RAGBRAI 2014

The long, straight, and typically empty Iowa roads were crowded with bicycles. We took up both lanes, and anyone foolish enough to drive a car or truck into our midst found themselves moving at about 12mph (19-ish kph).


Here I am at the side of the road sporting my Samba Team jersey (with the old-school logo).

  • 490 miles (788km) over 7 days
  • Five metric centuries (≥100km/day)
  • One century (≥100mi/day)
  • More pie, pork chops, and sports drink consumed than I can measure.

By the way, Iowa is not flat.  It has gently rolling hills, which are beautiful when you are riding in a car, and a constant challenge when you are on a bike.  There’s also the wind…

August 15, 2014 04:48 PM

August 02, 2014

Rusty

ccan/io: revisited

There are numerous C async I/O libraries; tevent being the one I’m most familiar with.  Yet, tevent has a very wide API, and programs using it inevitably descend into “callback hell”.  So I wrote ccan/io.

The idea is that each I/O callback returns a “struct io_plan” which says what I/O to do next, and what callback to call.  Examples are “io_read(buf, len, next, next_arg)” to read a fixed number of bytes, and “io_read_partial(buf, lenp, next, next_arg)” to perform a single read.  You could also write your own, such as pettycoin’s “io_read_packet()” which read a length then allocated and read in the rest of the packet.

This should enable a convenient debug mode: you turn each io_read() etc. into synchronous operations and now you have a nice callchain showing what happened to a file descriptor.  In practice, however, debug was painful to use and a frequent source of bugs inside ccan/io, so I never used it for debugging.

And I became less happy when I used it in anger for pettycoin, but at some point you’ve got to stop procrastinating and start producing code, so I left it alone.

Now I’ve revisited it.   820 insertions(+), 1042 deletions(-) and the code is significantly less hairy, and the API a little simpler.  In particular, writing the normal “read-then-write” loops is still very nice, while doing full duplex I/O is possible, but more complex.  Let’s see if I’m still happy once I’ve merged it into pettycoin…

August 02, 2014 06:58 AM

July 29, 2014

Rusty

Pettycoin Alpha01 Tagged

As all software, it took longer than I expected, but today I tagged the first version of pettycoin.  Now, lots more polish and features, but at least there’s something more than the git repo for others to look at!

July 29, 2014 07:53 AM

July 21, 2014

Andreas

What is preloading?

by Jakub Hrozek and Andreas Schneider

The LD_PRELOAD trick!

Preloading is a feature of the dynamic linker (ld). It is a available on most Unix system and allows to load a user specified, shared library before all other shared libraries which are linked to an executable.

Library pre-loading is most commonly used when you need a custom version of a library function to be called. You might want to implement your own malloc(3) and free(3) functions that would perform a rudimentary leak checking or memory access control for example, or you might want to extend the I/O calls to dump data when reverse engineering a binary blob. In this case, the library to be preloaded would implement the functions you want to override with prelinking. Only functions of dynamically loaded libraries can be overridden. You’re not able to override a function the application implements by itself or links statically with.

The library to preload is defined by the environment variable LD_PRELOAD, such as LD_PRELOAD=libwurst.so. The symbols of the preloaded library are bound first, before other linked shared libraries.
Lets look into symbol binding in more details. If your application calls a function, then the linker looks if it is available in the application itself first. If the symbol is not found, the linker checks all preloaded libraries and only then all the libraries which have been linked to your application. The shared libraries are searched in the order which has been given during compilation and linking. You can find out the linking order by calling 'ldd /path/to/my/applicaton'. If you’re interested how the linker is searching for the symbols it needs or if you want do debug if the symbol of your preloaded library is used correctly, you can do that by enabling tracing in the linker.

A simple example would be 'LD_DEBUG=symbols ls'. You can find more details about debugging with the linker in the manpage: 'man ld.so'.

Example:

Your application uses the function open(2).

  • Your application doesn’t implement it.
  • LD_PRELOAD=libcwrap.so provides open(2).
  • The linked libc.so provides open(2).

=> The open(2) symbol from libcwrap.so gets bound!

The wrappers used for creating complex testing environments of the cwrap project use preloading to supply their own variants of several system or library calls suitable for unit testing of networked software or privilege separation. For example, one wrapper includes its version of most of the standard API used to communicate over sockets that routes the communication over local sockets.

flattr this!

July 21, 2014 10:38 AM

July 17, 2014

Rusty

API Bug of the Week: getsockname().

A “non-blocking” IPv6 connect() call was in fact, blocking.  Tracking that down made me realize the IPv6 address was mostly random garbage, which was caused by this function:

bool get_fd_addr(int fd, struct protocol_net_address *addr)
{
   union {
      struct sockaddr sa;
      struct sockaddr_in in;
      struct sockaddr_in6 in6;
   } u;
   socklen_t len = sizeof(len);
   if (getsockname(fd, &u.sa, &len) != 0)
      return false;
   ...
}

The bug: “sizeof(len)” should be “sizeof(u)”.  But when presented with a too-short length, getsockname() truncates, and otherwise “succeeds”; you have to check the resulting len value to see what you should have passed.

Obviously an error return would be better here, but the writable len arg is pretty useless: I don’t know of any callers who check the length return and do anything useful with it.  Provide getsocklen() for those who do care, and have getsockname() take a size_t as its third arg.

Oh, and the blocking?  That was because I was calling “fcntl(fd, F_SETFD, …)” instead of “F_SETFL”!

July 17, 2014 03:31 AM

July 09, 2014

Andreas

Samba AD DC in Fedora and RHEL

Several people asked me about the status about the Active Directory Domain Controller support of Samba in Fedora. As Fedora and RHEL are using MIT Kerberos implementation as its Kerberos infrastructure of choice, the Samba Active Directory Domain Controller implementation is not available with MIT Kereberos at the moment. But we are working on it!

Günther Deschner and I gave a talk at the SambaXP conference about our development efforts in this direction:

The road to MIT KRB5 support

I hope this helps to understand that this is a huge task.

flattr this!

July 09, 2014 09:05 AM

July 02, 2014

David

Samba Server-Side Copy Offload

I recently implemented server-side copy offload support for Samba 4.1, along with Btrfs filesystem specific enhancements. This video compares server-side copy performance with traditional copy methods.


A few notes on the demonstration:
  • The Windows Server 2012 client and Samba server are connected via an old 100 Mb/s switch, which obviously acts as a network throughput bottleneck in the traditional copy demonstration.
  • The Samba server resembles the 4.1.0 code-base, but includes an extra patch to disable server-side copy requests on the regular share.

Many thanks to:
  • My colleagues at SUSE Linux, for supporting my implementation efforts.
  • The Samba Team, particularly Metze and Jeremy, for reviewing the code.
  • Kdenlive developers, for writing a great video editing suite.

Update (July, 2014): Usage is now fully documented on the Samba Wiki.

    July 02, 2014 06:21 PM

    June 21, 2014

    Rusty

    Alternate Blog for my Pettycoin Work

    I decided to use github for pettycoin, and tested out their blogging integration (summary: it’s not very integrated, but once set up, Jekyll is nice).  I’m keeping a blow-by-blow development blog over there.

    June 21, 2014 12:14 AM

    June 16, 2014

    Rusty

    Rusty Goes on Sabbatical, June to December

    At linux.conf.au I spoke about my pre-alpha implementation of Pettycoin, but progress since then has been slow.  That’s partially due to yak shaving (like rewriting ccan/io library), partially reimplementation of parts I didn’t like, and partially due to the birth of my son, but mainly because I have a day job which involves working on Power 8 KVM issues for IBM.  So Alex convinced me to take 6 months off from the day job, and work 4 days a week on pettycoin.

    I’m going to be blogging my progress, so expect several updates a week.  The first few alpha releases will be useless for doing any actual transactions, but by the first beta the major pieces should be in place…

    June 16, 2014 08:50 AM

    June 11, 2014

    David

    Using the Azure File Service on Linux

    The Microsoft Azure File Service is a new SMB shared-storage service offered on the Microsoft Azure public cloud.

    The service allows for the instant provisioning of file shares for private access by cloud provisioned VMs using the SMB 2.1 protocol, and additionally supports public access via a new REST interface.



    Linux VMs deployed on Azure can make use of this service using the Linux Kernel CIFS client. The kernel client must be configured to support and use the SMB 2.1 protocol dialect:
    • CONFIG_CIFS_SMB2 must be enabled in the kernel configuration at build time
      • Use
        # zcat /proc/config.gz | grep CONFIG_CIFS_SMB2
        to check this on a running system.
    • The vers=2.1 mount.cifs parameter must be provided at mount time.
    • Furthermore, the Azure storage account and access key must be provided as username and password.

    # mount.cifs -o vers=2.1,user=smb //smb.file.core.windows.net/share /share/
    Password for smb@//smb.file.core.windows.net/share: ******...
    # df -h /share/
    Filesystem Size Used Avail Use% Mounted on
    //smb.file.core.windows.net/share 5.0T 0 5.0T 0% /share

    This feature will be supported with the upcoming release of SUSE Linux Enterprise Server 12, and future openSUSE releases.

    Disclaimer: I work in the Labs department at SUSE.

    June 11, 2014 09:13 PM

    June 07, 2014

    Rusty

    Donation to Jupiter Broadcasting

    Chris Fisher’s Jupiter Broadcasting pod/vodcasting started 8 years ago with the Linux Action Show: still their flagship show, and how I discovered them 3 years ago.  Shows like this give access to FOSS to those outside the LWN-reading crowd; community building can be a thankless task, and as a small shop Chris has had ups and downs along the way.  After listening to them for a few years, I feel a weird bond with this bunch of people I’ve never met.

    I regularly listen to Techsnap for security news, Scibyte for science with my daughter, and Unfilter to get an insight into the NSA and what the US looks like from the inside.  I bugged Chris a while back to accept bitcoin donations, and when they did I subscribed to Unfilter for a year at 2 BTC.  To congratulate them on reaching the 100th Unfilter episode, I repeated that donation.

    They’ve started doing new and ambitious things, like Linux HOWTO, so I know they’ll put the funds to good use!

    June 07, 2014 11:45 PM

    June 02, 2014

    Andreas

    New features in socket_wrapper 1.1.0

    Maybe you already heard of the cwrap project. A set of tools to create a fully isolated network environment to test client/server components on a single host. socket_wrapper is a part of cwrap and I released version 1.1.0 today. In this release I worked together with Michael Adam and we implemented some nice new features like support for IP_PKTINFO for binding on UDP sockets, bindresvport() and more socket options via getsockopt(). This was mostly needed to be able to create a test environment for MIT Kerberos.

    The upcoming features for the next version are support for passing file description between processes using a unix domain socket and sendmsg()/recvmsg() (SCM_RIGHTS). We would also like to make socket_wrapper thread-safe.

    flattr this!

    June 02, 2014 02:36 PM

    May 27, 2014

    Rusty

    Effects of packet/data sizes on various networks

    I was thinking about peer-to-peer networking (in the context of Pettycoin, of course) and I wondered if sending ~1420 bytes of data is really any slower than sending 1 byte on real networks.  Similarly, is it worth going to extremes to avoid crossing over into two TCP packets?

    So I wrote a simple Linux TCP ping pong client and server: the client connects to the server then loops: reads until it gets a ’1′ byte, then it responds with a single byte ack.  The server sends data ending in a 1 byte, then reads the response byte, printing out how long it took.  First 1 byte of data, then 101 bytes, all the way to 9901 bytes.  It does this 20 times, then closes the socket.

    Here are the results on various networks (or download the source and result files for your own analysis):

    On Our Gigabit Lan

    Interestingly, we do win for tiny packets, but there’s no real penalty once we’re over a packet (until we get to three packets worth):

    Over the Gigabit Lan

    Over the Gigabit Lan

    gigabit-lan-closeup

    Over Gigabit LAN (closeup)

    On Our Wireless Lan

    Here we do see a significant decline as we enter the second packet, though extra bytes in the first packet aren’t completely free:

    Wireless LAN (all results)

    Wireless LAN (all results)

    Wireless LAN (closeup)

    Wireless LAN (closeup)

    Via ADSL2 Over The Internet (Same Country)

    Ignoring the occasional congestion from other uses of my home net connection, we see a big jump after the first packet, then another as we go from 3 to 4 packets:

    ADSL over internet in same country

    ADSL over internet in same country

    ADSL over internet in same country (closeup)

    ADSL over internet in same country (closeup)

    Via ADSL2 Over The Internet (Australia <-> USA)

    Here, packet size is completely lost in the noise; the carrier pidgins don’t even notice the extra weight:

    Wifi + ADSL2 from Adelaide to US

    Wifi + ADSL2 from Adelaide to US

    Wifi + ADSL2 from Adelaide to US (closeup)

    Wifi + ADSL2 from Adelaide to US (closeup)

    Via 3G Cellular Network (HSPA)

    I initially did this with Wifi tethering, but the results were weird enough that Joel wrote a little Java wrapper so I could run the test natively on the phone.  It didn’t change the resulting pattern much, but I don’t know if this regularity of delay is a 3G or an Android thing.  Here every packet costs, but you don’t win a prize for having a short packet:

    3G network

    3G network

    3G network (closeup)

    3G network (closeup)

    Via 2G Network (EDGE)

    This one actually gives you a penalty for short packets!  800 bytes to 2100 bytes is the sweet-spot:

    2G (EDGE) network

    2G (EDGE) network

    2G (EDGE) network (closeup)

    2G (EDGE) network (closeup)

    Summary

    So if you’re going to send one byte, what’s the penalty for sending more?  Eyeballing the minimum times from the graphs above:

    Wired LAN Wireless ADSL 3G 2G
    Penalty for filling packet 30%  15%  5%  0%  0%*
    Penalty for second packet 30%  40%  15%  20%  0%
    Penalty for fourth packet 60%  80%  25%  40%  25%

    * Average for EDGE actually improves by about 35% if you fill packet

    May 27, 2014 10:19 AM

    May 18, 2014

    David

    Elasto Cloud Storage Gateway

    At this years SambaXP conference, I was lucky enough to be able to present the Elasto cloud storage gateway. Elasto is an open-source cloud storage library and client that I created as part of SUSE Hack Week.



    In addition to Elasto, the project makes use of istgt and existing operating system components to expose a cloud storage object as a local filesystem, on which all data is transparently encrypted before going anywhere near the internet.


    Such a tool can offer a number of benefits over traditional cloud storage clients:
    • Secure: Client-side encryption, via dm-crypt or Bitlocker, reduces reliance on transport and cloud provider security.
    • Cost effective: Client-side compression (e.g. with Btrfs or NTFS) improves cloud storage utilization.
    • Flexible: Standard interface, no vendor lock-in.
    Both Elasto and istgt components are available to download for openSUSE and Fedora. Given the immaturity of the project, be warned that its use may result in data loss or corruption - play safe, take backups!

    May 18, 2014 01:02 AM

    May 08, 2014

    Rusty

    BTC->BPAY gateway (for Australians)

    I tested out livingroomofsatoshi.com, which lets you pay any BPAY bill (see explanation from reddit).  Since I’d never heard of the developer, I wasn’t going to send anything large through it, but it worked flawlessly.  At least the exposure is limited to the time between sending the BTC and seeing the BPAY receipt.  Exchange rate was fair, and it was a simple process.

    Now I need to convince my wife we should buy some BTC for paying bills…

    May 08, 2014 02:55 AM

    May 05, 2014

    Andreas

    Testing your full software stack with cwrap

    Together with Jakub Hrozek I wrote an article about cwrap which is a set of tools to test your full software stack on a single machine. The article is open to the public now.

    Read the article …

    flattr this!

    May 05, 2014 07:23 AM

    April 14, 2014

    Andreas

    Group support for cmocka

    Last Friday I’ve released cmocka 0.4.0. It has several bugfixes and at least two new features. One is support for groups. This means you can define a setup and teardown function for a group of unit tests. I think some people have been waiting for this.

    You can find an example here. It is simple and easy to use.
    The other small feature is a new macro: assert_return_code(). It is designed for standard C function return values which return 0 for success and less than 0 to indicate an error with errno set. It will produce a nice error message! The rest are bugfixes and improvements for error message.

    Thanks to all contributor and bug reporter!

    If you think cmocka is a great piece of software, please vote it up on stackoverflow, thanks.

    flattr this!

    April 14, 2014 04:24 PM

    March 24, 2014

    Rusty

    Legal Questions About Localbitcoins.com and Australia

    As my previous post documented, I’ve experimented with localbitcoins.com.  Following the arrest of two Miami men for trading on localbitcoins, I decided to seek legal advice on the sitation in Australia.

    Online research led me to Nick Karagiannis of Kelly and Co, who was already familiar with Bitcoin: I guess it’s a rare opportunity for excitement in financial regulatory circles!  This set me back several thousand dollars (in fiat, unfortunately), but the result was reassuring.

    They’ve released an excellent summary of the situation, derived from their research.  I hope that helps other bitcoin users in Australia, and I’ll post more in future should the legal situation change.

    March 24, 2014 04:01 AM

    Last updated: August 22, 2014 09:00 AM

    Donations


    Nowadays, the Samba Team needs a dollar instead of pizza ;-)

    Beyond Samba

    Releases