Samba

Planet Samba

Here you will find the personal blogs of Samba developers (for those that keep them). More information about members can also be found on the Samba Team page.

January 30, 2018

David

Building Ceph master with C++17 support on openSUSE Leap 42.3

Ceph now requires C++17 support, which is available with modern compilers such as gcc-7. openSUSE Leap 42.3, my current OS of choice, includes gcc-7. However, it's not used by default.

Using gcc-7 for the Ceph build is a simple matter of:

> sudo zypper in gcc7-c++
> CC=gcc-7 CXX=/usr/bin/g++-7 ./do_cmake.sh ...
> cd build && make -j$(nproc)

January 30, 2018 12:46 AM

January 08, 2018

Jelmer

Breezy: Forking Bazaar

A couple of months ago, Martin and I announced a friendly fork of Bazaar, named Breezy.

It's been 5 years since I wrote a Bazaar retrospective and around 6 since I seriously contributed to the Bazaar codebase.

Goals

We don't have any grand ambitions for Breezy; the main goal is to keep Bazaar usable going forward. Your open source projects should still be using Git.

The main changes we have made so far come down to fixing a number of bugs and to bundling useful plugins. Bundling plugins makes setting up an environment simpler and to eliminate the API compatibility issues that plagued external plugins in the Bazaar world.

Perhaps the biggest effort in Breezy is porting the codebase to Python 3, allowing it to be used once Python 2 goes EOL in 2020.

A fork

Breezy is a fork of Bazaar and not just a new release series.

Bazaar upstream has been dormant for the last couple of years anyway - we don't lose anything by forking.

We're forking because gives us the independence to make some of the changes we deemed necessary and that are otherwise hard to make for an established project, For example, we're now bundling plugins, taking an axe to a large number of APIs and dropping support for older platforms.

A fork also means independence from Canonical; there is no CLA for Breezy (a hindrance for Bazaar) and we can set up our own infrastructure without having to chase down Canonical staff for web site updates or the installation of new packages on the CI system.

More information

Martin gave a talk about Breezy at PyCon UK this year.

Breezy bugs can be filed on Launchpad. For the moment, we are using the Bazaar mailing list and the #bzr IRC channel for any discussions and status updates around Breezy.

January 08, 2018 04:00 PM

January 02, 2018

David

Rapid Linux Kernel Dev/Test with QEMU, KVM and Dracut


Update 2018-01-02: See my post on Rapido for a more automated approach to the procedure outlined below.

Inspired by Stefan Hajnoczi's excellent blog post, I recently set about constructing an environment for rapid testing of Linux kernel changes, particularly focused on the LIO iSCSI target. Such an environment would help me in number of ways:

  • Faster dev / test turnaround.
    • A modified kernel can be compiled and booted in a matter of seconds.
  • Improved resource utilisation.
    • No need to boot external test hosts or heavyweight VMs.
  •  Simplified and speedier debugging.

My requirements were slightly different to Stefan's, in that:
  • I'd prefer to be lazy and use Dracut for initramfs generation.
  • I need a working network connection between VM and hypervisor system
    • The VM will act as the iSCSI target, the hypervisor as the initiator.

Starting with the Linux kernel, the first step is to build a bzimage:
~/> git clone \
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
hack, hack, hack.
~/linux/> make menuconfig
Set CONFIG_IP_PNP_DHCP=y and CONFIG_E1000=y to enable IP address assignment on boot.
~/linux/> make -j6
~/linux/> make modules
~/linux/> INSTALL_MOD_PATH=./mods make modules_install
~/linux/> sudo ln -s $PWD/mods/lib/modules/$(make kernelrelease) \
/lib/modules/$(make kernelrelease)
This leaves us with a compressed kernel image file at arch/x86/boot/bzimage, and corresponding modules installed under mods/lib/module/$(make kernelrelease), where $(make kernelrelease) evaluates to 4.1.0-rc7+ in this example. The /lib/modules/4.1.0-rc7+ symlink allows Dracut to locate the modules.

The next step is to generate an initial RAM filesystem, or initramfs, which includes a minimal set of user-space utilities, and kernel modules needed for testing:

~/linux/> dracut --kver "$(make kernelrelease)" \
--add-drivers "iscsi_target_mod target_core_mod" \
--add-drivers "target_core_file target_core_iblock" \
--add-drivers "configfs" \
--install "ps grep netstat" \
--no-hostonly --no-hostonly-cmdline \
--modules "bash base shutdown network ifcfg" initramfs
...
*** Creating image file done ***

We now have an initramfs file in the current directory, with the following contents:
  • LIO kernel modules obtained from /lib/module/4.1.0-rc7, as directed via the --kver and --add-drivers parameters.
  • User-space shell, boot and network helpers, as directed via the --modules parameter.

We're now ready to use QEMU/KVM to boot our test kernel and initramfs:

~/linux/> qemu-kvm -kernel arch/x86/boot/bzImage \
-initrd initramfs \
-device e1000,netdev=network0 \
-netdev user,id=network0 \
-redir tcp:51550::3260 \
-append "ip=dhcp rd.shell=1 console=ttyS0" \
-nographic

This boots the test environment, with the kernel and initramfs previously generated:

[    3.216596] dracut Warning: dracut: FATAL: No or empty root= argument
[ 3.217998] dracut Warning: dracut: Refusing to continue
...
Dropping to debug shell.

dracut:/#

From the dracut shell, confirm that the QEMU DHCP server assigned the VM an IP address:

dracut:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
...
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
...
inet 10.0.2.15/24 brd 10.0.2.255 scope global eth0

Port 3260 (iSCSI) on this interface is forwarded to/from port 51550 on the hypervisor, as configured via the qemu-kvm -redir parameter.

Now onto LIO iSCSI target setup. First off load the appropriate kernel modules:

dracut:/# modprobe iscsi_target_mod
dracut:/# cat /proc/modules
iscsi_target_mod 246669 0 - Live 0xffffffffa006a000
target_core_mod 289004 1 iscsi_target_mod, Live 0xffffffffa000b000
configfs 22407 3 iscsi_target_mod,target_core_mod, Live 0xffffffffa0000000

LIO configuration requires a mounted configfs filesystem:

dracut:/# mount -t configfs configfs /sys/kernel/config/
dracut:/# cat /sys/kernel/config/target/version
Target Engine Core ConfigFS Infrastructure v4.1.0 on Linux/x86_64 on 4.1.0-rc1+

An iSCSI target can be provisioned by manipulating corresponding configfs entries. I used the lio_dump output on an existing setup as reference:

dracut:/# mkdir /sys/kernel/config/target/iscsi
dracut:/# echo -n 0 > /sys/kernel/config/target/iscsi/discovery_auth/enforce_discovery_auth
dracut:/# mkdir -p /sys/kernel/config/target/iscsi/<iscsi_iqn>/tpgt_1/np/10.0.2.15:3260
...

Finally, we're ready to connect to the LIO target using the local hypervisor port that forwards to the VM's virtual network adapter:

~/linux/> iscsiadm --mode discovery \
--type sendtargets \
--portal 127.0.0.1:51550
10.0.2.15:3260,1 iqn.2015-04.suse.arch:5eca2313-028d-435c-9131-53a5ab256a83

It works!

There are a few things that can be adjusted:
  • Port forwarding to the VM network is a bit fiddly - I'm now using a bridge/TAP configuration instead.
  • When dropping into the emergency boot shell, Dracut executes scripts carried under /lib/dracut/hooks/emergency/. This means that a custom script can be triggered on boot via:
    ~/linux/> dracut -i runme.sh /lib/dracut/hooks/emergency/02-runme.sh ...
  • It should be possible to have Dracut pull the kernel modules in from the temporary directory, but I wasn't able to get this working:
    ~/linux/> INSTALL_MOD_PATH=./mods make modules_install
    ~/linux/> dracut --kver "$(make kernelrelease)" --kmoddir ./mods/lib/...
  • Boot time and initramfs file IO performance can be improved by disabling compression. This is done by specifying the --no-compress Dracut parameter.

Update 20150722:
  • Don't install kernel modules as root, set up a /lib/modules symlink for Dracut instead.
  • Link to bridge/TAP networking post.
  • Describe boot script usage.
Update 20150813:
  • Use $(make kernelrelease) rather than a hard-coded 4.1.0-rc7+ kernel version string - thanks Aurélien!
Update 20150908: Describe initramfs --no-compress optimisation.
Update 20180102: Link to the Rapido project.

January 02, 2018 07:37 PM

December 20, 2017

David

QEMU/KVM Bridged Network with TAP interfaces

In my previous post, Rapid Linux Kernel Dev/Test with QEMU, KVM and Dracut, I described how build and boot a Linux kernel quickly, making use of port forwarding between hypervisor and guest VM for virtual network traffic.

This post describes how to plumb the Linux VM directly into a hypervisor network, through the use of a bridge.

Start by creating a bridge on the hypervisor system:

> sudo ip link add br0 type bridge

Clear the IP address on the network interface that you'll be bridging (e.g. eth0).
Note: This will disable network traffic on eth0!
> sudo ip addr flush dev eth0
Add the interface to the bridge:
> sudo ip link set eth0 master br0

Next up, create a TAP interface:
> sudo ip tuntap add dev tap0 mode tap user $(whoami)
The user parameter ensures that the current user will be able to connect to the TAP interface.

Add the TAP interface to the bridge:
> sudo ip link set tap0 master br0

Make sure everything is up:
> sudo ip link set dev br0 up
> sudo ip link set dev tap0 up

The TAP interface is now ready for use. Assuming that a DHCP server is available on the bridged network, the VM can now obtain an IP address during boot via:
> qemu-kvm -kernel arch/x86/boot/bzImage \
-initrd initramfs \
-device e1000,netdev=network0,mac=52:55:00:d1:55:01 \
-netdev tap,id=network0,ifname=tap0,script=no,downscript=no \
-append "ip=dhcp rd.shell=1 console=ttyS0" -nographic

The MAC address is explicitly specified, so care should be taken to ensure its uniqueness.

The DHCP server response details are printed alongside network interface configuration. E.g.
[    3.792570] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
[ 3.796085] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[ 3.812083] Sending DHCP requests ., OK
[ 4.824174] IP-Config: Got DHCP answer from 10.155.0.42, my address is 10.155.0.1
[ 4.825119] IP-Config: Complete:
[ 4.825476] device=eth0, hwaddr=52:55:00:d1:55:01, ipaddr=10.155.0.1, mask=255.255.0.0, gw=10.155.0.254
[ 4.826546] host=rocksolid-sles, domain=suse.de, nis-domain=suse.de
...

Didn't get an IP address? There are a few things to check:
  • Confirm that the kernel is built with boot-time DHCP client (CONFIG_IP_PNP_DHCP=y) and E1000 network driver (CONFIG_E1000=y) support.
  • Check the -device and -netdev arguments specify a valid e1000 TAP interface.
  • Ensure that ip=dhcp is provided as a kernel boot parameter, and that the DHCP server is up and running.
Happy hacking

 Update 20161223:
  • Use 'ip' instead of 'brctl' to manipulate the bridge device - thanks Yagamy Light!
Update 20171220:
  • Use 'ip tuntap' instead of 'tunctl' to create the TAP interface - thanks Johannes!

    December 20, 2017 03:44 PM

    October 18, 2017

    David

    Git send-email PATCH version subject prefix

    I use git send-email to submit patches to developer mailing lists. A reviewer may request a series of changes, in which case I find it easiest to make and test those changes locally, before sending a new round of patches to the mailing list with a new version number:

    git send-email --subject-prefix="PATCH v4" --compose -14

    Assigning a version number to each round of patches allows me to add a change log for the entire patch-set to the introductory mail, e.g.:
    From: David Disseldorp 
    Subject: [PATCH v4 00/14] add compression ioctl support

    This patch series adds support for the FSCTL_GET_COMPRESSION and
    FSCTL_SET_COMPRESSION ioctls, as well as the reporting of the current
    compression state via the FILE_ATTRIBUTE_COMPRESSED flag.

    Hooks are added to the Btrfs VFS module, which translates such requests
    into corresponding FS_IOC_GETFLAGS and FS_IOC_SETFLAGS ioctls.

    Changes since v3 (thanks for the feedback Jeremy):
    - fixed and split copy-chunk dest unlock change into separate commit

    Changes since v2:
    - Check for valid fsp file descriptors
    - Rebase atop filesystem specific selftest changes
    - Change compression fsctl permission checks to match Windows behaviour
    + Add corresponding smbtorture test

    Changes since v1:
    - Only use smb_fname and fsp args with GET_COMPRESSION. The smb_fname
    argument is only needed for the dosmode() code-path, fsp is used
    everywhere else.
    - Add an extra SetInfo(FILE_ATTRIBUTE_COMPRESSED) test.

    GIT: [PATCH v4 01/14] selftest/s3: expose share with FS applicable config
    ...

    Change logs can also be added to individual patches using the --annotate parameter:

    git send-email --annotate --subject-prefix="PATCH v2" -1
     
     Subject: [PATCH v2] common: add CephFS support
    ...
    ---
    Changes since version 1:
    - Remove -ceph parameter for check and rely on $FSTYP instead
    ...
    diff --git a/README.config-sections b/README.config-sections

    Putting the change log between the first "---" and "diff --git a/..." lines ensures that it will be dropped when applying the patch via git-am.

    [update 2016-11-03]
    - Describe the --annotate parameter 

    October 18, 2017 01:42 AM

    September 21, 2017

    Andreas

    Samba 4.7.0 (Samba AD for the Enterprise)

    Enterprise distributions like Red Hat or SUSE are required to ship with MIT Kerberos. The reason is that several institutions or governments have a hard requirement for a special Kerberos implementation. It is the reason why the distributions by these vendors (Fedora, RHEL, openSUSE, SLES) only package Samba FS and not the AD component.

    To get Samba AD into RHEL some day it was clear, that we need to port it to MIT Kerberos.

    In 2013 we started to think about this. The question which arise first was: How do we run the tests if we port to MIT Kerberos? We want to start the krb5kdc daemon. This was more or less the birth of the cwrap project! Think of cwrap like it is “The Matrix” where reality is simulated and everything is a lie. It allows us to create an artificial environment emulating a complete network to test Samba. It took nearly a year till we were able to integrate the first part of cwrap, socket_wrapper, into Samba.

    Then the work to port Samba AD to MIT Kerberos started. We created a simple abstraction of Samba KDC routines so we could convert between Heimdal and MIT Kerberos. Then created a MIT KDB module and were able to start the krb5kdc process.

    In 2015 we had more than 140 patches for Samba AD ready and pushed most of them upstream in April. We still had 70 testsuites failing. We started to implement missing features and fixed tests to work with MIT Kerberos tools. During that time we often had setbacks because features in MIT Kerboros were missing which we required. So we started to implement missing features in MIT Kerberos.

    In September of 2015 we started to implement missing pieces in ‘samba-tool’ to provision a domain with MIT Kerberos involved. Till the end of the year we implemented the backup key protocol using GnuTLS (which also needed to add features for us first).

    From January till July 2016 we implemented more features in MIT Kerberos to get everything working. In August we had most of the stuff working just the trust support wasn’t working. From there we discovered bug after bug in the implementation how trusts are handled and fixed bug by bug. We had to do major rewrites of code in order to get everything working correctly. The outcome was great. We improved our trust code and got MIT Kerberos working in the end.

    2017-04-30

    That’s the day when I pushed the final patchset to our source code repository!

    It took Günther Deschner, Stefan Metzmachen and me more than 4 years to implement Samba AD with MIT Kerberos. Finally with the release Samba 4.7.0 it is available for everyone out there.

    Fedora 27 will be the first version with Samba AD.

    September 21, 2017 04:00 PM

    July 06, 2017

    Rusty

    Broadband Speeds, 2 Years Later

    Two years ago, considering the blocksize debate, I made two attempts to measure average bandwidth growth, first using Akamai serving numbers (which gave an answer of 17% per year), and then using fixed-line broadband data from OFCOM UK, which gave an answer of 30% per annum.

    We have two years more of data since then, so let’s take another look.

    OFCOM (UK) Fixed Broadband Data

    First, the OFCOM data:

    • Average download speed in November 2008 was 3.6Mbit
    • Average download speed in November 2014 was 22.8Mbit
    • Average download speed in November 2016 was 36.2Mbit
    • Average upload speed in November 2008 to April 2009 was 0.43Mbit/s
    • Average upload speed in November 2014 was 2.9Mbit
    • Average upload speed in November 2016 was 4.3Mbit

    So in the last two years, we’ve seen 26% increase in download speed, and 22% increase in upload, bringing us down from 36/37% to 33% over the 8 years. The divergence of download and upload improvements is concerning (I previously assumed they were the same, but we have to design for the lesser of the two for a peer-to-peer system).

    The idea that upload speed may be topping out is reflected in the Nov-2016 report, which notes only an 8% upload increase in services advertised as “30Mbit” or above.

    Akamai’s State Of The Internet Reports

    Now let’s look at Akamai’s Q1 2016 report and Q1-2017 report.

    • Annual global average speed in Q1 2015 – Q1 2016: 23%
    • Annual global average speed in Q1 2016 – Q1 2017: 15%

    This gives an estimate of 19% per annum in the last two years. Reassuringly, the US and UK (both fairly high-bandwidth countries, considered in my previous post to be a good estimate for the future of other countries) have increased by 26% and 19% in the last two years, indicating there’s no immediate ceiling to bandwidth.

    You can play with the numbers for different geographies on the Akamai site.

    Conclusion: 19% Is A Conservative Estimate

    17% growth now seems a little pessimistic: in the last 9 years the US Akamai numbers suggest the US has increased by 19% per annum, the UK by almost 21%.  The gloss seems to be coming off the UK fixed-broadband numbers, but they’re still 22% upload increase for the last two years.  Even Australia and the Philippines have managed almost 21%.

    July 06, 2017 10:01 AM

    July 03, 2017

    David

    Multipath Failover Simulation with QEMU

    While working on a Ceph OSD multipath issue, I came across a helpful post from Dan Horák on how to simulate a multipath device under QEMU.


    qemu-kvm ... -device virtio-scsi-pci,id=scsi \
    -drive if=none,id=hda,file=<path>,cache=none,format=raw,serial=MPIO \
    -device scsi-hd,drive=hda \
    -drive if=none,id=hdb,file=<path>,cache=none,format=raw,serial=MPIO \
    -device scsi-hd,drive=hdb"
    • <path> should be replaced with a file or device path (the same for each)
    • serial= specifies the SCSI logical unit serial number
    This attaches two virtual SCSI devices to the VM, both of which are backed by the same file and share the same SCSI logical unit identifier.
    Once booted, the SCSI devices for each corresponding path appear as sda and sdb, which are then detected as multipath enabled and subsequently mapped as dm-0:

             Starting Device-Mapper Multipath Device Controller...
    [ OK ] Started Device-Mapper Multipath Device Controller.
    ...
    [ 1.329668] device-mapper: multipath service-time: version 0.3.0 loaded
    ...
    rapido1:/# multipath -ll
    0QEMU_QEMU_HARDDISK_MPIO dm-0 QEMU,QEMU HARDDISK
    size=2.0G features='1 retain_attached_hw_handler' hwhandler='0' wp=rw
    |-+- policy='service-time 0' prio=1 status=active
    | `- 0:0:0:0 sda 8:0 active ready running
    `-+- policy='service-time 0' prio=1 status=enabled
    `- 0:0:1:0 sdb 8:16 active ready running

    QEMU additionally allows for virtual device hot(un)plug at runtime, which can be done from the QEMU monitor CLI (accessed via ctrl-a c) using the drive_del command. This can be used to trigger a multipath failover event:

    rapido1:/# mkfs.xfs /dev/dm-0
    meta-data=/dev/dm-0 isize=256 agcount=4, agsize=131072 blks
    = sectsz=512 attr=2, projid32bit=1
    = crc=0 finobt=0, sparse=0
    data = bsize=4096 blocks=524288, imaxpct=25
    = sunit=0 swidth=0 blks
    naming =version 2 bsize=4096 ascii-ci=0 ftype=1
    log =internal log bsize=4096 blocks=2560, version=2
    = sectsz=512 sunit=0 blks, lazy-count=1
    realtime =none extsz=4096 blocks=0, rtextents=0
    rapido1:/# mount /dev/dm-0 /mnt/
    [ 96.846919] XFS (dm-0): Mounting V4 Filesystem
    [ 96.851383] XFS (dm-0): Ending clean mount

    rapido1:/# QEMU 2.6.2 monitor - type 'help' for more information
    (qemu) drive_del hda
    (qemu)

    rapido1:/# echo io-to-trigger-path-failure > /mnt/failover-trigger
    [ 190.926579] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
    [ 190.926588] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x2 [current]
    [ 190.926589] sd 0:0:0:0: [sda] tag#0 ASC=0x3a ASCQ=0x0
    [ 190.926590] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00 00 00 02 00 00 01 00
    [ 190.926591] blk_update_request: I/O error, dev sda, sector 2
    [ 190.926597] device-mapper: multipath: Failing path 8:0.

    rapido1:/# multipath -ll
    0QEMU_QEMU_HARDDISK_MPIO dm-0 QEMU,QEMU HARDDISK
    size=2.0G features='1 retain_attached_hw_handler' hwhandler='0' wp=rw
    |-+- policy='service-time 0' prio=0 status=enabled
    | `- 0:0:0:0 sda 8:0 failed faulty running
    `-+- policy='service-time 0' prio=1 status=active
    `- 0:0:1:0 sdb 8:16 active ready running

    The above procedure demonstrates cable-pull simulation while the broken path is used by the mounted dm-0 device. The subsequent I/O failure triggers multipath failover to the remaining good path.

    I've added this functionality to Rapido (pull-request) so that multipath failover can be performed in a couple of minutes directly from kernel source. I encourage you to give it a try for yourself!

    July 03, 2017 12:38 AM

    June 11, 2017

    David

    Rapido: Quick Kernel Testing From Source (Video)

    I presented a short talk at the 2017 openSUSE Conference on Linux kernel testing using Rapido.

    There were many other interesting talks during the conference, all of which can be viewed on the oSC 2017 media site.
    A video of my presentation is available below, and on YouTube. Many thanks to the organisers and sponsors for putting on a great event.

    June 11, 2017 07:29 PM

    May 06, 2017

    Jelmer

    Xandikos, a lightweight Git-backed CalDAV/CardDAV server

    For the last couple of years, I have self-hosted my calendar and address book data. Originally I just kept local calendars and address books in Evolution, but later I moved to a self-hosted CalDAV/CardDAV server and a plethora of clients.

    CalDAV/CardDAV

    CalDAV and CardDAV are standards for accessing, managing, and sharing calendaring and addressbook information based on the iCalendar format that are built atop the WebDAV standard, and WebDAV itself is a set of extensions to HTTP.

    CalDAV and CardDAV essentially store iCalendar (.ics) and vCard (.vcf) files using WebDAV, but they provide some extra guarantees (e.g. files must be well-formed) and some additional methods for querying the data. For example, it is possible to retrieve all events between two dates with a single HTTP query, rather than the client having to check all the calendar files in a directory.

    CalDAV and CardDAV are (unnecessarily) complex, in large part because they are built on top of WebDAV. Being able to use regular HTTP and WebDAV clients is quite neat, but results in extra complexity. In addition, because the standards are so large, clients and servers end up only implementing parts of it.

    However, CalDAV and CardDAV have one big redeeming quality: they are the dominant standards for synchronising calendar events and addressbooks, and are supported by a wide variety of free and non-free applications. They're the status quo, until something better comes along. (and hey, at least there is a standard to begin with)

    Calypso

    I have tried a number of servers over the years. In the end, I settled for calypso.

    Calypso started out as friendly fork of Radicale, with some additional improvements. I like Calypso because it is:

    • quite simple, understandable, and small (sloccount reports 1700 LOC)
    • it stores plain .vcf and .ics files
    • stores history in git
    • easy to set up, e.g. no database dependencies
    • written in Python

    Its use of regular files and keeping history in Git is useful, because this means that whenever it breaks it is much easier to see what is happening. If something were to go wrong (i.e. a client decides to remove all server-side entries) it's easy to recover by rolling back history using git.

    However, there are some downsides to Calypso as well.

    It doesn't have good test coverage, making it harder to change (especially in a way that doesn't break some clients), though there are some recent efforts to make e.g. external spec compliance tests like caldavtester work with it.

    Calypso's CalDAV/CardDAV/WebDAV implementation is a bit ad-hoc. The only WebDAV REPORTs it implements are calendar-multiget and addressbook-multiget. Support for properties has been added as new clients request them. The logic for replying to DAV requests is mixed with the actual data store implementation.

    Because of this, it can be hard to get going with some clients and sometimes tricky to debug.

    Xandikos

    After attempting to fix a number of issues in Calypso, I kept running into issues with the way its code is structured. This is only fixable by rewriting sigifnicant chunks of it, so I opted to instead write a new server.

    The goals of Xandikos are along the same lines as those of Calypso, to be a simple CalDAV/CardDAV server for personal use:

    • easy to set up; at the moment, just running xandikos -d $HOME/dav --defaults is enough to start a new server
    • use of plain .ics/.ivf files for storage
    • history stored in Git

    But additionally:

    • clear separation between protocol implementation and storage
    • be well tested
    • standards complete
    • standards compliant

    Current status

    The CalDAV/CardDAV implementation of Xandikos is mostly complete, but there still a number of outstanding issues.

    In particular:

    • lack of authentication support; setting up authentication support in uwsgi or a reverse proxy is one way of working around this
    • there is no useful UI for users accessing the DAV resources via a web browser
    • test coverage

    Xandikos has been tested with the following clients:

    Trying it

    To run Xandikos, follow the instructions on the homepage:

    ./bin/xandikos --defaults -d $HOME/dav
    

    A server should now be listening on localhost:8080 that you can access with your favorite client.

    May 06, 2017 01:06 AM

    March 09, 2017

    Rusty

    Quick Stats on zstandard (zstd) Performance

    Was looking at using zstd for backup, and wanted to see the effect of different compression levels. I backed up my (built) bitcoin source, which is a decent representation of my home directory, but only weighs in 2.3GB. zstd -1 compressed it 71.3%, zstd -22 compressed it 78.6%, and here’s a graph showing runtime (on my laptop) and the resulting size:

    zstandard compression (bitcoin source code, object files and binaries) times and sizes

    For this corpus, sweet spots are 3 (the default), 6 (2.5x slower, 7% smaller), 14 (10x slower, 13% smaller) and 20 (46x slower, 22% smaller). Spreadsheet with results here.

    March 09, 2017 12:53 AM

    January 06, 2017

    David

    Rapido: A Glorified Wrapper for Dracut and QEMU

    Introduction


    I've blogged a few of times about how Dracut and QEMU can be combined to greatly improve Linux kernel dev/test turnaround.
    • My first post covered the basics of building the kernel, running dracut, and booting the resultant image with qemu-kvm.
    • A later post took a closer look at network configuration, and focused on bridging VMs with the hypervisor.
    • Finally, my third post looked at how this technique could be combined with Ceph, to provide a similarly efficient workflow for Ceph development.
      In bringing this series to a conclusion, I'd like to introduce the newly released Rapido project. Rapido combines all of the procedures and techniques described in the articles above into a handful of scripts, which can be used to test specific Linux kernel functionality, standalone or alongside other technologies such as Ceph.

       

       

      Usage - Standalone Linux VM


      The following procedure was tested on openSUSE Leap 42.2 and SLES 12SP2, but should work fine on many other Linux distributions.

       

      Step 1: Checkout and Build


      Checkout the Linux kernel and Rapido source repositories:
      ~/> cd ~
      ~/> git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
      ~/> git clone https://github.com/ddiss/rapido.git

      Build the kernel (using a config provided with the Rapido source):
      ~/> cp rapido/kernel/vanilla_config linux/.config
      ~/> cd linux
      ~/linux/> make -j6
      ~/linux/> make modules
      ~/linux/> INSTALL_MOD_PATH=./mods make modules_install
      ~/linux/> sudo ln -s $PWD/mods/lib/modules/$(make kernelrelease) \
      /lib/modules/$(make kernelrelease)

      Step 2: Configuration 


      Install Rapido dependencies: dracut, qemu, brctl (bridge-utils) and tunctl.

      Edit rapido.conf, the master Rapido configuration file:
      ~/linux/> cd ~/rapido
      ~/rapido/> vi rapido.conf
      • set KERNEL_SRC="/home/<user>/linux"
      • set TAP_USER="<user>"
      • set MAC_ADDR1 to a valid MAC address, e.g. "b8:ac:24:45:c5:01"
      • set MAC_ADDR2 to a valid MAC address, e.g. "b8:ac:24:45:c5:02"

      Configure the bridge and tap network devices. This must be done as root:
      ~/rapido/> sudo tools/br_setup.sh
      ~/rapido/> ip addr show br0
      4: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
      ...
      inet 192.168.155.1/24 scope global br0


      Step 3: Image Generation 


      Generate a minimal Linux VM image which includes binaries, libraries and kernel modules for filesystem testing:
      ~/rapido/> ./cut_fstests_local.sh
      ...
       dracut: *** Creating initramfs image file 'initrds/myinitrd' done ***
      ~/rapido/> ls -lah initrds/myinitrd
      -rw-r--r-- 1 ddiss users 30M Dec 13 18:17 initrds/myinitrd

      Step 4 - Boot!

       ~/rapido/> ./vm.sh
      + mount -t btrfs /dev/zram1 /mnt/scratch
      [ 3.542927] BTRFS info (device zram1): disk space caching is enabled
      ...
      btrfs filesystem mounted at /mnt/test and /mnt/scratch
      rapido1:/# 

      In a whopping four seconds, or thereabouts, the VM should have booted to a rapido:/# bash prompt. Leaving you with two zram backed Btrfs filesystems mounted at /mnt/test and /mnt/scratch.

      Everything, including the VM's root filesystem, is in memory, so any changes will not persist across reboot. Use the rapido.conf QEMU_EXTRA_ARGS parameter if you wish to add persistent storage to a VM.

      Although the network isn't used in this case, you should be able to observe that the VM's network adapter can be reached from the hypervisor, and vice-versa.
      rapido1:/# ip a show dev eth0
      ...
      inet 192.168.155.101/24 brd 192.168.155.255 scope global eth0
      ...
      rapido1:/# ping 192.168.155.1
      PING 192.168.155.1 (192.168.155.1) 56(84) bytes of data.
      64 bytes from 192.168.155.1: icmp_seq=1 ttl=64 time=1.97 ms

      Once you're done playing around, you can shutdown:
      rapido1:/# shutdown
      [ 267.304313] sysrq: SysRq : sysrq: Power Off
      rapido1:/# [ 268.168447] ACPI: Preparing to enter system sleep state S5
      [ 268.169493] reboot: Power down
      + exit 0

       

       

      Usage - Ceph vstart.sh cluster and CephFS client VM

      This usage guide builds on the previous standalone Linux VM procedure, but this time adds Ceph to the mix. If you're not interested in Ceph (how could you not be!) then feel free to skip to the next section.

       

      Step I - Checkout and Build


      We already have a clone of the Rapido and Linux kernel repositories. All that's needed for CephFS testing is a Ceph build:
      ~/> git clone https://github.com/ceph/ceph
      ~/> cd ceph
      <install Ceph build dependencies>
      ~/ceph/> ./do_cmake.sh -DWITH_MANPAGE=0 -DWITH_OPENLDAP=0 -DWITH_FUSE=0 -DWITH_NSS=0 -DWITH_LTTNG=0
      ~/ceph/> cd build
      ~/ceph/build/> make -j4

       

      Step II - Start a vstart.sh Ceph "cluster"


      Once Ceph has finished compiling, vstart.sh can be run with the following parameters to configure and locally start three OSDs, one monitor process, and one MDS.
      ~/ceph/build/> OSD=3 MON=1 RGW=0 MDS=1 ../src/vstart.sh -i 192.168.155.1 -n
      ...
      ~/ceph/build/> bin/ceph -c status
      ...
      health HEALTH_OK
      monmap e2: 1 mons at {a=192.168.155.1:40160/0}
      election epoch 4, quorum 0 a
      fsmap e5: 1/1/1 up {0=a=up:active}
      mgr no daemons active
      osdmap e10: 3 osds: 3 up, 3 in

       

      Step III - Rapido configuration


      Edit rapido.conf, the master Rapido configuration file:
      ~/ceph/build/> cd ~/rapido
      ~/rapido/> vi rapido.conf
      • set CEPH_SRC="/home/<user>/ceph/src"
      • KERNEL_SRC and network parameters were configured earlier

      Step IV - Image Generation


      The cut_cephfs.sh script generates a VM image with the Ceph configuration and keyring from the vstart.sh cluster, as well as the CephFS kernel module.
      ~/rapido/> ./cut_cephfs.sh
      ...
       dracut: *** Creating initramfs image file 'initrds/myinitrd' done ***

       

      Step V - Boot!


      Booting the newly generated image should bring you to a shell prompt, with the vstart.sh provisioned CephFS filesystem mounted under /mnt/cephfs:
      ~/rapido/> ./vm.sh
      ...
      + mount -t ceph 192.168.155.1:40160:/ /mnt/cephfs -o name=admin,secret=...
      [ 3.492742] libceph: mon0 192.168.155.1:40160 session established
      ...
      rapido1:/# df -h /mnt/cephfs
      Filesystem Size Used Avail Use% Mounted on
      192.168.155.1:40160:/ 1.3T 611G 699G 47% /mnt/cephfs
      CephFS is a clustered filesystem, in which case testing from multiple clients is also of interest. From another window, boot a second VM:
      ~/rapido/> ./vm.sh

       

       

      Further Use Cases


      Rapido ships with a bunch of scripts for testing different kernel components:
      • cut_cephfs.sh (shown above)
        • Image: includes Ceph config, credentials and CephFS kernel module
        • Boot: mounts CephFS filesystem
      • cut_cifs.sh
        • Image: includes CIFS (SMB client) kernel module
        • Boot: mounts share using details and credentials specified in rapido.conf
      • cut_dropbear.sh
        • Image: includes dropbear SSH server
        • Boot: starts an SSH server with SSH_AUTHORIZED_KEY
      • cut_fstests_cephfs.sh
        • Image: includes xfstests and CephFS kernel client
        • Boot: mounts CephFS filesystem and runs FSTESTS_AUTORUN_CMD
      • cut_fstests_local.sh (shown above)
        • Image: includes xfstests and local Btrfs and XFS dependencies
        • Boot: provisions local xfstest zram devices. Runs FSTESTS_AUTORUN_CMD
      • cut_lio_local.sh
        • Image: includes LIO, loopback dev and dm-delay kernel modules
        • Boot: provisions an iSCSI target, with three LUs exposed
      • cut_lio_rbd.sh
        • Image: includes LIO and Ceph RBD kernel modules
        • Boot: provisions an iSCSI target backed by CEPH_RBD_IMAGE, using target_core_rbd
      • cut_qemu_rbd.sh
        • Image: CEPH_RBD_IMAGE is attached to the VM using qemu-block-rbd
        • Boot: runs shell only
      • cut_rbd.sh
        • Image: includes Ceph config, credentials and Ceph RBD kernel module
        • Boot: maps CEPH_RBD_IMAGE using the RBD kernel client
      • cut_tcmu_rbd_loop.sh
        • Image: includes Ceph config, librados, librbd, and pulls in tcmu-runner from TCMU_RUNNER_SRC
        • Boot: starts tcmu-runner and configures a tcmu+rbd backstore exposing CEPH_RBD_IMAGE via the LIO loopback fabric
      • cut_usb_rbd.sh (see https://github.com/ddiss/rbd-usb)
        • Image: usb_f_mass_storage, zram, dm-crypt, and RBD_USB_SRC
        • Boot: starts the conf-fs.sh script from RBD_USB_SRC

       

       

      Conclusion


        • Dracut and QEMU can be combined for super-fast Linux kernel testing and development.
        • Rapido is mostly just a glorified wrapper around these utilities, but does provide some useful tools for automated testing of specific Linux kernel functionality.

        If you run into any problems, or wish to provide any kind of feedback (always appreciated), please feel free to leave a message below, or raise a ticket in the Rapido issue tracker.

        Update 20170106:
        • Add cut_tcmu_rbd_loop.sh details and fix the example CEPH_SRC path.
          

        January 06, 2017 11:29 PM

        December 27, 2016

        David

        Adding Reviewed-by and Acked-by Tags with Git

        This week's "Git Rocks!" moment came while I was investigating how I could automatically add Reviewed-by, Acked-by, Tested-by, etc. tags to a given commit message.

        Git's interpret-trailers command is capable of testing for and manipulating arbitrary Key: Value tags in commit messages.

        For example, appending Reviewed-by: MY NAME <my@email.com> to the top commit message is as simple as running:

        > GIT_EDITOR='git interpret-trailers --trailer \
        "Reviewed-by: $(git config user.name) <$(git config user.email)>" \
        --in-place' git commit --amend 

        Or with the help of a "git rb" alias, via:
        > git config alias.rb "interpret-trailers --trailer \
        \"Reviewed-by: $(git config user.name) <$(git config user.email)>\" \
        --in-place"
        > GIT_EDITOR="git rb" git commit --amend

        The above examples work by replacing the normal git commit editor with a call to git interpret-trailers, which appends the desired tag to the commit message and then exits.

        My specific use case is to add Reviewed-by: tags to specific commits during interactive rebase, e.g.:
        > git rebase --interactive HEAD~3

        This brings up an editor with a list of the top three commits in the current branch. Assuming the aforementioned rb alias has been configured, individual commits will be given a Reviewed-by tag when appended with the following line:

        exec GIT_EDITOR="git rb" git commit --amend

        As an example, the following will see three commits applied, with the commit message for two of them (d9e994e and 5f8c115) appended with my Reviewed-by tag.

        pick d9e994e ctdb: Fix CID 1398179 Argument cannot be negative
        exec GIT_EDITOR="git rb" git commit --amend
        pick 0fb313c ctdb: Fix CID 1398178 Argument cannot be negative
        # ^^^^^^^ don't add a Reviewed-by tag for this one just yet
        pick 5f8c115 ctdb: Fix CID 1398175 Dereference after null check
        exec GIT_EDITOR="git rb" git commit --amend

        Bonus: By default, the vim editor includes git rebase --interactive syntax highlighting and key-bindings - if you press K while hovering over a commit hash (e.g. d9e994e from above), vim will call git show <commit-hash>, making reviewing and tagging even faster!



        Thanks to:
        • Upstream Git developers, especially those who implemented the interpret-trailers functionality.
        • My employer, SUSE.

        December 27, 2016 06:22 PM

        October 25, 2016

        Andreas

        Microsoft Catalog Files and Digital Signatures decoded

        TL;DR: Parse and print .cat files: parsemscat

        Introduction

        Günther Deschner and myself are looking into the new Microsoft Printing Protocol [MS-PAR]. Printing always means you have to deal with drivers. Microsoft package-aware v3 print drivers and v4 print drivers contain Microsoft Catalog files.

        A Catalog file (.cat) is a digitally-signed file. To be more precise it is a PKCS7 certificate with embedded data. Before I started to look into the problem understanding them I’ve searched the web, if someone already decoded them. I found a post by Richard Hughes: Building a better catalog file. Richard described some of the things we already discovered and some new details. It looks like he gave up when it came down to understand the embedded data and write an ASN.1 description for it. I started to decode the myth of Catalog files the last two weeks and created a tool for parsing them and printing what they contain, in human readable form.

        Details

        The embedded data in the PKCS7 signature of a Microsoft Catalog is a Certificate Trust List (CTL). Nikos Mavrogiannopoulos taught me ASN.1 and helped to create an ASN.1 description for the CTL. With this description I was able to start parsing Catalog files.

        CATALOG {}
        DEFINITIONS IMPLICIT TAGS ::=
        
        BEGIN
        
        -- CATALOG_NAME_VALUE
        CatalogNameValue ::= SEQUENCE {
            name       BMPString, -- UCS2-BE
            flags      INTEGER,
            value      OCTET STRING -- UCS2-LE
        }
        
        ...
        
        END

        mscat.asn

        The PKCS7 part of the .cat-file is the signature for the CTL. Nikos implemented support to get the embedded raw data from the PKCS7 Signature with GnuTLS. It is also possible to verify the signature using GnuTLS now!
        The CTL includes members and attributes. A member holds information about file name included in the driver package, OS attributes and often a hash for the content of the file name, either SHA1 or SHA256. I’ve written abstracted function so it is possible to create a library and a simple command line tool called dumpmscat.

        Here is an example of the output:

        CATALOG MEMBER COUNT=1
        CATALOG MEMBER
          CHECKSUM: E5221540DC4B974F54DB4E390BFF4132399C8037
        
          FILE: sambap1000.inf, FLAGS=0x10010001
          OSATTR: 2:6.0,2:6.1,2:6.4, FLAGS=0x10010001
          MAC: SHA1, DIGEST: E5221540DC4B974F54DB4E39BFF4132399C8037

        In addition the CTL has normally a list of attributes. In those attributes are normally OS Flags, Version information and Hardware IDs.

        CATALOG ATTRIBUTE COUNT=2
          NAME=OS, FLAGS=0x10010001, VALUE=VistaX86,7X86,10X86
          NAME=HWID1, FLAGS=0x10010001, VALUE=usb\\vid_0ff0&pid_ff00&mi_01

        Currently the projects only has a command line tool called: dumpmscat. And it can only print the CTL for now. I plan to add options to verify the signature, dump only parts etc. When this is done I will create a library so it can easily be consumed by other software. If someone is interested and wants to contribute. Something like signtool.exe would be nice to have.

        October 25, 2016 03:17 PM

        September 21, 2016

        Andreas

        A new cmocka release version 1.1.0

        It took more than a year but finally Jakub and I released a new version of cmocka today. If you don’t know it yet, cmocka is a unit testing framework for C with support for mock objects!

        We set the version number to 1.1.0 because we have some new features:

        • Support to catch multiple exceptions
        • Support to verify call ordering (for mocking)
        • Support to pass initial data to test cases
        • A will_return_maybe() function for ignoring mock returns
        • Subtests for groups using TAP output
        • Support to write multiple XML output files if you have several groups in a test
        • and improved documentation

        We have some more features we are working on. I hope it will not take such a long time to release them.

        September 21, 2016 03:15 PM

        June 28, 2016

        David

        Linux USB Gadget Application Testing

        Developing a USB gadget application that runs on Linux?
        Following a recent Ceph USB gateway project, I was looking at ways to test a Linux USB device without the need to fiddle with cables, or deal with slow embedded board boot times.

        Ideally USB gadget testing could be performed by running the USB device code within a virtual machine, and attaching the VM's virtual USB device port to an emulated USB host controller on the hypervisor system.


        I was unfortunately unable to find support for virtual USB device ports in QEMU, so I abandoned the above architecture, and discovered dummy_hcd.ko instead.


        The dummy_hcd Linux kernel module is an excellent utility for USB device testing from within a standalone system or VM.



        dummy_hcd.ko offers the following features:
        • Re-route USB device traffic back to the local system
          • Effectively providing device loopback functionality
        • USB high-speed and super-speed connection simulation
        It can be enabled via the USB_DUMMY_HCD kernel config parameter. Once the module is loaded, no further configuration is required.

        June 28, 2016 01:53 PM

        June 15, 2016

        Rusty

        Minor update on transaction fees: users still don’t care.

        I ran some quick numbers on the last retargeting period (blocks 415296 through 416346 inclusive) which is roughly a week’s worth.

        Blocks were full: median 998k mean 818k (some miners blind mining on top of unknown blocks). Yet of the 1,618,170 non-coinbase transactions, 48% were still paying dumb, round fees (like 5000 satoshis). Another 5% were paying dumbround-numbered per-byte fees (like 80 satoshi per byte).

        The mean fee was 24051 satoshi (~16c), the mean fee rate 60 satoshi per byte. But if we look at the amount you needed to pay to get into a block (using the second cheapest tx which got in), the mean was 16.81 satoshis per byte, or about 5c.

        tl;dr: It’s like a tollbridge charging vehicles 7c per ton, but half the drivers are just throwing a quarter as they drive past and hoping it’s enough. It really shows fees aren’t high enough to notice, and transactions don’t get stuck often enough to notice. That’s surprising; at what level will they notice? What wallets or services are they using?

        June 15, 2016 03:00 AM

        May 11, 2016

        David

        Rapid Ceph Kernel Module Testing with vstart.sh

        Introduction

        Ceph's vstart.sh utility is very useful for deploying and testing a mock cluster directly from the Ceph source repository. It can:
        • Generate a cluster configuration file and authentication keys
        • Provision and deploy a number of OSDs
          • Backed by local disk, or memory using the --memstore parameter
        • Deploy an arbitrary number of monitor, MDS or rados-gateway nodes
        All services are deployed as the running user. I.e. root access is not needed.

        Once deployed, the mock cluster can be used with any of the existing Ceph client utilities, or exercised with the unit tests in the Ceph src/test directory.

        When developing or testing Linux kernel changes for CephFS or RBD, it's useful to also be able to use these kernel clients against a vstart.sh deployed Ceph cluster.

        Test Environment Overview - image based on content by Sage Weil

        The instructions below walk through configuration and deployment of all components needed to test Linux kernel RBD and CephFS modules against a mock Ceph cluster. The procedure was performed on openSUSE Leap 42.1, but should also be applicable for other Linux distributions.

        Network Setup

        First off, configure a bridge interface to connect the Ceph cluster with a kernel client VM network:

        > sudo /sbin/brctl addbr br0
        > sudo ip addr add 192.168.155.1/24 dev br0
        > sudo ip link set dev br0 up

        br0 will not be bridged with any physical adapters, just the kernel VM via a TAP interface which is configured with:

        > sudo /sbin/tunctl -u $(whoami) -t tap0
        > sudo /sbin/brctl addif br0 tap0
        > sudo ip link set tap0 up

        For more information on the bridge setup, see:
        http://blog.elastocloud.org/2015/07/qemukvm-bridged-network-with-tap.html

        Ceph Cluster Deployment

        The Ceph cluster can now be deployed, with all nodes accepting traffic on the bridge network:

        > cd $ceph_source_dir
        <build Ceph>
        > cd src
        > OSD=3 MON=1 RGW=0 MDS=1 ./vstart.sh -i 192.168.155.1 -n --memstore

        $ceph_source_dir should be replaced with the actual path. Be sure to specify the same IP address with -i as was assigned to the br0 interface.

        More information about vstart.sh usage can be found at:
         http://docs.ceph.com/docs/hammer/dev/dev_cluster_deployement/

        Kernel VM Deployment

        Build a kernel:
         
        > cd $kernel_source_dir
        > make menuconfig
        $kernel_source_dir should be replaced with the actual path. Ensure CONFIG_BLK_DEV_RBD=m, CONFIG_CEPH_FS=y, CONFIG_CEPH_LIB=y, CONFIG_E1000=y and CONFIG_IP_PNP=y are set in the kernel config. A sample can be found here.
         
        > make
        > INSTALL_MOD_PATH=./mods make modules_install
         
        Create a link to the modules directory ./mods, so that Dracut can find them:
         
        > sudo ln -s $PWD/mods/lib/modules/$(make kernelrelease) \
        /lib/modules/$(make kernelrelease)

        Generate an initramfs with Dracut. This image will be used as the test VM.
         
        > export CEPH_SRC=$ceph_source_dir/src
        > dracut --no-compress --kver "$(cat include/config/kernel.release)" \
        --install "tail blockdev ps rmdir resize dd vim grep find df sha256sum \
        strace mkfs.xfs /lib64/libkeyutils.so.1" \
        --include "$CEPH_SRC/mount.ceph" "/sbin/mount.ceph" \
        --include "$CEPH_SRC/ceph.conf" "/etc/ceph/ceph.conf" \
        --add-drivers "rbd" \
        --no-hostonly --no-hostonly-cmdline \
        --modules "bash base network ifcfg" \
        --force myinitrd

        Boot the kernel and initramfs directly using QEMU/KVM:
         
        > qemu-kvm -smp cpus=2 -m 512 \
        -kernel arch/x86/boot/bzImage -initrd myinitrd \
        -device e1000,netdev=network1,mac=b8:ac:6f:31:45:70 \
        -netdev tap,id=network1,script=no,downscript=no,ifname=tap0 \
        -append "ip=192.168.155.2:::255.255.255.0:myhostname \
        rd.shell=1 console=ttyS0 rd.lvm=0 rd.luks=0" \
        -nographic

        This should bring up a Dracut debug shell in the VM, with a network configuration matching the values parsed in via the ip= kernel parameter.

        dracut:/# ip a
        ...
        2: eth0: ... mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
        link/ether b8:ac:6f:31:45:70 brd ff:ff:ff:ff:ff:ff
        inet 192.168.155.2/24 brd 192.168.155.255 scope global eth0

        For more information on kernel setup, see:
        http://blog.elastocloud.org/2015/06/rapid-linux-kernel-devtest-with-qemu.html

        RBD Image Provisioning

        An RBD volume can be provisioned using the regular Ceph utilities in the Ceph source directory:

        > cd $ceph_source_dir/src
        > ./rados lspools
        rbd
        ...

        By default, an rbd pool is created by vstart.sh, which can be used for RBD images:
         
        > ./rbd create --image-format 1 --size 1024 1g_vstart_img
        > ./rbd ls -l
        NAME SIZE PARENT FMT PROT LOCK
        1g_vstart_img 1024M 1

        Note: "--image-format 1" is specified to ensure that the kernel supports all features of the provisioned RBD image.

        Kernel RBD Usage

        From the Dracut shell, the newly provisioned 1g_vstart_img image can be mapped locally using the sysfs filesystem:
        dracut:/# modprobe rbd
        [ 9.031056] rbd: loaded
        dracut:/# echo -n "192.168.155.1:6789 name=admin,secret=AQBPiuhd9389dh28djASE32Ceiojc234AF345w== rbd 1g_vstart_img -" > /sys/bus/rbd/add
        [ 347.743272] libceph: mon0 192.168.155.1:6789 session established
        [ 347.744284] libceph: client4121 fsid 234b432f-a895-43d2-23fd-9127a1837b32
        [ 347.749516] rbd: rbd0: added with size 0x40000000

        Note: The monitor address and admin credentials can be retrieved from the ceph.conf and keyring files respectively, located in the Ceph source directory.

        The /dev/rbd0 mapped image can now be used like any other block device:
        dracut:/# mkfs.xfs /dev/rbd0 
        ...
        dracut:/# mkdir -p /mnt/rbdfs
        dracut:/# mount /dev/rbd0 /mnt/rbdfs
        [ 415.841757] XFS (rbd0): Mounting V4 Filesystem
        [ 415.917595] XFS (rbd0): Ending clean mount
        dracut:/# df -h /mnt/rbdfs
        Filesystem Size Used Avail Use% Mounted on
        /dev/rbd0 1014M 33M 982M 4% /mnt/rbdfs


        Kernel CephFS Usage

        vstart.sh already goes to the effort of deploying a filesystem:
        > cd $ceph_source_dir/src
        > ./ceph fs ls
        > name: cephfs_a, metadata pool: cephfs_metadata_a, data pools: [cephfs_data_a ]

        All that's left is to mount it from the kernel VM using the mount.ceph binary that was copied into the initramfs:
        dracut:/# mkdir -p /mnt/mycephfs
        dracut:/# mount.ceph 192.168.155.1:6789:/ /mnt/mycephfs \
        -o name=admin,secret=AQBPiuhd9389dh28djASE32Ceiojc234AF345w==
        [ 723.103153] libceph: mon0 192.168.155.1:6789 session established
        [ 723.184978] libceph: client4122 fsid 234b432f-a895-43d2-23fd-9127a1837b32

        dracut:/# df -h /mnt/mycephfs/
        Filesystem Size Used Avail Use% Mounted on
        192.168.155.1:6789:/ 3.0G 4.0M 3.0G 1% /mnt/mycephfs


        Note: The monitor address and admin credentials can be retrieved from the ceph.conf and keyring files respectively, located in the Ceph source directory.

        Cleanup

        Unmount CephFS:
        dracut:/# umount /mnt/mycephfs

        Unmount the RBD image:
        dracut:/# umount /dev/rbd0
        [ 1592.592510] XFS (rbd0): Unmounting Filesystem

        Unmap the RBD image (0 is derived from /dev/rbdX):
        dracut:/# echo -n 0 > /sys/bus/rbd/remove

        Power-off the VM:
        dracut:/# echo 1 > /proc/sys/kernel/sysrq && echo o > /proc/sysrq-trigger
        [ 1766.387417] sysrq: SysRq : Power Off
        dracut:/# [ 1766.811686] ACPI: Preparing to enter system sleep state S5
        [ 1766.812217] reboot: Power down

        Shutdown the Ceph cluster:
        > cd $ceph_source_dir/src
        > ./stop.sh

        Conclusion

        A mock Ceph cluster can be deployed from source in a matter of seconds using the vstart.sh utility.
        Likewise, a kernel can be booted directly from source alongside a throwaway VM and connected to the mock Ceph cluster in a couple of minutes with Dracut and QEMU/KVM.

        This environment is ideal for rapid development and integration testing of Ceph user-space and kernel components, including RBD and CephFS.

        May 11, 2016 02:40 PM

        April 08, 2016

        David

        Efficient Microsoft Azure Uploads and Downloads

        With the release of version 0.7.1, Elasto is now capable of efficient (sparse aware) uploads and downloads to/from Microsoft Azure, using the Blob and File services.


        Example of a Microsoft Azure Page Blob Download


        This is done by determining which regions of a Page Blob, File Service file, or local file are allocated and only transferring those regions, which improves both network and storage utilisation.
        • For Azure Page Blobs, the Get Page Ranges API request is used to obtain a list of allocated regions.
        • For Azure File Service files, the List Ranges API request is used.
        • For local files, SEEK_DATA and SEEK_HOLE are used to determine which regions of a file are allocated.
        • Amazon S3 Objects and Azure Block Blobs are still downloaded and uploaded in entirety.
          • Sparse regions are unsupported by these services.
        Elasto is free software, and can be obtained for openSUSE and many other Linux distributions from the openSUSE Build Service. Be safe, take backups before experimenting with this new feature.

        April 08, 2016 05:46 AM

        Rusty

        Bitcoin Generic Address Format Proposal

        I’ve been implementing segregated witness support for c-lightning; it’s interesting that there’s no address format for the new form of addresses.  There’s a segregated-witness-inside-p2sh which uses the existing p2sh format, but if you want raw segregated witness (which is simply a “0” followed by a 20-byte or 32-byte hash), the only proposal is BIP142 which has been deferred.

        If we’re going to have a new address format, I’d like to make the case for shifting away from bitcoin’s base58 (eg. 1At1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2):

        1. base58 is not trivial to parse.  I used the bignum library to do it, though you can open-code it as bitcoin-core does.
        2. base58 addresses are variable-length.  That makes webforms and software mildly harder, but also eliminates a simple sanity check.
        3. base58 addresses are hard to read over the phone.  Greg Maxwell points out that the upper and lower case mix is particularly annoying.
        4. The 4-byte SHA check does not guarantee to catch the most common form of errors; transposed or single incorrect letters, though it’s pretty good (1 in 4 billion chance of random errors passing).
        5. At around 34 letters, it’s fairly compact (36 for the BIP141 P2WPKH).

        This is my proposal for a generic replacement (thanks to CodeShark for generalizing my previous proposal) which covers all possible future address types (as well as being usable for current ones):

        1. Prefix for type, followed by colon.  Currently “btc:” or “testnet:“.
        2. The full scriptPubkey using base 32 encoding as per http://philzimmermann.com/docs/human-oriented-base-32-encoding.txt.
        3. At least 30 bits for crc64-ecma, up to a multiple of 5 to reach a letter boundary.  This covers the prefix (as ascii), plus the scriptPubKey.
        4. The final letter is the Damm algorithm check digit of the entire previous string, using this 32-way quasigroup. This protects against single-letter errors as well as single transpositions.

        These addresses look like btc:ybndrfg8ejkmcpqxot1uwisza345h769ybndrrfg (41 digits for a P2WPKH) or btc:yybndrfg8ejkmcpqxot1uwisza345h769ybndrfg8ejkmcpqxot1uwisza34 (60 digits for a P2WSH) (note: neither of these has the correct CRC or check letter, I just made them up).  A classic P2PKH would be 45 digits, like btc:ybndrfg8ejkmcpqxot1uwisza345h769wiszybndrrfg, and a P2SH would be 42 digits.

        While manually copying addresses is something which should be avoided, it does happen, and the cost of making them robust against common typographic errors is small.  The CRC is a good idea even for machine-based systems: it will let through less than 1 in a billion mistakes.  Distinguishing which blockchain is a nice catchall for mistakes, too.

        We can, of course, bikeshed this forever, but I wanted to anchor the discussion with something I consider fairly sane.

        April 08, 2016 01:50 AM

        Last updated: February 20, 2018 06:00 PM

        Donations


        Nowadays, the Samba Team needs a dollar instead of pizza ;-)

        Beyond Samba

        Releases