Alexander Leidinger

Just another weblog

Sep
29

Ker­nel fea­tures patch­set (from GSoC 2010)

I am play­ing around with the patch­set “my” stu­dent gen­er­ated dur­ing this years GSoC (the code for all projects is avail­able from Google). In short, it gives you the pos­si­bil­ity to query from user­land, which optional ker­nel fea­tures are avail­able. I have let him mostly do those fea­tures, which are not so easy to detect from user­land, or where the detec­tion could trig­ger an autoload of a ker­nel mod­ule.

I let the out­put speak for him­self, first the out­put before his patchset:

kern.features.compat_freebsd7: 1
kern.features.compat_freebsd6: 1
kern.features.posix_shm: 1

And now with his patchset:

kern.features.compat_freebsd6: 1
kern.features.compat_freebsd7: 1
kern.features.ffs_snapshot: 1
kern.features.geom_label: 1
kern.features.geom_mirror: 1
kern.features.geom_part_bsd: 1
kern.features.geom_part_ebr: 1
kern.features.geom_part_ebr_compat: 1
kern.features.geom_part_mbr: 1
kern.features.geom_vol: 1
kern.features.invariant_support: 1
kern.features.kdtrace_hooks: 1
kern.features.kposix_priority_scheduling: 1
kern.features.ktrace: 1
kern.features.nfsclient: 1
kern.features.nfsserver: 1
kern.features.posix_shm: 1
kern.features.pps_sync: 1
kern.features.quota: 1
kern.features.scbus: 1
kern.features.softupdates: 1
kern.features.stack: 1
kern.features.sysv_msg: 1
kern.features.sysv_sem: 1
kern.features.sysv_shm: 1
kern.features.ufs_acl: 1

With his patches we have a total of 84 ker­nel fea­tures which can be queried (obvi­ously I do not have all optional options enabled in the ker­nel which pro­duces this out­put). All of the fea­tures also have a descrip­tion, and it is easy to add more fea­tures. As an exam­ple I present what is nec­es­sary to pro­duce the kern.features.stack output:

./kern/subr_stack.c:FEATURE(stack, “Sup­port for cap­tur­ing ker­nel stack”);

There is also a lit­tle user­land appli­ca­tion (and a library inter­face) which allows to query sev­eral fea­tures from scripts/applications with the pos­si­bil­ity to pre­tend a fea­ture is not there (the require­ment for this was for ports; pre­tend­ing a fea­ture is there if it is not was ruled out because such run-time detec­tion is only nec­es­sary for things which have to run soon and pre­tend­ing some fea­ture is there while it is not will cause big prob­lems). Unfor­tu­nately the man page for the appli­ca­tion is not yet ready, but I’m sure you can fig­ure out how to use it.

The names of the fea­tures and the descrip­tion fol­lows an easy scheme, what is writ­ten down in NOTES is used as a name and a descrip­tion for the fea­ture (an excep­tion is geom_part_X, there we decided to use a com­mon theme (“GEOM par­ti­tion­ing class for XXX”) which is dis­tinct from the cor­re­spond­ing geom_X class). If you have com­plains about what is used in a spe­cific fea­ture, do not com­plain to him: change it in NOTES and the fea­ture will follow.

If you have ques­tions, sug­ges­tions, or some other inter­est to con­tact him, his FreeBSD address is kibab@. Feel free to encour­age him to go ahead with the next steps (fin­ish­ing the man page, split­ting up the patches into sen­si­ble pieces and pre­sent­ing them on appro­pri­ate mail­inglists for review). :-)

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share/Save

Tags: , , , , , , , , ,
Sep
28

The FreeBSD-linuxulator explained (for users)

After another mail where I explained a lit­tle bit of the lin­ux­u­la­tor behav­ior, it is time to try to make an easy text which I can ref­er­ence in future answers. If some­one wants to add parts of this expla­na­tion to the FreeBSD hand­book, go ahead.

Linux emu­la­tion? No, “native” exe­cu­tion (sort of)!

First, the lin­ux­u­la­tor is not an emu­la­tion. It is “just” a binary inter­face which is a lit­tle bit dif­fer­ent from the FreeBSD-“native”-one. This means that the binary files in FreeBSD and Linux are both files which com­ply to the ELF spec­i­fi­ca­tion.

When the FreeBSD ker­nel loads an ELF file, it looks if it is a FreeBSD ELF file or a Linux ELF file (or some other fla­vor it knows about). Based upon this it looks up appro­pri­ate actions in a table for this binary (it can also dif­fer­en­ti­ate between 64-bit and 32-bit, and prob­a­bly other things too).

The FreeBSD-table is always com­piled in (for a bet­ter big pic­ture: at least on an AMD/Intel 64-bit plat­form there is also the pos­si­bil­ity to include a 32-bit ver­sion of this table addi­tion­ally, to be able to exe­cute 32-bit pro­grams on 64-bit sys­tems), and other ones like the Linux one can be loaded addi­tion­ally into the ker­nel (or build sta­t­i­cally in the ker­nel, if desired).

Those tables con­tain some para­me­ters and point­ers which allow to exe­cute the binary. If a pro­gram is mak­ing a sys­tem call, the ker­nel will look up the cor­rect func­tion inside this table. It will do this for FreeBSD bina­ries, and for Linux bina­ries. This means that there is no emulation/simulation (over­head) going on… at least ide­ally. Some behav­ior is a lit­tle bit dif­fer­ently between Linux and FreeBSD, so that a lit­tle bit of translation/house-keeping has to go on for some Linux sys­tem calls for the under­ly­ing FreeBSD ker­nel func­tions.

This means that a lot of Linux stuff in FreeBSD is han­dled at the same speed as if this Linux pro­gram would be a FreeBSD program.

Linux file/directory tricks

When the ker­nel detects a Linux pro­gram, it is also play­ing some tricks with files and direc­to­ries (also a prop­erty of the above men­tioned table in the ker­nel, so the­o­ret­i­cally the ker­nel could play tricks for FreeBSD pro­grams too).

If you look up for a file or direc­tory /A, the ker­nel will first look for /compat/linux/A, and if it does not find it, it will look for /A. This is impor­tant! For exam­ple if you have an empty /compat/linux/home, any appli­ca­tion which wants to dis­play the con­tents of /home will show /compat/linux/home. As it is empty, you see noth­ing. If this appli­ca­tion does not allow you to enter a direc­tory man­u­ally via the key­board, you have lost (ok, you can remove /compat/linux/home or fill it with what you want to have). If you can enter a direc­tory via the key­board, you could enter /home/yourlogin, this would first let the ker­nel look for /compat/linux/home/yourlogin, and as it can not find it then have a look for /home/yourlogin (which we assume is there), and as such would dis­play the con­tents of your home directory.

This implies sev­eral things:

  • you can hide FreeBSD direc­tory con­tents from Linux pro­grams while still being able to access the content
  • badly” pro­grammed Linux appli­ca­tions (more cor­rectly: Linux pro­grams which make assump­tions which do not hold in FreeBSD) can pre­vent you from access­ing FreeBSD files, or files which are the same in Linux and FreeBSD (like /etc/group which is not avail­able in /compat/linux in the linux_base ports, so that the FreeBSD one is read)
  • you can have dif­fer­ent files for Linux than for FreeBSD

The Linux userland

The linux_base port in FreeBSD is com­ing from a plain instal­la­tion of Linux pack­ages. The dif­fer­ence is that some files are deleted, either because we can not use them in the lin­ux­u­la­tor, or because they exist already in the FreeBSD tree at the same place and we want that the Linux pro­grams use the FreeBSD file (/etc/group and /etc/passwd come to mind). The instal­la­tion also marks binary pro­grams as Linux pro­grams, so that the ker­nel knows which kernel-table to con­sult for sys­tem calls and such (this is not really nec­es­sary for all binary pro­grams, but it is harder to script the cor­rect detec­tion logic, than to just “brand” all binary programs).

Addi­tion­ally some con­fig­u­ra­tions are made to (hope­fully) make it do the right thing out of the box. The com­plete setup of the linux_base ports is done to let Linux pro­grams inte­grate into FreeBSD. This means if you start acroread or skype, you do not want to have to have to con­fig­ure some things in /compat/linux/etc/ first to have your fonts look the same and your user IDs resolved to names (this does not work if you use LDAP or ker­beros or other direc­tory ser­vices for the user/group ID man­age­ment, you need to con­fig­ure this your­self). All this should just work and the appli­ca­tion win­dows shall just pop up on your screen so that you can do what you want to do. Some linux_base ports also do not work on all FreeBSD releases. This can be because some ker­nel fea­tures which this linux_base ports depends upon is not avail­able (yet) in FreeBSD. Because of this you should not choice a linux_base port your­self. Just go and install the pro­gram from the Ports Col­lec­tion and let it install the cor­rect linux_base port auto­mat­i­cally (a dif­fer­ent FreeBSD release may have a dif­fer­ent default linux_base port).

A note of cau­tion, there are instruc­tions out there which tell how to install more recent linux_base ports into FreeBSD releases which do not have them as default. You do this on your own risk, it may or may not work. It depends upon which pro­grams you use and at which ver­sion those pro­grams are (or more tech­ni­cally, which ker­nel fea­tures they depend upon). If it does not work for you, you just have two pos­si­bil­i­ties: revert back and for­get about it, or update your FreeBSD ver­sion to a more recent one (but it could be the case, that even the most recent devel­op­ment ver­sion of FreeBSD does not have sup­port for what you need).

Linux libraries and “ELF file OS ABI invalid”-error messages

Due to the above explained fact about file/directory tricks by the ker­nel, you have to be care­ful with (addi­tional) Linux libraries. When a Linux pro­gram needs some libraries, sev­eral direc­to­ries (spec­i­fied in /compat/linux/etc/ld.so.conf) are searched. Let us assume that the /compat/linux/etc/ld.so.conf spec­i­fies to search in /A, /B and /C. This means the FreeBSD ker­nel first gets a request to open /A/libXYZ. Because of this he first tries /compat/linux/A/libXYZ, and if it does not exist he tries /A/libXYZ. When this fails too, the Linux run­time linker tries the next direc­tory in the con­fig, so that the ker­nel looks now for /compat/linux/B/libXYZ and if it does not exist for /B/libXYZ.

Now assume that libXYZ is in /compat/linux/C/ as a Linux library, and in /B as a FreeBSD library. This means that the ker­nel will first find the FreeBSD library /B/libXYZ. The Linux binary which needs it can not do any­thing with this FreeBSD library (which depends upon the FreeBSD syscall table and FreeBSD sym­bols from e.g. libc), and the Linux run­time linker will bail out because of this (actu­ally he sees that the lin is not of the required type by read­ing the ELF header of it). Unfor­tu­nately the Linux run­time linker will not con­tinue to search for another library with the same name in another direc­tory (at least this was the case last time I checked and mod­i­fied the order in which the Linux run­time linker searches for libraries… this has been a while, so he may be smarter now) and you will see the above error mes­sage (if you started the linux pro­gram in a terminal).

The bot­tom line of all this is: the error mes­sage about ELF file OS ABI invalid just means that the Linux pro­gram was not able to find the cor­rect Linux library and got a FreeBSD library instead. Go, install the cor­re­spond­ing Linux library, and make sure the Linux pro­gram can find it instead of the FreeBSD library (do not for­get to run “/compat/linux/sbin/ldconfig –r /compat/linux” if you make changes by hand instead of using a port, else your changes may not be taken into account).

Con­straints regard­ing chroot into /compat/linux

The linux_base ports are designed to have a nice install-and-start expe­ri­ence. The draw­back of this is, that there is not a full Linux sys­tem in /compat/linux, so doing a chroot into /compat/linux will cause trou­ble (depend­ing on what you want to do). If you want to chroot into the linux sys­tem on your FreeBSD machine, you bet­ter install a linux_dist port. A linux_dist port can be installed in par­al­lel to a linux_base port. Both of them are inde­pen­dent and as such you need to redo/copy con­fig­u­ra­tion changes you want to have in both environments.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…

Tags: , , , , , , , , ,
Sep
28

All inter­nal ser­vices migrated to IPv6

In the last days I migrated all my inter­nal ser­vices to IPv6.

All my jails have an IPv4 and an IPv6 address now. All Apaches (I have one for my pic­ture gallery, one for web­mail, and one for inter­nal man­age­ment) now lis­ten on the inter­nal IPv6 address too. Squid is updated from 2.x to 3.1 (the most recent ver­sion in the Ports Col­lec­tion) and I added some IPv6 ACLs. The inter­nal Post­fix is con­fig­ured to han­dle IPv6 too (it is deliv­er­ing every­thing via an authen­ti­cated and encrypted chan­nel to a machine with a sta­tic IPv4 address for final deliv­ery). My MySQL does not need an IPv6 address, as it is only lis­ten­ing to requests via IPC (the socket is hardlinked between jails). All ssh dae­mons are con­fig­ured to lis­ten to IPv6 too. The IMAP and CUPS server was pick­ing the new IPv6 addresses auto­mat­i­cally. I also updated Samba to han­dle IPv6, but due to lack of a Win­dows machine which prefers IPv6 over IPv4 for CIFS access (at least I think my Win­dows XP net­book only tries IPv4 con­nec­tions) I can not really test this.

Only my Wii is a lit­tle bit behind, and I have not checked if my Sony-TV will DTRT (but for this I first have to get some time to have a look if I have to update my DD-WRT firmware on the lit­tle WLAN-router which is “extend­ing the cable” from the TV to the inter­nal net­work, and I have to look how to con­fig­ure IPv6 with DD-WRT).

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…

Tags: , , , , , , , , ,
Sep
23

ZFS and NFS / on-disk-cache

In the FreeBSD mail­inglists I stum­bled over  a post which refers to a blog-post which describes why ZFS seems to be slow (on Solaris).

In short: ZFS guar­an­tees that the NFS client does not expe­ri­ence silent cor­rup­tion of data (NFS server crash and loss of data which is sup­posed to be already on disk for the client). A rec­om­men­da­tion is to enable the disk-cache for disks which are com­pletely used by ZFS, as ZFS (unlike UFS) is aware of disk–caches. This increases the per­for­mance to what UFS is deliv­er­ing in the NFS case.

There is no in-deep descrip­tion of what it means that ZFS is aware of disk-caches, but I think this is a ref­er­ence to the fact that ZFS is send­ing a flush com­mand to the disk at the right moments. Let­ting aside the fact that there are disks out there which lie to you about this (they tell the flush com­mand fin­ished when it is not), this would mean that this is sup­ported in FreeBSD too.

So every­one who is cur­rently dis­abling the ZIL to get bet­ter NFS per­for­mance (and accept silent data cor­rup­tion on the client side): move your zpool to ded­i­cated (no other real FS than ZFS, swap and dump devices are OK) disks (hon­est ones) and enable the disk-caches instead of dis­abling the ZIL.

I also rec­om­mend that peo­ple which have ZFS already on ded­i­cated (and hon­est) disks have a look if the disk-caches are enabled.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…

Tags: , , , , , , , , ,
Sep
21

IPv6 in my LAN

After enabling IPv6 in my WLAN router, I also enabled IPv6 in my FreeBSD sys­tems. I have to tell that the IPv6 chap­ter in the FreeBSD hand­book does not con­tain as much infor­ma­tion as I would like to have about this.

Con­fig­ur­ing the inter­faces of my two 9–cur­rent sys­tems to also carry a spe­cific IPv6 address (an easy one from the ULA I use) was easy after read­ing the man-page for rc.conf. After a lit­tle bit of exper­i­ment­ing it came down to:

ifconfig_rl0_ipv6=“inet6 ::2:1 pre­fixlen 64 accept_rtadv“
ipv6_defaultrouter=”<router address>”

Apart from this address (I chose it because the IPv4 address ends in “.2″, this way I can add some easy to remem­ber addresses for this machine if needed), I also have two auto­mat­i­cally con­fig­ured addresses. One is with the same ULA and some not so easy to remem­ber end (con­structed from the MAC address), and one is from the offi­cial pre­fix the router con­structed out of the offi­cial IPv4 address from the ISP (+ the same end than the other end).

Addi­tion­ally I also have all my jails on this machine with an IPv6 address now (yes, they are like “…:2:100″ with the :100 because the IPv4 address ends in “.100″). Still TODO is the con­ver­sion of all the ser­vices in the jails to also lis­ten on the IPv6 address.

I already changed the con­fig of my inter­nal DNS to have the IPv6 addresses for all sys­tems, lis­ten on the IPv6 address (when I add an IPv6 net­work to allow-query/allow-query-cache/allow-recursion bind does not want to start). And as I was there, I also enabled the DNSSEC ver­i­fi­ca­tion (but I get a lot of error mes­sages in the logs: “unable to con­vert errno to isc_result: 42: Pro­to­col not avail­able”, one search result which talks exactly about this error tells it is a “cos­metic error”…).

I noticed that an IPv6 ping between two phys­i­cal machines takes a lit­tle bit more time than an IPv4 ping (no IPsec enabled). It sur­prised me that this is such a notice­able dif­fer­ence (not within the std-dev at all):

— m87.Leidinger.net ping sta­tis­tics —
10 pack­ets trans­mit­ted, 10 pack­ets received, 0.0% packet loss
round-trip min/avg/max/stddev = 0.168÷0.193÷0.220÷0.017 ms

— m87.Leidinger.net ping6 sta­tis­tics —
10 pack­ets trans­mit­ted, 10 pack­ets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.207÷0.325÷0.370÷0.047 ms

The infor­ma­tion I miss in the FreeBSD hand­book in the IPv6 chap­ter is what those other IPv6 related ser­vices are and when/how to con­fig­ure them. I have an idea now what this radvd is, but I am not sure what the inter­ac­tion is with the accept_rtadv set­ting for ifcon­fig (and I do not think I need it, as my WLAN router seems to do it already). I know that I get the IPv6-friendly net­work neigh­bor­hood dis­played with ndp(8). I did not have a look at enabling IPv6 mul­ti­cast sup­port in FreeBSD, and I do not know what those other IPv6 options for rc.conf do.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…

Tags: , , , , , , , , ,