The FreeBSD-linuxulator explained (for devel­op­ers): basics

The last post about the Lin­ux­u­la­tor where I explained the Lin­ux­u­la­tor from an user point of view got some good amount of atten­tion. Trig­gered by a recent expla­na­tion of the Lin­ux­u­la­tor errno stuff to a fel­low FreeB­SD devel­op­er I decid­ed so see if more devel­op­ers are inter­est­ed in some more info too…

The syscall vector

In sys/linux/linux_sysvec.c is all the basic set­up to han­dle Lin­ux “sys­tem stuff” in FreeB­SD. The “sys­tem stuff” is about trans­lat­ing FreeB­SD errnos to Lin­ux errnos, about trans­lat­ing FreeB­SD sig­nals to Lin­ux sig­nales, about han­dling Lin­ux traps, and about set­ting up the FreeB­SD sys­tem vec­tor (the ker­nel struc­ture which con­tains all the data to iden­ti­fy when a Lin­ux pro­gram is called and to be able to lookup the right ker­nel func­tions for e.g. syscalls and ioctls).

There is not only one syscall vec­tor, there is one for a.out (struct sysentvec linux_sysvec) and one for ELF (struct sysentvec elf_linux_sysvec) bina­ries (at least on i386, for oth­er archi­tec­tures it may not make sense to have the a.out stuff, as they maybe nev­er seen any a.out Lin­ux binary).

The ELF AUX args

When an ELF image is exe­cut­ed, the Lin­ux­u­la­tor adds some run­time infor­ma­tion (like page­size, uid, guid, …) so that the user­land can query this infor­ma­tion which is not sta­t­ic at build-time eas­i­ly. This is han­dled in the elf_linux_fixup func­tion(). If you see some error mes­sages about miss­ing ELF notes from e.g. glibc, this is the place to add this infor­ma­tion to. It would not be bad from time to time to have a look what Lin­ux is pro­vid­ing and miss­ing pieces there. FreeB­SD does not has an auto­mat­ed way of doing this, and I am not aware of some­one who reg­u­lar­ly checks this. There is a lit­tle bit more info about ELF notes avail­able in a mes­sage to one of the FreeB­SD mail­ing lists, it also has an exam­ple how to read out this data.

Traps

Lin­ux and FreeB­SD do not share the same point of view how a trap shall be han­dled (SIGBUS or SIGSEGV), the cor­re­spond­ing deci­sion mak­ing is han­dled in translate_traps() and a trans­la­tion table is avail­able as _bsd_to_linux_trapcode.

Sig­nals

The val­ues for the sig­nal names are not the same in FreeB­SD and Lin­ux. The trans­la­tion tables are called linux_to_bsd_signal and bsd_to_linux_signal. The trans­la­tion is a fea­ture of the syscall vec­tor (= automatic).

Errnos

The val­ues for the errno names are not the same in FreeB­SD and Lin­ux. The trans­la­tion table is called bsd_to_linux_errno. Return­ing an errno in one of the Lin­ux syscalls will trig­ger an auto­mat­ic trans­la­tion from the FreeB­SD errno val­ue to the Lin­ux errno val­ue. This means that FreeB­SD errnos have to be returned (e.g. FreeB­SD ENOSYS=78) and the Lin­ux pro­gram will receive the Lin­ux val­ue (e.g. Lin­ux ENOSYS=38, and as the Lin­ux ker­nel returns neg­a­tive errnos, the lin­ux pro­gram will get ‑38).

If you see some­where an “-ESOMETHING” in the Lin­ux­u­la­tor code, this is either a bug, or some clever/tricky/dangerous use of the sign-bit to encode some info (e.g. in the futex code there is a func­tion which returns ‑ENOSYS, but the sign-bit is used as an error indi­ca­tor and the call­ing code is respon­si­ble to trans­late neg­a­tive errnos into pos­i­tive ones).

Syscalls

The Lin­ux syscalls are defined sim­i­lar to the FreeB­SD ones. There is a map­ping table (sys/linux/syscalls.master) between syscall num­bers and the cor­re­spond­ing func­tions. This table is used to gen­er­ate code (“make sysent” in sys//linux/) which does what is necessary.

The FreeBSD-linuxulator explained (for users)

After anoth­er mail where I explained a lit­tle bit of the lin­ux­u­la­tor behav­ior, it is time to try to make an easy text which I can ref­er­ence in future answers. If some­one wants to add parts of this expla­na­tion to the FreeB­SD hand­book, go ahead.

Lin­ux emu­la­tion? No, “native” exe­cu­tion (sort of)!

First, the lin­ux­u­la­tor is not an emu­la­tion. It is “just” a bina­ry inter­face which is a lit­tle bit dif­fer­ent from the FreeBSD-“native”-one. This means that the bina­ry files in FreeB­SD and Lin­ux are both files which com­ply to the ELF spec­i­fi­ca­tion.

When the FreeB­SD ker­nel loads an ELF file, it looks if it is a FreeB­SD ELF file or a Lin­ux ELF file (or some oth­er fla­vor it knows about). Based upon this it looks up appro­pri­ate actions in a table for this bina­ry (it can also dif­fer­en­ti­ate between 64-bit and 32-bit, and prob­a­bly oth­er things too).

The FreeBSD-table is always com­piled in (for a bet­ter big pic­ture: at least on an AMD/Intel 64-bit plat­form there is also the pos­si­bil­i­ty to include a 32-bit ver­sion of this table addi­tion­al­ly, to be able to exe­cute 32-bit pro­grams on 64-bit sys­tems), and oth­er ones like the Lin­ux one can be loaded addi­tion­al­ly into the ker­nel (or build sta­t­i­cal­ly in the ker­nel, if desired).

Those tables con­tain some para­me­ters and point­ers which allow to exe­cute the bina­ry. If a pro­gram is mak­ing a sys­tem call, the ker­nel will look up the cor­rect func­tion inside this table. It will do this for FreeB­SD bina­ries, and for Lin­ux bina­ries. This means that there is no emulation/simulation (over­head) going on… at least ide­al­ly. Some behav­ior is a lit­tle bit dif­fer­ent­ly between Lin­ux and FreeB­SD, so that a lit­tle bit of translation/house-keeping has to go on for some Lin­ux sys­tem calls for the under­ly­ing FreeB­SD ker­nel functions.

This means that a lot of Lin­ux stuff in FreeB­SD is han­dled at the same speed as if this Lin­ux pro­gram would be a FreeB­SD program.

Lin­ux file/directory tricks

When the ker­nel detects a Lin­ux pro­gram, it is also play­ing some tricks with files and direc­to­ries (also a prop­er­ty of the above men­tioned table in the ker­nel, so the­o­ret­i­cal­ly the ker­nel could play tricks for FreeB­SD pro­grams too).

If you look up for a file or direc­to­ry /A, the ker­nel will first look for /compat/linux/A, and if it does not find it, it will look for /A. This is impor­tant! For exam­ple if you have an emp­ty /compat/linux/home, any appli­ca­tion which wants to dis­play the con­tents of /home will show /compat/linux/home. As it is emp­ty, you see noth­ing. If this appli­ca­tion does not allow you to enter a direc­to­ry man­u­al­ly via the key­board, you have lost (ok, you can remove /compat/linux/home or fill it with what you want to have). If you can enter a direc­to­ry via the key­board, you could enter /home/yourlogin, this would first let the ker­nel look for /compat/linux/home/yourlogin, and as it can not find it then have a look for /home/yourlogin (which we assume is there), and as such would dis­play the con­tents of your home directory.

This implies sev­er­al things:

  • you can hide FreeB­SD direc­to­ry con­tents from Lin­ux pro­grams while still being able to access the content
  • bad­ly” pro­grammed Lin­ux appli­ca­tions (more cor­rect­ly: Lin­ux pro­grams which make assump­tions which do not hold in FreeB­SD) can pre­vent you from access­ing FreeB­SD files, or files which are the same in Lin­ux and FreeB­SD (like /etc/group which is not avail­able in /compat/linux in the linux_base ports, so that the FreeB­SD one is read)
  • you can have dif­fer­ent files for Lin­ux than for FreeBSD

The Lin­ux userland

The linux_base port in FreeB­SD is com­ing from a plain instal­la­tion of Lin­ux pack­ages. The dif­fer­ence is that some files are delet­ed, either because we can not use them in the lin­ux­u­la­tor, or because they exist already in the FreeB­SD tree at the same place and we want that the Lin­ux pro­grams use the FreeB­SD file (/etc/group and /etc/passwd come to mind). The instal­la­tion also marks bina­ry pro­grams as Lin­ux pro­grams, so that the ker­nel knows which kernel-table to con­sult for sys­tem calls and such (this is not real­ly nec­es­sary for all bina­ry pro­grams, but it is hard­er to script the cor­rect detec­tion log­ic, than to just “brand” all bina­ry programs).

Addi­tion­al­ly some con­fig­u­ra­tions are made to (hope­ful­ly) make it do the right thing out of the box. The com­plete set­up of the linux_base ports is done to let Lin­ux pro­grams inte­grate into FreeB­SD. This means if you start acrore­ad or skype, you do not want to have to have to con­fig­ure some things in /compat/linux/etc/ first to have your fonts look the same and your user IDs resolved to names (this does not work if you use LDAP or ker­beros or oth­er direc­to­ry ser­vices for the user/group ID man­age­ment, you need to con­fig­ure this your­self). All this should just work and the appli­ca­tion win­dows shall just pop up on your screen so that you can do what you want to do. Some linux_base ports also do not work on all FreeB­SD releas­es. This can be because some ker­nel fea­tures which this linux_base ports depends upon is not avail­able (yet) in FreeB­SD. Because of this you should not choice a linux_base port your­self. Just go and install the pro­gram from the Ports Col­lec­tion and let it install the cor­rect linux_base port auto­mat­i­cal­ly (a dif­fer­ent FreeB­SD release may have a dif­fer­ent default linux_base port).

A note of cau­tion, there are instruc­tions out there which tell how to install more recent linux_base ports into FreeB­SD releas­es which do not have them as default. You do this on your own risk, it may or may not work. It depends upon which pro­grams you use and at which ver­sion those pro­grams are (or more tech­ni­cal­ly, which ker­nel fea­tures they depend upon). If it does not work for you, you just have two pos­si­bil­i­ties: revert back and for­get about it, or update your FreeB­SD ver­sion to a more recent one (but it could be the case, that even the most recent devel­op­ment ver­sion of FreeB­SD does not have sup­port for what you need).

Lin­ux libraries and “ELF file OS ABI invalid”-error messages

Due to the above explained fact about file/directory tricks by the ker­nel, you have to be care­ful with (addi­tion­al) Lin­ux libraries. When a Lin­ux pro­gram needs some libraries, sev­er­al direc­to­ries (spec­i­fied in /compat/linux/etc/ld.so.conf) are searched. Let us assume that the /compat/linux/etc/ld.so.conf spec­i­fies to search in /A, /B and /C. This means the FreeB­SD ker­nel first gets a request to open /A/libXYZ. Because of this he first tries /compat/linux/A/libXYZ, and if it does not exist he tries /A/libXYZ. When this fails too, the Lin­ux run­time link­er tries the next direc­to­ry in the con­fig, so that the ker­nel looks now for /compat/linux/B/libXYZ and if it does not exist for /B/libXYZ.

Now assume that libXYZ is in /compat/linux/C/ as a Lin­ux library, and in /B as a FreeB­SD library. This means that the ker­nel will first find the FreeB­SD library /B/libXYZ. The Lin­ux bina­ry which needs it can not do any­thing with this FreeB­SD library (which depends upon the FreeB­SD syscall table and FreeB­SD sym­bols from e.g. libc), and the Lin­ux run­time link­er will bail out because of this (actu­al­ly he sees that the lin is not of the required type by read­ing the ELF head­er of it). Unfor­tu­nate­ly the Lin­ux run­time link­er will not con­tin­ue to search for anoth­er library with the same name in anoth­er direc­to­ry (at least this was the case last time I checked and mod­i­fied the order in which the Lin­ux run­time link­er search­es for libraries… this has been a while, so he may be smarter now) and you will see the above error mes­sage (if you start­ed the lin­ux pro­gram in a terminal).

The bot­tom line of all this is: the error mes­sage about ELF file OS ABI invalid just means that the Lin­ux pro­gram was not able to find the cor­rect Lin­ux library and got a FreeB­SD library instead. Go, install the cor­re­spond­ing Lin­ux library, and make sure the Lin­ux pro­gram can find it instead of the FreeB­SD library (do not for­get to run “/compat/linux/sbin/ldconfig ‑r /compat/linux” if you make changes by hand instead of using a port, else your changes may not be tak­en into account).

Con­straints regard­ing chroot into /compat/linux

The linux_base ports are designed to have a nice install-and-start expe­ri­ence. The draw­back of this is, that there is not a full Lin­ux sys­tem in /compat/linux, so doing a chroot into /compat/linux will cause trou­ble (depend­ing on what you want to do). If you want to chroot into the lin­ux sys­tem on your FreeB­SD machine, you bet­ter install a linux_dist port. A linux_dist port can be installed in par­al­lel to a linux_base port. Both of them are inde­pen­dent and as such you need to redo/copy con­fig­u­ra­tion changes you want to have in both environments.

Lin­ux­u­la­tor in ‑cur­rent ready for test­ing the 2.6.16 emulation

Today I com­mit­ted two patch­es which fix the last two pan­ics we know about in the 2.6.16 emu­la­tion. Now we need testers. Here’s the text of the mail I did send to current@ a few moments ago:

Hi,

today I com­mit­ted the last fix­es for the show­stop­per prob­lems (pan­ics) in the lin­ux 2.6.16 emu­la­tion. I intend to switch the default ver­sion to 2.6.16 on i386 “soon” (see below), so please help test­ing it.

More recent lin­ux dis­tri­b­u­tions (e.g. FC5) require a 2.6 ker­nel and don’t work with 2.4.2 any­more. And because FC4 is “abandon-ware” (no secu­ri­ty fix­es from fedo­rale­ga­cy any­more), get­ting 2.6.16 emu­la­tion up an run­ning is very important.

If you use a lin­ux pro­gram, please add compat.linux.osrelease=2.6.16 to /etc/sysctl.conf (my desk­top is run­ning with 2.6.16 emu­la­tion since some days already). After the next boot (or after run­ning “sysctl compat.linux.osrelease=2.6.16”, please make sure no lin­ux pro­gram is run­ning already) any lin­ux pro­gram will start with a lin­ux ker­nel ver­sion of 2.6.16 instead of 2.4.2. The default lin­ux base port (FC4) will then use dif­fer­ent code paths (e.g. with­in glibc). In case you want to switch back to the 2.4.2 emu­la­tion with­out a reboot, please make sure no lin­ux pro­gram is run­ning anymore.

So far we fixed all known/repeatable prob­lems with acrore­ad, realplay­er, skype and lin­ux fire­fox. If you encounter strange behav­ior with any lin­ux pro­gram, please tell us (emulation@freebsd.org) which pro­gram you used, how to repeat the prob­lem, what the prob­lem is, and if it only is vis­i­ble with 2.6.16 or with 2.4.2 too. You should also watch out for mes­sages in the dmesg (unim­ple­ment­ed sys­tem calls or oth­er stuff, this is used to deter­mine the pri­or­i­ty of miss­ing syscalls). Please also have a look at http://wiki.FreeBSD.org/linux-kernel, I intend to doc­u­ment the known prob­lems there. If you find your prob­lem there, please tell us about it if you are will­ing to test fixes.

We are spe­cial­ly inter­est­ed in reports (good or bad) on SMP sys­tems. Please beat the hell out of the linuxulator!

On amd64 sys­tems we have not the same func­tion­al­i­ty as on i386, miss­ing are futex­es and TLS. In P4 we already have the futex part cov­ered, but the TLS part is still miss­ing (any­one with a clue about the ker­nel side of TLS on amd64 is wel­come to give a hint or two to jkim@ and rdivacky@). So if you get a mes­sage about miss­ing futex­es or TLS on amd64: we know about it (testers for the futex stuff are wel­come, but first you need to use a pro­gram which uses futex­es and complains).

As long as we get prob­lem reports with 2.6.16 I will not switch the default to 2.6.16. If we don’t get a report at all, I will switch the default on i386 to 2.6.16 in two weeks. If we get some prob­lem reports, we will push back the switch a lit­tle bit depend­ing on the sever­i­ty of the problem.

Bye,
Alexander.