Alexander Leidinger

Just another weblog

Feb
02

Sta­tic DTrace probes for the lin­ux­u­la­tor updated

I got a lit­tle bit of time to update my 3 year old work of adding sta­tic DTrace probes to the lin­ux­u­la­tor.

The changes are not in HEAD, but in my linuxulator-dtrace branch. The revi­sion to have a look at is r230910. Included are some DTrace scripts:

  • script to check inter­nal locks
  • script to trace futexes
  • script to gen­er­ate stats for DTracified lin­ux­u­la­tor parts
  • script to check for errors:
    • emu­la­tion errors (unsup­ported stuff, unknown stuff, …)
    • ker­nel errors (resource shortage, …)
    • pro­gram­ming errors (errors which can hap­pen if some­one made a mis­take, but should not happen)

The programming-error checks give hints about user­land pro­gram­ming errors respec­tively a hint about the rea­son of error return val­ues due to resource short­age or maybe a wrong com­bi­na­tion of para­me­ters. An exam­ple error mes­sage for this case is “Appli­ca­tion %s issued a sysctl which failed the length restric­tions.nThe length passed is %d, the min length sup­ported is 1 and the max length sup­ported is %d.n”.

The stats-script (tai­lored spe­cially to the lin­ux­u­la­tor, but this can eas­ily be extended to the rest of the ker­nel) can report about:

  • num­ber of calls to a ker­nel func­tion per exe­cutable binary (not per PID!): allows to see where an opti­miza­tion would be ben­e­fi­cial for a given application
  • graph of CPU time spend in ker­nel func­tions per exe­cutable binary: together with the num­ber of calls to this func­tion this allows to deter­mine if a ker­nel opti­miza­tion would be ben­e­fi­cial / is pos­si­ble for a given application
  • graph of longest run­ning (CPU-time!) ker­nel func­tion in total
  • tim­ing sta­tis­tics for the emul_lock
  • graph of longest held (CPU-time!) locks

Unfor­tu­nately this can not be com­mit­ted to HEAD as-is. The DTrace SDT provider can not han­dle probes which are added to the ker­nel after the SDT provider is already loaded. This means that you either have to com­pile the lin­ux­u­la­tor sta­t­i­cally into the ker­nel, or you have to load the SDT ker­nel mod­ule after the lin­ux­u­la­tor mod­ule is loaded. If you do not respect this, you get a ker­nel panic on first access of one of the providers in the lin­ux­u­la­tor (AFAIR this includes list­ing the probes avail­able in the kernel).

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share

Tags: , , , , , , , , ,
Oct
27

The FreeBSD-linuxulator explained (for devel­op­ers): basics

The last post about the Lin­ux­u­la­tor where I explained the Lin­ux­u­la­tor from an user point of view got some good amount of atten­tion. Trig­gered by a recent expla­na­tion of the Lin­ux­u­la­tor errno stuff to a fel­low FreeBSD devel­oper I decided so see if more devel­op­ers are inter­ested in some more info too…

The syscall vector

In sys/linux/linux_sysvec.c is all the basic setup to han­dle Linux “sys­tem stuff” in FreeBSD. The “sys­tem stuff” is about trans­lat­ing FreeBSD errnos to Linux errnos, about trans­lat­ing FreeBSD sig­nals to Linux sig­nales, about han­dling Linux traps, and about set­ting up the FreeBSD sys­tem vec­tor (the ker­nel struc­ture which con­tains all the data to iden­tify when a Linux pro­gram is called and to be able to lookup the right ker­nel func­tions for e.g. syscalls and ioctls).

There is not only one syscall vec­tor, there is one for a.out (struct sysentvec linux_sysvec) and one for ELF (struct sysentvec elf_linux_sysvec) bina­ries (at least on i386, for other archi­tec­tures it may not make sense to have the a.out stuff, as they maybe never seen any a.out Linux binary).

The ELF AUX args

When an ELF image is exe­cuted, the Lin­ux­u­la­tor adds some run­time infor­ma­tion (like page­size, uid, guid, …) so that the user­land can query this infor­ma­tion which is not sta­tic at build-time eas­ily. This is han­dled in the elf_linux_fixup func­tion(). If you see some error mes­sages about miss­ing ELF notes from e.g. glibc, this is the place to add this infor­ma­tion to. It would not be bad from time to time to have a look what Linux is pro­vid­ing and miss­ing pieces there. FreeBSD does not has an auto­mated way of doing this, and I am not aware of some­one who reg­u­larly checks this. There is a lit­tle bit more info about ELF notes avail­able in a mes­sage to one of the FreeBSD mail­ing lists, it also has an exam­ple how to read out this data.

Traps

Linux and FreeBSD do not share the same point of view how a trap shall be han­dled (SIGBUS or SIGSEGV), the cor­re­spond­ing deci­sion mak­ing is han­dled in translate_traps() and a trans­la­tion table is avail­able as _bsd_to_linux_trapcode.

Sig­nals

The val­ues for the sig­nal names are not the same in FreeBSD and Linux. The trans­la­tion tables are called linux_to_bsd_signal and bsd_to_linux_signal. The trans­la­tion is a fea­ture of the syscall vec­tor (= automatic).

Errnos

The val­ues for the errno names are not the same in FreeBSD and Linux. The trans­la­tion table is called bsd_to_linux_errno. Return­ing an errno in one of the Linux syscalls will trig­ger an auto­matic trans­la­tion from the FreeBSD errno value to the Linux errno value. This means that FreeBSD errnos have to be returned (e.g. FreeBSD ENOSYS=78) and the Linux pro­gram will receive the Linux value (e.g. Linux ENOSYS=38, and as the Linux ker­nel returns neg­a­tive errnos, the linux pro­gram will get –38).

If you see some­where an “-ESOMETHING” in the Lin­ux­u­la­tor code, this is either a bug, or some clever/tricky/dangerous use of the sign-bit to encode some info (e.g. in the futex code there is a func­tion which returns –ENOSYS, but the sign-bit is used as an error indi­ca­tor and the call­ing code is respon­si­ble to trans­late neg­a­tive errnos into pos­i­tive ones).

Syscalls

The Linux syscalls are defined sim­i­lar to the FreeBSD ones. There is a map­ping table (sys/linux/syscalls.master) between syscall num­bers and the cor­re­spond­ing func­tions. This table is used to gen­er­ate code (“make sysent” in sys//linux/) which does what is necessary.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share

Tags: , , , , , , , , ,
Sep
28

The FreeBSD-linuxulator explained (for users)

After another mail where I explained a lit­tle bit of the lin­ux­u­la­tor behav­ior, it is time to try to make an easy text which I can ref­er­ence in future answers. If some­one wants to add parts of this expla­na­tion to the FreeBSD hand­book, go ahead.

Linux emu­la­tion? No, “native” exe­cu­tion (sort of)!

First, the lin­ux­u­la­tor is not an emu­la­tion. It is “just” a binary inter­face which is a lit­tle bit dif­fer­ent from the FreeBSD-“native”-one. This means that the binary files in FreeBSD and Linux are both files which com­ply to the ELF spec­i­fi­ca­tion.

When the FreeBSD ker­nel loads an ELF file, it looks if it is a FreeBSD ELF file or a Linux ELF file (or some other fla­vor it knows about). Based upon this it looks up appro­pri­ate actions in a table for this binary (it can also dif­fer­en­ti­ate between 64-bit and 32-bit, and prob­a­bly other things too).

The FreeBSD-table is always com­piled in (for a bet­ter big pic­ture: at least on an AMD/Intel 64-bit plat­form there is also the pos­si­bil­ity to include a 32-bit ver­sion of this table addi­tion­ally, to be able to exe­cute 32-bit pro­grams on 64-bit sys­tems), and other ones like the Linux one can be loaded addi­tion­ally into the ker­nel (or build sta­t­i­cally in the ker­nel, if desired).

Those tables con­tain some para­me­ters and point­ers which allow to exe­cute the binary. If a pro­gram is mak­ing a sys­tem call, the ker­nel will look up the cor­rect func­tion inside this table. It will do this for FreeBSD bina­ries, and for Linux bina­ries. This means that there is no emulation/simulation (over­head) going on… at least ide­ally. Some behav­ior is a lit­tle bit dif­fer­ently between Linux and FreeBSD, so that a lit­tle bit of translation/house-keeping has to go on for some Linux sys­tem calls for the under­ly­ing FreeBSD ker­nel func­tions.

This means that a lot of Linux stuff in FreeBSD is han­dled at the same speed as if this Linux pro­gram would be a FreeBSD program.

Linux file/directory tricks

When the ker­nel detects a Linux pro­gram, it is also play­ing some tricks with files and direc­to­ries (also a prop­erty of the above men­tioned table in the ker­nel, so the­o­ret­i­cally the ker­nel could play tricks for FreeBSD pro­grams too).

If you look up for a file or direc­tory /A, the ker­nel will first look for /compat/linux/A, and if it does not find it, it will look for /A. This is impor­tant! For exam­ple if you have an empty /compat/linux/home, any appli­ca­tion which wants to dis­play the con­tents of /home will show /compat/linux/home. As it is empty, you see noth­ing. If this appli­ca­tion does not allow you to enter a direc­tory man­u­ally via the key­board, you have lost (ok, you can remove /compat/linux/home or fill it with what you want to have). If you can enter a direc­tory via the key­board, you could enter /home/yourlogin, this would first let the ker­nel look for /compat/linux/home/yourlogin, and as it can not find it then have a look for /home/yourlogin (which we assume is there), and as such would dis­play the con­tents of your home directory.

This implies sev­eral things:

  • you can hide FreeBSD direc­tory con­tents from Linux pro­grams while still being able to access the content
  • badly” pro­grammed Linux appli­ca­tions (more cor­rectly: Linux pro­grams which make assump­tions which do not hold in FreeBSD) can pre­vent you from access­ing FreeBSD files, or files which are the same in Linux and FreeBSD (like /etc/group which is not avail­able in /compat/linux in the linux_base ports, so that the FreeBSD one is read)
  • you can have dif­fer­ent files for Linux than for FreeBSD

The Linux userland

The linux_base port in FreeBSD is com­ing from a plain instal­la­tion of Linux pack­ages. The dif­fer­ence is that some files are deleted, either because we can not use them in the lin­ux­u­la­tor, or because they exist already in the FreeBSD tree at the same place and we want that the Linux pro­grams use the FreeBSD file (/etc/group and /etc/passwd come to mind). The instal­la­tion also marks binary pro­grams as Linux pro­grams, so that the ker­nel knows which kernel-table to con­sult for sys­tem calls and such (this is not really nec­es­sary for all binary pro­grams, but it is harder to script the cor­rect detec­tion logic, than to just “brand” all binary programs).

Addi­tion­ally some con­fig­u­ra­tions are made to (hope­fully) make it do the right thing out of the box. The com­plete setup of the linux_base ports is done to let Linux pro­grams inte­grate into FreeBSD. This means if you start acroread or skype, you do not want to have to have to con­fig­ure some things in /compat/linux/etc/ first to have your fonts look the same and your user IDs resolved to names (this does not work if you use LDAP or ker­beros or other direc­tory ser­vices for the user/group ID man­age­ment, you need to con­fig­ure this your­self). All this should just work and the appli­ca­tion win­dows shall just pop up on your screen so that you can do what you want to do. Some linux_base ports also do not work on all FreeBSD releases. This can be because some ker­nel fea­tures which this linux_base ports depends upon is not avail­able (yet) in FreeBSD. Because of this you should not choice a linux_base port your­self. Just go and install the pro­gram from the Ports Col­lec­tion and let it install the cor­rect linux_base port auto­mat­i­cally (a dif­fer­ent FreeBSD release may have a dif­fer­ent default linux_base port).

A note of cau­tion, there are instruc­tions out there which tell how to install more recent linux_base ports into FreeBSD releases which do not have them as default. You do this on your own risk, it may or may not work. It depends upon which pro­grams you use and at which ver­sion those pro­grams are (or more tech­ni­cally, which ker­nel fea­tures they depend upon). If it does not work for you, you just have two pos­si­bil­i­ties: revert back and for­get about it, or update your FreeBSD ver­sion to a more recent one (but it could be the case, that even the most recent devel­op­ment ver­sion of FreeBSD does not have sup­port for what you need).

Linux libraries and “ELF file OS ABI invalid”-error messages

Due to the above explained fact about file/directory tricks by the ker­nel, you have to be care­ful with (addi­tional) Linux libraries. When a Linux pro­gram needs some libraries, sev­eral direc­to­ries (spec­i­fied in /compat/linux/etc/ld.so.conf) are searched. Let us assume that the /compat/linux/etc/ld.so.conf spec­i­fies to search in /A, /B and /C. This means the FreeBSD ker­nel first gets a request to open /A/libXYZ. Because of this he first tries /compat/linux/A/libXYZ, and if it does not exist he tries /A/libXYZ. When this fails too, the Linux run­time linker tries the next direc­tory in the con­fig, so that the ker­nel looks now for /compat/linux/B/libXYZ and if it does not exist for /B/libXYZ.

Now assume that libXYZ is in /compat/linux/C/ as a Linux library, and in /B as a FreeBSD library. This means that the ker­nel will first find the FreeBSD library /B/libXYZ. The Linux binary which needs it can not do any­thing with this FreeBSD library (which depends upon the FreeBSD syscall table and FreeBSD sym­bols from e.g. libc), and the Linux run­time linker will bail out because of this (actu­ally he sees that the lin is not of the required type by read­ing the ELF header of it). Unfor­tu­nately the Linux run­time linker will not con­tinue to search for another library with the same name in another direc­tory (at least this was the case last time I checked and mod­i­fied the order in which the Linux run­time linker searches for libraries… this has been a while, so he may be smarter now) and you will see the above error mes­sage (if you started the linux pro­gram in a terminal).

The bot­tom line of all this is: the error mes­sage about ELF file OS ABI invalid just means that the Linux pro­gram was not able to find the cor­rect Linux library and got a FreeBSD library instead. Go, install the cor­re­spond­ing Linux library, and make sure the Linux pro­gram can find it instead of the FreeBSD library (do not for­get to run “/compat/linux/sbin/ldconfig –r /compat/linux” if you make changes by hand instead of using a port, else your changes may not be taken into account).

Con­straints regard­ing chroot into /compat/linux

The linux_base ports are designed to have a nice install-and-start expe­ri­ence. The draw­back of this is, that there is not a full Linux sys­tem in /compat/linux, so doing a chroot into /compat/linux will cause trou­ble (depend­ing on what you want to do). If you want to chroot into the linux sys­tem on your FreeBSD machine, you bet­ter install a linux_dist port. A linux_dist port can be installed in par­al­lel to a linux_base port. Both of them are inde­pen­dent and as such you need to redo/copy con­fig­u­ra­tion changes you want to have in both environments.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share

Tags: , , , , , , , , ,