Strate­gic think­ing, or what I think what we need to do to keep FreeB­SD rel­e­vant

Since I par­tic­i­pate in the FreeB­SD project there are from time to time some voic­es which say FreeB­SD is dead, Lin­ux is the way to go. Most of the time those voic­es are trolls, or peo­ple which do not real­ly know what FreeB­SD has to offer. Some­times those voic­es wear blind­ers, they only see their own lit­tle world (were Lin­ux just works fine) and do not see the big pic­ture (like e.g. com­pe­ti­tion stim­u­lates busi­ness, …) or even dare to look what FreeB­SD has to offer.

Some­times those voic­es raise a valid con­cern, and it is up to the FreeB­SD project to fil­ter out what would be ben­e­fi­cial. Recent­ly there were some mails on the FreeB­SD lists in the sense of “What about going into direc­tion X?”. Some peo­ple just had the opin­ion that we should stay were we are. In my opin­ion this is sim­i­lar­ly bad to blind­ly say­ing FreeB­SD is dead and fol­low­ing the mass­es. It would mean stag­na­tion. We should not hold peo­ple back in explor­ing new / dif­fer­ent direc­tions. Some­one wants to write a ker­nel mod­ule in (a sub­set of) C++ or in Rust… well, go ahead, give it a try, we can put it into the Ports Col­lec­tion and let peo­ple get expe­ri­ence with it.

This dis­cus­sion on the mail­inglists also trig­gered some kind of “where do we see us in the next years” / strate­gic think­ing reflec­tion. What I present here, is my very own opin­ion about things we in the FreeB­SD project should look at, to stay rel­e­vant in the long term. To be able to put that into scope, I need to clar­i­fy what “rel­e­vant” means in this case.

FreeB­SD is cur­rent­ly used by com­pa­nies like Net­flix, NetApp, Cis­co, Juniper, and many oth­ers as a base for prod­ucts or ser­vices. It is also used by end-users as a work-horse (e.g. mailservers, web­servers, …). Stay­ing rel­e­vant means in this con­text, to pro­vide some­thing which the user base is inter­est­ed in to use and which makes it more easy / fast for the user base to deliv­er what­ev­er they want or need to deliv­er than with anoth­er kind of sys­tem. And this in terms of time to mar­ket of a solu­tion (time to deliv­er a ser­vice like a web-/mail-/whatever-server or prod­uct), and in terms of per­for­mance (which not only means speed, but also secu­ri­ty and reli­a­bil­i­ty and …) of the solu­tion.

I have cat­e­go­rized the list of items I think are impor­tant into (new) code/features, docs, pol­ish­ing and project infra­struc­ture. Links in the fol­low­ing usu­al­ly point to documentation/HOWTOs/experiences for/with FreeB­SD, and not to the canon­i­cal entry points of the projects or tech­nolo­gies. In a few cas­es the links point to an expla­na­tion in the wikipedia or to the web­site of the top­ic in ques­tion.

Code/features

The vir­tu­al­iza­tion train (Open­Stack, Open­Neb­u­la, oVirt, Cloud­Stack, Kuber­netes, Dock­er, Pod­man, …) is run­ning on full speed. The marked is as big/important, that solu­tion providers even do joint ven­tures on cross­ing bor­ders between each oth­ers, e.g. VMware is open­ing up to inte­grate their solu­tion with solu­tions from Amazon/Azure/Google. The under­ly­ing infra­struc­ture is get­ting more and more unim­por­tant, as long as the ser­vices which shall be run per­form as need­ed. Ease of use and time to mar­ket are the key-drivers (the last lit­tle piece of per­for­mance is most­ly impor­tant for com­pa­nies which go to the “edge” (both mean­ings intend­ed in a non-exclusive-or way) like Net­flix for their FreeB­SD based CDN). FreeB­SD is not real­ly par­tic­i­pat­ing in this world. Yes, we had jails way before any­one else out there had some­thing smi­lar, and some even do not have that right now. But if you are real­is­tic, FreeB­SD does not play a major role here. You can do nice things with jails and bhyve, but you have to do it “by hand” (ezjail, iocage and such are improve­ments on the ease of use side, but that is not enough as this is still lim­it­ed on a host cen­tric view). The world has moved on to admin­is­ter­ing a dat­a­cen­ter (to avoid the buzz­words “cloud” or “private-cloud”) with a single-click. In my opin­ion we would need to port sev­er­al of the ini­tialy men­tioned cloud/container man­age­ment solu­tions to FreeB­SD and have them able to han­dle their work via jails and/or bhyve. If FreeB­SD is not able to serve as a build­ing block in this big pic­ture, we will fall off the edge in this par­tic­u­lar IT-area in the long run.

With all the ready-made con­tain­ers avail­able in the inter­net, we should improve our lin­ux­o­la­tor. Our kernel-support for this is lim­it­ed to a 2.6.32-ish ABI ver­sion (it is less than 2.6.32, more like 2.6.16, we are miss­ing epoll and ino­ti­fy sup­port, among oth­ers, but this is the low­est ver­sion glibc in the Cen­tOS 7 based linux_base port is able to run on… and glibc checks the ver­sion num­ber). We need to catch-up to a more recent ver­sion if we want to be able to run those ready-made lin­ux con­tain­ers with­out issue (we can put a lin­ux sys­tem into a jail and start that). If some­one would like to work on that, a good start would be to run the Lin­ux Test Project tests via the lin­ux­u­la­tor and start fix­ing bugs. The last time I did that was in 2007 and about 16% of the test cas­es failed back then. It would be also quite nice if we could inte­grate those lin­ux­u­la­tor tests into the FreeB­SD CI. With improve­ments in the lin­ux­u­la­tor and the above men­tioned vir­tu­al­iza­tion sup­port, we should be able to run more/those linux-images … ehrm, sor­ry, docker/kubernetes/…-containers with­in the lin­ux­u­la­tor.

Fin­ish the work regard­ing ker­beros in base. Cy Schu­bert is/was work­ing on this. I do not know the sta­tus of this, but the “fix­ing 800 ports” part in his mail from May 2018 looks like some more help­ing hands would be ben­e­fi­cial. This would bring us to a bet­ter start­ing point for a more seam­less inte­gra­tion (some ports need “the oth­er” ker­beros).

We have one port (as far as I was able to deter­mine… the exact amount does not real­ly mat­ter) in terms of SDN – net/openvswitch – but the doc­u­men­ta­tion of this … leaves room for improve­ment (ker­nel sup­port / netmap sup­port and func­tion­al­i­ty / FreeB­SD spe­cif­ic HOWTO). As part of the vir­tu­al­i­sa­tion of every­thing we (yes: we – as part of the FreeB­SD hand­book, see docs cat­e­go­ry below for more on this) need to pro­vide this info so that FreeB­SD is able to par­tic­i­pate in this area. We should also have a look at port­ing some more SDN soft­ware, e.g Open­Con­trail now tung­sten­fab­ric (there is an old con­trail port­ing project), Open­Stack Neu­tron, Open­Day­light, … so that users have a choice, respec­tive­ly FreeB­SD can be inte­grat­ed into exist­ing het­ero­ge­neous envi­ron­ments. 

Sen­sors (tem­per­a­ture, volt­age, fans, …), a top­ic with his­to­ry. Short: in the Google Sum­mer of Code 2007 code was pro­duced, com­mit­ted, and then removed again due to a dis­pute. My per­son­al under­stand­ing (very sim­pli­fied) is “remove every­thing because some of the data han­dled by this frame­work shall not be han­dled by this frame­work” (instead of e.g. “remove sen­sor X, this data shall not be han­dled in this way”), and “remove every­thing as this does not han­dle sen­sors of type X which are not used in servers but in enter­prise class >99% non-IT-related sen­sors”. Noth­ing bet­ter has shown up since then. If I look at Win­dow, VMware, Solaris and Lin­ux, I can query sen­sors on my mainboard/chassis/disks/whatever (yes, I am mix­ing some apples with oranges here), plot them in mon­i­tor­ing sys­tems, and get alarms. In FreeB­SD we fail on this top­ic (actu­al­ly mul­ti­ple top­ics) which I con­sid­er to be some­thing basic and manda­to­ry. I do not sug­gest that we com­mit the Google Sum­mer of Code 2007 code. I sug­gest to have a look at what makes sense to do here. Take the exist­ing code and com­mit it, or improve on this code out­side the tree and then com­mit it, or write some­thing new. In the end it does not mat­ter (for an user) which way it is han­dled, as long as we have some­thing which users can use in the end. It sure­ly makes sense to have an OS-provided frame­work of reg­is­ter­ing sen­sors in a cen­tral place (it would sure­ly be nice if you could get the temp/fan val­ues of your graph­ics card… ooops… sor­ry… AI/HPC accel­er­a­tor togeth­er with oth­er sim­i­lar hard­ware data in your hard­ware).

To con­tin­ue play­ing well (not only) in the high-availability area, we should also have a look at get­ting an imple­men­ta­tion of MPTCP (Mut­li­path TCP) into the tree. Apple (and oth­ers) is already using it since 2013 with good ben­e­fits (most prob­a­bly not only for Siri users). There exists some code for FreeB­SD, but it is far from usable and it does not look like there is progress since 2016. We say we have the pow­er to serve, but with the cloud­i­fi­ca­tion of the recent years, all users expect that every­thing is always-on and nev­er fails, and being able to pro­vide the serv­er side of this client-server relat­ed tech­nol­o­gy for those peo­ple which have such high demands is nec­es­sary to not fall behind (do not let us rest on our lau­rels).

Secure­Boot needs also some help­ing hands. At some point oper­at­ing sys­tems which do not sup­port it will not be con­sid­ered by com­pa­nies any­more.

Anoth­er item we should have a look at is to pro­vide means to write ker­nel code in dif­fer­ent lan­guages. Not in the base sys­tem, but at least in ports. If some­one wants to write a ker­nel mod­ule in  C++ or Rust, why not? It offers pos­si­bil­i­ties to explore new areas. There are even reports of expe­ri­ences with dif­fer­ent lan­guages. It does not fit your needs? Well, ignore it and con­tin­ue writ­ing ker­nel code in C, but let oth­er peo­ple which want to use a screw­driv­er instead of a ham­mer do what they want, they will either learn that they should have used a ham­mer, or can report about ben­e­fits about the screw­driv­er.

Docs

I think we can improve our end-user docs to the next lev­el. The base sys­tem is already well cov­ered (we can sure­ly find some fea­tures which we could doc­u­ment), but an user does not use FreeB­SD to use FreeB­SD. An user sure­ly has a goal in mind which requires to set­up some kind of ser­vice (mail serv­er, web serv­er, dis­play serv­er (desk­top sys­tem), …). While one could argue that it is the 3rd par­ty project which needs to doc­u­ment how to run their soft­ware on FreeB­SD, I think we need to do our share here too. There are a lot of HOW­TOs for Lin­ux, and then you have to find some tips and tricks to make some­thing work (bet­ter) on FreeB­SD. What I have in mind here is that we should doc­u­ment how to make FreeB­SD par­tic­i­pate in a Win­dows Active Direc­to­ry envi­ron­ment, or in an LDAP envi­ron­ment (as a client), improve the Sam­ba part with FreeB­SD spe­cif­ic parts (like how to make Sam­ba use ZFS snap­shots for Win­dows Shad­ow Copies), con­fig­u­ra­tion man­age­ment tools and so on. I do not talk about pro­vid­ing in-depth docs about the 3rd par­ty soft­ware, but lit­tle HOW­TOs with FreeB­SD spe­cif­ic parts / tips and tricks, and a ref­er­ence to the 3rd par­ty docs. Peo­ple come to us for real-world needs and if we pro­vide them with a head-start of the most com­mon items (e.g. also cov­er­ing nginx or what­ev­er and not only apache httpd) and then guide them to fur­ther docs will improve the val­ue of our hand­book even more for end-users (spe­cial­ly for new­com­ers, but also for expe­ri­enced FreeB­SD users which out of a sud­den now need to do some­thing which they nev­er did before…).

We should also review our docs. The hand­book lists e.g. proc­mail (just an exam­ple…). With proc­mail not being mainained any­more since a long time and known vul­ner­a­bil­i­ties we should replace the info there with info about mail­drop (or any suit­able replace­ment). Care­ful review may also find sim­i­lar items which need some care.

One more item I have in mind in terms of docs for user is the restruc­tur­ing of some parts. Now the world is more think­ing in terms of XaaS (“some­thing as a ser­vice”) we should also have a “cloud” sec­tion (going beyond of what we have in terms of vir­tu­al­iza­tion already) in out hand­book. We can put there items like the exist­ing descrip­tion of vir­tu­al­i­sa­tion items, but also should put there new items like glus­terfs or object stor­age or the hope­ful­ly upcom­ing pos­si­bil­i­ty of how to set­up OpenStack/kubernetes/… on FreeB­SD. This goes into the same direc­tion as the first docs-item in terms of pro­vide more doc­u­men­ta­tion how to achieve goals of our users. 

In my opin­ion we are also lack­ing on the developer-documentation side. Yes, we have man pages which describe the offi­cial API (in most cas­es). Where I see room for improve­ment is the source code doc­u­men­ta­tion. Some­thing like doxy­gen (or what­ev­er the tool of the day is – which one does not real­ly mat­ter, any kind of extractable-from-source doc­u­men­ta­tion is bet­ter than no extractable doc­u­men­ta­tion) is already used in sev­er­al places in our source (search for it via: egrep ‑R ‘\\(brief|file)’ /usr/src/) and we have already some infra­struc­ture to extract and ren­der (HTML / PDF) them. The more acces­si­ble / easy it is to start devel­op­ment in FreeB­SD, the more attrac­tive it will be (addi­tion­al to the exist­ing ben­e­fits) to peo­ple / com­pa­nies to dive in. The best exam­ples about doc­u­ment­ing source code in our code I have found so far is the isci and ocs_fc device code.

Pol­ish­ing

Pol­ish­ing some­thing in the top­ic of stay­ing rel­e­vant? Yes! It is the details which mat­ter. If peo­ple have 2 options with rough­ly the same fea­tures (noth­ing miss­ing what you need, same price), which one do they take, the one which has every­thing con­sis­tent and well inte­grat­ed, or the one with some quirks you can cir­cum­vent with a lit­tle bit of work on their side?

We have some nice fea­tures, but we are not using it to the extend pos­si­ble. One of the items which come to my mind is DTrace. The area which I think needs pol­ish­ing is to add more probes, and to have some kind of probe-convention about com­mon top­ics. For exam­ple I/O relat­ed nam­ing con­ven­tion (maybe area spe­cif­ic, like stor­age I/O and net­work I/O) and cov­er­ing all dri­vers to com­ply. We should also look into mak­ing it more acces­si­ble by pro­vid­ing more easy inter­faces (no mat­ter if text based (thanks to Devin Teske for dwatch, more of this mag­ic please…), web based, or what­ev­er) to make it real­ly easy (= start a com­mand or click around and you get the result for a spe­cif­ic set of probes/conditions/…). Some exam­ples are statemaps, flamegraphs and most promi­nent­ly the Oracle/Sun ZFS Stor­age Ana­lyt­ics to give you an idea what is pos­si­ble with DTrace and and how to make it acces­si­ble to peo­ple with­out knowl­edge about the ker­nel inter­nals and pro­gram­ming.

Some pol­ish­ing in the ports col­lec­tion would be to revis­it the defaults options for ports with options. The tar­get here should be to have con­sis­tent default set­tings (e.g. serv­er soft­ware should not depend upon X11 by default (direct­ly or indi­rect­ly), most peo­ple should not need to build the port with non-default options). One could argue that it is the respon­s­abil­i­ty of the main­tain­er of the port, and to some extend it is, but we do not have guide­lines which help here. So a lit­tle team of peo­ple to review all ports (and mod­i­fy them if nec­es­sary) and come up with guide­lines and exam­ples would be great.

Addi­tion­al­ly we should come up with meta-ports for spe­cif­ic use case, e.g. web­serv­er (dif­fer­ent flavours… apache/nginx/…), data­base (dif­fer­ent flavours, with some use­ful tools like mytop or mysql­tuner or sim­i­lar) and then even ref­er­ence them in the hand­book (this goes along with my sug­ges­tion above to doc­u­ment real-world use cas­es instead of “only” the OS itself).

Recent­ly ‑cur­rent has seen some low lev­el per­for­mance improve­ments (via ifuncs, …). We should con­tin­ue this and even extend it to revise default set­tings / val­ues (non auto-tuned and even auto-tuned ones). I think it would be ben­e­fi­cial in the long run if we tar­get more cur­rent hard­ware (with­out los­ing the abil­i­ty to run on old­er hard­ware) and for those val­ues which can not be auto-tuned pro­vide some way of down-tuning (e.g. in a failsafe-boot set­ting in the loader or in doc­u­ment­ed set­tings for rc.conf or wher­ev­er those defaults can be changed).

Project infra­struc­ture

We have a CI (con­tin­u­ous inte­gra­tion) sys­tem, but it is not very promi­nent­ly placed. Just recent­ly it gained some more atten­tion from the devel­op­er side and we even got the first sta­tus report about it (nice! vis­i­bil­i­ty helps mak­ing it a part of the com­mu­ni­ty effort). There is a FreeBSD-wiki page about the sta­tus and the future ideas, but it was not updat­ed since sev­er­al months. There is also a page which talks in more details about using it for per­for­mance test­ing, which is some­thing peo­ple have talked about since years but nev­er became avail­able (and is not today).

I think we need to improve here. The goals I think which are impor­tant are to get var­i­ous test­ing, san­i­tiz­ing and fuzzing tech­nolo­gies inte­grat­ed into our CI. On the con­fig repos­i­to­ry I have not found any inte­gra­tion of e.g. the cor­re­spond­ing clang tech­nolo­gies (fuzzing, ASAN, UBSAN, MSAN (still exper­i­men­tal, so maybe not to tar­get before oth­er mature tech­nolo­gies)) or any oth­er such tech­nol­o­gy.

We should also make our CI more public/visible (build sta­tus linked some­where on www.freebsd.org, nag peo­ple more about issues found by it, have some docs how to add new tests (maybe from ports), so that more peo­ple can help in extend what we auto­mat­i­cal­ly test (e.g. how could I inte­grate the LTP (Lin­ux Test Project) tests to test our lin­ux­u­la­tor? This requires the down­load of a lin­ux dist port, the LTP itself, and then to run the tests). There are a lot of nice ideas float­ing around, but I have the impres­sion we are lack­ing some help­ing hands to get var­i­ous items inte­grat­ed.

Wrap-Up

Var­i­ous items I talked above are not sexy. Those are typ­i­cal­ly not the things peo­ple do just for fun. Those are typ­i­cal­ly items peo­ple get paid for. It would be nice if some of the com­pa­nies which ben­e­fit from FreeB­SD would be so nice to lend a help­ing hand for one or anoth­er item. Maybe the FreeB­SD Foun­da­tion has some con­tacts they could ask about this?

It could also be that for some of the items I men­tioned here there is more ongo­ing that I know of. This means then that the cor­re­spond­ing work could be made more known on the mail­inglists. When it is more known, maybe some­one wants to pro­vide a help­ing hand.

Send to Kin­dle

Essen Hackathon 2018

Again this time of the year where we had the plea­sure of doing the Essen Hackathon in a nice weath­er con­di­tion (sun­ny, not too hot, no rain). A lot of peo­ple here, about 20. Not only FreeB­SD com­mit­ters showed up, but also con­trib­u­tors (biggest group was 3 peo­ple who work on iocage/libiocage, and some indi­vid­u­als with inter­est in var­i­ous top­ics like e.g. SCTP / net­work pro­to­cols, and oth­er top­ics I unfor­tu­nate­ly for­got).

The top­ics of inter­est this year:

  • work­flows / process­es
  • Wiki
  • jail- / con­tain­er man­age­ment (pkg­base, iocage, dock­er)
  • ZFS
  • graph­ics
  • doc­u­men­ta­tion
  • bug squash­ing
  • CA trust store for the base sys­tem

I was first work­ing with Allan on mov­ing for­ward with a CA trust store for the base sys­tem (tar­get: make fetch work out of the box for TLS con­nec­tions – cur­rent­ly you will get an error that the cer­tifi­cate can not val­i­dat­ed, if you do not have the ca_nss_root port (or any oth­er source of trust) installed and a sym­link in base to the PEM file). We have inves­ti­gat­ed how base-openssl, ports-openssl and libressl are set­up (ports-openssl is the odd one in the list, it looks in LOCALBASE/openssl for his default trust store, while we would have expect­ed it would have a look in LOCALBASE/etc/ssl). As no ports-based ssl lib is look­ing into /etc/ssl, we were safe to do what­ev­er we want in base with­out break­ing the behav­ior of ports which depend upon the ports-based ssl libs. With that the cur­rent design is to import a set of CAs into SVN – one cert file per CA – and a way to update them (for the secu­ri­ty offi­cer and for users), black­list CAs, and have base-system and local CAs merged into the base-config. The expec­ta­tion is that Allan will be able to present at least a pro­to­type at EuroB­D­Con.

I also had a look with the iocage/libiocage devel­op­ers at some issues I have with iocage. The nice thing is, the cur­rent ver­sion of libiocage already solves the issue I see (I just have to change my process­es a lit­tle bit). Some more cleanup is need­ed on their side until they are ready for a port of libiocage. I am look­ing for­ward to this.

Addi­tion­al­ly I got some time to look at the list of PRs with patch­es I want­ed to look at. Out of the 17 PRs I toke note of, I have closed 4 (one because it was over­come by events). One is in progress (com­mit­ted to ‑cur­rent, but I want to MFC that). One addi­tion­al one (from the iocage guys) I for­ward­ed to jamie@ for review. I also noticed that Kristof fixed some bugs too.

On the social side we had dis­cus­sions dur­ing BBQ, pizza/pasta/…, and a restau­rant vis­it. As always Kristof was telling some fun­ny sto­ries (or at least telling sto­ries in a fun­ny way… 😉 ). This off course trig­gered some oth­er fun­ny sto­ries from oth­er peo­ple. All in all my bot­tom line of this years Essen Hackathon is (as for the oth­er 2 I vis­it­ed): fun, sun and progress for FreeB­SD.

By bring­ing cake every time I went there, it seems that I cre­at­ed a tra­di­tion of this. So any­one should already plan to reg­is­ter for the next one – if noth­ing bad hap­pens, I will bring cake again.

Send to Kin­dle

iocage: HOWTO cre­ate a base­jail from src (instead of from an offi­cial release)

Back­ground

So far I have used ezjail to man­age FreeB­SD jails. I use jails since years to have dif­fer­ent parts of a soft­ware stack in some kind of a con­tain­er (in a ZFS dataset for the filesys­tem side of the con­tain­er). On one hand to not let depen­den­cies of one part of the soft­ware stack have influ­ence of oth­er parts of the soft­ware stack. On the oth­er hand to have the pos­si­bil­i­ty to move parts of the soft­ware stack to a dif­fer­ent sys­tem if nec­es­sary. Nor­mal­ly I run -sta­ble or ‑cur­rent or more gen­er­al­ly speak­ing, a self-compiled FreeB­SD on those sys­tems. In ezjail I like the fact that all jails on a sys­tem have one com­mon base­jail under­ly­ing, so that I update one place for the user­land and all jails get the updat­ed code.

Since a while I heard good things about iocage and how it inte­grates ZFS, so I decid­ed to give it a try myself. As iocage does not come with an offi­cial way of cre­at­ing a base­jail (respec­tive­ly a release) from a self-compiled FreeB­SD (at least doc­u­ment­ed in those places I looked, and yes, I am aware that I can cre­ate a FreeB­SD release myself and use it, but I do not like to have to cre­ate a release addi­tion­al­ly to the build­world I use to update the host sys­tem) here now the short HOWTO achieve this.

Invari­ants

In the fol­low­ing I assume the iocage ZFS parts are already cre­at­ed in dataset ${POOLNAME}/iocage which is mount­ed on ${IOCAGE_BASE}/iocage. Addi­tion­al­ly the build­world in /usr/src (or wher­ev­er you have the FreeB­SD source) should be fin­ished.

Pre-requisites

To have the nec­es­sary dataset-infrastructure cre­at­ed for own basejails/releases, at least one offi­cial release needs to be fetched before. So run the com­mand below (if there is no ${IOCAGE_BASE}/iocage/releases direc­to­ry) and fol­low the on-screen instruc­tions.

iocage fetch

HOWTO

Some vari­ables:

POOLNAME=mpool
SRC_REV=r$(cd /usr/src; svnliteversion)
IOCAGE_BASE=""

Cre­at­ing the iocage basejail-datasets for this ${SRC_REV}:

zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/bin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/boot
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/lib
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/libexec
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/rescue
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/sbin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/bin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/include
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/lib
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/lib32
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/libdata
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/libexec
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/sbin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/share
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/src

Install from /usr/src (the exe­cutable “chown” is hardlinked across an iocage base­jail dataset bound­ary, this fails in the nor­mal install­world, so we have to ignore this error and install a copy of the chown bina­ry to the place where the hardlink nor­mal­ly is):

cd /usr/src
make -i installworld DESTDIR=${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root >&! iocage_installworld_base.log
cp -pv ${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root/usr/sbin/chown ${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root/usr/bin/chgrp
make distribution DESTDIR=${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root >>& iocage_installworld_base.log

While we are here, also cre­ate a release and not only a base­jail:

zfs create -o compression=lz4 ${POOLNAME}/iocage/releases/${SRC_REV}-RELEASE
zfs create -o compression=lz4 ${POOLNAME}/iocage/releases/${SRC_REV}-RELEASE/root
make installworld DESTDIR=${IOCAGE_BASE}/iocage/releases/${SRC_REV}-RELEASE/root >&! iocage_installworld_release.log
make distribution DESTDIR=${IOCAGE_BASE}/iocage/releases/${SRC_REV}-RELEASE/root >>& iocage_installworld_release.log

And final­ly make this the default release which iocage uses when cre­at­ing new jails (this is option­al):

iocage set release=${SRC_REV}-RELEASE default

Now the self-build FreeB­SD is avail­able in iocage for new jails.

Send to Kin­dle

HOWTO: “Blind” re­mote in­stall of FreeB­SD via tiny disk im­age (ZFS edi­tion)

In a past post I described how to install a FreeB­SD remote­ly via a tiny UFS based disk image over a lin­ux sys­tem. In this post I describe how to do it with a ZFS based disk image.

Invari­ants

Giv­en a unix based remote sys­tem (in this case a lin­ux sys­tem) from which you know what kind of hard­ware it runs on (e.g. PCI IDs) and what the cor­re­spond­ing FreeB­SD dri­vers are.

HOWTO

In the title of this post I wrote “via a tiny disk im­age”. This is true for a suit­able defin­i­tion of tiny.

What we have in the root­server are two ~900 GB hard­disks. They shall be used in a soft­ware mir­ror. The ma­chine has 8 GB of RAM. I do not ex­pect much ker­nel pan­ics (= crash dumps) there, so we do not real­ly need >8 GB of swap (for­get the rule of hav­ing twice as much swap than RAM, with the cur­rent amount of RAM in a ma­chine you are in “trou­ble” when you need even the same amount of swap than RAM). I de­cided to go with 2 GB of swap.

Pushing/pulling a 900 GB im­age over the net­work to in­stall a sys­tem is not real­ly some­thing I want to do. I am OK to trans­fer 5 GB (that is 0.5% of the en­tire disk) to get this job done, and this is feas­ible.

First let us define some vari­ables in the shell, this way you just need to change the val­ues in one place and the rest is copy&paste (I use the SVN revi­sion of the source which I use to install the sys­tem as the name of the sysutils/beadm com­pat­i­ble boot-dataset in the rootfs, as such I also have the revi­sion num­ber avail­able in a vari­able):

ROOTFS_SIZE=5G
ROOTFS_NAME=root
FILENAME=rootfs
POOLNAME=mpool
VERSION=r$(cd /usr/src; svnliteversion)
SWAPSIZE=2G

Then change your cur­rent dir­ect­ory to a place where you have enough space for the im­age. There we will cre­ate a con­tainer for the im­age, and make it ready for par­ti­tion­ing:

truncate -s ${ROOTFS_SIZE} ${FILENAME}
mdconfig -a -t vnode -f ${FILENAME}
# if you want to fully allocate
# dd if=/dev/zero of=/dev/md0 bs=1m

Cre­ate the par­ti­tion table and the rootfs (in a sysutils/beadm com­pat­i­ble way – as I install FreeBSD-current there – and mount it tem­po­rary to /temppool):

gpart create -s GPT /dev/md0
gpart add -s 512K -t freebsd-boot -l bootcode0 /dev/md0
gpart add -a 4k -t freebsd-swap -s ${SWAPSIZE} -l swap0 /dev/md0
gpart add -a 1m -t freebsd-zfs -l ${POOLNAME}0 /dev/md0
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 /dev/md0
# if not already the case and you want to have 4k physical sector size of the pool
# syscl vfs.zfs.min_auto_ashift=12
zpool create -o cachefile=/boot/zfs/zpool.cache_temp -o altroot=/temppool -O compress=lz4 -O atime=off -O utf8only=on ${POOLNAME} /dev/gpt/${POOLNAME}0
zfs create -o mountpoint=none ${POOLNAME}/ROOT
zfs create -o mountpoint=/ ${POOLNAME}/ROOT/${VERSION}
zfs create -o mountpoint=/tmp -o exec=on -o setuid=off ${POOLNAME}/tmp
zfs create -o mountpoint=/usr -o canmount=off ${POOLNAME}/usr
zfs create -o mountpoint=/home ${POOLNAME}/home
zfs create -o setuid=off ${POOLNAME}/usr/ports
zfs create ${POOLNAME}/usr/src
zfs create -o mountpoint=/var -o canmount=off ${POOLNAME}/var
zfs create -o exec=off -o setuid=off ${POOLNAME}/var/audit
zfs create -o exec=off -o setuid=off ${POOLNAME}/var/crash
zfs create -o exec=off -o setuid=off ${POOLNAME}/var/log
zfs create -o atime=on ${POOLNAME}/var/mail
zfs create -o setuid=off ${POOLNAME}/var/tmp
zfs create ${POOLNAME}/var/ports
zfs create -o exec=off -o setuid=off -o mountpoint=/shared ${POOLNAME}/shared
zfs create -o exec=off -o setuid=off ${POOLNAME}/shared/distfiles
zfs create -o exec=off -o setuid=off ${POOLNAME}/shared/packages
zfs create -o exec=off -o setuid=off -o compression=lz4 ${POOLNAME}/shared/ccache
zfs create ${POOLNAME}/usr/obj
zpool set bootfs=${POOLNAME}/ROOT/${VERSION} ${POOLNAME}

In­stall FreeB­SD (from source):

cd /usr/src
#make buildworld >&! buildworld.log
#make buildkernel -j 8 KERNCONF=GENERIC >&! buildkernel_generic.log
make installworld DESTDIR=/temppool/ >& installworld.log
make distribution DESTDIR=/temppool/ >& distrib.log
make installkernel KERNCONF=GENERIC DESTDIR=/temppool/ >& installkernel.log

Copy the tem­po­rary zpool cache cre­at­ed above in the pool-creation part to the image (I have the impres­sion it is not real­ly need­ed and will work with­out, but I have not tried this):

cp /boot/zfs/zpool.cache_temp /temppool/boot/
cp /boot/zfs/zpool.cache_temp /temppool/boot/zpool.cache

Add the zfs mod­ule to loader.conf:

zfs_load="yes"
opensolaris_load="yes"

Now you need to cre­ate /temppool/etc/rc.conf (set the de­faultrouter, the IP ad­dress via ifconfig_IF (and do not for­get to use the right IF for it), the host­name, set sshd_enable to yes, zfs_enable=“YES”)  /temppool/boot/loader.conf (zfs_load=“yes”, opensolaris_load=“yes”, vfs.root.mountfrom=“zfs:${POOLNAME}/ROOT/r${VERSION}”)
/temppool/etc/hosts, /temppool/etc/resolv.conf and maybe /temppool/etc/sysctl.conf and /temppool/etc/periodic.conf.

Do not allow password-less root logins in single-user mode on the phys­i­cal con­sole, cre­ate a resolv.conf and an user:

cd /temppool/etc
sed -ie 's:console.*off.:&in:' ttys
cat >resolv.conf <<EOT
search YOURDOMAIN
nameserver 8.8.8.8
EOT
pw -V /temppool/etc groupadd YOURGROUP -g 1001
pw -V /temppool/etc useradd YOURUSER -u 1001 -d /home/YOURUSER -g YOURUSER -G wheel -s /bin/tcsh
pw -V /temppool/etc usermod YOURUSER -h 0
pw -V /temppool/etc usermod root -h 0
zfs create mpool/home/YOURUSER
chown YOURUSER:YOURGROUP /temppool/home/YOURUSER

Now you can make some more mod­i­fi­ca­tions to the sys­tem if want­ed, and then export the pool and detach the image:

zpool export ${POOLNAME}

mdconfig -d -u 0

Depend­ing on the upload speed you can achieve, it is ben­e­fi­cial to com­press the image now, e.g. with bzip2. Then trans­fer the image to the disk of the remote sys­tem. In my case I did this via:

ssh –C –o CompressionLevel=9 root@remote_host dd of=/dev/hda bs=1m < /path/to/${FILENAME}

Then reboot/power-cycle the remote sys­tem.

Post-install tasks

Now we have a new FreeB­SD sys­tem which uses only a frac­tion of the the hard­disk and is not resilient against harddisk-failures.

FreeB­SD will detect that the disk is big­ger than the image we used when cre­at­ing the GPT label and warn about it (cor­rupt GPT table). To fix this and to resize the par­ti­tion for the zpool to use the entire disk we first mir­ror the zpool to the sec­ond disk and resize the par­ti­tion of the first disk, and when the zpool is in-sync and then we resize the boot disk (atten­tion, you need to change the “-s” part in the fol­low­ing to match your disk size).

First back­up the label of the first disk, this makes it more easy to cre­ate the label of the sec­ond disk:

/sbin/gpart backup ada0 > ada0.gpart

Edit ada0.gpart (give dif­fer­ent names for the labels, main­ly change the num­ber 0 on the label-name to 1) and then use it to cre­ate the par­ti­tion of the sec­ond disk:

gpart restore -Fl ada1 < ada0.gpart
gpart resize -i 3 -a 4k -s 929g ada1
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1
zpool set autoexpand=on mpool

Fix the warn­ing about the GPT label and resize the par­ti­tion:

gpart recover ada0
gpart resize -i 3 -a 4k -s 929g ada0

After­wards it should look sim­i­lar to this:

gpart show -l
=>        40  1953525088  ada0  GPT  (932G)
          40        1024     1  bootcode0  (512K)
        1064     4194304     2  swap0  (2.0G)
     4195368         984        - free -  (492K)
     4196352  1948254208     3  mpool0  (929G)
  1952450560     1074568        - free -  (525M)

=>        40  1953525088  ada1  GPT  (932G)
          40        1024     1  bootcode1  (512K)
        1064     4194304     2  swap1  (2.0G)
     4195368         984        - free -  (492K)
     4196352  1948254208     3  mpool1  (929G)
  1952450560     1074568        - free -  (525M)

Add the sec­ond disk to the zpool:

zpool attach mpool gpt/mpool0 gpt/mpool1

When the mir­ror is in sync (zpool sta­tus mpool), we can extend the size of the pool itself:

zpool offline mpool /dev/gpt/mpool0
zpool online mpool /dev/gpt/mpool0

As a last step we can add now an encrypt­ed swap (depend­ing on the impor­tance of the sys­tem maybe a gmirror-ed one – not explained here), and spec­i­fy where to dump (text-dumps) on.

/boot/loader.conf:

dumpdev="/dev/ada0p2"

/etc/rc.conf:

dumpdev="/dev/gpt/swap0"
crashinfo_enable="YES"
ddb_enable="yes"
encswap_enable="YES"
geli_swap_flags="-a hmac/sha256 -l 256 -s 4096 -d"

/etc/fstab:

# Device        Mountpoint      FStype  Options                 Dump    Pass#
/dev/ada1p2.eli none    swap    sw      0       0

Now the sys­tem is ready for some appli­ca­tions.

Send to Kin­dle

Tran­si­tion to nginx: part 4 – CGI scripts

I still have some CGI scripts on this web­site. They still work, and they are good enough for my needs. When I switched this web­site to nginx (the word­press set­up was a lit­tle bit more com­plex than what I wrote in part 1, part 2 and part 3… the con­fig will be one of my next blog posts) I was a lit­tle bit puz­zled how to do that with nginx. It took me some min­utes to get an idea how to do it and to find the right FreeB­SD port for this.

  • Install www/fcgiwrap
  • Add the fol­low­ing to rc.conf:

fcgiwrap_enable="YES"
fcgiwrap_user="www"

  • Run “ser­vice fcgi­wrap start”
  • Add the fol­low­ing to your nginx con­fig:
location ^~ /cgi-bin/ {
    gzip off; #gzip makes scripts feel slower since they have to complete before getting gzipped
    fastcgi_pass  unix:/var/run/fcgiwrap/fcgiwrap.sock;
    fastcgi_index index.cgi;
    fastcgi_param SCRIPT_FILENAME /path/to/location$fastcgi_script_name;
    fastcgi_param GATEWAY_INTERFACE  CGI/1.1;
}

Send to Kin­dle