Recently we had a strange performance problem at work. A web application was having slow response times from time to time and users complained. We did not see an uncommon CPU/mem/swap usage on any involved machine. I generated heat-maps from performance measurements and there where no obvious traces of slow behavior. We did not find any reason why the application should be slow for clients, but obviously it was.
Then someone mentioned two recent apache DoS problems. Number one — the cookie hash issue — did not seem to be the cause, we did not see a huge CPU or memory consumption which we would expect to see with such an attack. The second one — the slow reads problem (no max connection duration timeout in apache, can be exploited by a small receive window for TCP) — looked like it could be an issue. The slow read DoS problem can be detected by looking at the server-status page.
What you would see on the server-status page are a lot of worker threads in the ‘W’ (write data) state. This is supposed to be an indication of slow reads. We did see this.
As our site is behind a reverse proxy with some kind of IDS/IPS feature, we took the reverse proxy out of the picture to get a better view of who is doing what (we do not have X-Forwarded-For configured).
At this point we noticed still a lot of connection in the ‘W’ state from the rev-proxy. This was strange, it was not supposed to do this. After restarting the rev-proxy (while the clients went directly to the webservers) we had those ‘W’ entries still in the server-status. This was getting really strange. And to add to this, the duration of the ‘W’ state from the rev-proxy tells that this state is active since several thousand seconds. Ugh. WTF?
Ok, next step: killing the offenders. First I verified in the list of connections in the server-status (extended-status is activated) that all worker threads with the rev–proxy connection of a given PID are in this strange state and no client request is active. Then I killed this particular PID. I wanted to do this until I do not have those strange connections anymore. Unfortunately I arrived at PIDs which were listed in the server-status (even after a refresh), but not available in the OS. That is bad. Very bad.
So the next step was to move all clients away from one webserver, and then to reboot this webserver completely to be sure the entire system is in a known good state for future monitoring (the big hammer approach).
As we did not know if this strange state was due to some kind of mis-administration of the system or not, we decided to have the rev-proxy again in front of the webserver and to monitor the systems.
We survived about one and a half day. After that all worker threads on all webservers where in this state. DoS. At this point we where sure there was something malicious going on (some days later our management showed us a mail from a company which offered security consulting 2 months before to make sure we do not get hit by a DDoS during the holiday season… a coincidence?).
Next step, verification of missing security patches (unfortunately it is not us who decides which patches we apply to the systems). What we noticed is, that the rev-proxy is missing a patch for a DoS problem, and for the webservers a new fixpack was scheduled to be released not far in the future (as of this writing: it is available now).
Since we applied the DoS fix for the rev-proxy, we do not have a problem anymore. This is not really conclusive, as we do not really know if this fixed the problem or if the attacker stopped attacking us.
From reading what the DoS patch fixes, we would assume we should see some continuous traffic going on between the rev-rpoxy and the webserver, but there was nothing when we observed the strange state.
We are still not allowed to apply patches as we think we should do, but at least we have a better monitoring in place to watch out for this particular problem (activate the extended status in apache/IHS, look for lines with state ‘W’ and a long duration (column ‘SS’), raise an alert if the duration is higher than the max. possible/expected/desired duration for all possible URLs).
GD Star Rating
loading…
GD Star Rating
loading…
Tags: dos problem,
dos problems,
memory consumption,
performance measurements,
performance problem,
proxy connection,
reverse proxy,
slow response times,
swap usage,
worker threads —
The recent Phoronix benchmark which compared a release candidate of FreeBSD 9 with Oracle Linux Server 6.1 created a huge discussion in the FreeBSD mailinglists. The reason was that some people think the numbers presented there give a wrong picture of FreeBSD. Partly because not all benchmark numbers are presented in the most prominent page (as linked above), but only at a different place. This gives the impression that FreeBSD is inferior in this benchmark while it just puts the focus (for a reason, according to some people) on a different part of the benchmark (to be more specific, blogbench is doing disk reads and writes in parallel, FreeBSD gives higher priority to writes than to reads, FreeBSD 9 outperforms OLS 6.1 in the writes while OLS 6.1 shines with the reads, and only the reads are presented on the first page). Other complaints are that it is told that the default install was used (in this case UFS as the FS), when it was not (ZFS as the FS).
The author of the Phoronix article participated in parts of the discussion and asked for specific improvement suggestions. A FreeBSD committer seems to be already working to get some issues resolved. What I do not like personally, is that the article is not updated with a remark that some things presented do not reflect the reality and a retest is necessary.
As there was much talk in the thread but not much obvious activity from our side to resolve some issues, I started to improve the FreeBSD wiki page about benchmarking so that we are able to point to it in case someone wants to benchmark FreeBSD. Others already chimed in and improved some things too. It is far from perfect, some more eyes — and more importantly some more fingers which add content — are needed. Please go to the wiki page and try to help out (if you are afraid to write something in the wiki, please at least tell your suggestions on a FreeBSD mailinglist so that others can improve the wiki page).
What we need too, is a wiki page about FreeBSD tuning (a first step would be to take the man-page and convert it into a wiki page, then to improve it, and then to feed back the changes to the man-page while keeping the wiki page to be able to cross reference parts from the benchmarking page).
I already told about this in the thread about the Phoronix benchmark: everyone is welcome to improve the situation. Do not talk, write something. No matter if it is an improvement to the benchmarking page, tuning advise, or a tool which inspects the system and suggests some tuning. If you want to help in the wiki, create a FirstnameLastname account and ask a FreeBSD comitter for write access.
A while ago (IIRC we have to think in months or even years) there was some framework for automatic FreeBSD benchmarking. Unfortunately the author run out of time. The framework was able to install a FreeBSD system on a machine, run some specified benchmark (not much benchmarks where integrated), and then install another FreeBSD version to run the same benchmark, or to reinstall the same version to run another benchmark. IIRC there was also some DB behind which collected the results and maybe there was even some way to compare them. It would be nice if someone could get some time to talk with the author to get the framework and set it up somewhere, so that we have a controlled environment where we can do our own benchmarks in an automatic and repeatable fashion with several FreeBSD versions.
GD Star Rating
loading…
GD Star Rating
loading…
Tags: benchmark numbers,
fingers,
freebsd,
improvement suggestions,
linux server,
oracle,
oracle linux,
release candidate,
retest,
ufs —
I was fighting with the right way to add a recent Verisign certificate to a keystore for the IBM HTTP Server (IHS). I have used the ikeyman utility on Solaris.
The problem indicator was the error message “SSL0208E: SSL Handshake Failed, Certificate validation error” in the SSL log of IHS.
The IBM websites where not really helpful to track down the problem (the missing stuff). The Verisign instructions did not lead to a working solution either.
What was done before: the Verisign Intermediate Certificates where imported as “Signer Certificates”, and the certificate for the webserver was imported within “Personal Certificates”. Without the signer certificates the personal certificate would not import due to an intermediate certificated missing (no valid trust-chain).
What I did to resolve the problem:
- I removed all Verisign certificates.
- I added the Verisign Root Certificate and the Verisign Intermediate Certificate A as a signer certificate (use the “Add” button). I also tried to add the Verisign Intermediate Certificate B, but it complained that some part of it was already there as part of the Intermediate Certificate A. I skipped this part.
- Then I converted the server certificate and key to a PKS12 file via “openssl pkcs12 –export –in server-cert.arm –out cert-for-ihs.p12 –inkey server-key.arm –name name_for_cert_in_ihs”.
- After that I imported the cert-for-ihs.p12 as a “Personal Certificate”. The import dialog offers 3 items to import. I selected the “name_for_cert_in_ihs” and the one containing “cn=verisign class 3 public primary certification authority — g5” (when I selected the 3rd one too, it complained that a part of it was already imported with a different name).
With this modified keystore in place, I just had to select the certificate via “SSLServerCert name_for_cert_in_ihs” in the IHS config and the problem was fixed.
GD Star Rating
loading…
GD Star Rating
loading…
Tags: ibm http server,
intermediate certificate,
intermediate certificates,
keystore,
personal certificate,
personal certificates,
server cert,
validation error,
verisign certificate,
verisign certificates —
I have a little problem finding a clean solution to the following problem.
A machine with two network interfaces and no default route. The first interface gets an IP at boot time and the corresponding static route is inserted during boot into the routing table without problems. The second interface only gets an IP address when the shared-IP zones on the machine are started, during boot the interface is plumbed but without any address. The networks on those interfaces are not connected and the machine is not a gateway (this means we have a machine–administration network and a production-network). The static routes we want to have for the addresses of the zones are not added to the routing table, because the next hop is not reachable at the time the routing-setup is done. As soon as the zones are up (and the interface gets an IP), a re-run of the routing-setup adds the missing static routes.
Unfortunately I can not tell Solaris to keep the static route even if the next hop is not reachable ATM (at least I have not found an option to the route command which does this).
One solution to this problem would be to add an address at boot to the interface which does not have an address at boot-time ATM (probably with the deprecated flag set). The problem is, that this subnet (/28) has not enough free addresses anymore, so this is not an option.
Another solution is to use a script which re-runs the routing-setup after the zones are started. This is a pragmatic solution, but not a clean solution.
As I understand the in.routed man-page in.routed is not an option with the default config, because the machine shall not route between the networks, and shall not change the routing based upon RIP messages from other machines. Unfortunately I do not know enough about it to be sure, and I do not get the time to play around with this. I have seen some intersting options regarding this in the man-page, but playing around with this and sniffing the network to see what happens, is not an option ATM. Anyone with a config/tutorial for this “do not broadcast anything, do not accept anything from outside”-case (if possible)?
GD Star Rating
loading…
GD Star Rating
loading…
Tags: administration network,
boot time,
clean solution,
default config,
default route,
network interfaces,
pragmatic solution,
routing table,
static route,
static routes —
As I wrote earlier, I try to get some infos which formats my Sony BRAVIA 5800 TV is able to play over the network. Sony is not really helpful (they tell only names someone with a DLNA spec could correctly interpret). Now I took the time to move my TV into a different subnet (the same where my NAS is in, not like before in a DMZ), and I installed minidlna. After some network sniffing, the use of the Intel UPnP Device Spy and some minidlna–source reading I have now a better idea what my Sony TV expects.
The DLNA-specification seems to mandate a MIME-type and some DLNA-specific identifier which describes the content a player (a DLNA-Renderer) is able to display. In the following I will present the MIME-type, the DLNA-identifier, and probably a Sony-specific identifier.
Regarding pictures the TV only accepts JPEGs, bit in small, medium and large sizes. I did not bother to look up what this means in real values, so far this is not of high interest for me. For audio the TV accepts MP3s and LPCM (raw PCM samples). The raw sniffed data from the TV looks like this:
image/jpeg:DLNA.ORG_PN=JPEG_SM
image/jpeg:DLNA.ORG_PN=JPEG_MED
image/jpeg:DLNA.ORG_PN=JPEG_LRG
audio/mpeg:DLNA.ORG_PN=MP3
audio/L16:DLNA.ORG_PN=LPCM
The more interesting part for me is the video part. The TV supports MPEG2 Video (the MPEG_ part in the DLNA.ORG_PN) and H.264 (the AVC_ part in the DLNA.ORG_PN). For MPEG2 it supports program streams (PS in DLNA.ORG_PN) and transport streams (TS in DLNA.ORG_PN). For PS it supports PAL and NTSC resolutions (720×576 is PAL, HD resolutions like 720p or 1080i or 1080p are not supported). The packet-length of a transport steam can be 188 bytes or 192 bytes. If the width is >= 1288 or the height is >= 720, minidlna adds HD in DLNA.ORG_PN, else it will add SD. The EU in DLNA.ORG_PN is for SD video with a height of 576 or 288 pixels. Depending of the combination of the packet-length and if there is a timestamp in use or not, the DLNA.ORG_PN will have a _ISO or a _T appended.
It also supports H.264. The DLNA.ORG_PN starts with a AVC in this case. Only transport streams (TS in DLNA.ORG_PN) is supported. As with MPEG2, the packet-length of the TS can be 188 or 192 bytes. Depending of the combination of the packet-length and if there is a timestamp in use or not, the DLNA.ORG_PN will have a _ISO or a _T appended. Depending on the profile used, minidlna adds some more infos to the DLNA.ORG_PN, BL if it is a baseline-profile, MP if it is a main-profile, and HP if it is a high-profile. I do not see this in the valid video formats my TV requested over the wire. As with the MPEG2 format, SD or HD is added (in minidlna) depending on the width and height, but also on the bitrate of the video. For the main-profile the width has to be <= 720, the height <= 576 and the bitrate <= 10M (base 10, not base 2) for SD, and the width has to be <=1920, the height <= 1152 and the bitrate <= 20M (base 10, not base 2) for HD. For the high-profile the width has to be <=1920, the height <=1152, the bitrate <= 30M (base 10, not base 2) and the audio has to be AC3 to get the HD added in DLNA.ORG_PN. The audio is specified in DLNA.ORG_PN as MPEG1_L3 for MP3, AC3 for AC3, and AAC or AAC_MULT5 for AAC (stereo or 5-channel). As can be seen below, the TV seems only to support AC3 audio for AVC. The TV also has _24_, _50_ and _60_ in DLNA.ORG_PN. I did not find those things in the minidlna source (but I have not really searched for this). I could imagine that _24_ stands for 24 pictures per second, and the _50_ and _60_ for progressive videos (with 50 respectively 60 pictures per second), but this is pure speculation from my side. Here is the raw sniffed data:
video/mpeg:DLNA.ORG_PN=AVC_TS_HD_24_AC3_ISO;SONY.COM_PN=AVC_TS_HD_24_AC3_ISO
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=AVC_TS_HD_24_AC3;SONY.COM_PN=AVC_TS_HD_24_AC3
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=AVC_TS_HD_24_AC3_T;SONY.COM_PN=AVC_TS_HD_24_AC3_T
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_PS_PAL
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_PS_NTSC
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_TS_SD_50_L2_T
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_TS_SD_60_L2_T
video/mpeg:DLNA.ORG_PN=MPEG_TS_SD_EU_ISO
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_TS_SD_EU
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_TS_SD_EU_T
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_TS_SD_50_AC3_T
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_TS_SD_60_AC3_T
video/mpeg:DLNA.ORG_PN=MPEG_TS_HD_50_L2_ISO;SONY.COM_PN=HD2_50_ISO
video/mpeg:DLNA.ORG_PN=MPEG_TS_HD_60_L2_ISO;SONY.COM_PN=HD2_60_ISO
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_TS_HD_50_L2_T;SONY.COM_PN=HD2_50_T
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=MPEG_TS_HD_60_L2_T;SONY.COM_PN=HD2_60_T
video/mpeg:DLNA.ORG_PN=AVC_TS_HD_50_AC3_ISO;SONY.COM_PN=AVC_TS_HD_50_AC3_ISO
video/mpeg:DLNA.ORG_PN=AVC_TS_HD_60_AC3_ISO;SONY.COM_PN=AVC_TS_HD_60_AC3_ISO
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=AVC_TS_HD_50_AC3;SONY.COM_PN=AVC_TS_HD_50_AC3
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=AVC_TS_HD_60_AC3;SONY.COM_PN=AVC_TS_HD_60_AC3
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=AVC_TS_HD_50_AC3_T;SONY.COM_PN=AVC_TS_HD_50_AC3_T
video/vnd.dlna.mpeg-tts:DLNA.ORG_PN=AVC_TS_HD_60_AC3_T;SONY.COM_PN=AVC_TS_HD_60_AC3_T
video/x-mp2t-mphl-188
So far I did not get the time to experiment with this. I also have the impression that minidlna has still some rough edges (the sintel video I used to test before with a different media server, does not show up in the list with minidlna).
GD Star Rating
loading…
GD Star Rating
loading…
Tags: 1080i or 1080p,
720p or 1080i,
audio mpeg,
packet length,
pcm samples,
program streams,
sony bravia tv,
sony tv,
source reading,
transport streams —