Storage VM: Part 4 – Beep Beep Overheat

Storage VM: Part 4 – Beep Beep Overheat

The price of silence finally hits me.  Way back in the build section I mentioned how I replaced every fan in both chassis with low noise fans from Noctua.  Noise and air flow are at odds with each other, and here is where it comes back to bite me.  After installing the expansion chassis in the rack, I heard some random beeping after about 10 minutes of the expansion chassis being on.  I suspected it was heat related.  There are two important factors when trying to keep a server cool, first is how does heat escape the room that the server is in, and second is how well does heat flow from the chassis/server hot spots to the air.

Outside air is the ultimate destination for all heat removal.  There are numerous transitions possible when transferring heat between all mediums, but outside air is where the final removal takes place. (well, outer space to be more precise, but planetary cooling characteristics isn’t really my point here.) 

Water cooling systems, which I have built, merely transfer heat using the medium of water to a radiator which then emits that heat to the air.  AIO systems do the same, as it usually uses water, though not always.  What these systems do is allow a better transfer of heat over a larger surface area, so more heat can be radiated more quickly.  Heatsinks take concentrated heat on a chip, transfer them over a larger surface area, so that a fan can blow air and carry the heat away. 

Pretty much all cooling systems we use here on earth do something that looks like this: Source of Heat -> transfer medium (could be multiple) -> Outside Air.  There are more esoteric ones that can use phase changes for cooling, but that’s not really common.  We use air. 

Data centers have massive cooling towers, where cold air is circulated throughout the facility.  Each data center’s cooling plan is a bit different, but I have seen them traveling throughout the facility, floor by floor where cold air is pushed out in one tower and hot air collected in another tower running from top to bottom.  I have also seen a standard AC configuration from an office building that was later converted to a data center.  Heck, a data center at the web hosting firm I worked at was, at one point, in the basement of a mall. 

Essentially, what is needed is big AC units that pump cold air into the general area.  This is the first line of defense for keeping the servers cold.  Make the environment colder.  That is probably pretty obvious, but the physical design challenges for this can be challenging.  Especially since companies have to pay the power bill on these, so design efficiencies are highly desirable. 

However, how hot is too hot for a server?  Well, hot, like unbearable to humans hot, is the answer.  Google has kept servers running at 115o F [1], and Dell warranties them for up to 115o F too.

From my own perspective, I don’t really have a desire to do house work (or have it remodeled) on this.  As long as I keep it bearable to me, I should be in good shape.  Let’s cover the room characteristics of my rack room. 

The rack room’s air circulation is notably worse than the rest of the house, as it is actually just a storage room.  I keep my Christmas ornaments in there as well. (I am an absolute sucker for Hallmark Keepsake Ornaments, and a proud member of the Keepsake Ornament Club.) There is also quite a bit of dedicated shelf space for old projects, the 3d printer, old parts and more.  Additionally, this is where the AV cabinet for running all of the amplifiers is located.  As the name suggests, it has my server rack and networking equipment.  From this description alone I’m sure people are figuring out that it can get rather hot in this room. 

This hasn’t been an issue up until this point.  I have had my workstation, the old storage server, the streaming machine, and more running in here for years with no issues.  I’m pretty sure that the room temperature isn’t the issue. 

Two counter rotating fans

The second characteristic is about airflow within the chassis itself.  The original fans in the expansion chassis were loud and pushed a ton of air throughout the system.  They used a counter-rotating system, where two fans spin in a different direction, but push air the same way [2].  I think I’ve seen this kind of design more in power supplies than servers. That may just be my experience rather than an industry standard. 

I had a mild curiosity about it at the time, but ultimately didn’t think much of it.  This wasn’t complete system install, just some drives and a controller board.  I took them out and replaced them with some Noctuas, then called it good. 

The chassis had all kinds of free air flow while being tested and put together

When I was building the chassis I had it in a working spot right next to the home’s AC unit, rather than in the rack room.  It never gave me any trouble.  I think this was probably more a result of the system being laid out completely without being truly put together.  In retrospect, it’s not surprising that this didn’t have issues, there was airflow from the cool room, and little resistance in the case.

The motherboard shelf covers all the lower components

Looking at the case as I installed it in the rack, though, I can see where some of the issues developed. There is a big plate blocking all of the airflow below the top.  The fans do not reach down into the backplanes below, and in fact cannot push air down there.

I was, and still am, pretty sure this isn’t all that big of a deal.  I heard the double beeps of a heat warning lasting for days without any discernible issue with the system.  As mentioned before, these servers are designed for operating at pretty hot temperatures.  The beeping did get to me though.  I don’t like random beeping (and neither does my partner) whenever I get close to the room, so I set out to do something about it.

After a brief foray into the controller board, during which I completely forgot about the password issue before, having expected LastPass to remember the actual correct password, I logged into the Supermicro control panel to discover that there were not any sensor warnings that it was giving.  All clear here. 

Thinking about it I went to check on the chassis itself.  Sure enough it had a warning light, clear as day.  Then a brief foray into the manual and I discovered that my supposition was indeed correct.  I can verify that the PSUs are present and the Fans are working.  It has to be a high temperature warning.

Left: The red warning light, Right: The manual explaining what it could mean
140mm Fan added trying to push air down

I mulled the situation over for a bit.  The issue had to be that the backplanes themselves were not getting a lot of air.  Look at the previous photo, there aren’t a lot of ways for air to get down below.  Luckily the controller board has pinouts for half a dozen fans.  So I tried to strap a 140mm fan on the top and see if I could get some air down below.  That didn’t appear to do anything.  The beeps kept coming.

Honestly, at this point it let it stew for a bit.  I wasn’t prevented from doing the rest of the project, and the warning didn’t really affect what I was doing, it was just annoying.  With no great solution, how was I going to get air flow below that plate?  I could always just remove the plate, but I was kinda hoping not to have to take the whole thing apart, also where exactly would the controller board sit.  

Handle shelf for the chassis

It struck me about a week later.  I could go even lower tech.  I remembered that when I was installing the rails on the chassis, I had to remove the handles that it came with.  These handles actually get installed into little shelves that attach to the back of the chassis.  Well these don’t seem to serve any purpose but to hold these handles so they don’t get lost.  I can just take them out, and then I can see the lower backplanes from the outside. 

Room fan blowing into the chassis

Then it was a simple matter of just sticking a room fan blowing in there.  It is not a particularly elegant solution.  But the beeping did stop completely, and the warning light went out.  

I have the beginnings of a working solution here.  The problem has been identified, and I can keep iterating from here.  I remembered when I was researching fan replacements that Noctua makes very small 40mm fans.  Knowing that the chassis power supply uses 40mm fans, I was certain that they would fit in the slots where the handle drawers used to go. 

I purchased two of them, ran their wires up to the controller board, and waited to see.  Unfortunately, the beeping returned.  Being stubborn about it, I doubled the number of fans and moved them closer.  Still the beeping returned. 

40mm Fans in the shelf spots

For now I’ve opted to just include the external room fan.  It isn’t loud.  I just prefer a solution that is internal to the system.  I still think this is the right path, but I also think removing the plate may be necessary to get enough airflow down there.




Leave a Reply