Hello all, I am experiencing some issue of “freezing”. I have 20 VMs. A master and 19 hosts who have between 20 and 30 containers. Migrating my Discourse install to a new hosted server and running into some serious head-smash-on-desk problems that I can’t figure out. On executing the launcher bootstrap command, I get this:./launcher bootstrap app /usr/bin/docker: Error response from daemon: invalid header field value 'oci runtime error: container_linux.go:247: starting container process caused 'process_linux.go:334: running prestart hook 0 caused 'error running hook: exit status 1, stdout:, stderr: time= '2016-11-10T14:20:29-05:00 ' level=fatal msg= 'failed to add interface veth7d2a024 to sandbox: failed to get link by name 'veth7d2a024 ': Link not found ' n ' ' n'. Your Docker installation is not working correctly See: The relevant errors are the “failed to add interface vethxxx to sandbox” and “link not found” errors, i believe. This is on a server running ubuntu 16.04 LTS, with the app.yml file and templates set up identically to the running instance from which I’m migrating (except for a change in the hostname). Edited to add - Using Docker 1.12.3, via the docker-engine package provided. I am using iptables (will attach my rules at the bottom), and some googling seems to reveal that Docker shits itself into a blind fury sometimes with iptables (, but there are many). So, I’ve already modified Docker with --iptables=false and bounced the server. After having it launch my first container, I see this in journalctl: Could not generate persistent MAC address for vethXXXX: No such Note there are two distinct veth entries. I don't know if that means anything. Systemd[1]: Started docker container 1234 systemd[1]: Starting docker container 1234. With the help of Antivirus, Zap users can protect their Mac from latest malware, Trojans, worms, viruses and malicious software. Antivirus Zap 3.1.0 Free Download for Mac Its powerful and unique scanners detect and remove malware and viruses and safeguard the Mac. Antivirus zap for mac. Problem behavior is unaffected. I’ve also followed and thrown in a pair of iptables rules to allow unrestricted traffic flow between eth0 and docker0. Problem behavior is unaffected. I’ve tried flushing all iptables rules and bootstrapping again, both with and without ’ --iptables=false set for Docker. Problem behavior is unaffected. One weird thing which may or may not matter is that the veth interface listed in the error message does not match any interfaces shown when I do an ifconfig. Every time the bootstrap fails I’m left with another orphaned veth interface, but none of them match the ones listed in each error. Any assistance would be great. I am totally lost as to where to go from here, especially if this turns out to be some kind of stupid docker bug. Hopefully this’ll just be something simple. Edit- Guh, this is almost certainly Docker-related. Welcome to the shady underworld of Docker Mysteries. As you say, this is definitely a Docker bug, but the fact that you’ve tried across multiple versions suggests it isn’t just a Docker bug – that is, there’s something else in the machine’s setup that’s causing problems. Browsing through the bugs that mention the error you’re getting, it looks like there’s a whole raft of possible causes, from appallingly bad endpoint protection software to bugs in certain kernels. I see two ways forward. If you have l33t sk1llz in kernel hacking, you can watch the netlink messages flying around (I’m pretty sure tcpdump can capture them, from memory), and dig into the source of the running kernel to figure out why the messages aren’t doing what you might otherwise expect. The root problem might be in the kernel, or in Docker, but I’m pretty sure you’ll need to rummage around in the kernel to figure out what’s going on, anyway. The other option is to do a very careful comparison between the two machines you’re running – every package, every running process, every /proc/sys setting – heck, potentially the checksum of every binary – to figure out what’s different between the working and non-working machine. Yeah, I tried the docker install script—all it effectively ends up doing is sniffing out your distro version and adding the right repo to your sources list. It’s the same thing as the manual process, just faster. The discourse-setup script isn’t an option for me because it immediately bails if it senses you’ve got something else bound to port 80 (which I do, since there are a half-dozen other sites on this server). Still, things are a lot better now than they used to be—we’re miles ahead of having to screw with passenger and 80 different conflicting versions of Ruby. Resurrecting this issue—I haven’t been able to make any progress on this, in spite of months of on-and-off effort. I’ve gone through multiple kernel upgrades and am currently on 4.8.0, and nothing—literally nothing I’ve done—has made even the slightest difference in the problem behavior. As far as I can tell, I’m the only person on the whole damn internet who’s having this exact problem. Which really sucks, because it’s on a dedicated server in a colo datacenter far far away from me, so I can’t just start swapping hardware.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |