My server appears to be completely down. What's going on?
Good day Daniel, I verified and confirmed that your VPS is up and running. The tests results below confirms that your VPS is responding to ping with no packet lost: PING 167.114.3.218 (167.114.3.218) 56(84) bytes of data. 64 bytes from 167.114.3.218: icmp_req=1 ttl=61 time=3.07 ms 64 bytes from 167.114.3.218: icmp_req=2 ttl=61 time=2.72 ms 64 bytes from 167.114.3.218: icmp_req=3 ttl=61 time=3.85 ms ^C --- 167.114.3.218 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2002ms rtt min/avg/max/mdev = 2.729/3.217/3.850/0.468 ms :~$ mtr -c30 -r 167.114.3.218 HOST: xxxxxxxxxxxxxxxxxxxxxxxxxxx Loss% Snt Last Avg Best Wrst StDev 1.|-- mtl-1-6k.qc.ca 0.0% 30 1.0 18.4 1.0 137.3 37.6 2.|-- bhs-g1-6k.qc.ca 0.0% 30 3.8 4.8 2.6 11.1 2.1 3.|-- bhs-s5-6k.qc.ca 53.3% 30 2.2 5.5 2.1 37.4 9.2 4.|-- 51.ip-167-114-96.net 0.0% 30 2.2 4.2 2.1 11.3 2.3 5.|-- realm.of.the.chickenkille 0.0% 30 5.1 4.1 2.1 8.0 2.0 If you believe that there is a connection issue on your server/VPS, please send us the results of the following command so that we can escalate this to our system administrators for further investigation: Results of a ping on your server. Results of a traceroute (preferably done with MTR) from your local host to your server. Results of a traceroute (preferably done with MTR) from your server to your local host. Results of your server downloading the following file: http://proof.ovh.ca/files/1Gio.dat (preferably using the wget -O /dev/null http://proof.ovh.ca/files/1Gio.dat) command. Please note that you can upload the screen-shots on our website at http://demo.ovh.eu/ and paste the URL in this ticket for further assistance. Thank you for your time and cooperation and I remain at your disposal for further assistance. Cordially, Dhavy A. Customer Advocate.
The issue happened yesterday for about 10 minutes, but I am yet again suffering network outage! Server does not respond at all to ping or traceroute, all network traffic is dropped. I can not access the server. The server did not reboot yesterday, so its purely network.
This issue is happening at this very moment
Ok it does ping/trace, its just suffering major packet loss. traceroute to ovh1.starlis.com (167.114.3.218), 30 hops max, 60 byte packets 1 router (192.168.1.1) 2.464 ms 3.564 ms 3.652 ms 2 cpe-174-109-192-001.nc.res.rr.com (174.109.192.1) 29.915 ms 29.981 ms 42.136 ms 3 66.26.45.1 (66.26.45.1) 26.437 ms 27.291 ms 27.381 ms 4 cpe-024-025-062-022.ec.res.rr.com (24.25.62.22) 23.905 ms 24.488 ms 27.446 ms 5 be31.drhmncev01r.southeast.rr.com (24.93.64.184) 29.798 ms 29.698 ms 30.009 ms 6 107.14.19.44 (107.14.19.44) 47.345 ms 107.14.19.42 (107.14.19.42) 45.536 ms 107.14.19.20 (107.14.19.20) 44.644 ms 7 so-1-1-1.c1.buf00.tbone.rr.com (66.109.1.113) 42.801 ms 24.271 ms ae0.pr1.dca10.tbone.rr.com (107.14.17.200) 24.747 ms 8 ix-17-0.tcore2.AEQ-Ashburn.as6453.net (216.6.87.149) 30.489 ms 31.686 ms 32.043 ms 9 if-3-2.tcore2.NJY-Newark.as6453.net (216.6.87.10) 104.240 ms 100.441 ms 40.640 ms 10 * if-2-2.tcore1.NJY-Newark.as6453.net (66.198.70.1) 34.035 ms if-15-3.thar2.NJY-Newark.as6453.net (66.198.111.145) 36.609 ms 11 if-1-3.thar1.NJY-Newark.as6453.net (216.6.57.1) 43.140 ms if-14-3.thar1.NJY-Newark.as6453.net (66.198.70.14) 121.196 ms if-1-3.thar1.NJY-Newark.as6453.net (216.6.57.1) 46.084 ms 12 * * * 13 * bhs-g2-6k.qc.ca (198.27.73.207) 449.422 ms 359.668 ms 14 * * * 15 * * * 16 * * * 17 * * * 18 * * * 19 * * * 20 * * * 21 * * * 22 * * * 23 * * * 24 * * * 25 * * * 26 * * * 27 * * * 28 * * * 29 * * * 30 * * *
>>> mtr --report ovh1.starlis.com Start: Fri Dec 5 14:53:17 2014 HOST: TYRIAL Loss% Snt Last Avg Best Wrst StDev 1.|-- router 0.0% 10 1.1 1.1 1.1 1.1 0.0 2.|-- cpe-174-109-192-001.nc.re 70.0% 10 42.9 43.9 38.3 50.5 6.1 3.|-- 66.26.45.1 70.0% 10 25.9 23.3 18.1 25.9 4.4 4.|-- cpe-024-025-062-022.ec.re 70.0% 10 12.2 12.5 11.8 13.6 0.7 5.|-- be31.drhmncev01r.southeas 70.0% 10 16.5 13.4 11.3 16.5 2.6 6.|-- 107.14.19.44 70.0% 10 25.5 26.7 25.5 27.6 1.0 7.|-- ae0.pr1.dca10.tbone.rr.co 70.0% 10 25.5 53.2 25.5 107.8 47.3 8.|-- ix-17-0.tcore2.AEQ-Ashbur 70.0% 10 29.3 27.9 26.2 29.3 1.2 9.|-- if-3-2.tcore2.NJY-Newark. 70.0% 10 109.4 85.2 35.3 110.8 43.2 10.|-- if-15-3.thar2.NJY-Newark. 80.0% 10 36.4 34.9 33.4 36.4 2.0 11.|-- if-1-3.thar1.NJY-Newark.a 70.0% 10 39.4 39.7 34.8 44.9 5.0 12.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0 13.|-- bhs-g2-6k.qc.ca 70.0% 10 345.7 351.6 340.3 368.7 15.1 14.|-- bhs-s5-6k.qc.ca 80.0% 10 185.1 122.3 59.6 185.1 88.7 15.|-- ??? 100.0 10 0.0 0.0 0.0 0.0 0.0
I also received alerts from your monitoring system: Our monitoring system has just detected a fault on your VPS vps31308.vps.ovh.ca. The fault was noticed on 2014-12-05 14:41:00 Logs: ---------------------- PING vps31308.vps.ovh.ca (167.114.3.218) from 198.100.155.250 : 56(84) bytes of data. From 198.100.155.250: Destination Host Unreachable From 198.100.155.250: Destination Host Unreachable From 198.100.155.250: Destination Host Unreachable --- 167.114.3.218 ping statistics --- 10 packets transmitted, 0 packets received, +6 errors, 100% packet loss --------------------- On 2014-12-05 14:41:00, we noticed a fault on your VPS. However, on 2014-12-05 14:56:00 our monitoring system did not detect any faults on your VPS vps31308.vps.ovh.ca Logs: ---------------------- PING vps31308.vps.ovh.ca (167.114.3.218) from 198.100.155.250 : 56(84) bytes of data. From 198.100.155.250: Host is alive From 198.100.155.250: Host is alive From 198.100.155.250: Host is alive --- 167.114.3.218 ping statistics --- 10 packets transmitted, 10 packets received, 0 errors, 0% packet loss
2 days in a row and not to mention the random reboot I had last month.... I am not pleased with this service. Please correct things so it doesn't happen again. I can't have downtime like this.
If I don't get my site back up in the next 10 minutes, I'll be closing this account at the end of its term...
Something is wrong with the hardware it seems, showing super high load average with no cpu utilization http://screencloud.net/v/4jpd
This has to be the worst support I've ever seen.
Hello, The reason you haven't received a reply yet is that we are still waiting for the results of the four tests we requested in our last reply. Again, to be able to look into this for you, we need: -Results of a ping on your server. -Results of a traceroute (preferably done with MTR) from your local host to your server. -Results of a traceroute (preferably done with MTR) from your server to your local host. -Results of your server downloading the following file: http://proof.ovh.ca/files/1Gio.dat (preferably using the wget -4 -O /dev/null http://proof.ovh.ca/files/1Gio.dat command) Once we have this, we'll be able to look into resolving this issue for you. Regards Phil C Customer Service Supervisor
... How can you even work for a company this bad? My previous dedicated host I left for bad support wasn't this bad. I GAVE you those responses. And its YOUR job to look into why your services are failing when its reported. Your own systems marked it as a failure. Yet again it went down 12 hours ago based on your own systems reporting. This is the most bullshit support I've ever seen. Our contract is done. I will not be renewing.
Hello again, I've just looked through all the messages you've sent us. I see two MTR results from local hosts to your server. There is no MTR from the server to a local host, there is no ping on your server from your location, and there is no wget test. As we are an unmanaged service provider, it is not our job to run the tests showing that there is a problem. Once you provide the evidence, we will be able to resolve it for you. Regards Phil C Customer Service Supervisor > ... How can you even work for a company this bad? My previous dedicated host I left for bad support wasn't this bad. > > I GAVE you those responses. And its YOUR job to look into why your services are failing when its reported. Your own systems marked it as a failure. > > Yet again it went down 12 hours ago based on your own systems reporting. > > This is the most bullshit support I've ever seen. Our contract is done. I will not be renewing.
It is your job when your own internal monitoring services shows that the server is unreachable, and that the system load is high even when I'm not utilizing it. Hell, I even gave you the information to diagnose likely cause: It's likely another customer on the system using a lot of resources. it doesn't matter. You've lost my business with no chance of saving it. I have unmanaged hosting with my dedicated servers, and they have no problems investigating reports on their network being down... That's the key, in unmanaged hosting, YOU are responsible for network and power. This is a network problem, so it is in your area. I even told you originally I couldn't even access the system! How do you expect me to give you a reverse MTR and wget if I can't? You need to read my responses instead of cutting straight to your copy and paste book.
Hello, Other than the tests themselves, nothing I sent you was copy pasted. I am more than willing to help you with the issue. I've asked you to run tests that will take a grand total of 5 minutes to run. We have our procedures. If you do not wish to follow them, there is nothing else I can do to assist you with the problem. Cordially, Phil C. Customer Service Supervisor
Again, how can I run those tests when I can not access the system due to the complaint at hand? You're asking me to do something impossible.
And note when I did finally get in, the packet loss made the system unusable and it took a lot of effort just to get an htop command to see the system load. which I eventually lost connection to the system due to the extreme packet loss.
Hello, According to my tests, your VPS is up and running at the moment. I ran a ping test, and it is replying: $ ping vps31308.vps.ovh.ca PING vps31308.vps.ovh.ca (167.114.3.218): 56 data bytes 64 bytes from 167.114.3.218: icmp_seq=0 ttl=61 time=4.388 ms 64 bytes from 167.114.3.218: icmp_seq=1 ttl=61 time=3.973 ms 64 bytes from 167.114.3.218: icmp_seq=2 ttl=61 time=3.718 ms 64 bytes from 167.114.3.218: icmp_seq=3 ttl=61 time=4.697 ms Furthermore, my system indicates that the VPS is online, and that HTTP, HTTPS, DNS, and SMTP are all working fine. I have noticed, however, that your SSH port appears to be closed, which you probably already detected using nmap. At that point, it seems you've configured something within your VPS (probably a firewall) to block TCP port 22. That would obviously block you from making SSH connections, unless you've moved SSH to a different port. If you have blocked your SSH access, you would need to boot your VPS into Rescue Mode and correct the issue through there. Here is a guide on using RescueMode: http://docs.ovh.ca/en/guides-ovh-rescue.html Regards, Phil C. customer service supervisor > Again, how can I run those tests when I can not access the system due to the complaint at hand? > You're asking me to do something impossible.
The issue occured day before yesterday, yesterday and again then last night... You would of needed to run your test during the time I was responding to the ticket... Do you not have logs on your own internal monitoring? You should be able to see exactly when my server was offline. And my SSH is fine when the server is up... It's simply on a different port. I wasn't complaining about that. but that status tool does not work as it said things were up when it was clearly down.
Also your own monitoring tool runs a Traceroute also, and it reported to my email: Our monitoring system has just detected a fault on your VPS vps31308.vps.ovh.ca. The fault was noticed on 2014-12-06 00:22:00 Your VPS may function correctly, though a few false positives may appear: - following a reboot of the VPS with a fsck which would be launched automatically, delaying the reboot - on setting up firewall rules that block the monitoring requests (198.100.155.250) If you don't want your VPS to be monitored by our system you can disable it: - via your manager https://www.ovh.com/manager/web/login.html - via the RESTful API offered by OVH, under ownership of slaMonitoring via the URL: https://ca.api.ovh.com/1.0/vps/vps31308.vps.ovh.ca By disabling the monitoring of your VPS, this ticket will not be processed by our teams and will be closed automatically after 15 days of inactivity in exchanges with the support. Logs: ---------------------- PING vps31308.vps.ovh.ca (167.114.3.218) from 198.100.155.250 : 56(84) bytes of data. From 198.100.155.250: Destination Host Unreachable From 198.100.155.250: Destination Host Unreachable From 198.100.155.250: Destination Host Unreachable --- 167.114.3.218 ping statistics --- 10 packets transmitted, 0 packets received, +6 errors, 100% packet loss