OVH

Open on 12/4/14 3:52 PM

Closed on 12/6/14 2:08 PM

From: customer

My server appears to be completely down.

What's going on?

12/4/14 3:52 PM

From: support

Good day Daniel,

I verified and confirmed that your VPS is up and running. The tests results below confirms that your VPS is responding to ping with no packet lost: 

PING 167.114.3.218 (167.114.3.218) 56(84) bytes of data.
64 bytes from 167.114.3.218: icmp_req=1 ttl=61 time=3.07 ms
64 bytes from 167.114.3.218: icmp_req=2 ttl=61 time=2.72 ms
64 bytes from 167.114.3.218: icmp_req=3 ttl=61 time=3.85 ms
^C
--- 167.114.3.218 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 2.729/3.217/3.850/0.468 ms

:~$ mtr -c30 -r 167.114.3.218
HOST: xxxxxxxxxxxxxxxxxxxxxxxxxxx Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- mtl-1-6k.qc.ca             0.0%    30    1.0  18.4   1.0 137.3  37.6
  2.|-- bhs-g1-6k.qc.ca            0.0%    30    3.8   4.8   2.6  11.1   2.1
  3.|-- bhs-s5-6k.qc.ca           53.3%    30    2.2   5.5   2.1  37.4   9.2
  4.|-- 51.ip-167-114-96.net       0.0%    30    2.2   4.2   2.1  11.3   2.3
  5.|-- realm.of.the.chickenkille  0.0%    30    5.1   4.1   2.1   8.0   2.0


If you believe that there is a connection issue on your server/VPS, please send us the results of the following command so that we can escalate this to our system administrators for further investigation:

Results of a ping on your server.
Results of a traceroute (preferably done with MTR) from your local host to your server.
Results of a traceroute (preferably done with MTR) from your server to your local host.
Results of your server downloading the following file: http://proof.ovh.ca/files/1Gio.dat (preferably using the wget -O /dev/null http://proof.ovh.ca/files/1Gio.dat) command.

Please note that you can upload the screen-shots on our website at http://demo.ovh.eu/ and paste the URL in this ticket for further assistance.

Thank you for your time and cooperation and I remain at your disposal for further assistance.

Cordially,

Dhavy A.
Customer Advocate.

12/5/14 12:30 PM

From: customer

The issue happened yesterday for about 10 minutes, but I am yet again suffering network outage!

Server does not respond at all to ping or traceroute, all network traffic is dropped.

I can not access the server.

The server did not reboot yesterday, so its purely network.

12/5/14 2:44 PM

From: customer

This issue is happening at this very moment

12/5/14 2:45 PM

From: customer

Ok it does ping/trace, its just suffering major packet loss. 

traceroute to ovh1.starlis.com (167.114.3.218), 30 hops max, 60 byte packets
 1  router (192.168.1.1)  2.464 ms  3.564 ms  3.652 ms
 2  cpe-174-109-192-001.nc.res.rr.com (174.109.192.1)  29.915 ms  29.981 ms  42.136 ms
 3  66.26.45.1 (66.26.45.1)  26.437 ms  27.291 ms  27.381 ms
 4  cpe-024-025-062-022.ec.res.rr.com (24.25.62.22)  23.905 ms  24.488 ms  27.446 ms
 5  be31.drhmncev01r.southeast.rr.com (24.93.64.184)  29.798 ms  29.698 ms  30.009 ms
 6  107.14.19.44 (107.14.19.44)  47.345 ms 107.14.19.42 (107.14.19.42)  45.536 ms 107.14.19.20 (107.14.19.20)  44.644 ms
 7  so-1-1-1.c1.buf00.tbone.rr.com (66.109.1.113)  42.801 ms  24.271 ms ae0.pr1.dca10.tbone.rr.com (107.14.17.200)  24.747 ms
 8  ix-17-0.tcore2.AEQ-Ashburn.as6453.net (216.6.87.149)  30.489 ms  31.686 ms  32.043 ms
 9  if-3-2.tcore2.NJY-Newark.as6453.net (216.6.87.10)  104.240 ms  100.441 ms  40.640 ms
10  * if-2-2.tcore1.NJY-Newark.as6453.net (66.198.70.1)  34.035 ms if-15-3.thar2.NJY-Newark.as6453.net (66.198.111.145)  36.609 ms
11  if-1-3.thar1.NJY-Newark.as6453.net (216.6.57.1)  43.140 ms if-14-3.thar1.NJY-Newark.as6453.net (66.198.70.14)  121.196 ms if-1-3.thar1.NJY-Newark.as6453.net (216.6.57.1)  46.084 ms
12  * * *
13  * bhs-g2-6k.qc.ca (198.27.73.207)  449.422 ms  359.668 ms
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

12/5/14 2:48 PM

From: customer

>>> mtr --report ovh1.starlis.com
Start: Fri Dec  5 14:53:17 2014
HOST: TYRIAL                      Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- router                     0.0%    10    1.1   1.1   1.1   1.1   0.0
  2.|-- cpe-174-109-192-001.nc.re 70.0%    10   42.9  43.9  38.3  50.5   6.1
  3.|-- 66.26.45.1                70.0%    10   25.9  23.3  18.1  25.9   4.4
  4.|-- cpe-024-025-062-022.ec.re 70.0%    10   12.2  12.5  11.8  13.6   0.7
  5.|-- be31.drhmncev01r.southeas 70.0%    10   16.5  13.4  11.3  16.5   2.6
  6.|-- 107.14.19.44              70.0%    10   25.5  26.7  25.5  27.6   1.0
  7.|-- ae0.pr1.dca10.tbone.rr.co 70.0%    10   25.5  53.2  25.5 107.8  47.3
  8.|-- ix-17-0.tcore2.AEQ-Ashbur 70.0%    10   29.3  27.9  26.2  29.3   1.2
  9.|-- if-3-2.tcore2.NJY-Newark. 70.0%    10  109.4  85.2  35.3 110.8  43.2
 10.|-- if-15-3.thar2.NJY-Newark. 80.0%    10   36.4  34.9  33.4  36.4   2.0
 11.|-- if-1-3.thar1.NJY-Newark.a 70.0%    10   39.4  39.7  34.8  44.9   5.0
 12.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
 13.|-- bhs-g2-6k.qc.ca           70.0%    10  345.7 351.6 340.3 368.7  15.1
 14.|-- bhs-s5-6k.qc.ca           80.0%    10  185.1 122.3  59.6 185.1  88.7
 15.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0

12/5/14 2:53 PM

From: customer

I also received alerts from your monitoring system:

Our monitoring system has just detected a fault on your VPS vps31308.vps.ovh.ca.
The fault was noticed on 2014-12-05 14:41:00

Logs:
----------------------
PING vps31308.vps.ovh.ca (167.114.3.218) from 198.100.155.250 : 56(84) bytes of data.
From 198.100.155.250: Destination Host Unreachable
From 198.100.155.250: Destination Host Unreachable
From 198.100.155.250: Destination Host Unreachable

--- 167.114.3.218 ping statistics ---
10 packets transmitted, 0 packets received, +6 errors, 100% packet loss
---------------------



On 2014-12-05 14:41:00, we noticed a fault on your VPS.

However, on 2014-12-05 14:56:00 our monitoring system did not detect any faults on
your VPS vps31308.vps.ovh.ca

Logs:
----------------------
PING vps31308.vps.ovh.ca (167.114.3.218) from 198.100.155.250 : 56(84) bytes of data.
From 198.100.155.250: Host is alive
From 198.100.155.250: Host is alive
From 198.100.155.250: Host is alive

--- 167.114.3.218 ping statistics ---
10 packets transmitted, 10 packets received, 0 errors, 0% packet loss

12/5/14 3:05 PM

From: customer

2 days in a row and not to mention the random reboot I had last month.... I am not pleased with this service.

Please correct things so it doesn't happen again. I can't have downtime like this.

12/5/14 3:06 PM

From: customer

If I don't get my site back up in the next 10 minutes, I'll be closing this account at the end of its term...

12/5/14 3:19 PM

From: customer

Something is wrong with the hardware it seems, showing super high load average with no cpu utilization

http://screencloud.net/v/4jpd

12/5/14 3:53 PM

From: customer

This has to be the worst support I've ever seen.

12/5/14 4:28 PM

From: support

Hello,

The reason you haven't received a reply yet is that we are still waiting for the results of the four tests we requested in our last reply. Again, to be able to look into this for you, we need: 

-Results of a ping on your server.
-Results of a traceroute (preferably done with MTR) from your local host to your server.
-Results of a traceroute (preferably done with MTR) from your server to your local host.
-Results of your server downloading the following file: http://proof.ovh.ca/files/1Gio.dat (preferably using the wget -4 -O /dev/null http://proof.ovh.ca/files/1Gio.dat command)

Once we have this, we'll be able to look into resolving this issue for you.

Regards
Phil C
Customer Service Supervisor

12/6/14 12:39 PM

From: customer

... How can you even work for a company this bad? My previous dedicated host I left for bad support wasn't this bad.

I GAVE you those responses. And its YOUR job to look into why your services are failing when its reported. Your own systems marked it as a failure.

Yet again it went down 12 hours ago based on your own systems reporting.

This is the most bullshit support I've ever seen. Our contract is done. I will not be renewing.

12/6/14 12:42 PM

From: support

Hello again, 

I've just looked through all the messages you've sent us. I see two MTR results from local hosts to your server. There is no MTR from the server to a local host, there is no ping on your server from your location, and there is no wget test.

As we are an unmanaged service provider, it is not our job to run the tests showing that there is a problem. Once you provide the evidence, we will be able to resolve it for you.

Regards
Phil C
Customer Service Supervisor

> ... How can you even work for a company this bad? My previous dedicated host I left for bad support wasn't this bad.
> 
> I GAVE you those responses. And its YOUR job to look into why your services are failing when its reported. Your own systems marked it as a failure.
> 
> Yet again it went down 12 hours ago based on your own systems reporting.
> 
> This is the most bullshit support I've ever seen. Our contract is done. I will not be renewing.

12/6/14 1:45 PM

From: customer

It is your job when your own internal monitoring services shows that the server is unreachable, and that the system load is high even when I'm not utilizing it.

Hell, I even gave you the information to diagnose likely cause: It's likely another customer on the system using a lot of resources.

it doesn't matter. You've lost my business with no chance of saving it.

I have unmanaged hosting with my dedicated servers, and they have no problems investigating reports on their network being down...

That's the key, in unmanaged hosting, YOU are responsible for network and power. This is a network problem, so it is in your area. 

I even told you originally I couldn't even access the system! How do you expect me to give you a reverse MTR and wget if I can't?

You need to read my responses instead of cutting straight to your copy and paste book.

12/6/14 1:49 PM

From: support

Hello,

Other than the tests themselves, nothing I sent you was copy pasted. I am more than willing to help you with the issue. I've asked you to run tests that will take a grand total of 5 minutes to run. We have our procedures. If you do not wish to follow them, there is nothing else I can do to assist you with the problem. 

Cordially,
Phil C.
Customer Service Supervisor

12/6/14 1:51 PM

From: customer

Again, how can I run those tests when I can not access the system due to the complaint at hand?
You're asking me to do something impossible.

12/6/14 1:53 PM

From: customer

And note when I did finally get in, the packet loss made the system unusable and it took a lot of effort just to get an htop command to see the system load. which I eventually lost connection to the system due to the extreme packet loss.

12/6/14 1:59 PM

From: support

Hello,

According to my tests, your VPS is up and running at the moment. I ran a ping test, and it is replying:

$ ping vps31308.vps.ovh.ca
PING vps31308.vps.ovh.ca (167.114.3.218): 56 data bytes
64 bytes from 167.114.3.218: icmp_seq=0 ttl=61 time=4.388 ms
64 bytes from 167.114.3.218: icmp_seq=1 ttl=61 time=3.973 ms
64 bytes from 167.114.3.218: icmp_seq=2 ttl=61 time=3.718 ms
64 bytes from 167.114.3.218: icmp_seq=3 ttl=61 time=4.697 ms

Furthermore, my system indicates that the VPS is online, and that HTTP, HTTPS, DNS, and SMTP are all working fine. I have noticed, however, that your SSH port appears to be closed, which you probably already detected using nmap. At that point, it seems you've configured something within your VPS (probably a firewall) to block TCP port 22. That would obviously block you from making SSH connections, unless you've moved SSH to a different port.

If you have blocked your SSH access, you would need to boot your VPS into Rescue Mode and correct the issue through there. Here is a guide on using RescueMode: http://docs.ovh.ca/en/guides-ovh-rescue.html

Regards,
Phil C.
customer service supervisor


> Again, how can I run those tests when I can not access the system due to the complaint at hand?
> You're asking me to do something impossible.

12/6/14 2:02 PM

From: customer

The issue occured day before yesterday, yesterday and again then last night... You would of needed to run your test during the time I was responding to the ticket...

Do you not have logs on your own internal monitoring? You should be able to see exactly when my server was offline. 

And my SSH is fine when the server is up... It's simply on a different port. I wasn't complaining about that.  but that status tool does not work as it said things were up when it was clearly down.

12/6/14 2:06 PM

From: customer

Also your own monitoring tool runs a Traceroute also, and it reported to my email:

Our monitoring system has just detected a fault on your VPS vps31308.vps.ovh.ca.
The fault was noticed on 2014-12-06 00:22:00

Your VPS may function correctly,
though a few false positives may appear:
 - following a reboot of the VPS with a fsck which
   would be launched automatically, delaying the reboot
 - on setting up firewall rules that block the
   monitoring requests (198.100.155.250)

If you don't want your VPS to be monitored by our system
you can disable it:
 - via your manager
   https://www.ovh.com/manager/web/login.html

 - via the RESTful API offered by OVH, under ownership of
   slaMonitoring via the URL:
   https://ca.api.ovh.com/1.0/vps/vps31308.vps.ovh.ca

By disabling the monitoring of your VPS, this ticket
will not be processed by our teams and will be closed
automatically after 15 days of inactivity
in exchanges with the support.

Logs:
----------------------
PING vps31308.vps.ovh.ca (167.114.3.218) from 198.100.155.250 : 56(84) bytes of data.
From 198.100.155.250: Destination Host Unreachable
From 198.100.155.250: Destination Host Unreachable
From 198.100.155.250: Destination Host Unreachable

--- 167.114.3.218 ping statistics ---
10 packets transmitted, 0 packets received, +6 errors, 100% packet loss

12/6/14 2:08 PM