Regarding the latency issues, I wonder to what extent backbone design is contributing to this.
In particular, one issue I have noticed (for years... really a decade) involves connectivity to Rogers' transit providers and peers in the U.S., in Chicago, Ashburn, and NYC. Frankly, as far as I can tell, it is more or less random (or based on some algorithm that is not tuned around latency) whether traffic goes out through NYC, Chicago, or Ashburn. If something goes out to a transit provider in Ashburn, that's probably 5-10ms more than going out in NYC.
Similarly, for our friends in New Brunswick, if your traffic goes to Toronto, then Chicago, then back out east, potentially, that's adding latency. Even for someone in Ottawa, I would expect a Ottawa -> Montreal -> NYC fiber path would cut a decent amount of latency compared to Ottawa -> Toronto -> Chicago or Ottawa -> Toronto -> Ashburn.
I've seen lots of situations, too, where traffic to, say, a Cogent customer in Toronto goes through Ashburn, then back up on Cogent's network. Rogers has a circuit to Cogent Toronto (see the traceroute further below), but presumably for ill-advised load balancing reasons, it's rarely always used for outbound traffic, despite the latency hit.
Look at this:
1 <1 ms <1 ms <1 ms 192.168.22.1
2 11 ms 9 ms 10 ms 18.104.22.168
3 12 ms 11 ms 14 ms 22.214.171.124
4 22 ms 18 ms 22 ms 126.96.36.199
5 24 ms 33 ms 24 ms 188.8.131.52
6 27 ms 27 ms 28 ms be812.ccr41.iad02.atlas.cogentco.com [184.108.40.206]
7 33 ms 35 ms 28 ms be2171.ccr41.dca01.atlas.cogentco.com [220.127.116.11]
8 29 ms 30 ms 28 ms be2891.ccr21.cle04.atlas.cogentco.com [18.104.22.168]
9 51 ms 31 ms 33 ms be2993.ccr21.yyz02.atlas.cogentco.com [22.214.171.124]
10 37 ms 27 ms 27 ms te0-0-2-0.rcr13.b011027-3.yyz02.atlas.cogentco.com [126.96.36.199]
11 30 ms 27 ms 30 ms university-of-toronto.demarc.cogentco.com [188.8.131.52]
12 27 ms 36 ms 28 ms mcl-gpb.gw.utoronto.ca [184.108.40.206]
13 33 ms 29 ms 37 ms mail.utoronto.ca [220.127.116.11]
Routing that traffic out of somewhere other than Ashburn could cut 15ms on this.
Same thing, look at a traceroute to my VPS:
2 13 ms 15 ms 11 ms 18.104.22.168
3 12 ms 165 ms 28 ms 22.214.171.124
4 14 ms 15 ms 26 ms 126.96.36.199
5 33 ms 33 ms 26 ms 188.8.131.52
6 145 ms 56 ms 49 ms 10gigabitethernet2-2.core1.ash1.he.net [184.108.40.206]
7 38 ms 26 ms 26 ms 100ge12-1.core1.tor1.he.net [220.127.116.11]
8 35 ms 33 ms 31 ms fibernetics-corporation.10gigabitethernet3-1.core1.tor1.he.net [18.104.22.168]
9 * * * Request timed out.
10 33 ms 33 ms 34 ms host-74-205-214-59.static.295.ca [22.214.171.124]
[Last hop removed.]
Look at the reverse traceroute:
3 vl3831.te-0-1-0-3.tor0151-asr9a.ne.fibernetics.ca (126.96.36.199) 5.238 ms 5.510 ms 5.659 ms
4 * * *
5 te0-0-0-1.rcr12.b011027-3.yyz02.atlas.cogentco.com (188.8.131.52) 5.074 ms 5.217 ms 5.240 ms
6 te0-7-0-30.ccr22.yyz02.atlas.cogentco.com (184.108.40.206) 5.086 ms 5.193 ms 5.181 ms
7 220.127.116.11 (18.104.22.168) 4.634 ms 4.648 ms 4.636 ms
8 van58-9-230-5.dynamic.rogerstelecom.net (22.214.171.124) 24.267 ms 22.572 ms 24.287 ms
9 126.96.36.199 (188.8.131.52) 24.278 ms 184.108.40.206 (220.127.116.11) 8.419 ms 18.104.22.168 (22.214.171.124) 24.278 ms
[Note: last hop showing my IP removed.]
See the 20ms jump between hop 7 and 8? That's because the Rogers network is sending the response packets through Ashburn. That has nothing to do with the cable modems, gateways versus modems, Hitron vs Cisco, CMTS, etc.
Frankly, I think that this network's routing policy was designed to balance (large) traffic volumes across interfaces and circuits in different cities, not to optimize latency on traffic to north-eastern North America.
@RogersDave, any thoughts? (oh, and while I have your attention: can you please put some PTR records on your router interfaces' IPs?)
I got the upgrade 24 hours ago and since then I've had to reset my box and the chromecast multiple times. It's not working properly.
A quick note regarding the observed latency in that scenario. When a modem is running in bridge mode, the modem includes a packet bridge function and a host (webserver / SNMP server) used for some management functions. The interface responding to pings at 192.168.100.1 is the management host and is not in the normal traffic path. It’s a process that runs on the cable modem, entirely in software and with a fairly low priority. I am not surprised to see high variation in ping times towards that address. I am also not surprised to see large differences between modem vendors as this is entirely implementation specific.
That being said, as I mentioned previously we are actively working on reducing latency and more importantly jitter. This is a medium term project as it involves reviewing the performance of multiple components along the path including the modems, CMTSs and traffic peering. It is high on my project list as I understand it is of great importance to a lot of you. The initial target is to reduce jitter towards the gateway IP address on the CMTS (.1 in the WAN subnet) and I've been working on towards this goal for the last couple of weeks.
@VivienM, I will review the traceroute you provided but generally when it comes to transit, we have little control. From a cost perspective, it is better for us to send traffic to the closest point (transport cost) so if traffic is taking a longer path, it might be based on what routes are being advertised to us at each peering point.
Regarding the PTR records, I feel the same but this is outside my control. I can ask but I don’t have high hopes.
Got the upgrade push yesterday. All good. Did a factory reset to get ipv6 happening. Success. Had a few hiccups reconencting IOT devices to the wifi network, but that resolved with abit of fiddling. Most importantly, Googlecast is back in action.
No negatives to report on the firmware upgrade after a day.
This firmware update fixed my Chromecast connectivity issues. Thank you.