VROC Technical Notes

VROC: Virtual Racers' Online Connection

Technical Notes

Part 1 | Part 2

Table of Contents

Part 1: The Basics

Problems When Racing
What to Do
Changing Replay Length in GPL

Part 2: The Hairy Stuff

Collision Prediction in Internet Racing
Impossibly Short Lap Times and Other Evils
Setting MTU Size
Setting Windows 95 Memory Size

Collision Prediction in Internet Racing

Doc Wynne, who hosts races on VROC on his high speed connection, has written an excellent and very detailed discussion of the problems of predicting where cars will be after a collision in a high latency situation, such as on the Internet, and the reasons why a crash at the start of an online race wreaks such havoc.

Note that this article refers to GPL 1.0. An enhancement in GPL 1.2 significantly reduces the collision prediciton problem by making GPL's cars "softer" when latency is high, thus reducing the risk of overly violent collisions. Nevertheless, Doc's comments are still relevant, although the problem iw now much less severe.

Comments by Doc are in green; my responses are in black.

Ok, here's the scenario: I hosted via VROC, and had a full 20 cars, with only one player having disco problems, and I think it was on his end.

[Here Doc mentioned an impossibly short lap time set in qualifying. In the Clock Smashes section I attempt to explain why this and other problems mentioned above occur. - Ed.]

Everything was fine up to the start...and even then it was working well. I usually see (if you're underpowered in the CPU department) folks getting dumped when the drive on the host system hits their green button to place their car on the grid. This didn't happen, and we had a full 19 cars at the start, and I was on grid showing 28 fps. At the drop of the green flag, the frame rate actually increased to around 32~33, and by one second into the race I was back at a pegged 36. (Too cool at 800x600 with most everything checked and bias at 1/3 from the left, mirrors to cars and track w/textures on.)

Then the carnage began. One car in the back row ran into the car in the next row up, and pushed them into me...now at this point the cars aren't going terribly fast, but if flipped me end over end several times. Of course, I landed in front of more cars, and the pileup began. A full 30 seconds had passed before I was able to Shift-R my car to upright and try to chase the pack down.

Unknown to me, most of the middle of the pack had crashed in Curva Grande up ahead, and there were cars everywhere as I rounded the corner. I managed to brake and couldn't believe what I was seeing! I was driving in 1st, trying to weave my way through all the cars, but they kept warping into and out of view...you'd see a "hole" to try to use, and as soon as you went for it, a car would pop into view in front of you. I finally got through it all, but not without further mishaps...my first lap time due to all the wrecks was like 2:48! Needless to say, I was lapped by the 5th lap and didn't fare too well.

But I did save the whole race in replay and got to looking at it last night after it was all over. It looked like suspicions confirmed as to the collisions....the warping was caused by the collisions! It seems to me that the host is doing nothing more than collision detection, and sends a packet out to all player's computers telling them that (in effect, probably greatly simplified here) "car #xx just hit car ##yy in zzz location" and then everyone's own GPL does its own interpretation of the physics involved and models it accordingly. However, (and here's the catch) even with a Latency of 0.120, the cars actually involved in the crash don't send their actual positional data back to the host and then on to the rest of the pack until almost a 1/4 of a second later!

Assuming latency of .120, the crash occurs, and the host sees a collision in the car's positional data and says "there's a crash between #xx and #yy" -- the client systems don't get this message until at least .060 sec. after it's happened and they attempt to model it their way. In the meantime, the two cars actually involved have GPL model the crash from their perspective and as soon as the car's position is fixed in 3-D space by their respective copies of GPL, they send their new positional data back to the host - at least another .060 sec. assuming the good connect. The host receives this new info and passes it along to the other players (another .060 sec. and all this is assuming -incorrectly - that it takes zero time to process the info and resend it on both ends), who already have supposedly "modeled" the crash, but now here it is (via latency) arriving another .180 sec. later..and guess what? The cars aren't where your copy of GPL predicted they would be, so your copy of GPL just moves the wrecking cars to where they are as reported by the host from the data supplied by the wrecking cars.....instant warping! Of course, at some place like Monza where the cars are moving along pretty fast, the margin for error between where the car was predicted to be and where it actually reports itself as being is HUGE!

Add to the delay of the above the time it takes each player's copy of GPL to process the incoming data and respond, as well as if the data is produced in time to make the next available packet outbound to the host (core.ini settings):

If default is every 2 ticks or every 1/16 sec. (.0625), and MOST drivers are using the core.ini they d/l somewhere that updates every 3 ticks (.09375 sec.), the average delay in getting the packet out with accurate car position info. would be around half that, so add another 0.046 sec. to the lag.

Timeline:

Host detect coll. = 0.00
Host sends packet to players = 0.046 sec. (avg. core.ini delay)
Travel time (1/2 latency) = 0.060 sec. - total delay now 0.106 sec.
Player's computers figure physics, position cars. =0.000 (not right but instantly will be used for assumption) Player's computers send packet w/position back = 0.046 sec.
(avg.) total delay now 0.152 sec.
Travel time to host = 0.060 sec. - total delay now 0.212 sec.
Host resends to players = 0.046 sec. - total delay now 0.258 sec.
Travel time to players = 0.060 sec. - total delay now 0.318 sec.

Now, assuming (incorrectly) that figuring the physics of the car takes no time whatsoever, we still have a total delay in updating car positions of every 0.318 sec. (average.) after the actual collision was detected! Thereafter, the car physics would be updated a bit quicker, since the collision detection delay isn't calculated anymore and would equal the travel times plus packet delays: 0.212 sec. avg. (after all, no one knew there was a collision until their computer were told there was one!)

Best numbers would be if the packet is calculated and ready at the exact instant that the next packet is due to be sent: 0.060 times three trips = 0.180 sec. from initial collision detection, and updates on car position every 0.0180 sec. thereafter.

Worst numbers would be the max. delay in the way the packets are sent out - the new packet with the post-collision car position was ready just as the previous packet was sent, so we wait the full time for the next packet: 0.456 sec. from collision before first update, and updates every 0.364 sec. thereafter.

Considering this data can (at best in an online setting) be updated at only around .2 sec. intervals with almost perfect connects, is it any wonder any crash in a pack would result in a multi car pileup even with careful drivers? Think of how slow updating would be possible if everyone was connected via modem with a latency of around .250! Accurate car position updates then would take place less than 2 times per second, turning a close pack of cars into a wheeled version of billiards. You would think the whole pack is warping or the host is hosed, but it's merely GPL making assumptions and updating with facts whenever it can get them. Throughout the whole mess of this massive crash, I was still getting 36 fps! On another note, I followed all of the cars post crash for the rest of the race to see if there was one who was warping badly, and couldn't find one...everyone had pretty well warp-free connects.

Another thing to think of is how GPL probably models the physics of the crash as to how hard you were hit...relative velocities and all that. If a car just appears (due to packet delay as shown above) to be inside the envelope of your car, how does GPL look at that? I'm thinking GPL may have a max. limit on relative velocity, but since the car appeared from virtually nowhere, GPL still figures it's moving damned fast, and figures collision physics accordingly. This explains why some of the really low speed collisions between car result in violent flips and whatnot.

Now, if you take all of the above, knowing that the majority of the drivers in any given race do not have connects sufficient to give 0.111 latency, and put them in a pack, the end result is usually a massive crash with only the slightest of driver error. This starts a "pinball effect" of warping into other player's cars, with more collisions to figure (and the delays I've outlined above) and the next thing you know, it looks to every driver in the race like everyone else is warping badly! Actually, there are no fingers to point at anyone, other than one driver who might have made a small miscalculation in judgment.

In a nutshell, GPL can fairly accurately predict what the other driver's cars are doing remarkably well given the info passed in the data packets in normal racing, but if a collision occurs, the predictions seem to fail rather badly with the physics as modeled by the cars involved in the crash. This disparity between the predicted position of the car and the actual position reported back is the major cause of these massive pileups all the drivers blame on warping. It isn't actually warping as much as it is missed predictions and the catastrophic results of repositioning the cars once their true position post crash are known. Sure looks like warping, though! :)

- Doc Wynne

This is a very good analysis of crash dynamics, and the problems presented by attempting to predict car position and velocity in the wake of a collision in a high latency environment. It makes it quite clear why a crash at the start or T1 wreaks havoc. Now I understand better why it is worse in online racing than real life!

Thanks, Doc!

- Alison

Impossibly Short Lap Times and Other Evils

Concerning the above race, Doc also remarked:

Practice was very smooth, I never saw any cars warping, etc., although the one who was having problems connecting did manage a 5.14 sec. lap at Monza. :) I can see how this happens as well...he leaves the pits, makes enough of a lap to cross the line and start the lap timer, then gets discoed, then rejoins it immediately (and now he's like car #20 in line) and immediately pulls out of the pits, and by being near the end of the line, he has to cross the start/finish line to get on the track...he now trips the timer, and the car # and driver name are correct for GPL, so it must be a good lap...even if it is way out of range.

Actually, most impossibly short lap times are caused by clock smashing, not disconnects. When you host, you see clock smashes as giant warps: a car will suddenly disappear, and then reappear a long distance away. Sometimes the car's sound warbles strangely too. On the client where a clock smash occurs, this is accompanied by a startling flash of the screen and a brief interruption in sound.

If a clock smash occurs near the s/f line, it can cause the server to score a very short lap. For example, if the client car crosses the line, and then its clock gets smashed forward a half second, then it will be relocated behind the s/f line (from the host's perspective). It will then keep driving and cross the s/f line again a moment later.

It's also possible to gain an extra lap, lose a lap, get disqualified for going backwards, or get black flagged for cutting the course, all due to clock smashes which relocate the client's car some large distance.

I was having severe clock smashes due to a defective modem driver, and I've had a number of ultrafast laps during races, when obviously there was no chance that they could have been caused by disconnect/reconnect. Fortunately, I finally found the problem, and have been clock smash free for two days now. :-)

[For a more detailed discussion of the clock synchronization issue, including comments by Papyrus engineer Randy Cassidy, see the Bizarre Stuff! section in my GPL Online FAQ. Note that GPL 1.2 contains a far superior method of clock synchronization, which dramatically reduces the frequency of occurance of the problems described here. - ed.]

People who are having connection problems are also prone to have clock smashes, and vice versa. Clock smashes are large corrections in the client's internal clock, which are necessary if the client's clock gets out of sync with the host's clock. Unfortunately, if the timing data (constantly being exchanged between client and host) gets delayed, the client may assume that its clock is too far ahead, and smash it back. When data starts coming in a timely fashion again, the client will then smash its clock ahead again.

Several things can delay timing data: a saturated host CPU, which will thus be slow responding to the client's timing request; a saturated client CPU, which will thus be slow picking up the host's response to its timing data request; a spike in latency, which will delay the timing packets; clogged serial ports at either end, which will do the same thing.

So, people with slow computers who haven't cut graphics detail enough to get 36 fps, and people with bad internet connections (high and highly variable latency) are prone to both disconnects (which happen when too much data gets lost or delayed) and clock smashes (which happens when timing data gets lost or delayed).

[Some people can get away with running lower than 36 fps in online racing in GPL, while others can't. I have no idea why this is. The rule of thumb is, if you see more than occasional clock smashes, or you have problems staying connected to GPL races, get your frame rate up to 36 fps. - ed.]

Incidentally, the situation is aggravated when latency is high, because GPL times out on overdue data, and re-sends it, thus requiring more bandwidth. That's why sometimes one person with a bad connection or slow computer can cause a lot of people to disconnect. This will happen if the host's bandwidth is nearly saturated due to a lot of users. The re-transmission required by the unfortunate client saturates the bandwidth, causing delays, and people start dropping like flies.

I suspect Papy could write a patch to make the host compare the lap time against a set "minimum allowed" time for that track (like maybe 1:20 at Monza) and disallow anything under that time...that should keep host's lap records from getting hosed by this bug.

I've suggested that they do that. In the meantime, anyone who's had bogus lap records saved by GPL can simply delete the records.ini file in the appropriate track folder in GPL.

- Alison

Setting MTU Size

Note: This information applies mostly to people using analog modems. If you have a high speed connection such as cable modem or xDSL, you may not find that Doc's recommendations are ideal for you. With my cable modem, I've left my MTU size, RWIN, and TTL at their default values, and have excellent results.

More information from Doc Wynne.

Most diehard Quake/Quake2 online gamers are already pretty experienced with tweaking their system and searching around for ISP's that provide the nirvana of online gamers - a low ping!

The following URL has many tips and tweaks on how to set up Windows DUN for optimal online gaming. It is written for Quake online gamers, but I think the same tweaks would help online racers, especially those who are running Windows with DUN's inefficient default settings.

http://www.voodooextreme.com/3Fingers/NetTweak.htm

This advice applies to everyone on the net, be it gamers, web page browsers, warez heads, or chatting. If everyone did these things by default, there would be a bit less "wasted" bandwidth, so there would be more open bandwidth for all.

This site had excellent advice, but with one exception.

The Max MTU size should always be set for the old "standard" of 576. The way they are instructing you to do it will work as long as all the players involved are using the same ISP, but it gets a bit trickier when you're trying to play nationally or internationally.

In this case (and ours), the Max MTU size should be the smallest of the different sizes supported by any and all potential router "hops" along the way. And since those hops are dynamic in nature, 576 is the smallest possible denominator, so it works no matter where your packets end up going through. It's easier to send many smaller packets than it is to send fewer larger ones and have to have them chopped into smaller ones out there somewhere and have them reassembled somewhere else.

Here hardwired on the T-1, setting it to 576 (instead of the Win default of God-knows-what), results in almost a 20% speed increase in file transfers and web page loading.

My recommended settings for online gameplay are:

MTU (Maximum Transmission Unit) = 576
RWIN (Receive Window size) = 2144 (You don't need a large buffer with dynamically changing data, you want to get it to GPL a.s.a.p., not just buffer it!)
TTL (Time To Live) = 32 (Most places say set it to 64...this will get you connected to more places when the net is busy, but if it's taking 64 hops, your ping time is shot anyway.)

- Doc Wynne

Delphi also has a page with information on how to set the above parameters, either by use of a freeware utility or by editing the registry.

Setting Windows 95 Memory Size

Doc Wynne again:

Here are different settings you can do to save memory.

First, in Netscape:

Click Edit ---> Preferences ---> Advanced (open it by clicking the +), then Cache. Change the memory cache size to 256k (default is 1024k) This saves 3/4 of a meg. of memory when Netscape is running.

Next, take Notepad and look in your Windows directory and open your System.ini file. If you scroll down in it you'll come to a section that looks like:

[vcache]

It will probably contain nothing and below it will be another section marked by the square brackets.

Add the following lines just below [vcache]:

MaxFileCache=4096

MinFileCache=4096

(Note that this is for machines with 64 meg. RAM! if you have 32 Meg. make the sizes either 512 and 512 or 1024 and 1024. Machines with more memory can use respectively higher values for vcache. On mine with 192 Meg., I keep them set at 16384.)

This tells Windows to stop dynamically allocating free memory for file caching...Windows by default will take up to 33% of all free memory it finds for file caching, and while it does "shrink" the cache when other programs need the memory, if there is data in the cache, it has to write the existing cache data to the hard drive to do this (or delete it if it was just data that was read off the drive). All of this will make Windows pause while it happens, and can cause those annoying little "hiccups" in GPL. The 4096 number sets you up a 4 meg. file cache, which should be plenty. You might notice that BIG applications don't load quite as fast, but they will get more memory without having to fight Windows for it.

Next, go to your Control Panel ---> System, and click the Performance Tab, then the File System button, then the CD-ROM tab. Change the Supplemental Cache size slider from 1238 to 214 kilobytes. This saves another 3/4 of a meg. of memory that is dedicated just to reading ahead with the CD-ROM. (As long as you don't run your software off the CD, you'll probably never notice the difference.)

Just the changes above will save you a bit over 2 megs of RAM and will help keep Windows from doing any disk accesses while you're online racing.

Another one is to do a permanent swap file of at least 100 meg. so Windows won't resize it on the fly. Believe it or not, Windows will actually try to keep about 10~15% of it's total memory free, even if it has to write data for the only software in use on your computer to the swap file! (I guess it thinks you may want to start another program, so it's partially ready.) This is above and beyond what memory it can grab for the file cache, so it makes sense to make both of those memory hogs to a fixed size for much less disk access while you're involved in a hot battle for the lead - or last place!

Cut back on the amount of memory you let GPL allocate for the replay file. 20000 should be plenty for any Pro short race with up to 12 or so cars on a system w/64 meg. of RAM.

[See Changing Replay Length in GPL, below, for information about how to reduce the memory allocated to replays by GPL. - ed.]

One other smaller thing is to go into the Windows Fonts on the control panel, and remove any font that has duplicates. By that I mean like most systems have Arial, Arial Italic, Arial Bold, and Arial Bold Italic installed. If you are in your word processor or whatnot and click those buttons to italicize or make the font bold, you are merely doing those changes to the standard font! You have to specifically choose the italic font or bold font if you actually want to use it! (Try this...choose Arial in Wordpad and type a line, then highlight it, and hit the italics button. It works...now with the same line highlighted, choose Arial Italic and then hit the italic button, you notice they lean over even more, because the program is actually italicizing the italic font now.) if you remove all those extra fonts you don't use, you make Windows itself use less memory and boots up faster, since you're not having to load all those font names into the font cache in Windows. This doesn't save a lot of memory unless you have tons of fonts, but every little bit helps. I've seen heavily configured Win 95 systems take up almost 28 meg. just getting to the desktop.

This may help those with systems that are marginal on RAM to be able to play without major disk access caused warps.

- Doc Wynne

Doc also had this to say about memory allocation in Pentium Classics and Pentium MMX's:

Any Pent. Classic or Pent. MMX board with the FX, VX, or HX Intel chipsets will only cache the first 64 megs. Socket 7 boards with the LX chipset don't have this limitation.

Note that Pentium Classics/MMXs with the older chipsets and more than 64 mb of memory may run GPL terribly slowly, if GPL loads into the area of memory that is not being cached.

Part 1 | Part 2