Closed Bug 257454 Opened 20 years ago Closed 11 years ago

setTimeout and setInterval are slower than on IE or Opera

Categories

(Core :: DOM: Core & HTML, defect)

x86
All
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mpk, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf, regression, testcase)

Attachments

(2 files)

I opened this bug as a "spin-off" from bug #212831 upon suggestion from Markus
Hübner. IE and Opera are able to perform setTimeout and setInterval every 10ms,
whereas Mozilla seems to take considerably longer (16.3ms on my machine). I'll
attach a minimized testcase from the aforementioned bug shortly.
This testcase lets you specify the number of runs and an interval/timeout time.

You can then start the test using either setTimeout or setInterval. After
completion, it adds a short summary to a result list.
IE and Opera display some strange behaviour, whereby any delay from 11-20ms
results in a delay of 20ms. 21-30ms results in a delay of 30ms and so on.
Mozilla scales (more or less) linearly.
Blocks: 212831
Marco, what sort of hardware are you using, if I may ask?
It's basically an ASUS A7V266-E motherboard, equipped with an AMD Athlon XP1800+
CPU and 1GB of RAM. I suspect this to be a regression (caused quite some time
ago). I tested some old releases with the following results:

Mozilla 1.0 & 1.1:
   both setTimeout and setInterval perform normal (10ms)
Mozilla 1.2.x, Phoenix 0.1 - 0.4:
   setTimeout already slow while setInterval still normal
Mozilla 1.3, Phoenix 0.5 (and later):
   both setTimeout and setInterval perform slowly (16ms)

I'll try to test it on a few other platforms (Hardware and OS).
Keywords: regression
The results were reproducible on the same machine under FreeBSD 5.2.1 as well as
under United Linux 1.0 on an older machine with an ASUS TX97-XE motherboard,
equipped with an AMD K6 (200MHz) CPU and 128MB of RAM. Changing OS to All.
OS: Windows 2000 → All
Blocks: 203448
It doesn't seem to be an AMD-specific problem. Confirming for a DELL Inspiron
8100 Notebook (running Win2K). It features an Intel Pentium 3 Mobile CPU (1GHz)
and 512MB of RAM.
Right component, please.

jst, any thoughts?

Anyone: more measurements, maybe some kind of instrumentation or profiling,
would be helpful.  If someone can identify a window during which things
regressed, year or two ago, that would be even better.

/be
Assignee: general → general
Component: JavaScript Engine → DOM: Level 0
QA Contact: pschwartau → peterv
I plan to both profile this and check for the regression window when I get back,
if I can see the problem then (the "if" is because a 1.8a3 build on this Win98
machine I have right now runs 10000 setTimeout/setInterval calls in 10.4
seconds; I'm not sure what the hardware here is, though).
Sorry about the wrong component.
AFAICT setTimeout regressed between Moz 1.2a (20020910) and Moz 1.2b (20021016).
With the oldest trunk nightly I could find on archive.mozilla.org (20021204)
setInterval had regressed as well.
Oh, I finally figured out what the hardware here is (on the machine where I do
not see the bug):  it's an Athlon 800MHz with 384MB ram, running Win98.  And
today I can reproduce the bug in spades (takes 4-6 times as long to run as it
should).  Which is odd, since 2 days ago with this exact build on this exact
machine there was no problem...
The setInterval behavior changed with the patch to bug 178810 which made the
clock-skew-corrected interval be at least 10ms.  As a result, if there is 6ms of
clock skew we end up with 16ms per interval.

If I reduce DOM_MIN_TIMEOUT_VALUE to 5 (or just remove the code introduced in
GlobalWindowImpl::RunTimeout for the interval case), the interval testcase goes
back to being 10ms on average.

As for timeouts, I tried doing a jprof of the testcase (1000 timeouts at 10ms).
 About 4300 of the 126237 total hits are spent doing something "useful".  That's
not enough to account for 6ms per timeout clock skew...

For those 4300 hits, the breakdown is:
  463 under JS_GC
  23  under TimerThread::UpdateFilter
  58  under nsTimeoutImpl::Release
  28  under nsTimerImpl::Cancel
  3341 under nsJSContext::CallEventHandler

This last one has 2769 under JS_CallFunctionValue and the rest under pushing and
popping js contexts (14 and 20 hits), assigning to nsCOMPtr (114 hits), and
calling security manager (375 hits).
I can't quite find the issue with setTimeout... Moving the init of "now" to the
top of SetTimeoutOrInterval doesn't affect things much.  There's a lot of
silliness involved in going back and forth between delays and times (in
SetTimeoutOrInterval and then in timer initialization).

As an experiment, I tried keeping mTimeoutAdjustment at 0 in TimerThread.cpp. 
Then the times went up from 16.5 seconds to 20 seconds.  So mTimeoutAdjustment
is helping some, but not enough...  There's sometimes a millisecond lost between
PostTimerEvent() and Fire().  That's not really enough to account for 10ms of
overhead.

Naturally, increasing the fudge factor on smoothslack from 1.5 to, say, 3 helps
some (average of 3ms, in fact).  Brendan, do we need to tweak this algorithm in
general?

Another note:  When I was looking at timer firing times, it looked like there
were a lot of 20ms delays and some 1ms ones (!).  So something is definitely up
somewhere....
Tweaking the XPCOM timer digital filter to compensate for something it doesn't
directly measure seems wrong to me.  We need to measure the source of "extra"
delay.  Please correct me if I'm wrong, and the XPCOM code can measure all of
the delay seen by DOM timeouts/intervals.

/be
I don't know yet; I've not figured out the exact algorithm of the xpcom code.

The right way to debug this, in my opinion, is to instrument the code all over
to measure exactly when things happen (original setTimeout call, call into
XPCOM, timer thread called, timer thread wakes up, PLEvent posted, PLEvent
fired, DOM code triggers timeout handler).

I may be able to get to this sometime, but I need to deal with some pressing
style system issues first.  If someone has time to look into this, that would be
much appreciated (the instrumentation can even land on the trunk in
DEBUG_TIMEOUT ifdefs or something).
Just wondering (now that I've checked in lcook@sybase.com's patch) whether the
fix for bug 252324 helps this bug?

/be
Not that I can tell.  A build with that patch applied shows the same times as
one without.
I messed with this today.  In the end, I tracked this down, on my Linux build,
to the fact that TimerThread does a PR_WaitCondVar which calls
pthread_cond_timedwait with the timeout (see pt_TimedWait()).

Now this last function is only precise to 1/HZ on Linux (which is 10ms on
Linux/Intel, apparently).  I would not be at all surprised if it's got similar
precision on Windows....  If someone has Linux/Alpha, where 1/HZ is 1ms, I'd be
interested to hear of their results on this testcase.

In any case, pthread_cond_timedwait seems to return after somewhere on the order
of 19ms... which, with some adjustment done by the mTimeoutAdjustment code in
TimerThread gives us the results we see.

I bet IE and opera have the 10ms steps precisely because of the 10ms accuracy
provided by pthreads and the like...

I can make things a bit better by changing pt_TimedWait (in
nsprpub/pr/src/pthreads/ptsynch.c) to floor the tmo.tv_nsec value to the nearest
10 million (so floor to 10ms).  This ensures that we won't end up waiting for
two 10ms ticks if we want a timeout that would end up firing after the nearest
tick but before the next one.  That changes the average timeout from 16ms to
13ms... (this is while keeping the 1.5 mTimeoutAdjustment fudge factor in
TimerThread.cpp).

The upshot of all this, though, is that we need to somehow rethink the whole
approach in TimerThread, since the NSPR api only guarantees 10ms precision and
we need much higher precision than that (PRIntervalTime precision, ideally,
which seems to be 1ms on Linux).  Can we use nanosleep() and signals or
something like that to get better time resolution?
Intel, PowerPC, and Sparc all have higher resolution hardware timers (running
off bus clock or processor clock).  Can we use those on platforms where
available rather than trying to string something together?
That would be ideal, yes.
My jprof patch, polished and committed by bryner, uses /dev/rtc on Linux where
it's present (pretty much everywhere these days), so it might be a place to look. 

Polling on a pipe might be what we want here, though, because it doesn't require
signals.
Just to clarify the current timer arch, so people have an idea how it works now:

1)  When there are not timers around, we sleep for 100ms at a time.
2)  When we have timers, we sort them by when they fire and sleep till the
    nominal time the first one should fire (this is what's not really working)
3)  When a timer is added or a timer's delay is changed, we wake up, resort
    timers as needed, and go back to sleep per item #2.

Whatever solution we come up with should preserve property 3 without making us
busy-wait (because we want to deal with timers that suddenly get added and have
short timeouts).  I agree that polling on a pipe would work for property 3, if
we can get it to have property 2 (being able to wake up when it's time for a
timer to fire).
Re: comment 17 and comment 18

When using high resolution hardware timers, we
must make sure that we have a consistent global
time in a multiprocessor system.
Who might be the right person to assign to?
Anyone with a lot of time on their hands and multiple operating systems on hand
to test the changes.  Are you volunteering?  ;)
I'm a volunteer. Tell me what to do.
1) Write some code to no longer use 10ms-resolution timers for nsITimer
2) Test the heck out of it...
Blocks: 107746
Blocks: 197341
Note that testing this bug on a Linux 2.6 system is near-useless because 2.6 has
HZ at 1000 instead of 100.  You'll still get times that are too high, but
they'll be too high by 0.7*1ms instead of 0.7*10ms.
With the patch I'm developing for bug 326273, I'm able to get very reliable 10ms timer intervals on Windows.
Depends on: nsIThreadManager
I am getting
10.7s (1000 runs, 10ms delay, setInterval)
10.7s (1000 runs, 10ms delay, setTimeout)
On Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060514 Minefield/3.0a1 ID:2006051404

10.7s (1000 runs, 10ms delay, setInterval)
10.7s (1000 runs, 10ms delay, setTimeout)
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a1) Gecko/20060514 Minefield/3.0a1

10.1s (1000 runs, 10ms delay, setInterval)
10.2s (1000 runs, 10ms delay, setTimeout)
On the same Minefield build for the Mac I get the above results. It would appear that bug 326273 has fixed this.
FreeBSD too introduced HZ=1000 as default (in v6.0) . This 
can be reverted by setting kern.hz=100 in boot/loader.conf
and rebooting. But then even with a current Firefox trunk
build (20060515) I still get slow results:
 16.3s (1000 runs, 10ms delay, setTimeout)
 16.4s (1000 runs, 10ms delay, setInterval)
Just to stop this dying completely...

I've been worrying about erratic animations under Firefox. This happens less under Opera, and not noticeably at all under IE. Scrolling even quite simple animations (see for example the first case on: http://cross-browser.com/x/examples/animation_tech.php) gives very jerky results.

Investigating this for my own scrolling photo code (using a simple setInterval timer) I see:
- there is a minimum interval of 16ms
- longer intervals are calculated in multiples of 16ms, so for instance 20ms will be mostly 16ms intervals, with a few 32ms intervals thrown in to correct the average, and 50ms will be mostly 48ms (or more accurately 46/47) with a few 64ms (or more accurately 63ms) added as a correction.
- long term averages are correct to a decimal place.

I measured these times by using getTime() under Firebug under Windows XP. I'm not sure how anyone gets a timer of 10ms - nothing I do ever gives a time less than 16+-1ms (in line with comment 30).

I don't mind the interval sometimes being twice what I asked for (although it does seem a bit crude) but it would be good if someone could sort out the jerky animations.
Another year gone by without progress I guess.
I stumbled over this behaviour today myself. Tested this with various browsers on my system (C2D@2.0GHz, XP SP3). Firefox 3.1, Opera 9.6, IE 7 and Safari 3.1.2 yielded the same result: a minimum of 15.7ms.
Just one exception: Chromium 0.3.* only paused the time specified, and might it be only 1ms. One drawback, CPU utilization got up to 50% - so I suspect they're polling.

Regarding comment #31, minimum cycle time was 15.7ms here but every interval above (20ms, 33ms, 50ms a.s.o) was met (ok, plus 0.7ms but still).
Assignee: general → nobody
QA Contact: peterv → general
This bug was fixed 1 or 2 years ago by some similar bug, right?
Firefox doesn't get exactly 10s like IE 10, but before it was around 16s, I think.

Nightly 27
10.6s (1000 runs, 10ms delay, setTimeout)
10.4s (1000 runs, 10ms delay, setInterval)

Chrome 29
11.0s (1000 runs, 10ms delay, setTimeout)
11.0s (1000 runs, 10ms delay, setInterval)

IE10
10.0s (1000 runs, 10ms delay, setTimeout)
10.0s (1000 runs, 10ms delay, setInterval)

Setting needinfo from you, because I know you know the answer. Sorry!
Flags: needinfo?(bzbarsky)
I actually don't know the answer.  Maybe Avi does?
Flags: needinfo?(bzbarsky) → needinfo?(avihpit)
Interesting. Not quite sure what the question is, or what that other bug which fixed it was, but I'd assume the question is "Are these results expected?".

TL;DR: expected, but maybe could yet improve:
- Animation should not be a concern since it uses the very accurate requestAnimationFrame.
- setTimeout is expected to have overhead depending on system performance.
- setInterval looks to me quite accurate but also noisy on average.


First, as for animation concerns, even if javascript's setTimeout and setInterval are inaccurate, it shouldn't affect animation, since animation uses requestAnimationFrame (or equivalent for css animation) which sticks to an imaginary vsync signal very accurately and especially on average (defaults to 60hz but can be modified via a pref: layout.frame_rate) or to an actual vsync signal if available (currently on windows with DWM).

Also, setTimeout is expected IMO to have some inaccuracy in accumulation since there's an overhead to retriggering it, and this overhead amount is inversely proportional to the system's performance. This is supported by the following measurements which I took on a slow and fast systems with testcase 1 (slightly modified to use performance.now() and without rounding the result) also, the last results are 2000 cycles of 5ms timeouts (the result is the duration divided by fixed 1000):

Slow: AMD E-350, win7:
10.903978062687383s (1000 runs, 10ms delay, setTimeout)
10.846578678743972s (1000 runs, 10ms delay, setTimeout)
10.851496481188864s (1000 runs, 10ms delay, setTimeout)
10.804399139496963s (1000 runs, 10ms delay, setTimeout)

11.648054725228576s (2000 runs, 5ms delay, setTimeout)

Fast: i7-3638qm, win7:
10.106655759324617s (1000 runs, 10ms delay, setTimeout)
10.07944969523017s  (1000 runs, 10ms delay, setTimeout)
10.224640643561681s (1000 runs, 10ms delay, setTimeout)
10.124398007941883s (1000 runs, 10ms delay, setTimeout)
10.059879768362967s (1000 runs, 10ms delay, setTimeout)

10.153214088089298s (2000 runs, 5ms delay, setTimeout)

So, while it's a bit noisy, especially on the faster system, the slow system has overhead of about 0.8ms/iteration, and the fast system's overhead is ~0.1ms/iteration. Sounds maybe a bit high, but the effect itself shouldn't be described as a bug IMO.


As for setInterval, here IMO we should expect much more accurate results, and also less dependency on system performance. Here are my measurements on the same systems:

Slow: AMD E-350, win7:
10.094857447888295s (1000 runs, 10ms delay, setInterval)
10.023708106450329s (1000 runs, 10ms delay, setInterval)
10.002322562928951s (1000 runs, 10ms delay, setInterval)
10.00563253589865s  (1000 runs, 10ms delay, setInterval)
10.010157042466453s (1000 runs, 10ms delay, setInterval)

10.028546351490892s (2000 runs, 5ms delay, setInterval)
10.011365160144633s (2000 runs, 5ms delay, setInterval)


Fast: i7-3638qm, win7:
10.010062493425234s (1000 runs, 10ms delay, setInterval)
10.00005516389665s  (1000 runs, 10ms delay, setInterval)
10.150105666657836s (1000 runs, 10ms delay, setInterval)
10.004458012577365s (1000 runs, 10ms delay, setInterval)
9.999234547480738s  (1000 runs, 10ms delay, setInterval)

10.012124938955996s (2000 runs, 5ms delay, setInterval)
10.03322790899733s  (2000 runs, 5ms delay, setInterval)

(all measurements taken with recent Firefox 27.0a1 nightly builds)

On both systems it's a bit noisy, but roughly on both systems, with either 1000x 10ms or 2000x 5ms, it looks to me around ~0.01ms inaccuracy, where frequently it's quite better than this, and sometimes worse.

I'd say this behavior is acceptable, though I wouldn't mind seeing lower setTimeout overheads, especially on slow systems.

bz, would you agree with the conclusion? do you think we should do something to address/improve this?
Flags: needinfo?(avihpit) → needinfo?(bzbarsky)
Sounds like this is worksforme, indeed.
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(bzbarsky)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: