| syncing with the monitor retrace |
|
jeroen clarysse
|
as a side note : I've been digging through the NSOpenGL documentation in XCode, and found this note :
NSOpenGLCPSwapInterval Sets or gets the swap interval. The swap interval is represented as one long. If the swap interval is set to 0 (the default), the flushBuffer method executes as soon as possible, without regard to the vertical refresh rate of the monitor. If the swap interval is set to 1, the buffers are swapped only during the vertical retrace of the monitor. Available in OS X v10.0 and later. Declared in NSOpenGL.h. again, it isn't entirely clear what happens when you call an openGL context swap at time 25 : the documentation states "swapped only during retrace"... does that mean that if you call swap outside that interval, NOTHING IS SWAPPED, or is the swap simply delayed ? If so, is the app blocked, or does some internal mechanism handle this in a separate thread ? What happens if stuff is drawn to the backbuffer between the call at 25 and the effective retrace sync at 30 ? What happens if TWO swap calls are made within one retrace ? questions, questions, questions.... :-) |
|||||||||||
|
|
||||||||||||
|
jeroen clarysse
|
as one last question (I should have grouped my questions in one reply.... )
is there a way in SDL to "hook into the vertical blank" ? I.e. : can I attach a function to the OpenGL engine that gets called EVERY time a retrace is completed ? That would be a nice way to do proper stimulus presentation in "frame counts"... |
|||||||||||
|
|
||||||||||||
| syncing with the monitor retrace |
|
Scott Percival
Guest
|
I could be wrong, but here's my understanding from the OpenGL side of things. WGL and GLX don't have a method to poll for the refresh rate or the vertical retrace status, instead they have an extension (GLX_EXT_swap_control) to set the swap interval, exposed in SDL as SDL_GL_SetSwapInterval and SDL_GL_GetSwapInterval.
A swap interval of 0 means buffers are swapped as fast as possible with no regard for vsync, 1 means the buffer swap call will block by sleeping until the vertical retrace finishes, 2 means the same but for every 2nd retrace, and so forth. Therefore, in the 100Hz case a draw call made at 25ms would sleep until it hits 30ms, then release. Also there's a further extension (GLX_EXT_swap_control_tear) for "Xbox-style" vsync handling, where any errant draw call that misses the retrace will trigger a buffer swap ASAP, then afterwards revert back to vsync. But that doesn't really help in this case. The OpenGL platform APIs don't support callbacks, so you're probably out of luck. If you really need vertical sync, you will most likely have to poll your external devices from a separate thread, or perhaps change your strategy (e.g. set the swap interval to 1, determine the refresh rate, set the swap interval to 0, start polling and trigger buffer swaps by checking a timer). On 30 April 2013 18:35, jeroen clarysse wrote:
|
|||||||||||||
|
|
||||||||||||||
|
jeroen clarysse
|
damd... that's a painful situation...
how do high performance games handle this ? I mean : you sometimes see fps > 100 on some games... I guess they use a separate thread to poll the keyboard & mouse and do collision calculations ? If you look at games in the process manager/activity monitor, they are for sure taxing the CPU, so they are not sitting idle waiting for the VSync... your last suggestion might work, but it is a bit dangerous, since any rounding errors will accumulate over time, causing redraws to occur midscreen my biggest fear here is the following : my app is not a game, but an engine for psychology experiments. We need to be able to poll devices all the time to see if a subject has responded to visual stimuli on the screen. Even a false positive (subject presses too early to be a proper reaction time, or even BEFORE the stimulus is shown) needs to be detected. With a sleep inside the OpenGL_Swap, I will be losing the capacity to detect these responses I could go multithreaded, but as people have pointed out to me in another forum topic, the odds of introducing more problems that I'm solving are substantial. Especially timing-wise, threads are a "bag of hurt" |
|||||||||||
|
|
||||||||||||
| syncing with the monitor retrace |
|
Scott Percival
Guest
|
With the exception of fancy >100Hz monitors, FPS >100 would imply that the game is running with vsync turned off. Also most games only need to pump the event queue (which deals with input) once per draw call, which simplifies things.
Can you describe the external device that you're polling? If it just appears to the PC as an ordinary input device (e.g. keyboard, mouse, joystick), and you need sub-frame input accuracy (which seems like slight overkill IMHO), then you might be able to rig something by looping SDL_PollEvent, and using a timer to record measurements and swap the buffers as described before. On 30 April 2013 19:51, jeroen clarysse wrote:
|
|||||||||||||
|
|
||||||||||||||
| syncing with the monitor retrace |
|
Gabriele Greco
Guest
|
I don't thing an human being is capable of reaction times under 1/60 of second, so I'm quite sure that any choice you'll make will not invalidate your simulation purposes. Anyway I think the only way to be sure to process as many frames as possibile and never wait it's to disable VSYNC, if the FPS are high you will barely notice any tearing.
This will not solve your problem anyway, since if you detect something on another thread you'll have anyway to push the results to the main thread, and wait for the vsync to display anything. -- Ing. Gabriele Greco, DARTS Engineering Tel: +39-0100980150 Fax: +39-0100980184 s-mail: Piazza Della Vittoria 9/3 - 16121 GENOVA (ITALY) |
|||||||||||||||
|
|
||||||||||||||||
| syncing with the monitor retrace |
|
Scott Smith
Guest
|
Most modern games use multithreading which can get complicated.
I handle this by letting the main loop render as fast as possible and then adjusting objects according to how long the render took. Vsync can also be used to limit the frame rate and it works since the logic is tied to how much time has passed, not how many frames have rendered. I use this method and works well, as I can get the same logical speed of the games across slow to fast systems, with just a fluctuation of frame rate. Heres a more detailed write up on the method: http://www.koonsolo.com/news/dewitters-gameloop/ From: jeroen clarysse To: Sent: Tue, April 30, 2013 7:51:12 AM Subject: Re: [SDL] syncing with the monitor retrace damd... that's a painful situation... how do high performance games handle this ? I mean : you sometimes see fps > 100 on some games... I guess they use a separate thread to poll the keyboard & mouse and do collision calculations ? If you look at games in the process manager/activity monitor, they are for sure taxing the CPU, so they are not sitting idle waiting for the VSync... your last suggestion might work, but it is a bit dangerous, since any rounding errors will accumulate over time, causing redraws to occur midscreen my biggest fear here is the following : my app is not a game, but an engine for psychology experiments. We need to be able to poll devices all the time to see if a subject has responded to visual stimuli on the screen. Even a false positive (subject presses too early to be a proper reaction time, or even BEFORE the stimulus is shown) needs to be detected. With a sleep inside the OpenGL_Swap, I will be losing the capacity to detect these responses I could go multithreaded, but as people have pointed out to me in another forum topic, the odds of introducing more problems that I'm solving are substantial. Especially timing-wise, threads are a "bag of hurt" |
|||||||||||
|
|
||||||||||||
| Re: syncing with the monitor retrace |
|
jeroen clarysse
|
ha, you're underestimating psychologists :-) no, in al seriousness : we sometimes display a series of images (animation if you want) and need to record input from the subject. The actual response time to the image is not relevant, but a few seconds later, a variation of that animation might be shown, and we need to see if the subject is responding slower or faster to this new animation. Sometimes, the examined effect is in the order of 5 msec ! If each frame of the animation can induce a sleep-time, it is impossible to measure this
Well, in our experiments, the visual feedback on the screen rarely needs accurate timing. It's only the subject's response to the first visual stimulus that needs to be recorded accurately |
|||||||||||||||
|
|
||||||||||||||||
| syncing with the monitor retrace |
|
Rainer Deyke
Guest
|
On 2013-04-30 14:58, jeroen clarysse wrote:
Simple, almost predictable way to measure reaction time: - Draw your image. - Present() - Measure the time after Present() returns. - Measure the time when you receive user input. - Take the difference between these two times. It doesn't matter if Present() returns immediately or blocks for 20 ms or blocks for five minutes, so long as it returns at the exact time when the image is sent to the screen. The only problem is that the OS might choose the schedule another task in the exact instant between when Present() returns and when you measure the time, but there's no way to avoid this without a real-time OS. The same principle applies to animation, so long as you can pump out frames faster than the screen refresh rate. Just measure the time after the first frame. -- Rainer Deyke _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
|
||||||||||||||
|
Frederik vom Hofe
|
Actually GLX at last has glXGetVideoSyncSGI to read out the vsync frame counter. I don't know if there is an equivalent on windows.
How do you want to read out reactions as short as 5 Milliseconds? 60Hz screen = full frame update in 16,6_ Milliseconds 120 Hz screen = full frame update in 8,3_ Milliseconds 200 Hz screen = full frame update in 5 Milliseconds You could limit yourself to only use the upper or lower part of the screen and thereby make the time for a "full frame update" proportional shorter. Also flat screen pixels that change in a wide color range may need "a long time" to change. Gray to gray is often 3-1 ms, but black-white changes need more time. (And manufacturer informations on reaction timings are garbage) And then there is the so called "input lag". A fixed time delay between the sending of data to the screen and when the screen starts to change the first pixels. Some flat screens lag more then 25ms. But because it is fix, you could measure it and just set a variable in you program accordingly. Note: CRTs have input lag, too! The graphic card and driver also causes some "input lag" but not that noticeable. Still the easiest way is to not use vsync at all and render as fast as possible. Then you only have to compensate for input lag. |
|||||||||||
|
|
||||||||||||
|
jeroen clarysse
|
I'm aware of these limitations, and I myself also think that the obsession with display timing in psychology experiments is sometimes exaggerated ! But there are a ton of articles on this subject, and (some of) my colleagues can be rather persistent about this issue... it is not so much about displaying fast, but about accurate timing : - we want to know as precise as possible WHEN a stimulus is presented on the screen. - we also want to control HOW LONG it is visible with as much precision as possible - the faster it is presented, the better (aka : higher refresh rate is better) - we need to be able to inspect devices ALL the time - all of this should be feasible in a "framework" where the experimenter has as much freedom as (s)he wants. ideal would be if we would stop using 'time' as a measurement of onset for stimuli, but rather use 'frames'. So instead of saying "present the stimulus at time 100", we would say "present it after 10 frames" (on a 100Hz monitor). The problem is that I can not reliably count frames : if the OS takes the processor away for just enough time to miss one refresh, everything goes bananas. A "hook" or "interrupt" in the OpenGL core that triggers a custom function at every vertical blank would have been the solution, but hardware probably doesn't support this, unless this glXGetVideoSyncSGI is exactly that ? My app needs to be Mac + Win as you see : there is nothing really complicated, but there are a lot of issues involved which CAN lead to complex situations... If this was just a simple display-image-then-wait-for-device, things would be simple. But unfortunately i'm trying to write a FRAMEWORK in which the researchers can create their experiments, and thus I have to foresee execution paths that lead to bizarre results thanks for replying, all of you ! |
|||||||||||||
|
|
||||||||||||||
|
Frederik vom Hofe
|
In this case I would use vsync and multithreading. (Multithreading is not THAT scary)
You can also give your application a higher priority so it is unlikely that other events stall it. And you can build in detectors for missed vsync, or time stalls in the even thread. Then only input lag and frame queues can ruin your measurements. Some imaginary code:
|
|||||||||||||
|
|
||||||||||||||
|
jeroen clarysse
|
@ Frederik vom Hofe :
wow ! This is awesome !!! I still have to read it more thoroughly (just got home after a long hard workday so my brain is a bit fuzzy) but it helps me a lot ! thank you ! thank you ! |
|||||||||||
|
|
||||||||||||
|
jeroen clarysse
|
as a silly side-note : I know how multithreading works, but I wonder HOW fast thread switches happen on a single core CPU... what is on average the time assigned to every thread ? 5 msecs ? 1 msec ? less than a msec ?
|
|||||||||||
|
|
||||||||||||
|
Frederik vom Hofe
|
Thread switching happens 100. of times in a Second anyway in form of hardware IRQs.
The only thing that may cause issues on a single core CPU is the fact that a pooling thread would use 100% CPU time. Such a thread has higher chances of getting CPU time taken away by the OS for a longer time frame. A simple SDL_Delay(1) in the event thread could fix that without introduction to much lag. |
|||||||||||
|
|
||||||||||||
|
jeroen clarysse
|
@ Frederik vom Hofe :
am I right if i summarize your code as follows : work with 2 threads : - the main thread, which ONLY does screen updates, and does each of these with VSYNC turned on. This means that on a 100Hz monitor, the thread will sit idle for 9.9 msecs and then blit some stuff on screen. After each swap, a (mutexed) counter is incremented to keep track of the number of frames that have passed. Comparing this value with the previous value can detect OS-generated lag greater than one refresh. - a secondary worker thread which polls the clock and checks for device input every millisecond. By looking at the clock, and comparing that time with the previous looked-at-clock-time, the thread can determine when the OS has caused lag. This thread can also use the mutexed counter to launch actions at specific frame counts. one question that pops up in my head is : can the worker thread textures that need to be blitted in the main thread ? My two threads would simply share an array of textures+coordinates (I call this the "blit queue"). But sometimes my textures change. Is SDL thread safe in such a way ? And further down the thread-safe train of thought : can the worker thread call SDL_pumpEvents to process key and mouse events as well ? the only thing I have to work out, is how to handle "static events". These are events that are at specific times (for instance at time 205, which is NOT a multiple of refresh rate). If such an event happens, some changes in the blit queue will be made, and the next Swap() call will display these changes. However, my framework can also start internal timers... I have to figure out a way to start these timers when the image is displayed... or I have to delay the static event until the next frame was passed... but neither of these is all that complicated thanks again ! You gave me whole new insights. The only thing I'm afraid of, is that SDL is not thread safe... |
|||||||||||
|
|
||||||||||||
|
jeroen clarysse
|
Hi frederik
i've been thinking about your samlpe code, but I'm still a bit puzzled : I made a little image to illustrate. The first line is the main thread which does nothing except swapping images synced to the VRS. The 2nd line is the event thread which polls devices and staticaly-timed events. Whenever something is triggered in the event thread, it will simply update a (mutex-protected) array of bitmaps_to_be_drawn. The main thread will, right before the synced-swap check if that array was modified (a simple mutexed boolean must_redraw) and if so, redraw the backbuf before swapping. so far so good. I also added some vertical lines to indicate a 100Hz refresh rate. The red curves indicate how the main thread "jumps" every time to the next swap. Now imagine an event occurs in the event thread at time 13, so between the 2nd and 3rd swap. This event will update the bitmaps_to_be_drawn array and raise the must_redraw boolean. My problem is : WHEN WILL THE NEW SCREN BE VISIBLE ? Obviously, I want the update to become visible at the next retrace : at time 20. However, if I understand correctly, this will NOT BE THE CASE : at time 20, the main thread will wake up from the sleep it went into after the swap at time 10 !!! So at time 20, it will update the backbuffer and call Swap(vsync=true), which means that it will go back to sleep and update the monitor only at time 30 !!!! am I correct ? what do you propose as a solution ? [img]http://imgur.com/zu1x8nX[/img] |
|||||||||||
|
|
||||||||||||
|
Nathaniel J Fries
|
SDL_Delay has no understanding of frames.
It delays for in milliseconds (on a 100Hz display, 1/10th of a frame, on a 60hz display, 1/17th of a frame, etc). |
|||||||||||
|
|
||||||||||||
|
Frederik vom Hofe
|
Thats the cost of syncing to the screen. On a 100Hz screen, a single frame can only use 10ms to draw or it will miss the vsync. The default strategy is to draws as fast as possible and then idle until the vsync occurs. This means any visual change you make will take between 10-20 ms (on 100Hz) until you see it on screen. Theoretically you could use SDL_Delay after the blocking swap function. But this will just make it very likely that you miss the next vsync. Also it would not help a lot: e.g. by cutting the draw time in half you only have 5ms to draw and still 5-15ms before change appears on the screen. But this only is a problem if you need direct screen changes after input. Otherwise just use scripts that the render thread knows in advance (like 3 frames) and then can show stuff at exact predefined frames/time.
The only point in using SDL_Delay(1) was to make the event thread idle the smallest possible amount of time so it doesn't use up 100% CPU but still can measure input timings exactly. |
|||||||||||||||
|
|
||||||||||||||||
| syncing with the monitor retrace |
|
Sik
|
1) SDL_Delay(0) will give you the minimum delay (it literally means
"switch away from this task and return as soon as possible"). 2) Polling the events should be doing that already. 2013/5/3, Frederik vom Hofe:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||
|
|
||||||||||||||||||
|
jeroen clarysse
|
thanks to all for your replies. This is quite an interesting discussion :-)
I think the only solution would be to have some sort of a call like this : SDL_if_in_vertical_blank_then_swap_else_do_nothing() this would basically check if the retrace beam is at the top. If so, the backbuffer is swapped onscreen. If not, execution continues and you can do some processing. Of course, if the beam was very close to the bottom, and this processing takes longer than the beam to turn around, you will miss a retrace and this drop a frame ! so an even better call would be SDL_how_far_from_the_vertical_blank_is_the_beam(), which returns pixels or msecs (one can be calculated from the other based on monitor resolution and refresh rate) DirectX has such a call : IDirectDraw7::GetScanLine() which is documented [url=http://msdn.microsoft.com/en-us/library/windows/desktop/gg426149(v=vs.85).aspx]here[/url] but if I'm correct, OpenGL doesn't support this I think that my only solution is to implement a new feature in my framework : "start an internal clock after the next swap", which can then be used to measure response latencies. This does mean of course that exact timing between visual presentations is difficult |
|||||||||||
|
|
||||||||||||
| Re: syncing with the monitor retrace |
|
Nathaniel J Fries
|
True. Even better would be to use sched_yield on POSIX or SwitchToThread on Windows, since both of them essentially establish "I have done my job for this time slice, let other threads do theirs". I understand that the semantics are different (sched_yield guarantees that the thread is moved to the back of the scheduling queue, SwitchToThread only guarantees that one other thread will be given an opportunity to run)
Polling events does wait on a mutex, but it does not explicitly yield execution. Pumping events will not yield execution except as performed internally by the underlying windowing system (I would imagine xlib does quite a bit of this, and that Windows does very little). |
|||||||||||||||
|
|
||||||||||||||||
| syncing with the monitor retrace |
|
Sik
|
I meant that polling events does help the scheduler know when it can
be safe to switch threads without hurting performance. Ultimately you just want to help the scheduler do its job, not force it to do what you want. Maybe it thinks that giving your thread more time is the best thing after all! 2013/5/3, Nathaniel J Fries:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||
|
|
||||||||||||||||||
|
Frederik vom Hofe
|
Didnt know about SDL_Delay(0). But this would not eliminate 100% CPU usage. "pooling" with SDL_PollEvent is non blocking and would also not cause the thread to idle. But SDL_WaitEvent looks like it. But then you don't know in what time intervals events come in and therefor how long the thread would be asleep after calling SDL_WaitEvent. @jeroen clarysse: One simple question! Do you really need minimum time from event to screen-output with vsync? Otherwise you just over complicating things. |
|||||||||||||
|
|
||||||||||||||
|
jeroen clarysse
|
yeah... i'm not using SDL as a game engine, but for psychology experiments. Timing is imperative for approx 20% of our experiments. Anything that is "priming" related or subliminal perception requires very accurate measurements. I know about the whole issue with monitor latencies etc etc and we have ways to embrace that (we use very expensive 200Hz monitors and high end graphic cards for subliminal experiments or eyetracking stuff) basically we need to know the response time between a presented stimulus and input that is read from a device (usually via the parallel port or Data Acquisiution cards). Also eyetrackers or EMG measurements need to be synced with the presentation time of the visual stimulus. if it's just one stimulus, I can sort of work around the VRS issues listed previously in this discussion, but sometimes we have a sequence of stimuli... also, like I said : i'm building an application that other researchers will use to build their own experiments. As such, I need to be sure that my timing is accurate in all possible circumstances, since you never know what these researchers will come up with :-) the previous version of the software was DirectX only, and I could use the GetScanLine() code : I would prepare all stimuli in the first half of the screen refresh, and make sure to be ready before the vertical blank is reached. With the SDL approach, I can't do that since the Swap(sync=true) routine will block me from doing any work between swaps. Multithreading looks like a solution, but since I'm always one swap too late, I can't get the accuracy I need either but it is an interesting discussion nonetheless, and i'm learning a lot ! |
|||||||||||||
|
|
||||||||||||||
|
Frederik vom Hofe
|
Now it makes more sense to me, you just want to cover all cases. I have a new idea. Even single threaded! :D The most important part is a high resolution timer! But sdl2 got that covered:
PS: high end graphic cards maybe not your first choice if you are just looking for latency. |
|||||||||||||||||
|
|
||||||||||||||||||
|
Frederik vom Hofe
|
Just correcting a brain fart:
has to be
|
|||||||||||||||
|
|
||||||||||||||||
| syncing with the monitor retrace |
|
Jared Maddox
Guest
|
Hmmm... Have you tried using software renderers with multi-threading? I don't know how much work your rendering is doing (and I haven't tried using the software renderer in a multi-threaded manner) but if it's light enough then it should be practical to do it entirely in software, and just upload it when appropriate. If you need something more heavy-duty than SDL itself then you can glue one of the software OpenGL implementations (e.g. TinyGL) to a SDL_Surface. Beyond that, have you looked to see if anyone has tried this with platform-specific OpenGL? If they have, then their experiences with it could be much more informative than anything we can tell you, since SDL would mostly be useful for making your code more portable, rather than being useful for e.g. rendering geometry-based scenes. _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
|
||||||||||||||||
| syncing with the monitor retrace |
|
Sam Lantinga
|
If you're running on a dedicated system, you can run a calibration loop using SDL's performance timer and the present synced to vblank to estimate its timing, then you can figure out how long a present takes and roughly when it happens.
Then you can in a single thread, loop with no delays and work until the next scheduled vblank and then do a present. If you don't have any other load on the system it should be pretty accurate, and you can adjust the calibration on the fly and detect missed vblanks. I don't have time to show it in code, but hopefully that helps. See ya! On Fri, May 3, 2013 at 7:33 PM, Jared Maddox wrote:
|
|||||||||||||||||||||
|
|
||||||||||||||||||||||
|
Nathaniel J Fries
|
Unless your thread is literally the only thread in the system, then it should drop significantly. I've dropped from >90% to as low as 2% using SDL_Delay(0). |
|||||||||||||||
|
|
||||||||||||||||
| syncing with the monitor retrace |
|
Jonny D
|
I recall something about SDL_Delay(0) being a no-op on some platforms. Is that true? I always use SDL_Delay(1) just in case.
Jonny D On Sat, May 4, 2013 at 10:16 AM, Nathaniel J Fries wrote:
|
|||||||||||||
|
|
||||||||||||||
| Re: syncing with the monitor retrace |
|
Nathaniel J Fries
|
I have also heard this, which is why I think a yield function might be more appropriate for this purpose. |
|||||||||||||
|
|
||||||||||||||
| syncing with the monitor retrace |
|
Sik
|
Decided to look into this to see how SDL_Delay behaves.
Windows: always calls the Sleep function. This behaves exactly like I said. Unix: it tries to use nanosleep if available, and a busy loop otherwise. Note that in the case of the latter it literally never tells the OS that it's waiting, making SDL_Delay unsuitable for giving up CPU time. BeOS: Haiku? Anyway, it uses snooze there, so I guess the yielding semantics still apply. PSP: what the heck is this doing here?! (whatever, calls sceKernelDelayThreadCB, no idea how that works) I assume all OSX, Linux and Android go with the Unix code. No idea what iOS would use (I'd guess Unix, but yeah, not sure). Anyway, basically it seems that yes, it calls the sleep function of the OS pretty much always. There's only one exception which is Unix without nanoseconds, but in this case it enters a busy loop so no matter how much you make the delay it won't give up CPU time. 2013/5/4, Nathaniel J Fries:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
|
||||||||||||||||
| Re: syncing with the monitor retrace |
|
Nathaniel J Fries
|
False. If nanosleep is not available, it uses select(). Passing a valid timeval but no fd_sets to a POSIX-compliant implementation of select will delay the thread for the specified time, or until a signal occurs. In fact, its behavior in this situation is nearly the same as nanosleep's, except that select does not provide its own method of determining how much longer it needs to sleep (thus the seeming busy-loop). POSIX does not specify whether this actually yields the thread. The internal implementation of select might well not yield execution. |
|||||||||||||
|
|
||||||||||||||
| syncing with the monitor retrace |
|
Sik
|
Touché. Forgot about that trick.
And technically that about not explicitly saying if it yields is true even in the delay functions of operating systems, it just happens to be the most common behavior (originally because if the process sleeps it truly has nothing else to do, and then because taking advantage of that sleeping 0ms became an idiom for instant yielding in programs so operating systems behave accordingly). I don't think it's possible to guarantee a yield in a modern system no matter what - and probably it isn't a good idea to force yields since that can mess up with the scheduler's plans. We can behave in specific ways to provide hints to the scheduler and that's it. 2013/5/4, Nathaniel J Fries:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
|
||||||||||||||||
|
jeroen clarysse
|
allright, here's a wrap-up of my findings so far :
- SDL does not have a way to detect vertical blank, so we can not use that to coordinate things - multithreading is a solution, but it is impossible to accurately coordinate everything inside my own framework in such a way that it always works I will resort to the following solution : * two threads : main thread and event thread * main thread does only one thing : do a sync'ed SWAP() call which waits for the beam sync, then swaps. After the swap, this thread will increment a VRScounter and verify that we did not lose too much time (time sinc last swap should be <= refresh rate) * event thread will inpsect all external devices and fixed-time-events. * every time the main thread updates the screen, a flag will be raised, e.g. flag named "screen_was_updated" * a researcher can lower this flag in his experiment when he displays critical visual stimuli. * a researcher can make an event "when flag is raised". He can attach affect5 commands (my framework code) to start timers * this timer can be used to calculate the response time of the subject in the experiment * needed commands in my framework are : + turn on/off VRS sync --> simply calls SDL_SetSwapInterval(1) resp. SDL_SetSwapInterval(0) + set VRS flag --> tells the framework which flag to raise at each screen update + set VRS counter --> tells the framework which counter to increment at each screen update that should do the trick !! fingers crossed that I did not forget any crucial aspects :-) thanks for everyone who helped me brainstorm on this |
|||||||||||
|
|
||||||||||||
|
jeroen clarysse
|
I have one last question perhaps for the experts here : the wiki states :
this would mean that I can only inspect mouse and keyboard inside the MAIN THREAD, not the event thread. Is that correct ? Because it would mean that mouse & keyboard can only be checked once per refresh rate. For the keyboard that would be not that much of an issue since it is a slow device anyway. But a mouse button can be polled faster than that on most machines... It would be a bit of a shame to lose that ! any ideas/suggestions ? |
|||||||||||||
|
|
||||||||||||||
|
Nathaniel J Fries
|
You need to call SDL_PumpEvents from the main thread. you can use SDL_PeekEvent from another thread, as long as the main thread is filling SDL's thread-safe event queue via SDL_PumpEvents. But there's a flaw with this plan, too. First, you'd have to wait for a vsync to fill the event queue. Second, if you miss a vsync during rendering, you'd have to wait for the NEXT vsync; meaning that your app will be less responsive. But there's more: unless this has been fixed in SDL2, SDL's thread-safe event queue only holds 256 events. It's unlikely, but possible, to have more than 256 events per vsync. What happens to these events? Are they lost forever, or are they just held over? At least in the case of some older versions of SDL, they are lost forever. I'm not sure that this was ever fixed. And if they weren't lost forever, they'd still be processed very late. My recommendation? Don't use vsync. If you only want to draw at the same rate of refresh, figure out the refresh rate of the display and use SDL_GetTicks to determine when next to draw. And if you're using a backbuffer, I don't see why you'd use vsync at all. |
|||||||||||||||
|
|
||||||||||||||||
|
jeroen clarysse
|
@Nathaniel : thanks for the feedback !
I need to use the VSYNC unfortunately : the whole application is centered around accurate displaying of images to subjects in psychology experiments (we use 100Hz or 200Hz CRT monitors for optimal results). It is rather important for us to ensure that images are drawn in "one sweep", so syncing is mandatory. I can however live with the disadvantage of missing one frame occasionally. First of all, we only do a few draws per "trial" (a trial is the smallest instance of execution that is presented to the subject of the experiment. Typically, a trial is one image display, plus measurement of a response. A typical experiment is 50-100 trials per subject) your solution of measuring the refresh rate once, and then using that time is something I'm not really feeling confident with : I just wrote a very small piece of code :
and it turns out there is a bit of variation in between each refresh. this would make it very difficult to ensure proper syncing I think that using SDL_GL_SetSwapInterval(1) is my only way to ensure TRUE syncing |
|||||||||||||
|
|
||||||||||||||
| syncing with the monitor retrace |
|
Rainer Deyke
Guest
|
On 2013-05-06 14:27, jeroen clarysse wrote:
Multithreading is not a solution. Seriously, if you want accurate timings, the last thing you want is threads. Random context switches kill accurate timings. Don't think that a multi-core CPU will protect you; the threads still need to use mutexes to synchronize. Ideally, you wouldn't even use a multitasking operating system. You can get reasonably accurate vsync timings like this: SDL_RenderPresent(...); base_time = SDL_GetTicks(); SDL_RenderPresent(...); interval = SDL_GetTicks() - base_time; Future vsyncs will happen predictably at base_time + interval * N. For added accuracy, make several measurements of the interval, discarding any very high values or very low values, and rebase base_time after each SDL_RenderPresent. -- Rainer Deyke _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
|
||||||||||||||
|
jeroen clarysse
|
@Nathaniel :
as a side note, i've peeked inside the SDL sources, and found
so your 256 was even optimistic :-) and yes : it is a "running" array with head/tail/next pointers that walk through the array, overwriting anything that is older than 128 ! but I'm not so worried about this : on a 100Hz monitor, I would be calling PumpEvents every 10msec...if there are so many things happening that this 128 queue is filled up in 10msec, any timing accuracy will be worthless anyway ! Most likely, refreshes will be missed, which is detected inside the main thread that checks if the current clock is never more than the (previous_clock+refresh_time) |
|||||||||||||
|
|
||||||||||||||
|
jeroen clarysse
|
@Rainer : thx for the feedback
yeah... sigh... I know... mutexes make things a lot more complicated... I really have to decide what to do. I could force my users to use a multi core CPU, and thread scheduling would allow me to ensure that both threads are NOT running on the same core. So context switches here will be minimal.. I think (correct me if i'm wrong !!) It is really a choice between two evils : - use threading : PRO : we can use SetSwapInterval(1) so we KNOW for SURE that images are displayed in sync. By using the "flag" system i described earlyer, I can start my timer IMMEDIATELY after the RenderPresent has completed, so any user-related device input is synced to the end of the swap. CON : risk of complications due to threads CON : mutex code needed, which might slow down things a lot CON : sdl uses its own thread also to handle events. (again : correct me if i'm wrong) It will take a lot of fine tuning to make sure these three threads don't interfere - use your own loop PRO : threading risks avoided, mutex bottleneck solved CON : sdl is threaded anyway, so we are STILL based on threads ! CON : refresh rate is not a simple constant. You can't just calculate it from a few swap() calls I have noticed. It seems to vary a bit : not much, but if you have a 100Hz display, that implies 12000 refreshes in 2 minutes (reasonably expectable trial length in our experiments). If you vary 0.05msec per refresh, you can end up with a 5 msec deviation in 1000 frames . So we'd have to periodically recalibrate this... but that would imply switching from SwapInterval(0) to SwapInterval(1) periodically... making htings quite complicated since I have to predict that this will NOT happen at a critical time in the experiment ! ideas ? I have to either sacrifice sync-accuracy, or simplicity the only really proper solution would be to have a GetScanLine() routine like DirectDraw has... |
|||||||||||
|
|
||||||||||||
| syncing with the monitor retrace |
|
Sik
|
Late to the discussion, but I really doubt SDL_GetTicks is going to be
even remotely useful for something like this, just out of accuracy. In fact I think the minimum guaranteed accuracy is just 10ms, that's 1/100th of a second, which goes to say how inaccurate it is. 2013/5/6, jeroen clarysse:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
|
||||||||||||||
|
jeroen clarysse
|
are you sure about that ? According to the documentation, SDL_getTicks is in msec.. Looking in the SDL sources, it is a wrapper around gettimeofday(), which is msec also... but if you are right, there is always SDL_GetPerformanceCounter(), which should be more accurate, right ? interesting nonetheless ! |
|||||||||||||
|
|
||||||||||||||
| syncing with the monitor retrace |
|
Sik
|
SDL_GetTicks returns 1ms values, but the minimum you can expect from
the OS is 10ms. Note that this depends on the OS, on some systems you will indeed get 1ms accuracy, you just can't rely on it. And yes, SDL_GetPerformance*() is definitely much more accurate, it's designed to use the high precision timers (SDL_GetTicks() uses the low precision ones). 2013/5/6, jeroen clarysse:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
|
||||||||||||||||
|
jeroen clarysse
|
I doublechecked the SDL2 sources, and found this for GetPerformaceCounter() :
and this for GetTicks() :
so basically, if you DONT have the have_clock_gettime enabled, GetTicks and PerfCounter derive from the same function : gettimeofday(), which is according to the BSD man pages microsecond accurate... |
|||||||||||||||
|
|
||||||||||||||||
| syncing with the monitor retrace |
|
Sik
|
You're only taking into account *nix systems...
2013/5/6, jeroen clarysse:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
|
||||||||||||||
|
jeroen clarysse
|
sorry, you're right about that indeed... I assume that on windows, SDL will use QueryPerformanceCounter(), which is also very accurate. On other systems such, I have no idea really... But mac/unix & windows is probably 99% of the SDL target ? |
|||||||||||||
|
|
||||||||||||||
| syncing with the monitor retrace |
|
Sik
|
OSX and Android too.
Just looked up Windows. It *does* have the option of using QueryPerformanceCounter, but also it can be made to use GetTickCount, which is *extremely* inaccurate (in fact MSDN says it may be even more inaccurate than 10ms), so you can't rely on it unless you control the build process. Of course at this point I'd have to wonder what Windows system doesn't support QueryPerformanceCounter, given it was around back in the Windows 9x era already... Maybe removing GetTickCount support should be considered in the future? In fact, SDL_GetTicks() in itself probably could be built entirely on top of SDL functions only (no system-specific code). I know it predates the high precision timers though so that may be why it's done that way still. 2013/5/6, jeroen clarysse:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
|
||||||||||||||||
| Re: syncing with the monitor retrace |
|
jeroen clarysse
|
I don't think that SDL2 is still dependent on GetTickCount !! I just opened the project folder and did a search on all files for "GetTickCount". Only in SDL_systimer.c, inside src/timer/windows is there a reference still, but that one is inside preprocessor directives #ifdef USE_GETTICKCOUNT, which is not defined in any makefile so i'm fairly sure that all timing code on all major platforms is now microsecond accurate, or at least millisecond... (not withstanding scheduler interrupts of course !) |
|||||||||||||
|
|
||||||||||||||
| syncing with the monitor retrace |
|
Sik
|
The fact the USE_GETTICKCOUNT code is still there does though imply it
may still get used under some conditions, and the programs using SDL shouldn't rely on SDL being built in a particular way (only that the same source code is used). Again, it's debatable why that code is still present. As I said, SDL_GetTicks doesn't even need system-specific functions, it could be done with other SDL calls only (at this point becoming just a convenience function). 2013/5/6, jeroen clarysse:
SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
|
||||||||||||||||
|
jeroen clarysse
|
true ! |
|||||||||||||
|
|
||||||||||||||
|
Nathaniel J Fries
|
In both solutions (threaded and threadless), event processing is restricted by calls to SDL_PumpEvents, which is limited by vsync. Unless you're drawing fairly complex scenes, you certainly don't need 10ms to render. You should have plenty of time to also properly handle events in the main thread for each frame. And SDL event thread: Windows - no Linux/X11 - no. Mac OS X - no In fact, I think BeOS might be the only system to use it (but don't quote me on that).
Probably correct. It was just an idea, which it seems he has proven wrong.
gettimeofday is actually microsecond resolution. And on Unix, SDL_GetPerformanceCounter also uses gettimeofday or clock_gettime, same as SDL_GetTicks, making accuracy comparable. On PSP and BeOS, it is literally just a proxy to SDL_GetTicks. In fact, SDL_GetPerformanceCounter is only useful on Windows, which is the only supported system that provides actual performance counters. anyway, I think the 10ms guarantee is for portability reasons. Some platforms might not actually have any timing mechanism with greater precision than 10ms. |
|||||||||||||||||||
|
|
||||||||||||||||||||

