Using SDL_atomic |
Using SDL_atomic |
Sik
|
To be fair an atomic here may be enough if you just want to pass
around a flag. Basically: 1) Make the flag of type SDL_atomic_t 2) Do SDL_AtomicSet(&flag, 0) and then spawn the secondary thread 3a) In the main thread, do SDL_AtomicGet(&flag) until it doesn't return 0 3b) In the secondary thread, do SDL_AtomicSet(&flag, 1) when it's done saving 4) Let the secondary thread die :P (swap 0 and 1 if you think that makes more sense, the point stands) Atomics are enough when you're doing stuff like signaling the status of something through a flag. If it's data that's constantly being shared among threads then a mutex is better. (a good example of this is the audio callback in SDL, changing anything that the callback uses requires using a mutex) _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||
|
Using SDL_atomic |
Neil White
Guest
|
http://wiki.libsdl.org/SDL_AtomicSet needs code as well
On Sat, Feb 28, 2015 at 5:10 AM, Sik the hedgehog wrote:
|
|||||||||||||
|
Using SDL_atomic |
eric.w
|
On Tue, Feb 24, 2015 at 2:46 PM, mbabuskov wrote:
Hi, I think your proposal will have a race condition like this: // main thread if (!saving) // atomic read of saving here { // Problem: saving thread might 'saving' to true right now // main thread modifies world while saving thread is reading it => corrupted save } To make it safe, you should use a lock/mutex. Just from skimming the SDL_Lock documentation, and assuming you want the main thread to skip the world update during saving, I'd do it like this: // main thread is going to update world if (SDL_AtomicTryLock (lock)) // if the saving thread is currently holding the lock, don't wait for it, just skip updating the world { // update world here SDL_AtomicUnlock(lock); } // saving thread. SDL_AtomicLock(lock); // If the main thread is currently updating world, this will (correctly) wait until the lock is released. // read world state and save it SDL_AtomicUnlock(lock); The thing to watch out for here is you have to be 100% sure the SDL_AtomicUnlock calls are made, i.e. watch out for exceptions / gotos / break statements / etc. - if you miss unlocking for some reason, you'll get a deadlock (or else the world will stop updating.) Hope this helps. Eric _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Using SDL_atomic |
Neil White
Guest
|
im gonna attempt to update all the wiki when someone gives me a login and a laptop ( laptop tomorrow ) a lot that is missing from the page is in the code tests just need ctrl c & v , unless someone has written an elaborate script
On Sat, Feb 28, 2015 at 1:52 PM, Eric Wasylishen wrote:
|
|||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 01/03/2015, at 12:10 AM, Sik the hedgehog wrote:
Not really. Atomic operations ensure that reads and writes of the data return a coherent value. That's all. So if you increment a variable atomically in thread 1, thread 2 will either read the old value, or it will read the new value. It won't read a messed up value even if the variable is multibyte. A mutex is just a user defined atomic operation. So in principle it also cannot be used to "pass" a value. The best way to synchronise threads is to use channels. [Library like 0MQ provides this, as does my Felix system] Unfortunately these primitives are not available on most processors and not available in most programming languages either. So you will have to use the next best thing, which is a Posix semaphore. Semaphores synchronise threads by allowing one thread to set some data then signal another thread it has done so. The other thread can wait for the signal. General semaphores are of course not useful for passing data though. Posix therefore had to enhance the concept to include fencing, that is, to ensure data to be communicated at the synchronisation time are shared. Unfortunately, there's no way to know what is to be shared so EVERYTHING has to be synchronised. This is true of a Posix mutex too. Anyhow, semaphores are the way to go with Posix. Windows doesn't have them, but there are emulations available. No idea about iOS or Android. -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Using SDL_atomic |
Sik
|
2015-02-28 20:02 GMT-03:00, john skaller:
Flags are booleans though (;゚ω゚) _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 01/03/2015, at 1:35 PM, Sik the hedgehog wrote:
That's true but the encoding could be 64 bit 0 or (int64_1)(-1) = 0xFFFFFFFFFFFFFFFF. Without an atomic operation one might set the second value but read 0xFFFFFFFF00000000 and who knows what the effect would be on the application? [Assuming the data bus is only 32 bit :] -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
Sik
|
2015-03-01 0:41 GMT-03:00, john skaller:
Yeah, but I was talking about doing it with SDL_AtomicSet/Get which *are* atomic (also IIRC the underlying type is 32-bit signed integer anyway). For a case like this a full-blown mutex is overkill, mutexes are designed for sharing around larger amounts of data rather than just a flag. _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Re: Using SDL_atomic |
mbabuskov
|
I fail to see how can this create problems. The saving thread sets "saving" to true just before quitting. It won't try to read world anymore. The next time save is done, a new saving thread will be started. But, I see other potential problems: 1. it's possible that changed value is not propagated to the main thread quickly (or ever). As far as I researched this on the Internet, it seems that this should never happen on common modern platforms (Intel CPUs where this game will run), but it could in theory. AFAICT, using SDL_SetAtomic would fix this. 2. The saving thread code is: void save() { ... save stuff saveFlag = false; } As I understand, some compiler/CPU might "optimize" the code and since saveFlag is not checked or used in the whole save() function it could reorder the code to make that statement run *before* the "safe stuff" or somewhere in the middle of it. I changed the code to use SDL_AtomicSet. void save() { ... save stuff SDL_AtomicSet(&saveFlag, 0); } Would that make sure that code is not reordered? Is there some chance that a compiler or CPU could reorder my SDL_AtomicSet call to run before "save stuff"? |
|||||||||||||
|
mbabuskov
|
Actually, to clarify. My code looks like this now:
MAIN THREAD: while (!gameOver) { ... int isSaving = SDL_AtomicGet(&saveFlag); if (needsSaving && !isSaving) { needsSaving = false; SDL_AtomicSet(&saveFlag, 1); start detached save thread } if (!isSaving) updateDangerousStuff(); updateOtherStuff(); } SAVE THREAD: void save() { ...save stuff to file SDL_AtomicSet(&saveFlag, 0); ...thread gets destroyed } The saving thread never sets the flag to true, only to false. |
|||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 01/03/2015, at 9:16 PM, mbabuskov wrote:
How? Atomic write and read ensures that the individual writes of the write will not interleave with individual reads of the read. Nothing more. It makes no assurances about the ordering of any OTHER reads and writes by any threads. You need a fence for that. So consider: var saving = false; proc game() { while run do if needs_saving do saving = true; spawn_pthread saving_thread; done if not saving do dangerous_stuff; done safe_stuff; done } proc saving_thread () { save_data; saving = false; } Assuming the assignments are atomic, is this code safe? [BTW: this is real Felix code] The answer, unfortunately is NO. The reason is, some writes *scheduled* in dangerous_stuff might not have occurred when saving thread is launched. In fact they might propagate in pieces during the save operation. To force the writes to occur, you need a fence before saving is set to true. This ensures the writes are completed before saving commences. Of course they will appear completed to the main thread without a fence. Generally synchronisation primitives include a fence, or the primitives would be useless. However if you check Posix pthread_mutex_lock specification, for example, there's no mention of it. Nor any of the SDL_Atomic functions. -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Re: Using SDL_atomic |
mbabuskov
|
Whoops, you are right. Actually, I missed one line from my code. Here's what it really looks like: MAIN THREAD: while (!gameOver) { ... int isSaving = SDL_AtomicGet(&saveFlag); if (needsSaving && !isSaving) { needsSaving = false; isSaving = 1; // the line missing from the last message SDL_AtomicSet(&saveFlag, 1); start detached save thread } if (!isSaving) updateDangerousStuff(); updateOtherStuff(); } SAVE THREAD: void save() { ...save stuff to file SDL_AtomicSet(&saveFlag, 0); ...thread gets destroyed } |
|||||||||||||
|
Using SDL_atomic |
Jared Maddox
Guest
|
<snip>
<snip>
John, look at this documentation page: https://wiki.libsdl.org/CategoryAtomic Right above the Atomic Locks header it has this line: "All of the atomic operations that modify memory are full memory barriers." This probably isn't the highest-performance way to do things (there might be something that does fences according to cache block: you'd get better performance while inside a protected region if you used a fence as rarely as possible while inside that block), but it's the best that can be done both outside of the compiler, AND without the programmer digging into the hairy details of their target processor. _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 02/03/2015, at 6:57 AM, Jared Maddox wrote:
Ah, thanks. This comment should be repeated in the documentation for each individual function. i will fix it! -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 02/03/2015, at 10:46 AM, john skaller wrote:
[I have started doing this but my bandwidth is so bad I can't load web pages in a reasonable time at the moment .. will have to finish it later when bandwidth improves. Sorry!] -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||
|
Using SDL_atomic |
Eirik Byrkjeflot Anonsen
Guest
|
john skaller writes:
To quote the documentation: "All of the atomic operations that modify memory are full memory barriers." Though, now that I look at it, I realize that SDL_AtomicGet() does not, in fact, "modify" memory. So there might be a problem there. And for pthread_mutex_lock(): http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11 "functions that synchronize thread execution and also synchronize memory with respect to other threads" ... includes the pthread_mutex functions. Though of course, we have the little issue that "threads cannot be implemented as a library": http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf In practice, this doesn't matter so much, since all current compilers will do the right thing. For obvious reasons :) eirik _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||
|
Using SDL_atomic |
Sik
|
2015-03-02 14:14 GMT-03:00, Eirik Byrkjeflot Anonsen:
But SDL_AtomicSet still would, so the worst that would happen is that SDL_AtomicGet would just see the update later than expected if I understand correctly (but will eventually see it, since it's actively asking for the data in RAM). When SDL_AtomicGet sees the change it's safe to assume that whatever happened before SDL_AtomicSet made its way into RAM already. The problem would be if you keep making changes after SDL_AtomicSet, but in that case you should ask yourself how do you expect the code to work in the first place _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Using SDL_atomic |
Eirik Byrkjeflot Anonsen
Guest
|
Sik the hedgehog writes:
You would think so, however: void protected_dangerous() { if (SDL_AtomicGet(atomic) == 0) { dangerous(shared_data); } } Looks safe, right? However, if SDL_AtomicGet() is not a proper memory barrier, the compiler is allowed to rewrite this as: void protected_dangerous() { suitable_type tmp_shared_data = shared_data; if (SDL_AtomicGet(atomic) == 0) { dangerous(tmp_shared_data); } } Note that in this case 'shared_data' is read before 'atomic' is tested. Thus it might end up sending a stale value to dangerous(). When the SDL documentation says "Seriously, here be dragons!", it really means it :) eirik _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 03/03/2015, at 7:13 AM, Eirik Byrkjeflot Anonsen wrote:
SDL_AtomicGet takes a pointer to an struct SDL_Atomic_t which contains an integer field named "value". Here is the current definition: typedef struct { int value; } SDL_atomic_t; I think this should read: typedef struct {int volatile value; } SDL_atomic_t; That's line 189 https://hg.libsdl.org/SDL/file/bc00287b414f/include/SDL_atomic.h Then the "volatile" should, normally, prevent any compiler optimisations. [It's not assured by the C or C++ Standards but a compiler breaking normally expected volatile semantics would be a poor implementation .. not that that's saying much, given how recent gcc's break a lot of previously working code in the name of strict adherence to a Standard semantics for a language which is sloppy and weak .. bad idea GNU!] I have to say, however, that these notes in the same file would be much simpler to use with explicit fences instead of crudding about with undocumented and uncertain features: /** * Memory barriers are designed to prevent reads and writes from being * reordered by the compiler and being seen out of order on multi-core CPUs. * * A typical pattern would be for thread A to write some data and a flag, * and for thread B to read the flag and get the data. In this case you * would insert a release barrier between writing the data and the flag, * guaranteeing that the data write completes no later than the flag is * written, and you would insert an acquire barrier between reading the * flag and reading the data, to ensure that all the reads associated * with the flag have completed. * * In this pattern you should always see a release barrier paired with * an acquire barrier and you should gate the data reads/writes with a * single flag variable. * * For more information on these semantics, take a look at the blog post: * http://preshing.com/20120913/acquire-and-release-semantics */ -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Using SDL_atomic |
Sik
|
2015-03-02 17:13 GMT-03:00, Eirik Byrkjeflot Anonsen:
Wouldn't this be an issue with mutexes as well, actually? I mean, if the compiler can reorder around SDL_AtomicGet like that, it surely can reorder around the function that calls the mutex lock as well for the same reason. It's not like the compiler knows that the shared data is not safe to cache :P 2015-03-02 18:16 GMT-03:00, john skaller:
Except it won't. The only thing it does is guarantee that it'll write to memory and that consecutive volatile accesses will be done in said order (and now not even that thanks to processor-level reordering, it needs to be cache-through as well for that to work). There's a very good reason why volatile doesn't work at all for multithreading. The only purpose of volatile is to access hardware ports. Anything else won't work as expected. _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
Bob
|
Just a couple things, since I pushed the atomics through acceptance and wrote the first several versions before they were completely rewritten... :-)
Just because you start another thread is no reason to believe that the thread is running. in fact, since the total number of threads running on a system is pretty much always larger than the number of cores you can be sure that sometimes one thread will be running and one thread will not be running. You have to use locks to make sure that a thread can actually block and force another thread to run. The code as presented may never let the save thread run at all because until it runs the flag will not be set and unless you force the other thread to block once in a while it may never stop running and let the flag be set. Probably the best test of when to use an atomic versus a mutex is how long the flag will stay in the locked state. If you are going to keep the flag set for more than a few hundred cycles then use a mutex. Yes, I said CYCLES. The time it takes to run at most a few hundred instructions in a single core. Oh, yeah, you should never use simple assignment to communicate between threads. Like it was pointed out above, between the machine scheduling instructions out of order and the compiler moving code all over the place you can never be sure when, or if, an assignment will actually take place. In fact if you set a value like flag = true; and then later say flag = false, and do not check the value in between the two statements, the compiler may just decide to eliminate flag = true because it has no effect. If the flag is initialized to false then both statements can be removed completely, unless you tell the compiler not to do that using the volatile type modifier. A good dead code eliminator pass in the compiler can do amazing things if you let it and are not aware that it exists. Bob Pendleton On Mon, Mar 2, 2015 at 4:15 PM, Sik the hedgehog wrote:
-- +----------------------------------------------------------- + Bob Pendleton: writer and programmer + email: + blog: www.TheGrumpyProgrammer.com |
|||||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 03/03/2015, at 9:15 AM, Sik the hedgehog wrote:
It's not really compiler reordering that's the problem, rather its the CPU and cache management. In theory, mutexes just can't work. i mean, they are specified to ensure mutual exclusion, and they will do that, but in theory that is of no value whatsoever, since it doesn't lead to any ability to share. -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 03/03/2015, at 10:01 AM, Bob Pendleton wrote:
Thanks!
Which is of course no use for the problem you mentioned above :-) If you want to force a thread to wait for another thread you have to use a condition variable or semaphore .. that won't force the other thread to run but it will force the current one to wait UNTIL it runs (up to a particular point). In fact for SDL the correct control structure to provide is probably a thing called a monitor. [Monitors are provided in Felix represented as pchannels] The only other way to "force" threads to run is to use a RTOS (Real time operating system). -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||
|
Using SDL_atomic |
Sik
|
2015-03-02 22:33 GMT-03:00, john skaller:
Or to force a yield, which tells the OS that it's a good place to make the thread start waiting (i.e. tell the schedule that it's OK to switch threads ahead of time). I know SDL_Sleep(0) on Windows manages to do this, but for some reason I can't get this to work on Linux (my build is configured to use the wrong API, maybe?). I know that using SDL_Sleep may be seen as a problem by some people due to the unpredictability of sleeping (it waits at least the amount of specified but can wait more, even *seconds* if it wishes), but I was messing with it and in practice it doesn't really cause problems, at least on modern Windows. _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 03/03/2015, at 6:07 PM, Sik the hedgehog wrote:
That's useful but it still doesn't force "the other" thread to run. -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 03/03/2015, at 6:07 PM, Sik the hedgehog wrote:
Try setting it to 1 instead of 0 (you mean SDL_Delay I assume) [The argument should be floating point but that's another issue .. :] If you really want another thread to run, you need to give it some time. -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Using SDL_atomic |
Sik
|
2015-03-03 9:33 GMT-03:00, john skaller:
Well, a RTOS wouldn't help here either, you'd need a single tasking system and have full control over each core (you simply have no way to tell the scheduler to move onto the other thread, it can decide to move to a different thread, or even to a different process) To put it bluntly, you should always assume the other thread may respond up much later in the future if performance is a serious issue. Incidentally this is also why the concept of critical sections exists, they tell the scheduler that it's the worst moment to switch away since other threads are waiting for a resource to be unlocked. 2015-03-03 9:41 GMT-03:00, john skaller:
Er yeah. But if I recall correctly 0 on Windows does a yield anyway (it moves onto other threads and returns as soon as the scheduler says so) and SDL doesn't seem to be filtering out the value. On Linux it's a whole different issue since first of all it depends on the underlying API (i.e. one of the two APIs it supports just does a busy loop, so even a huge delay will result in CPU hogging by the thread if SDL is using that) _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
Eirik Byrkjeflot Anonsen
Guest
|
john skaller writes:
Correct, however, see below...
Similar problems, but compiler optimizations are more likely to cause problems because they can do so much more. CPU and cache management will not eliminate "unnecessary" code or execute code speculatively. A good optimizing compiler can and will do both. And more.
A correct implementation of posix mutexes does work, since they are specified to do all the right things (mutual exclusion and full memory barriers). However, as I referred to earlier in this thread: "Threads Cannot be Implemented as a Library" (http://www.hpl.hp.com/techreports/2004/HPL-2004-209.pdf). In practice, that just means that compilers have to recognize the threading code and disable any optimizations around that code that would break it. And I expect that's exactly what they do. But it does make it more likely for bugs to show up in this area, I'm sure. This is unlike the situation with "volatile" which have been documented for a long time as not suitable for multi-threading. So I would not trust any serious optimizing compiler to avoid dangerous optimizations around those. Of course, the situation is much better with C++11, which does provide the necessary primitives (and memory model) to write multi-threaded code. eirik _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 03/03/2015, at 11:57 PM, Sik the hedgehog wrote:
Sure it would. I've written one. RTOS can make hard guarantees that threads run, and run in a particular time as well. Next time you fly over a nuclear power plant .. you'd better hope both the plant and plane are controlled with critical components running on a RTOS. -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 04/03/2015, at 2:44 AM, Eirik Byrkjeflot Anonsen wrote:
Oh, but they do (execute code speculatively). In fact all modern intel CPU's do this. It's unlikely a compiler will do it because compilers can really only schedule a single thread of control. All they do is try to help the CPU do it. "Modern" compilers are still quite stupid, at least in part because they're compiling a language which appears to have been designed to defeat optimisation (namely, C). [People doing high performance numerical work still use Fortran ..]
I actually checked the specs and saw no mention of a memory barrier, do you happen to have a link?
Certainly not for languages like C. They don't "recognise" anything. C compilers are extremely dumb. They can barely optimise basic primitives like memset .. and when they do they break all sorts of security code (clearing passwords out of memory ...)
The situation is better with C++ because it provides much higher level constructs and a stronger type system, as well as specifically supporting threads. In addition the design is deliberate and modern (although it still has to work in a framework which is poorly structured). -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||||
|
Using SDL_atomic |
Bob
|
Look, sharing works, threads work, because the hardware is designed to make them work. Yes, you have to use operations that are recognized by the hardware as memory barriers and in some cases the compiler also has to recognize them so it does the right thing with instruction scheduling. But, the machines do it right and the compilers do it right and if you do all the right things it works and it works very well. I first did multi-threaded code on a Univac 1108 in the early 70s (in fortran and cobol) and the basic rules have not changed. (I've also implement threads on the 8080 and later machines :-)
And, yes, you can implement a thread package as a library, but it needs to have support from hardware and the OS. But, learning to write multithreaded code is not easy. The natural assumptions built into the human mind about how multiple threads "should" act is completely different from the reality of how they DO act. BTW, the main reason I wanted atomics in SDL was to implement atomic reference counting. Reference counting has its problems, but it has very nice properties for use in interactive programs. But, threads and reference counts do not mix well unless you have atomic increment and decrement. Oh, yeah, I should not have said anything about "forcing" another thread to run. You can't do that. You can only stop your thread from running until the other thread has run. Bob Pendleton On Tue, Mar 3, 2015 at 3:23 PM, john skaller wrote:
-- +----------------------------------------------------------- + Bob Pendleton: writer and programmer + email: + blog: www.TheGrumpyProgrammer.com |
|||||||||||||||||||||
|
Using SDL_atomic |
Eirik Byrkjeflot Anonsen
Guest
|
john skaller writes:
Yes, unfortunate choice of words there :) The guarantees provided by intel CPUs when they reorder your code is far stronger than the guarantees of the C/C++ standards, though.
It is not only likely, it is absolutely guaranteed that modern compilers will both eliminate unnecessary code and execute code speculatively. That is, a compiler will calculate values just in case they may be needed. And that means they will reorder the code in such a way that code that should not be reachable will still be executed. So code that is written like: if (a == 0) call_a_function(b); can well be rewritten as: b_type tmp = b; if (a == 0) call_a_function(tmp); if the compiler's analysis shows that this is likely to typically be faster. (And it doesn't break the guarantees of the language, of course.)
C does have features that make certain classes of optimization harder (pointers, in particular). Some of those problems can be mitigated (e.g. by using "restrict" in the places where the compiler needs that guarantee.)
True, though maybe as much from tradition as from actual advantages But yes, classic fortran has some restrictions that are useful for these cases.
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11 "synchronize thread execution and also synchronize memory with respect to other threads."
Yes, even not-very-modern C compilers recognize common patterns and optimize them specifically. Because that is really effective. For important constructs that are known to be broken unless the compiler helps out, you can be sure the compiler authors will detect those cases and protect them where necessary. memset is also a library function and not a basic primitive. And as you say, compilers do recognize them and optimize them. Or how about this little surprise: What do you think this code compiles to (using current gcc): printf("Hello world\n"); Turns out the final binary doesn't call printf at all. It is instead turned into: puts("Hello world"); I discovered this when my breakpoint on printf never triggered :) Though, on second thoughts, I expect what actually happens with the threading functions is that compiler-specific barriers have been added to the code to ensure the tested compilers are unable to optimize the code to breaking.
C++11 in particular. Older versions do not have language support for threads, and so need to deal with the problems of libraries supporting threads. eirik _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||||||||||||||||||
|
Using SDL_atomic |
Eirik Byrkjeflot Anonsen
Guest
|
Bob Pendleton writes:
Yes, and this is the problem. The C89 and C++03 specifications have very limited provisions for memory barriers. This is intentional, because it allows some seriously effective optimizations. Thus there is no way for a library to implement proper threading primitives while only referring to the language specifications. That's essentially what Hans Boehm says in the article. Of course, if you are making a library for a particular version of a particular compiler, you can usually figure out ways to force it not to break your code :)
Also true. In practice, compiler vendors will work with threading library vendors to ensure that those threading libraries will fulfill their promises. Because anything else would be truly stupid. However, if you write your own threading primitives, you run into the problem that you need those compiler-level memory barriers. Some compilers actually provide that, but pure C89 and C++03 do not. eirik _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||
|
Using SDL_atomic |
Bob
|
Ah, yes, the standards do not provide the mechanism. That is very true. The standards can not provide the mechanism because the mechanism is always machine and sometimes OS specific. The standards define the basic language that you are supposed to be able to count on from system to system and machine to machine. The standards can not define things that must be done differently for each cpu architecture or OS. The standard does not even specify how many bits are in an int, it only specifies the minimum number of bits in an int. I've worked on machines with 18 and 36 bit ints. I saw C on a lisp machine with arbitrary length (as large as will fit in virtual memory) ints. Thousand plus digit ints are cool. Just like SDL must live with the lowest common denominator so must standards live with defining what can be defined.
But, C compilers provide extensions that make it possible to implement all sorts of things, such as thread libraries, even though the standards do not and can not. contain those features. Glad we got that straightened out. There is a huge difference between the language specified in the standard and the language that actually gets implemented. An aside, I seriously dislike most every threading package I have ever encountered because of just one thing. They do not implement threads that work as people expect them to work. They implement them the ways the OS scheduler works. Not at all the same thing. This confuses people fiercely. Bob Pendleton On Wed, Mar 4, 2015 at 9:36 AM, Eirik Byrkjeflot Anonsen wrote:
-- +----------------------------------------------------------- + Bob Pendleton: writer and programmer + email: + blog: www.TheGrumpyProgrammer.com |
|||||||||||||||||
|
Using SDL_atomic |
john skaller
Guest
|
On 05/03/2015, at 2:36 AM, Eirik Byrkjeflot Anonsen wrote:
The truly stupid is typically exceedingly common. -- john skaller http://felix-lang.org _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|
Using SDL_atomic |
Eirik Byrkjeflot Anonsen
Guest
|
Bob Pendleton writes:
True when these differences are visible to the code. However, for memory coherence, all the code cares about is that it gets guarantees about some specific level of memory coherence at some specific points of source-level execution. So the compiler can hide away the details of exactly how that is accomplished. And in fact, some standards do provide such mechanisms. C++11 being the most relevant example I know of (I don't know whether C99 or C11 does). I think the reason C89 and C++03 did not provide such mechanisms were that they weren't considered important at the time. And they thought that it could be done as libraries :)
C++03 actually specifies a char to be 1 byte. Not that it helps any, as it goes on to say that it is unspecified how many bits are in a byte :) [...]
Important distinction :) And in the end, the main point I take away from Hans Boehm's article is that C compilers will do extremely weird things to your code. Which will work as expected in single-threaded code because any temporary weird state will be cleaned up before it is observed. But in multi-threaded code, you really need explicit compiler-level memory barriers around all access to shared-memory data.
I'm interested. In which ways do you think these thread libraries work contrary to people's expectations? eirik _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||||||||
|
Using SDL_atomic |
Bob
|
I'm interested. In which ways do you think these thread libraries work contrary to people's expectations? Ok, that is really the subject for a long blog post. I'll try to be fairly quick here; People expect threads to run in parallel. They expect that if they have 20 threads all 20 threads will be running at once. People think of threads as being like workers in a factory, each doing their jobs in parallel with all the other workers in the factory. In reality you have N cores. That means you have at most N workers running around doing the jobs of all the workers. These workers do not just automatically stop one job and switch to another job. They only switch when they can not keep doing the job they are doing. The key thing is that no matter how many threads you have only a few of them will be running at one time. But, people expect all threads to be running all the time. Even people who know better tend to expect that if they have N cores they should have N active threads in their code. When in fact they may well have zero cores active or N - (any number <= N). When you get to situations with multiple machines connected together it gets even harder for people to understand. I once spent hours... really days, trying to explain to an EE why when our systems were connected by a high speed parallel bus the complete system ran slower than when connected by a low speed serial line. The difference was that the bus was polled and the serial line was interrupt driven. He never did understand why fast was slow and slow was fast... but he finally gave me an interrupt on the parallel bus. The wrong Interrupt, but an interrupt which let it run almost as fast as the serial line. He never did understand that the interrupt let me queue data so that both machines ran nearly full speed all the time while polling forced the machines to run in lockstep. Oh well, documentation and education does not make enough of a distinction between software threads, the things that thread packages deal with and hardware threads, the real things that do the work. The lack of a one to one correspondence between them is very surprising to people. The most intuitive thread package I ever used was one I wrote under DOS on a 286 lo these many years ago. It switched threads when ever a thread blocked, but it also used a timer interrupt to force switching after about a thousand instructions had been run. That kind of fine grain scheduling "wastes" a lot of CPU time but it made it look like every thread was always running. I did have to lock out task switching around all I/o calls though... I base my observations on my own learning curve, my experience trying to teach the subject, and on decades of helping people on mailing lists. Bob Pendleton On Thu, Mar 5, 2015 at 9:55 AM, Eirik Byrkjeflot Anonsen wrote:
-- +----------------------------------------------------------- + Bob Pendleton: writer and programmer + email: + blog: www.TheGrumpyProgrammer.com |
|||||||||||||||||||||||
|
Using SDL_atomic |
Sik
|
2015-03-05 12:55 GMT-03:00, Eirik Byrkjeflot Anonsen:
I believe C99 explicitly states char to be exactly 8 bits (the rest of the sizes is still up to the implementation though, minimum size aside). _______________________________________________ SDL mailing list http://lists.libsdl.org/listinfo.cgi/sdl-libsdl.org |
|||||||||||||
|