by Arturo Castroc++
corrections by Brannon Dorsey安全
translate by ryanaltair 30% now服務器
在一個應用程序中,咱們可能會碰到須要耗費些許時間的任務。例如,從硬盤上讀取一些東西,CPU的讀取內存的速度老是快於硬盤,那麼,須要讀取一張保存在硬盤中的高清圖片,對於應用程序的其餘任務來講,就是一個長時間的任務了。app
在openFrameworks, 老是須要與openGL協同工做,咱們的應用程序須要不斷進行重複update/draw這個循環。
若是打開了垂直同步(vertical sync),那麼咱們的屏幕會保存60Hz的刷新率,那麼這個循環的執行時間就只有16ms(1s/(60frames/s))*1000(ms/s).
而從硬盤讀取一張圖片老是會超過16ms的,所以,若是在update中讀取圖片,那麼,咱們會注意到程序假死了。less
爲了解決這個問題,咱們使用線程(threads)。線程指在主程序工做流外執行的一個特定任務。可讓咱們能夠一次同時執行多個任務而不會令咱們的主程序假死。咱們也能夠將大型任務分開,變成多個小塊的任務,加快處理速度。你能夠把線程看成是程序的子程序。dom
每個應用程序都必須至少包含一個線程。在openFrameworks中,這個不可少的線程就是 setup/update/draw 。咱們稱之爲主線程(或者openGL線程)。咱們能夠建立多個線程,而它們之間相互獨立。異步
所以, 當咱們想要在程序中期讀取一張圖片,不須要在update中讀取圖片,
咱們能夠建立一個專門用於讀取圖片的副線程。問題上,一旦建立線程,
主線程就不知道何時這個副線程會結束,所以,咱們須要一個得讓主副線程得以交流。
還有的問題上,多個線程不能訪問同事同一塊內存。咱們須要如下機制來確保不一樣線程實現異步訪問內存。ide
首先,咱們來看在openFrameworks中如何建立一個線程。oop
每一個應用程序都有至少包含一個線程,也就是主線程(也稱之爲openGL線程),用來調用openGL。ui
而咱們對於須要耗費長時間的任務,則應該使用輔助線程。
在openFrameworks中,咱們能夠經過ofThread類來使用額外的線程。
ofThread並不能直接使用,須要咱們繼承它,並根據需求實現 threadedFunction
,最後,在須要的時候激活這個線程。
class ImageLoader: public ofThread{ //繼承ofThread void setup(string imagePath){ this->path = imagePath; } void threadedFunction(){ //線程的主要工做 ofLoadImage(image, path); } ofPixels image; string path; } //ofApp.h ImageLoader imgLoader; // ofApp.cpp void ofApp::keyPressed(int key){ imgLoader.setup("someimage.png"); imgLoader.startThread(); //激活線程 }
一旦咱們調用 startThread()
,ofThread
就會建立一個新的線程並當即返回。而這個線程則會當即調用 threadedFunction
直到完成。
這樣,就能在update進行的時候,同時進行讀取圖片,而不需中斷。
如今,咱們如何知道圖片已讀取完畢?這個線程但是獨立與咱們的主線程的。
如咱們在上圖所示,副線程並不會將在讀取圖片的狀況主動告知主線程,而當其完成了讀取圖片,咱們則能夠經過確認其是否在運行來推斷圖片是否讀取完畢,這須要用到方法 isThreadRunning()
:
class ImageLoader: public ofThread{ void setup(string imagePath){ this->path = imagePath; } void threadedFunction(){ ofLoadImage(image, path); } ofPixels image; string path; } //ofApp.h bool loading; ImageLoader imgLoader; ofImage img; // ofApp.cpp void ofApp::setup(){ loading = false; } void ofApp::update(){ if(loading==true && !imgLoader.isThreadRunning()){//線程是否在運行 img.getPixelsRef() = imgLoader.image; img.update(); loading = false; } } void ofApp::draw(){ if (img.isAllocated()) { img.draw(0, 0); } } void ofApp::keyPressed(int key){ if(!loading){ imgLoader.setup("someimage.png"); loading = true; imgLoader.startThread(); } }
如今你知道如何讀取一張圖片了。
那若是咱們一次性要讀多張圖片呢?一個可行的方法是建立多個線程。
class ImageLoader: public ofThread{ ImageLoader(){ loading = false; } void load(string imagePath){ this->path = imagePath; loading = true; startThread(); } void threadedFunction(){ ofLoadImage(image,path); loaded = true; } ofPixels image; string path; bool loading; bool loaded; } //ofApp.h vector<unique_ptr<ImageLoader>> imgLoaders; //用vector建立線程組 vector<ofImage> imgs; //ofImage也要多個 // ofApp.cpp void ofApp::setup(){ loading = false; } void ofApp::update(){ for(int i=0;i<imgLoaders.size();i++){//逐個檢查線程是否完成 if(imgLoaders[i].loaded){ if(imgs.size()<=i) imgs.resize(i+1); imgs[i].getPixelsRef() = imgLoaders[i].image; imgs[i].update(); imgLoaders[i].loaded = false; } } } void ofApp::draw(){ for(int i=0;i<imgLoaders.size();i++){ imgs[i].draw(x,y); } } void ofApp::keyPressed(int key){ imgLoaders.push_back(move(unique_ptr<ImageLoader>(new ImageLoader))); imgLoaders.back().load("someimage.png"); }
另外一種辦法只使用一個線程,但須要在線程中設置一個隊列,才能讓線程能夠逐個讀取圖片。
咱們須要在主線程中將圖片路徑添加到隊列中,而輔助線程中的 threadedFunction()
則負責檢查是否有新圖片須要讀取,若是有,則取出隊列中的圖片路徑,並據此,將圖片讀取出來。
The problem with this is that we will be trying to access
the queue from 2 different threads, and as we've mentioned
in the memory chapter, when we add or remove elements to
a memory structure there's the possibility that the memory
will be moved somewhere else. If that happens while one
thread is trying to access it we can easily end up with
a dangling pointer that will cause the application to
Imagine the next sequence of instruction calls
the 2 different threads:
但這種方法存在一個問題:有兩個不一樣的線程(主線程和輔助線程)都須要訪問同一個內存區(這裏是保存圖片路徑的隊列),
若是一個線程須要修改該內存區的內容時(例如,添加新的圖片路徑),此時該內存區卻被另外一個線程修改(例如,移除舊的圖片路徑),那麼,其中一個線程極可能讀取到錯誤的信息,而這,會致使程序崩潰。
下一部分的教程演示瞭如何安全地從兩個線程訪問同一個內容。
loader thread: finished loading an image //loader thread 結束讀取圖片 loader thread: pos = get memory address of next element to load //pos=下一個須要讀取的元素的地址 main thread: add new element in the queue //添加新的元素到隊列 main thread: queue moves in memory to an area with enough space to allocate it //爲了有足夠的空間能夠容納新的元素,隊列的內存地址發生改變 loader thread: try to read element in pos <- crash pos is no longer a valid memory address //嘗試讀取pos上的指向的元素 <- 崩潰, 如今pos再也不是一個可訪問的內存地址,
在這裏,線程1loader thread
和線程2main thread
會同時訪問一個內存地址pos
,但因爲不能很好地安排其訪問順序,系統會終止它們並報出segmentation fault錯誤(緣由是其中一個線程 如 線程1 對該地址的修改,例如,增長新的數據,致使分配的內存位置不足,所以,系統配了新的內存區,並註銷了原來的內存區,但就在這個時候,另外一個進程, 線程2 開始了對該內存區的訪問,而且,因爲未更新內存區的位置,致使線程2對老的內存區進行了訪問,但因爲老的內存區已經被註銷了,因而,系統就當即報錯)。
爲了確保不一樣線程訪問同一個內存能夠井井有理,咱們須要一個鎖,當線程1操做的時候,給該線程上鎖,則其餘線程不能操做。
這種鎖,便是c++中的mutex,也就是openFrameworks中的ofMutex。
在介紹mutex以前,咱們還須要瞭解下thread與openGL。
你可能會注意到前面:
class ImageLoader: public ofThread{ ImageLoader(){ loaded = false; } void setup(string imagePath){ this->path = imagePath; } void threadedFunction(){ ofLoadImage(image, path); loaded = true; } ofPixels image; string path; bool loaded; } //ofApp.h ImageLoader imgLoader; ofImage img; // ofApp.cpp void ofApp::setup(){ loading = false; } void ofApp::update(){ if(imgLoader.loaded){ img.getPixelsRef() = imgLoader.image; img.update(); imgLoader.loaded = false; } } void ofApp::draw(){ if (img.isAllocated()) { img.draw(0, 0); } } void ofApp::keyPressed(int key){ if(!loading){ imgLoader.setup("someimage.png"); imgLoader.startThread(); } }
在副線程中,不使用ofImage,而使用ofPixels來讀取圖片。而是在主線程再將ofPixels的內容放入ofIamge中。這是由於,古老的openGL,一次只能和一個線程工做。而這也是爲何咱們直接稱主線程爲GL線程。
在以前的高級圖形(advanced graphics)中和ofbooks的其餘章節中。openGL是典型的異步工做,客戶端/服務器模式。咱們的應用程序時客戶端,用來發送信息,告訴服務器openGL須要顯示什麼,而後,openGL則會根據其實際狀況,將須要顯示的圖形發送給顯卡。
也正是由於這個緣由,openGL能夠很好地和一個線程協做,也就是主進程。但若是嘗試從不一樣線程調用openGL,則必然會致使程序崩潰,至少不會顯示想要的圖片信息。
而當調用ofImage的img.loadImage(path)
,它實際上調用了openGL來處理圖片的紋理。而若是咱們在非GL線程調用它,那麼,應用程序就會崩潰,或者,不會正確顯示紋理。
固然,咱們也能夠告訴ofImage,和openFrameworks中其餘須要使用像素和紋理的對象,不要使用像素,而且,不要調用openGL讀取紋路,這樣,就能夠在非GL線程下使用ofImage了。以下:
class ImageLoader: public ofThread{ ImageLoader(){ loaded = false; } void setup(string imagePath){ image.setUseTexture(false);//禁用紋路 this->path = imagePath; } void threadedFunction(){ image.loadImage(path); loaded = true; } ofImage image; string path; bool loaded; } //ofApp.h ImageLoader imgLoader; // ofApp.cpp void ofApp::setup(){ loading = false; } void ofApp::update(){ if(imgLoader.loaded){ imgLoader.image.setUseTexture(true); imgLoader.image.update(); imgLoader.loaded = false; } } void ofApp::draw(){ if (imgLoader.image.isAllocated()){ imgLoader.image.draw(0,0); } } void ofApp::keyPressed(int key){ if(!loading){ imgLoader.setup("someimage.png"); imgLoader.startThread(); } }
In general remember that accessing openGL outside of the GL thread is not safe.
In openFrameworks you should only do operations that involve openGL calls from the main thread, that is, from the calls that happen in the setup/update/draw loop, the key and mouse events, and the related ofEvents.
If you start a thread and call a function or notify an ofEvent from it, that call will also happen in the auxiliary thread, so be careful to not do any GL calls from there.
在不一樣線程中,有許多方法能夠安全使用openGL,例如,建立一個可分享區域,用來給不一樣的線程讀取紋路。或者使用PBO來規劃內存(不過這超出本章節討論範圍)。
整體而言,在openGL線程外訪問openGL是很不安全的。在openFrameworks,你應該只在主線程的setup()
/update()
/draw()
,key event
和 mouse event
及其餘標準的ofeEvent
中調用openGL相關的操做。在其餘線程中,若是調用了ofEvent,也須要確保其沒有調用openGL。
另外,對於聲音,也須要多加留意,ofSoundStream會建立其單獨的線程來保證聲音的準確輸出,所以,在使用ofSoundStream,也須要當心。
Before we started the openGL and threads section we were talking
about how accessing the same memory area from 2 different threads
can cause problems.
This mostly occurs if we write from one of the threads causing
the data structure to move in memory or make a location invalid.
To avoid that we need something that allows to access that data to
only one thread simultaneously. For that we use something called
When one thread want's to access the shared data, it locks
mutex and when a mutex is locked any other thread trying to
lock it will get blocked there until the mutex is unlocked again. You can think of this as some kind of token that each thread needs to have to be able to access the shared memory.
Imagine you are with a group of people building a tower of cards,
if more than one at the same time tries to put cards on it it's
very possible that it'll collapse so to avoid that, anyone who
wants to put a card on the tower, needs to have a small stone,
that stone gives them permission to add cards to the tower
and there's only one, so if someone wants to add cards
they need to get the stone but if someone else has the
stone then they have to wait till the stone is freed.
If more than one wants to add cards and the stone is
not free they queue, the first one in the queue gets
the stone when it's finally freed.
A mutex is something like that, to get the stone you call lock on the mutex,
once you are done, you call unlock.
If some other thread calls lock while another
thread is holding it, they are put in to a queue,
the first thread that called lock will get the mutex when it's finally unlocked:
thread 1: lock mutex thread 1: pos = access memory to get position to write thread 2: lock mutex <- now thread 2 will stop it's execution till thread 1 unlocks it so better be quick thread 1: write to pos thread 1: unlock mutex thread 2: read memory thread 2: unlock mutex
When we lock a mutex from one thread and another thread tries to lock it,
that stops it's execution. For this reason we should try to do
only fast operations while we have the mutex locked in order to not lock
the execution of the main thread for too long.
In openFrameworks, the ofMutex class allows us to do this kind of locking.
The syntax for the previous sequence would be something like:
thread 1: mutex.lock(); thread 1: vec.push_back(something); thread 2: mutex.lock(); // now thread 2 will stop it's execution until thread 1 unlocks it so better be quick thread 1: // end of push_back() thread 1: mutex.unlock(); thread 2: somevariable = vec[i]; thread 2: mutex.unlock();
We just need to call lock()
and unlock()
on our ofMutex from the different threads,
from threadedFunction
and from the update/draw loop
when we want to access a piece of shared memory.
ofThread actually contains an ofMutex that can be
locked using lock()/unlock(), we can use it like:
class NumberGenerator{ public: void threadedFunction(){ while (isThreadRunning()){ lock(); numbers.push_back(ofRandom(0,1000)); unlock(); ofSleepMillis(1000); } } vector<int> numbers; } // ofApp.h NumberGenerator numberGenerator; // ofApp.cpp void ofApp::setup(){ numberGenerator.startThread(); } void ofApp::update(){ numberGenerator.lock(); while(!numberGenerator.numbers.empty()){ cout << numberGenerator.numbers.front() << endl; numberGenerator.numbers.pop_front(); } numberGenerator.unlock(); }
As we've said before, when we lock a mutex we stop other threads from accessing it. It is important that we try to keep the lock time as small as possible or
else we'll end up stopping the main thread anyway making the
use of threads pointless.
Sometimes we don't have a thread that we've created ourselves,
but instead we are using a library that creates
it's own thread and calls our application on a callback.
Let's see an example with an imaginary video library that
calls some function whenever there's a new frame from the camera,
that kind of function is called a callback because some
library calls us back when something happens,
the key and mouse events functions in OF are examples of callbacks.
class VideoRenderer{ public: void setup(){ pixels.allocate(640,480,3); texture.allocate(640,480,GL_RGB); videoLibrary::setCallback(this, &VideoRenderer::frameCB); videoLibrary::startCapture(640,480,"RGB"); } void update(){ if(newFrame){ texture.loadData(pixels); newFrame = false; } } void draw(float x, float y){ texture.draw(x,y); } void frameCB(unsigned char * frame, int w, int h){ pixels.setFromPixels(frame,w,h,3); newFrame = true; } ofPixels pixels; bool newFrame; ofTexture texture; }
Here, even if we don't use a mutex, our application won't crash.
That is because the memory in pixels is preallocated in setup
and it's size never changes. For this reason the memory
won't move from it's original location. The problem is
that both the update and frame_cb functions might be
running at the same time so we will probably end
up seeing tearing.
Tearing is the same kind of effect we can see when we draw to the screen without activating the vertical sync.
To avoid tearing we might want to use a mutex:
class VideoRenderer{ public: void setup(){ pixels.allocate(640,480,3); texture.allocate(640,480,GL_RGB); videoLibrary::setCallback(this, &VideoRenderer::frameCB); videoLibrary::startCapture(640,480,"RGB"); } void update(){ mutex.lock(); if(newFrame){ texture.loadData(pixels); newFrame = false; } mutex.unlock(); } void draw(float x, float y){ texture.draw(x,y); } void frameCB(unsigned char * frame, int w, int h){ mutex.lock(); pixels.setFromPixels(frame,w,h,3); newFrame = true; mutex.unlock(); } ofPixels pixels; bool newFrame; ofTexture texture; ofMutex mutex; }
That will solve the tearing, but we are stopping the main thread while the frameCB
is updating the pixels and stopping the camera thread while the main one is uploading the texture. For small images this is usually ok, but for bigger images we could loose some frames. A possible solution is to use a technique called double or even triple buffering:
class VideoRenderer{ public: void setup(){ pixelsBack.allocate(640,480,3); pixelsFront.allocate(640,480,3); texture.allocate(640,480,GL_RGB); videoLibrary::setCallback(this, &VideoRenderer::frameCB); videoLibrary::startCapture(640,480,"RGB"); } void update(){ bool wasNewFrame = false; mutex.lock(); if(newFrame){ swap(pixelsFront,pixelsBack); newFrame = false; wasNewFrame = true; } mutex.unlock(); if(wasNewFrame) texture.loadData(pixelsFront); } void draw(float x, float y){ texture.draw(x,y); } void frameCB(unsigned char * frame, int w, int h){ pixelsBack.setFromPixels(frame,w,h,3); mutex.lock(); newFrame = true; mutex.unlock(); } ofPixels pixelsFront, pixelsBack; bool newFrame; ofTexture texture; ofMutex mutex; }
With this we are locking the mutex for a very short time in the frame callback to set newFrame = true
in the main thread. We do this to check if there's a new frame and then to swap the front and back buffers. swap
is a c++ standard library function that swaps 2 memory areas so if we swap 2 ints a
and b
, a
will end up having the value of b
and viceversa, usually this happens by copying the variables but swap
is overridden for ofPixels and swaps the internal pointers to memory inside frontPixels
and backPixels
to point to one another. After calling swap
, frontPixels
will be pointing to what backPixels
was pointing to before, and viceversa. This operation only involves copying the values of a couple of memory addresses plus the size and number of channels. For this reason it's way faster than copying the whole image or uploading to a texture.
Triple buffering is a similar technique that involves using 3 buffers instead of 2 and is useful in some cases. We won't see it in this chapter.
Sometimes we need to lock a function until it returns, or lock
for the duration of a full block. That is exactly what a
scoped lock does. If you've read the memory chapter
you probably remember about what we called
initially, stack semantics, or
RAII Resource Adquisition Is Initialization.
A scoped lock makes use of that technique to lock a mutex for the whole duration of the block, even any copy that might happen in the same return
call if there's one.
For example, the previous example could be turned into:
class VideoRenderer{ public: void setup(){ pixelsBack.allocate(640,480,3); pixelsFront.allocate(640,480,3); texture.allocate(640,480,GL_RGB); videoLibrary::setCallback(&frame_cb); videoLibrary::startCapture(640,480,"RGB"); } void update(){ bool wasNewFrame = false; { ofScopedLock lock(mutex); if(newFrame){ swap(fontPixels,backPixels); newFrame = false; wasNewFrame = true; } } if(wasNewFrame) texture.loadData(pixels); } void draw(float x, float y){ texture.draw(x,y); } static void frame_cb(unsigned char * frame, int w, int h){ pixelsBack.setFromPixels(frame,w,h,3); ofScopedLock lock(mutex); newFrame = true; } ofPixels pixels; bool newFrame; ofTexture texture; ofMutex mutex; }
A ScopedLock is a good way of avoiding problems because we forgot to unlock a mutex and allows us to use the {}
to define the duration of the lock which is more natural to C++.
There's one particular case when the only way to properly lock is by using a scoped lock. That's when we want to return a value and keep the function locked until after the value was returned. In that case we can't use a normal lock:
ofPixels accessSomeSharedData(){ ofScopedLock lock(mutex); return modifiedPixels(pixels); }
We could make a copy internally and return that later, but with this pattern we avoid a copy and the syntax is shorter.
A condition, in threads terminology, is an object that allows to
synchronize 2 threads. The pattern is
something like this: one thread waits for something
to happen before starting it's processing.
When it finishes, instead of finishing the thread,
it locks in the condition and waits till there's new data to process.
An example of this could be the image loader class we were working
with earlier. Instead of starting one thread for every image, we
might have a queue of images to load. The main thread adds image
paths to that queue and the auxiliary thread loads the images
from that queue until it is empty. The auxiliary thread then
locks on a condition until there's more images to load.
Such an example would be too long to write in this section,
but if you are interested in how something like that might
work, take a look at ofxThreadedImageLoaded (which does just that).
Instead let's see a simple example.
Imagine a class where we can push urls to
pings addresses in a different thread. Something like:
class ThreadedHTTPPing: public ofThread{ public: void pingServer(string url){ mutex.lock(); queueUrls.push(url); mutex.unlock(); } void threadedFunction(){ while(isThreadRunning()){ mutex.lock(); string url; if(queueUrls.empty()){ url = queueUrls.front(); queueUrls.pop(); } mutex.unlock(); if(url != ""){ ofHttpUrlLoad(url); } } } private: queue<string> queueUrls; }
The problem with that example is that the auxiliary thread
keeps running as fast as possible in a loop,
consuming a whole CPU core from our computer which is not a very good idea.
A typical solution to this problem
is to sleep for a while at the end of each cycle like:
class ThreadedHTTPPing: public ofThread{ public: void pingServer(string url){ mutex.lock(); queueUrls.push(url); mutex.unlock(); } void threadedFunction(){ while(isThreadRunning()){ mutex.lock(); string url; if(queueUrls.empty()){ url = queueUrls.front(); queueUrls.pop(); } mutex.unlock(); if(url != ""){ ofHttpUrlLoad(url); } ofSleepMillis(100); } } private: queue<string> queueUrls; };
That alleviates the problem slightly but not completely.
The thread won't consume as much CPU now,
but it sleeps for an unnecesarily while
when there's still urls to load. It also
continues to run in the background even when
there's no more urls to ping. Specially
in small devices powered by batteries,
like a phone, this pattern would drain the battery in a few hours.
The best solution to this problem is to use a condition:
class ThreadedHTTPPing: public ofThread{ void pingServer(string url){ mutex.lock(); queueUrls.push(url); condition.signal(); mutex.unlock(); } void threadedFunction(){ while(isThreadRunning()){ mutex.lock(); if (queueUrls.empty()){ condition.wait(mutex); } string url = queueUrls.front(); queueUrls.pop(); mutex.unlock(); ofHttpUrlLoad(url); } } private: Poco::Condition condition; queue<string> queueUrls; };
Before we call condition.wait(mutex)
the mutex needs to be locked, then the condition unlocks the mutex and
blocks the execution of that thread until condition.signal()
is called.
When the condition awakens the thread because it's been signaled,
it locks the mutex again and continues the execution.
We can read the queue without problem because we know
that the other thread won't be able to access it.
We copy the next url to ping and unlock the mutex to
keep the lock time to a minimum. Then outside the lock we
ping the server and start the process again.
Whenever the queue gets emptied the condition will
block the execution of the thread to avoid it from running in the background.
As we've seen threads are a powerfull tool to allow for several tasks to happen simultaneously
in the same application. They are also hard to use, the main problem is usually accessing shared resouces, usually shared memory.
We've only seen one specific case, how to use threads to do background tasks that will pause the execution of the main task,
there's other cases where we can parallelize 1 task by dividing it in small subtasks like for
example doing some image operation by dividing the image in for subregions
and assigning a thread to each. For those cases there's special libraries
that make the syntax easier, OpenCv for example can do some operations
using more than one core through TBB and
there's libraries like the same TBB or OpenMP that allow to specify that a loop should be divided and run simultaneol¡usly in more than one core