Archive

Posts Tagged ‘posix’

Concurrency, when several threads fight for the access to a resource [ example in C ]

January 30, 2014 No comments

If we’re creating a multi-thread application and we’re also sharing information between the main thread and the secondary thread, or between threads, you must have in mind the type of access to that information.
For example, if we will only allow one thread to write on a variable and the other will just read we won’t have any problem in most cases, but if any thread can write a value at any time, we must be careful. If some threads are willing to write a variable at almost the same time, only the last value written will remain.

Another example, we have a film collection software, and at the moment we have 50 films stored. Another thread is going to synchronize an Internet server, but while the synchronization is running, we add 3 films more. The thread synchronizing may see 50 films, but the films sent can be a mix of the old and new films, so the server will think we’ve removed some films and we will have a problem.
In this case, we must protect the access to the critic section (our film list), so when we are adding data, the other thread can’t sync, and when the other thread is syncing, we must wait before adding anything. We will use for that mutual exclusion or mutex.

To try to make a visible example, we’re incrementing numbers, but we will insert a CPU eater task between the read of the value and the writing of the new value. The CPU-eater task can finish in a variable time interval, so one threads will finish this task before others. The desired result is the number incrementing to 10, but the real one differs a bit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>

struct thread_vars_t
{
  int number;
};

int numberExists(int arr[], int current)
{
  int i;

  for (i=0; i<current; ++i)
    {
      if (arr[current]==arr[i])
    return 1;
    }

  return 0;
}

void numberSearch()
{
  /* A time taking task */
  int numbers[100];
  int i;

  for (i=0; i<100; ++i)
    {
      numbers[i] = rand()%101;
      while (numberExists(numbers, i))
    numbers[i] = rand()%101;
    }
}


void *newtask(void *_number)
{
  struct thread_vars_t *vars = _number;

  int number = vars->number;
  numberSearch();
  vars->number = number+1;

  printf ("THREAD: number = %d\n", vars->number);

  pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t thread;
   int rc;
   int i;
   struct thread_vars_t *vars = malloc (sizeof(struct thread_vars_t));

   vars->number = 0;

   printf ("Main process just started.\n");
   for (i=0; i<10; ++i)
     {
       rc = pthread_create(&thread, NULL, newtask, vars);
       if (rc)
     {
       printf("ERROR in pthread_create(): %d\n", rc);
       exit(-1);
     }
     }

   printf ("Main process about to finish.\n");
   /* Last thing that main() should do */
   pthread_exit(NULL);
}

Your result can be more or less like that:

$ ./sharedvar
Main process just started.
THREAD : number = 1
THREAD : number = 1
THREAD : number = 2
THREAD : number = 2
THREAD : number = 3
Main process about to finish.
THREAD : number = 4
THREAD : number = 4
THREAD : number = 3
THREAD : number = 4
THREAD : number = 4

What has happened? Some threads read the variable when it was 0 (two occasions), so they both incremented to 1, others read the value when it was 1 (another two ones), and incremented to 2, in other cases, the variable was 2 and was incremented to 3…
So, several threads read the same value and when writing the new value, we didn’t have in mind the value could have changed by another thread while we were working. That is the race condition.

How can we fix that? The solution is coding structures that block access to the resource when it’s being used. For example, it some other thread has read the value of the variable, no other can, until a new value is written.
Do we lose performance? Yes, a bit, because we are waiting for other tasks instead of working together. But we avoid undesirable situations like the example before. But the threads may do also some other things outside the critical section, and this work can be done simultaneously. We will only block the critical section (when working on a number), when we will block other threads with a mutex.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>

struct thread_vars_t
{
  int number;
  pthread_mutex_t mutex;
};

int numberExists(int arr[], int current)
{
  int i;

  for (i=0; i<current; ++i)
    {
      if (arr[current]==arr[i])
    return 1;
    }

  return 0;
}

void numberSearch()
{
  /* A time taking task */
  int numbers[100];
  int i;

  for (i=0; i<100; ++i)
    {
      numbers[i] = rand()%101;
      while (numberExists(numbers, i))
    numbers[i] = rand()%101;
    }
}


void *newtask(void *_number)
{
  struct thread_vars_t *vars = _number;

  /* BLOCK */
  pthread_mutex_lock(&vars->mutex);
  /* BLOCK */

  int number = vars->number;
  numberSearch();
  vars->number = number+1;

  /* UNBLOCK */
  pthread_mutex_unlock(&vars->mutex);
  /* UNBLOCK */

  printf ("THREAD: number = %d\n", vars->number);

  pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t thread;
   int rc;
   int i;
   struct thread_vars_t *vars = malloc (sizeof(struct thread_vars_t));

   pthread_mutex_init(&vars->mutex, NULL);
   vars->number = 0;

   printf ("Main process just started.\n");
   for (i=0; i<10; ++i)
     {
       rc = pthread_create(&thread, NULL, newtask, vars);
       if (rc)
     {
       printf("ERROR in pthread_create(): %d\n", rc);
       exit(-1);
     }
     }

   printf ("Main process about to finish.\n");
   /* Last thing that main() should do */
   pthread_exit(NULL);
}

And the result will be like this:

$ ./simplemutex
Main process just started.
THREAD: number = 1
Main process about to finish.
THREAD: number = 2
THREAD: number = 3
THREAD: number = 4
THREAD: number = 5
THREAD: number = 6
THREAD: number = 7
THREAD: number = 8
THREAD: number = 9
THREAD: number = 10

Photo: Daryl L. Hunter (Flickr) CC-by

Concurrency, POSIX threads and shared variables in C

January 14, 2014 No comments

A few days ago, we talked about building our first multi-thread programs using POSIX threads. But soon a need arises: share data between the main thread and the recently launched thread, at least to indicate what we want it to do.
So we can use the last argument of pthread_create(), this is a void pointer we can use for whatever we want and pass the variable type we want. For example an integer:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *newtask(void *_number)
{
  int number = *(int*)_number;
  printf("The number I was asked for: %d\n", number);
  pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t thread;
   int rc;
   int i;
   int number = 99;

   printf ("Main process just started.\n");
   rc = pthread_create(&thread, NULL, newtask, &number);
   if (rc)
     {
       printf("ERROR in pthread_create(): %d\n", rc);
       exit(-1);
     }

   printf ("Main process about to finish.\n");
   /* Last thing that main() should do */
   pthread_exit(NULL);
}

Remember to compile using pthread:

$ gcc -o simplepass simplepass.c -lpthread

This way, the thread can read the value we’ve passed him, as we can see if we run this little program. Just after starting newtask() we extract the number from the void pointer to an integer variable. But working a bit on this code we can realize this variable can be bi-directional, in other words, we can write a value from the secondary thread, and the main thread can read it. (Instead of doing this, we can use global variables, but I don’t like it much):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *newtask(void *_number)
{
  int *number = _number;
  printf("The number I was asked for: %d\n", *number);
  *number = 10;
  pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t thread;
   int rc;
   int i;
   int number = 99;

   printf ("Main process just started.\n");
   rc = pthread_create(&thread, NULL, newtask, &number);
   if (rc)
     {
       printf("ERROR in pthread_create(): %d\n", rc);
       exit(-1);
     }

   usleep(100000);

   printf (" Before exiting. Number = %d\n", number);

   printf ("Main process about to finish.\n");
   /* Last thing that main() should do */
   pthread_exit(NULL);
}

Now, if we want to pass multiple arguments to the secondary thread, we can create a struct which stores all variables we want to use there:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>

struct thread_vars_t
{
  int number;
  char string[50];
};

void *newtask(void *_number)
{
  struct thread_vars_t *vars = _number;

  printf("The number I was asked for: %d\n", vars->number);
  vars->number = 10;
  strcpy(vars->string, "Hello world from thread");
  pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t thread;
   int rc;
   int i;
   struct thread_vars_t *vars = malloc (sizeof(struct thread_vars_t));


   vars->number = 99;

   printf ("Main process just started.\n");
   rc = pthread_create(&thread, NULL, newtask, vars);
   if (rc)
     {
       printf("ERROR in pthread_create(): %d\n", rc);
       exit(-1);
     }

   usleep(100000);

   printf (" Before exiting. Number = %d\n", vars->number);
   printf (" Before exiting. String = %s\n", vars->string);

   printf ("Main process about to finish.\n");
   /* Last thing that main() should do */
   pthread_exit(NULL);
}

In older and slower computers, we may have to cary the value passed to usleep(), I’m just simulating a wait, it could be the time taken by additional computation. But using a sleep could be a problem, depending on the computer capabilities this value must change, or we could give a high value to make it compatible with more computers, but it could be a waste of time for modern computers. We can solve it many ways, but this first one will be a demo of what you shouldn’t do making an active wait:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>

struct thread_vars_t
{
  int ready;
  int number;
  char string[50];
};

void *newtask(void *_number)
{
  struct thread_vars_t *vars = _number;

  printf("The number I was asked for: %d\n", vars->number);
  vars->number = 10;
  strcpy(vars->string, "Hello world from thread");
  vars->ready = 1;
  pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t thread;
   int rc;
   int i;
   struct thread_vars_t *vars = malloc (sizeof(struct thread_vars_t));

   vars->ready = 0;
   vars->number = 99;

   printf ("Main process just started.\n");
   rc = pthread_create(&thread, NULL, newtask, vars);
   if (rc)
     {
       printf("ERROR in pthread_create(): %d\n", rc);
       exit(-1);
     }

   //   usleep(100000);
   while (!vars->ready);
   printf (" Before exiting. Number = %d\n", vars->number);
   printf (" Before exiting. String = %s\n", vars->string);

   printf ("Main process about to finish.\n");
   /* Last thing that main() should do */
   pthread_exit(NULL);
}

In this case, we have a new variable (vars->ready), when it’s 1 it means the values are set by the secondary thread, so the main one can read them. The main thread, to wait for vars->ready is asking all the time for this value while it’s 0:

1
while (!vars->ready);

But it has a great disadvantage: the process is eating CPU while the condition is not met. We can see it clearer writing sleep(20) just before the secondary thread set ready to 1, and use a task manager to see how CPU use is going, our process will use a big percentage of CPU for nothing, just waiting. If we think about it and the secondary thread is doing any type of complex calculation, the main thread will be stealing CPU time obstructing the secondary one, making it slower, so we have to use another type of solution, like semaphores, signals, mutex, etc. But I’ll talk about them in another post.
Photo: OakleyOriginals (Flickr) CC-by

Starting with concurrency and a “Hello world” using pthreads

December 31, 2013 No comments


Nowadays is very common to see dual-core or quad-core CPUs, but there are still systems with one single core.
In this post, I want to about the very basis of concurrency, a brief introduction to this world, so some descriptions will be so so simple, but real world is a little bit more complex.

In multi-core or multi-CPU systems, the operating system (OS) distributes tasks within our CPUs, and we can run several tasks at the same time, but we have been enjoying multitasking for decades, long before multi-core appeared, so, how is it possible? Let’s think about single-core systems, they can only perform one task, but the operating system distributes the CPU control along all tasks very fast so it makes us think all of them are running at the same time. So single-core CPUs usually have one execution thread, dual-core system, will have two execution threads, so they can perform two tasks at the same time. But don’t worry about how it is distributed, because the operating system does it for you.

We’ve seen the OS is smart enough to take advantage of our CPUs or CPU cores, so does it have any sense to make our application use several cores, or make our application run two things at the same time?
Of course it does, let’s see some examples:

  • Imagine a GUI program performing a heavy task. While this task is running, we may want to give the user some little control to cancel the process, for example, or to carry on using our software. We can run the heavy task in one thread and the GUI in another one.
  • A server which attends several clients simultaneously.
  • A program which does intense calculations, if we have dual-core, it can be twice faster

In the last case, we must think about something a little bit. Distributing tasks in time takes some time for the OS, so we must make sure the time we save (separating the process in several threads) is bigger than the time the OS wastes managing the new tasks and sometimes we must put all the results of all individual threads together (and it takes time too).
We may think it will be valid only for multi-core processors but the time taken by our tasks may vary depending on what we are doing, for example, calculations take all CPU Time, but if we access external devices (keyboard, screen, hard disk, USB, etc), even access a file, it makes our process wait for the data and while we are waiting, the CPU is wasting time, so we can take this time to perform calculations for some other process, and it is valid even on single-core CPUs. Sometimes running a tasks in two threads can squeeze our CPU better than one thread using CPU while the other thread is just waiting for data an vice-versa.

Ok, before playing a little bit with it we must know there are some ways to do that. We can create several processes, it’s like running some programs at the same time, and the other way is creating threads. The difference is that processes must store in RAM memory, data and stack among other things, and some threads can share code and data, so only stack and a little information about execution status belongs to each thread exclusively. We can say process initiates also a single execution thread.

Let’s code a bit, we will use pthreads (POSIX threads):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *newtask(void *null)
{
   printf("Hello world! I'm a new thread\n");
   sleep(1);
   printf("Goodbye world! I'm a thread about to die\n");
   pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t thread;
   int rc;

   printf ("Main process just started.\n");
   rc = pthread_create(&thread, NULL, newtask, NULL);
   if (rc)
     {
       printf("ERROR in pthread_create(): %d\n", rc);
       exit(-1);
     }

   printf ("Main process about to finish.\n");
   /* Last thing that main() should do */
   pthread_exit(NULL);
}

This example can be compiled including pthread library, this way:

$ gcc -o onethread onethread.c -lpthread

We are just executing a program that creates another thread that performs a task, this task is writing a couple of messages on screen and wait a bit between them. We can see a result like this:

Main process just started.
Main process about to finish.
Hello world! I’m a new thread
[ wait for a second ]
Goodbye world! I’m a thread about to die

The first two messages belongs to the main thread, one when it stats and the other one when it ends, but before ending we are invoking the other thread that writes the other two messages. As we can see, each thread is executed independently, but to see it a bit more clearer, let’s see the next code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *newtask(void *null)
{
   printf("Hello world! I'm a new thread\n");
   int i;

   for (i=0; i<10;++i)
     {
       printf (" THREAD: %d\n", i);
       usleep(60);
     }
   printf("Goodbye world! I'm a thread about to die\n");
   pthread_exit(NULL);
}

int main (int argc, char *argv[])
{
   pthread_t thread;
   int rc;
   int i;

   printf ("Main process just started.\n");
   rc = pthread_create(&thread, NULL, newtask, NULL);
   if (rc)
     {
       printf("ERROR in pthread_create(): %d\n", rc);
       exit(-1);
     }

   for (i=0; i<10;++i)
     {
       usleep(100);
       printf (" MAIN: %d\n", i);
     }
   printf ("Main process about to finish.\n");
   /* Last thing that main() should do */
   pthread_exit(NULL);
}

What we’ve done is that the main thread (MAIN) and secondary thread (THREAD) will write on screen several messages inside a for loop. They will be counting from 0 to 9, but waiting a little bit between each writing (instead of doing heavier time eater tasks we simulate them with sleeps).

We can see a execution result like this:

Main process just started.
Hello world! I’m a new thread
THREAD: 0
THREAD: 1
MAIN: 0
THREAD: 2
MAIN: 1
THREAD: 3
THREAD: 4
MAIN: 2
THREAD: 5
MAIN: 3
THREAD: 6
MAIN: 4
THREAD: 7
THREAD: 8
MAIN: 5
THREAD: 9
MAIN: 6
Goodbye world! I’m a thread about to die
MAIN: 7
MAIN: 8
MAIN: 9
Main process about to finish.

Messages are written alternatively both MAIN and THREAD, like interlaced, now we can see they are both running concurrently, at the same time. We can start making multi-thread software in C.

Photo: Toastyken (Flickr) CC-by

Top