|
The qthreads API is designed to make using large numbers of threads convenient and easy, and to allow portable access to threading constructs used in massively parallel shared memory environments. The API maps well to both MTA-style threading and PIM-style threading, and is still quite useful in a standard SMP context. The qthreads API provides access to full/empty-bit (FEB) semantics, where every word of memory can be marked either full or empty, and a thread can wait for any word to attain either state.
The qthreads library on an SMP (i.e. the POSIX implementation) is essentially a library for spawning and controlling coroutines: threads with small (4k) stacks. The threads are entirely in user-space and use their blocked/unblocked status as part of their scheduling. The library's metaphor is that there are many qthreads and several "shepherds". Shepherds can be thought of as a thread mobility domain; they map to specific processors or memory regions. Qthreads are assigned to specific shepherds and do not migrate unless directed to migrate.
The API includes utility functions for making threaded loops, sorting, and similar operations convenient.
The Qthread library was developed to explore innovations in highly concurrent systems where the ultimate system either does not exist, or is sufficiently hard to obtain that development of software for the system becomes difficult.
Development is currently hosted on GitHub: https://github.com/Qthreads/qthreads.
Platforms & Requirements
Architectures
POSIX Qthreads supports most POSIX-style machines, including Linux, Solaris, and MacOS X, running on a variety of architectures. It has been tested on:
| Architecture |
Linux |
Solaris |
MacOS X |
SST |
| PPC32 |
✓ |
|
✓ |
✓ |
| PPC64 |
✓ |
|
✓ |
|
| IA32 |
✓ |
|
✓ |
|
| IA64 |
✓ |
|
|
|
| AMD64/x86_64 |
✓ |
|
✓ |
|
| SparcV9+ |
|
✓ |
|
|
| Tilera (MIPS) |
✓ |
|
|
|
Compilers
Qthreads has been tested with:
| GCC 3.x |
Works |
| GCC 4.x |
Works |
| PGI 11.6 |
Works |
| Intel ICC 11.1.x |
Works; does not support inline assembly on IA64 |
| Intel ICC 12.x |
Works |
| TileraMDE 2.0.0.77314 |
Works; requires -O0 |
| SunStudio 12 |
Causes internal compiler errors ("Wasted space") |
Build Requirements
To compile and run the POSIX Qthreads you will require:
- A UNIX-like shell (Qthreads uses the GNU Autotools)
- GNU Make
- C Compiler (Prior to 1.5 requires either C++ or the cprops library)
To compile and run SST Qthreads you will also require:
- PPC C Compiler
- A full complement of static libraries (here)
Installation
Detailed installation directions are included in the INSTALL file in the distribution (see the link on the left).
Papers & Publications
To cite qthreads, please use:
Additional related publications:
- Implementing a Portable Multi-threaded Graph Library: the MTGL on Qthreads
Brian Barrett, Jonathan Berry, Richard Murphy, Kyle Wheeler
In the Proceedings of the 23rd IEEE International Parallel & Distributed Processing Symposium (IPDPS '09, in the MTAAP '09 workshop), IEEE Press, 2009.
- Portable Performance from Workstation to Supercomputer: Distributing Data Structures with Qthreads
Kyle Wheeler, Douglas Thain, Richard Murphy
In the Proceedings of the First Workshop on Programming Models for Emerging Architectures (PMEA), IEEE Press, 2009.
- Scheduling Task Parallelism on Multi-Socket Multicore Systems
Stephen Olivier, Allan Porterfield, Kyle Wheeler, and Jan Prins
In the Proceedings of the 25th International Conference on Supercomputing (ICS'11, in the ROSS'11 workshop), ACM Press, 2011.
API 1.8 Documentation
- Basic Threading
- Futures
- Blocking & Atomic Operations
- Non-blocking Atomics
- FEB
- Mutex
- Reductions (Sincs)
- Threaded Loops
- Futurelib
- Qloop
- Qutil
- qutil_double_max, qutil_uint_max, qutil_int_max
- qutil_double_min, qutil_uint_min, qutil_int_min
- qutil_double_mult, qutil_uint_mult, qutil_int_mult
- qutil_double_sum, qutil_uint_sum, qutil_int_sum
- qutil_mergesort, qutil_qsort
- qt_double_max, qt_uint_max, qt_int_max
- qt_double_min, qt_uint_min, qt_int_min
- qt_double_prod, qt_uint_prod, qt_int_prod
- qt_double_sum, qt_uint_sum, qt_int_sum
- Timers
- Distributed Data Structures
- Qpool (distributed memory pool)
- Qarray (distributed array)
- Qlfqueue (lock-free ordered queue)
- Qdqueue (distributed end-to-end ordered queue)
- Qt_dictionary (concurrent lock-free hash table)
- Externally Blocking Operations
- Runtime Functions
- Computational Templates
- Memory Allocation (deprecated)
Top of page
|
Contacts
|