Learning threads and distributed computing

Discussion in 'Mac Programming' started by MrFusion, Mar 14, 2009.

  1. macrumors 6502a

    Hello everyone,

    I want to tackle threads and distributed computing as the next step in learning how to program.
    I am looking for a topic or problem that could benefit from threading, but which is not so complicated that a hobbyist programmer can't do it. It's a learning experience, so the advantages and disadvantages of threading should be obvious.

    Many people here have or will have a CS degree. Maybe you want to share some idea's or assignments you had in college?

    Please don't hold back on the basis of math. I am certain I can handle most mathematical problems. A physics problem would be even better. But in the end it's about learning how to program.

  2. macrumors 6502a

    You could try your hand at Genetic Algorithms. Have a look at Mona Lisa in 50 triangles. You could certainly use threads and multiple cores in a project similar to this.

    b e n
  3. macrumors 68030


  4. macrumors 68040


    The last solution listed on that page is co-credited to Jayadev Misra. I'm sure it's of little interest to most, but he taught a class of mine (CS 337) at UT Austin, and it was probably one of the best courses of my college career. Those currently learning to program or even those with a great deal of experience may wish to read the following, which is a transcription of a speech Dr. Misra gave on our last day of class:


    Edit: As I feared, I am repeating myself. This was originally posted in this thread:
    I will use any excuse to bring it up, because i think it's a worthwhile read. Also, re-reading that thread I am embarrassed to see i misspelled "collector" at one point. I feel it's not worthwhile to correct now. =)
  5. macrumors 603


    Interesting read. Thanks Lee.
  6. macrumors 68040


    this may not be the best place to start, but i figured since i posted largely off-topic once, i might try an on-topic post.

    Have you written anything particularly long running so far? If so, is there any means by which you might break the tasks down such that they could be done in parallel ? For example, searching a file system for a file. You could make each subdirectory a new node, and process each node in a separate thread. You would need to have a synchronized list of pending nodes, and a main thread that is dispatching work up to N concurrent threads, where N is some maximum. It's normally best to pick an N that's one less than available cores, but for something that is for practice, this may not be necessary.

    Otherwise... you could have a system monitor that gathers a number of pieces of information and reports them to a remote system. In serial, this might take 5 minutes to gather all information and transmit it, but if these can all be run in parallel, you might be able to do this in a fraction of the time, etc.

    Other common tasks are things like image-recognition/processing. For example, one could break down a 256x256 image into 16 16x16 images, and process each independently. The units would have to communicate partial matches so something that crosses the boundaries isn't missed, etc. but it is parallelizable problem.

    Choosing a standard, well-understood problem as others have suggested might be better, though, since you will have a lot of documentation to look at if you run into trouble.

  7. macrumors 6502a

    Yes. I have a program that is slow and that (I think) could benefit from threading. Before I start rewriting that, I want to play around with threading. Best way to do so is to first do a problem which can be truly parallelized and which is well documented. With a new problem comes new ideas as well.

    I can probably start messing around, but then most likely I will be flooding this forum even more with questions about how to do something. I am also interested in learning it the right way (design wise). Any solution I might accidently find is not necessarily the best method.

    Thanks for the suggestions everyone. If you have more, I am more than happy to hear them.
  8. macrumors 68040


    You should read before you start coding. A lot. Some things you should look into that haven't been mentioned already:
    Readers-writers problem
    Semaphores (What do P and V mean... not the dutch words, but conceptually)
    Mutexes (a special case of a semaphore)
    Implementing Semaphores and Mutexes in a particular language you wish to work in
    Why you'd use a mutex (mutual exclusion to a resource) vs. a semaphore (atomically incrementing decrementing a value for tracking multiple resource access, etc.)

    It may be worthwhile (even if not for this specific purpose) to read some of the writings of Dr. Dijkstra:
    Beware that a number are in Dutch, so unless you are a dutch speaker you may have to seek out translations.

    I have gathered up some material that I've attached here from my OS class. Hopefully Dr. Emmett Witchel will not mind my posting of these materials, as it is not-for-profit. His page re: the current incarnation of the course these are from can be found here:
    All but the solutions to lab2 and lab3 are to his credit (or perhaps Dr. Kathryn McKinley, who was teaching the course at the same time).

    There are lecture notes available. I am sure there are many other similar courses that may have other materials as well, but i refer to these as they are where i learned these things. The projects included are my own. They were written about 5 years ago, and i don't think i scored particularly well, but hopefully they will be of use.

    The materials are not well organized, but i did make a point to find the original assignments for lab2 and lab3 from the wayback machine so you'd have some idea of what that code was meant to do, and so that you could attempt to implement your own solution if desired. The code is all java.


    Attached Files:

  9. macrumors 68040


    sorry to double post, but i don't want to edit and re-edit as people are reading the thread... so:

    You might want to do something like finding if (and where) in a 2048x2048 grid of black/white pixels (this could just be a 2d array of booleans, in practice) a certain pattern (say a 2x2 block of black pixels) lays. You can then break it down into smaller blocks, and have certain types of tasks that can be distributed to threads. One type of task is:
    check block (16x16), which determines if all or part of the pattern falls in a 16x16 grid.
    check block (4x4), this could be a secondary task that would check to see if a 4x4 block is the pattern or not. if the previous task had a partial match, this could be used to see if the full pattern existed at that spot
    distribute blocks, the "master thread" could be in charge of giving out tasks, and generating new tasks as threads return partial matches.

    You could start assuming there would be at most one of the pattern, then generalize from there, etc.


Share This Page