UncleAl Posted March 29, 2008 Report Posted March 29, 2008 Uncle Al has a calculation using time as (radius)^2 of a crystal lattice. At 107 trillion atoms contained it now requires 1.7 hours/point in an AMD Athlon FX-55, http://www.mazepath.com/uncleal/bzhdense.png Needs workhttp://www.mazepath.com/uncleal/qzdense.png Typical finished graph That will not add another 10,000 points. Who has a bored cluster that needs thousands of CPU-hrs of love? You may compile the serial or (preferred) parallel C++ source code (no surprises). It runs 100% in CPU and RAM; maybe 1 MB total output. No Microsoft compiler or OS is satisfactory. Post here or organiker'coupling symbol'lycos'the usual'com Quote
C1ay Posted March 29, 2008 Report Posted March 29, 2008 Have you thought about looking into using distributed computing for your needs. It seems to work well for projects like SETI and the Great Internet Mersenne Prime Search. You could also start your collection of Playstation 3s. They make good clustering hardware for physics. Quote
alexander Posted March 31, 2008 Report Posted March 31, 2008 i have a friend with a beowulf in his basement.... if you want him to run something, i am sure he wont mind... just post mpi code, i will send it his way :) Quote
UncleAl Posted March 31, 2008 Author Report Posted March 31, 2008 Have you thought about looking into using distributed computing for your needs... You could also start your collection of Playstation 3s. . It is a single remaining problem into which we prefer not adding more development. The code is debugged, optimized, compiled, and works to spec. It would do well running in slack time in capable hardware. An AMD FX-55 calculates 2.1 million decimal places of pi in 5.95 seconds versus hours/point in this problem. 60% the throughput in Pentium vs. Athlon, 60% the throughput in WinXP vs. Linux (Knoppix). Segmenting the calculation as a Wintel screen saver is an unpleasant thought. "8^>) Quote
UncleAl Posted March 31, 2008 Author Report Posted March 31, 2008 i have a friend with a beowulf in his basement.... just post mpi code, i will send it his way :Glasses: Kindly ask him first. I suspect he'd like to see the C++ code and perhaps compile it himself. The serial version is a 560K Debian Linux static file (~66K executable plus math libraries). C++ source with documentation is 15K. We'd start with a timing run each in his slowest and fastest CPU. Total run time to any radius is then known within 5%. No hard drive space is needed. It runs wholly in CPU and RAM from a thumb drive if you like. Parallel version C++ source code is 35K, uses MPI, and is best locally compiled for the cluster. A 100% CPU-bound app benefits from cache optimization on-site for maximum execution speed, Valgrind The problem can run as a stack of serial processes with a different radius interval in each CPU. It can run as a parallel process using all CPUs at once for each radius interval. A power failure during serial execution is a disaster. Crashing parallel execution loses a few dozen points in process and RAM storage. Load it then ignore it for a month or two as it runs. Does unused hardware get lonely? Quote
alexander Posted March 31, 2008 Report Posted March 31, 2008 so did you write it using the MPI libraries and threading? if so, as i said, i have a friend (who i will get to post his cluster specs) would likely be willing to run the code for you (he can use it as a bench for his cluster). but it's beowulf and nix, so the best combination would be C++, threading, mpi... Quote
UncleAl Posted April 1, 2008 Author Report Posted April 1, 2008 C++ source for parallel calculation is compiled (long_double_precision, NO Microsoft compilers!) with define MPI_ROUTINES to include MPI functionality with a main server CPU (small fraction of run time) and clients coding (99.9% continuous). If he has 1023 CPUs or fewer it should be a happy camper. The command line for each radius interval includes MPI control and number of processors used. Severe questions can be routed to the UK programmer who parallelized the serial code. Stuff like: To compile CHIpir.CPP use LAM-MPI. They have comprehensive instructions on installation. LAM/MPI Parallel Computing Compile, mpiCC -DMPI_ROUTINES -O3 -o chipir chipir.cpp The -DMPI_ROUTINES (as the name suggests) compiles in the MPI routines. Without it you get the serial single CPU version. To run it use a series of shell script command lines like time mpirun -np 3 ./chipir 100 10 500 1 2 >> output.txt -np 3" = how many CPU's to use. Use 1 more than the number of CPU's since one task is the "master" which is very lightweight and uses very little CPU at all. For a four-processor box first try -np 5 ./chipir = executable process 100 = starting radius 10 = radius increment500 = ending radius 1 = flag. Set to "1" for output to file. 2 = work units/CPU From 1-3 shows speedup. There is no advantage above 5. Multiple scripts may be loaded to dynamically fill each work unit queue as processes complete. At the end the process will not terminate until all CPUs finish. The last command line should all load "1." Uncle Al will write the command lines based upon the timing run and real time cluster access available. If your sysop is clever with a cluster I'm all ears for hearing better strategies. Data look like (radius, angstroms; atoms; CHI) 42340.800 29620928954114 0.99999999753760737442352.000 29644441206354 0.99999999891001143142363.200 29667965859458 0.99999999792227466042374.400 29691502978924 0.999999997623357328 Quote
Morally.Corrupted Posted April 1, 2008 Report Posted April 1, 2008 Well since Alexander was kind enough to mention my cluster, here are some simple specs on it as it is now... Titan : Master Node[wolf00]Homebrew BoxSoyo Dragon BoardAMD x64 Core Duo 6000+3 Gigs Ram750 gig HDDCrossFire ATI cards Trident1 :Slave Node 1[wolf01]Precision 530 1.7GHZ Xeon1.5 Gig Ram Trident2: Slave Node 2[wolf02]Precision 530 1.7GHZ Xeon1.5 Gig Ram Startup......ssi:boot:base:linear:booting n0-n2[wolf00-wolf02] I just got another P4 HL PC so I'll be adding that in to the mix as well..... Quote
UncleAl Posted April 1, 2008 Author Report Posted April 1, 2008 Well since Alexander was kind enough to mention my cluster, here are some simple specs on it as it is now... AMD x64 Core Duo 6000+3 Gigs RamPrecision 530 1.7GHZ Xeon1.5 Gig RamPrecision 530 1.7GHZ Xeon1.5 Gig Ram I just got another P4 HL PC so I'll be adding that in to the mix as well..... Ride 'em cowboy! If it is mostly not doing anything for a month or three... let's do a serial timing run in each CPU (about an hour each, less for the AMD). Linux static executable BigCHIBz is 561K as is and 265K ZIPped. Send me a private message with contact data and I'll e-mail it to you with instructions, or give you a URL for download, your choice. The benzil molecule is a sock - it has no handedness. In the crystal the flat molecules slightly twist and stack into helices, all left- or right-handed in a given crystal, opposite shoes. General Relativity postulates the vacuum has no handedness. Teleparallel gravitation wholly includes GR as a restricted case and further allows the vacuum to be a left foot. One of them is detectably wrong by the energy of the different crystals versus their identical melts fitting into the vacuum, http://www.mazepath.com/uncleal/benzil.png Stereogram of structureCalorimetric Equivalence Principle Test One way to do the experimenthttp://www.mazepath.com/uncleal/qz4.pdf Technical readout (not for the faint of heart) We need to calculate the handedness of the benzil crystal mass distribution (atom positions) to academic standards. Ivory Towers are reluctant to accept outside wash (Not Invented Here). One begins by getting their attention. Donkeys and 2x4s upside the head are a natural pairing. Professors and calculated numbers achieve the same final state. Quote
alexander Posted April 1, 2008 Report Posted April 1, 2008 Oh hey, i don't know if there is a distro out there, i could probably let you use my spark blade in your beowulf, also you will need that P4; I also have a PPC machine sitting around, doing nothing, so if you can find live distros for the platforms, you will have an 800MHz PPC machine added as a node, and a 440MHz Spark box (don't let the speed fool you, its 64 bit spark... comporable to like a 1.2GHz intel.... though its like comparing a watermelon to a capacitor... Quote
UncleAl Posted April 1, 2008 Author Report Posted April 1, 2008 don't know if there is a distro out thereAs needed amend your BIOS to priority boot from the DvD or CD drive, then Knoppix LIVE 5.3.1 DvD (awesome) or 5.1.1 CD (entirely adequate), KNOPPIX 5.3 - Live Linux Filesystem On DVD Long download; burn your own DvDWelcome to Linux Cd.org - Knoppix Buy the DvD, $(US)5.95 plus shipping, then burn more. You need a Knoppix disk if you run Windows. When the OS crashes your HD contents are held hostage to Redmond. Boot Knoppix, mount the drive, and take off what you need. We've compiled and run 64-bit (AMD, of course). It runs slightly faster 32-bit. Long_double_precision is tough on pathways. The FX-55 is a very respectable single core, though obsolete. If you are feeling virtuous you could run the program through valgrind Valgrind and see if my code poet volunteers missed anything clever. Whatever is convenient and interesting is OK with me. When it stops being fun, quit - no harm no foul. Quote
Morally.Corrupted Posted April 2, 2008 Report Posted April 2, 2008 Oh hey, i don't know if there is a distro out there, i could probably let you use my spark blade in your beowulf, also you will need that P4; I also have a PPC machine sitting around, doing nothing, so if you can find live distros for the platforms, you will have an 800MHz PPC machine added as a node, and a 440MHz Spark box (don't let the speed fool you, its 64 bit spark... comporable to like a 1.2GHz intel.... though its like comparing a watermelon to a capacitor... Well If I switched the distros over to BCCD than in their 2.2.2 BETA they have support for PPC's so definitely could add in the PPC and just boot all machines LIVE with BCCD. Already has all the MPI tools and whatnot built into the distro so that wouldnt be an issue as far as running and compiling the code... Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.