alexander Posted February 6, 2009 Report Posted February 6, 2009 So I am dedicating this sticky to that, i am currently putting together an HPC cluster to run at home for use, including other things, but mainly for some project computation needs here at hypo (please check out the Strange Numbers thread) I figured what i wanted to do, i am going to make custom installs of Gentoo on a few machines to start with, i will start a 2 node cluster on machines i have at home, i hope to go to a 4 or 5 node next week, and hopefully to an 8 node in the coming months (actually if i get a decent amount of money from taxes, this may turn to have a large amount of nodes before the summer arrives) current hardware: master:3.4Ghz P4ht 800Mhz fsb2.5Gb DDR400 node 1:3.0Ghz P4ht 800MHz fsb1.5GB ddr400 network: is a dedicated 8 port gigabit switch, intel gigabit cards. Storage: not decided yet, master node may get a storage facelift, figuring 1-2T should suffice for now this will eliminate the usage of the network to store the data as i physically get more nodes, i will post them here first reference: Gentoo Linux Documentation -- High Performance Computing on Gentoo Linux now i just need to make it happen software-wise I decided to go with gentoo only because its a totally stripped down distro that i can build up... i didnt want a live distro, it uses too much ram, usually they are inefficient too, i might eventually make a net-boot version, but for now i just want to get it going, and start writing mpi code for it. I'll post as i go along, on what happens, how it gets resolved, etc :shrug: once i have master, new nodes will be allowed to use distcc to build their packages, so each consecutive node will have more processing power to compile with for boot strapping etc Quote
Turtle Posted February 7, 2009 Report Posted February 7, 2009 So I am dedicating this sticky to that, i am currently putting together an HPC cluster to run at home for use, including other things, but mainly for some project computation needs here at hypo (please check out the Strange Numbers thread)... :) :eek: So dude, I prolly wouldn't know a cluster if it bit me in the backside, but I just happened to watch an interview today that Charlie Rose did for 40 minutes with Jen-Hsun Huang, CEO & co-founder of Nvidia. What does this have to do with fast processing you ask? Well, seems Mr. Huang and his gang are applying programmable GPU's to amp up overall raw math processing usually left to the CPU and getting performances as much as 100 times faster. So, I can't explain it so well maybe, but here is the interview. This is as cutting edge as it gets as far as I can gather from it. Enjoy. :shrug: Charlie Rose - A conversation with Jen-Hsun Huang, CEO Nvidia Quote
dberkholz Posted February 7, 2009 Report Posted February 7, 2009 first reference: Gentoo Linux Documentation -- High Performance Computing on Gentoo Linux[/url]Hi there, Good to see you're using Gentoo! Take that guide with a grain of salt, it's a bit outdated. You probably want Torque/Maui and OpenMPI instead of OpenPBS and MPICH. Quote
alexander Posted February 8, 2009 Author Report Posted February 8, 2009 i know, i do research well, and well, i've been using gentoo since 2005, you have probably vague understanding of just how many installs of it i had done, lets put it this way, we had weekends at college where my friends and i installed gentoo for any advanced population that came around... that was neat... yeah i am going to use openmpi and mpitch, distcc... i've also been thinking about Dr.Queue batch job manager, in case someone wanted to do some rendering on it (speaking of multitude of friends i have that do a lot of random things that might want to use the cluster) Quote
alexander Posted February 9, 2009 Author Report Posted February 9, 2009 db if you are interested i will post details on the installs, they are fairly similar, juuust a little different hardware they are both compiling kernels right now, but that's besides the point... they are stage 3's that will eventually be recompiled... getting them as down to core as i can actually disabling almost everything to conserve power.... i will probably be ordering a couple of Gigabyte boards with Phenom II 920's and 8 gb ram for each, pretty soon, i have to come up with some sort of an enclosure for them, but more then that, with MOHNEH (yes those don't magically appear for anyone) that is currently a bit tight, for now, i am recycling my friend's no longer used equipment, as long as it's like a p4 or a 3000+ or newer series AMD (preferably HT, because i set makeopts to j3, and with symmetric multiprocessing, those procs give interesting stats on compiles), i will use it :) base system is almost installed, they are finishing some essentials before first reboot Quote
alexander Posted February 9, 2009 Author Report Posted February 9, 2009 I was going to work on it from work today, but even though i did not forget to enable ssh on my mac, as i was going to use it to ssh into the boxes that are being worked on, i did forget to disable Little Snitch... which you can NOT disable in command line... that pisses me off, but that's besides the point, you cant unload the kext, you cant shutdown the service... useless... so i will have wait till later to finish the installs (i have a couple of things left, boot loader, and some packages, but i mostly recompiled all the system libraries and binaries over night (optimized for the processors they are running on CFLAGS="-march=pentium4 -O2 -pipe -fomit-frame-pointer" also i disabled all kinds of USE flags so not to compile sound support, or any X, kde, gnome or gtk libs and a load of other crap that gets loaded on the system) Oh also note to anyone doing Gentoo installs from stage 2-3 (i use stage3 for the speed and then optimize it) if you have an error just after unpacking udev do NOT panic. Problem is caused by the differences in the march flags used to compile the binaries you are loading vs the new march flags you set in make.conf first reemerge gcc second run gcc-config and set the profile to the i686 third source /etc/profile forth reemerge libtool also if ss and com_err are blocking e2fsptogs and e2fsprogs-libs first fetch everything you might need# emerge -vuDaf world second backup ss and com_err# quickpkg ss com_err remove ss and com_err and e2fsprogs (if they are blocking themselves)# emerge -C ss com_err e2fsprogs emerge the new ones#emerge -va --oneshot e2fsprogs-libs e2fsprogs fix brokage#revdep-rebuild Quote
alexander Posted February 10, 2009 Author Report Posted February 10, 2009 Update: first 2 machines are finally self-booting, they are almost where i want them to be, just need a couple of more things before they are hpc-ready :) (boot times under 35 seconds btw) Quote
alexander Posted February 10, 2009 Author Report Posted February 10, 2009 the master and node1 are now almost there, i have setup distcc to compile using them, this means that i have now the ability to setup new systems a few times faster then normal, i also have gotten 2 more p4 machines, one ht, one not (i think), they will be joining the cluster probably closer to the end of the week (i have to make some space for them near some power sockets) Plan for tonite is to get openmpi working, as well as torque and maui (gentoo restricts the download of the code, i have to download it, but wget does not download it either) i was having a ton of fun compiling the kernel yesterday, i saw about a 60% increase in speed with just one more machine (granted it was a 3.4 vs 3.0 with nearly twice the memory). plan for first cluster test next week (probably render something in blender using mpi) (do i hear anyone say MIT box?) Quote
alexander Posted February 11, 2009 Author Report Posted February 11, 2009 another update, i finally registered downloaded and installed maui, figured out my dyndns update issue on the router and forwarded 22 to the master node (the router runs an ips and a block all by default policy, the box runs iptables that only allows 22 and icmp through from the outside)... i think i covered security fairly well... Now its only a matter of figuring out the configuration of the batch job manager, openmpi etc, etc, what needs to be installed on the nodes, testing it with the one node and finally throwing more nodes on :) should be done in the next couple of days... Quote
alexander Posted February 11, 2009 Author Report Posted February 11, 2009 ok more documentation links: Torque: torque:torque_wiki [Documentation Wiki]Maui: Cluster resources :: Docs & Training - Documentation (scroll down to maui section) Quote
Theory5 Posted February 11, 2009 Report Posted February 11, 2009 Wow, you are building a cluster using linux? I tried to build a cluster usning a couple of really old computers at school, I kept running into soo many problems. I was using Windows Active directory on windows 2000 OS. How many years of experience have you had with Linux? Ive played around with it but I never thought once about switching cause I could never figure out the entire OS (ugh commands!). Keep us updated :-) Quote
alexander Posted February 11, 2009 Author Report Posted February 11, 2009 been using linux on and off since 2005 actually, but i had good teachers, well, they helped me set up my first gentoo box, then i started reading, working with it, in 5 months i was managing a linux lab and i've stuck with unix-like OSes since then (in techno speak nix, implying BSD, linux, minix, os x), I can't say that i have a main OS, i run a lot of different distros, and OSes. I run various flavors of linux, generally Gentoo, Ubuntu or BackTrack, but i have done others; SuSe, CentOS, Slackware, RedHat, Fedora, Yellow Dog, some small Linux distros, embedded linux, and others. I use OS X as my main platform for doing graphics and audio work, and i am once again rebuilding my OpenBSD router (its running on dual ultra spark III machine with 768 megs of ram and a quad interface nic on top of that, isa naas 1u). I cant see why anyone would want to build a cluster on windows, and its not that i am against windows, its bang for the buck thing, performance, cost and stability, 3 things that Linux excels in, i will just say this, most clusters and supercomputers today run Linux. It's not as hard as it may seem. If you want an advise for starting up, take a system that you are no longer using, and throw ubuntu on it, its easier to install then windows, and there is no need of command line managing it. Also ubuntu has a great community, and a lot of "howto"s and beginner guides are available. Just one thing, you have to do your own research to find how to do things, its an integral part of working with linux, the solution is out there, you just need to find the answers.... other then that, ask, there are a couple of people here i know that know linux fairly well, and we will help. Use Ubuntu for a bit, once you feel acquainted with the interface, try getting to know it better, learn some command line, eventually, anyone can be a pro :) Quote
Theory5 Posted February 11, 2009 Report Posted February 11, 2009 Well, considering that my main computer is the only computer that can handle anything, I have three others, two are 12 year old dell's and the third is a dell optiplex 270 with so little memory (256mb) that it cant handle installing Sun's operating system Solaris 10 (it does run XP but poorly). If I get more memory for it and a couple more of them then I might consider building a cluster. I would like a linux box to mess around with and study. Know of anyplace I can get a linux box or no-OS box for $50-$100? How are you building that router and is it working out? I tried to use windows because that was the one I am most familier with and the only linux they had was a really old version of ubuntu and puppy linux as well as one that supposedly acts like windows.Windows Will always be my main operating system for better or for worse until WINE can run games, :-) Quote
alexander Posted February 11, 2009 Author Report Posted February 11, 2009 you wont get any cheaper and cooler looking then this: ZaReason, Inc. :: Desktops :: Shuttle KPC Quote
alexander Posted February 11, 2009 Author Report Posted February 11, 2009 btw those old boxes will run linux no probs.... by the way, a cluster is not what you think, it does not start magically using other computers and make it seem like one big computer... the parallel cluster i am building will run programs that are written specifically for parallel processing with a specific set of libraries to manage the nodes, etc, it's not like you will get a giant supercomputer by doing this that you can play crysis at 1000fps on, very specific use, not something you can build without knowing a thing or two about systems, not something you can build without planning a purpose for it... Quote
Theory5 Posted February 12, 2009 Report Posted February 12, 2009 btw those old boxes will run linux no probs....Yea but what can I do with them? If they dont have adaquite speed and processing power what good are they? it's not like you will get a giant supercomputer by doing this that you can play crysis at 1000fps on, very specific use, not something you can build without knowing a thing or two about systems, not something you can build without planning a purpose for it...Lol, I am well aware of that :-) I just wanted to see what building a cluster was like. How will you be able to get a command to all the computers does the gentoo OS allow you to control them using its GUI or is it all command based? I read that there was some software for windows computers that linked them as a cluster but it was command based. Quote
alexander Posted February 12, 2009 Author Report Posted February 12, 2009 to answer your question, please read: OpenMPI Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.