Caelinux cluster
- Filippo Monari
- Topic Author
- Offline
- Junior Member
- 
				  
		Less
		More
		
			
	
		- Posts: 28
- Thank you received: 0
			
	
						15 years 1 month ago				#4667
		by Filippo Monari
	
	
		
			
	
															
	
				Caelinux cluster was created by Filippo Monari			
			
				Hi, I'm going to change my pc (pentium quadcore) and i knew the possibility of create a computational cluster, connecting two or more pc in a network. Can anyone tell me if CAEliniux support this system and suggest me any guide about how realize such thing.
Thank you in advice.
					Thank you in advice.
- CAVT
- Offline
- Senior Member
- 
				  
		Less
		More
		
			
	
		- Posts: 59
- Thank you received: 1
			
	
						15 years 1 month ago				#4679
		by CAVT
	
	
		
			
	
															
	
				Replied by CAVT on topic Re:Caelinux cluster			
			
				I think it's not an issue if CaeLinux can or cannot, which I think it should be able since it's more than anything a network, but rather that you use and set up the libraries recquired for paralell computing.
If you check for example C_S official site there you will see many libraries that they reccomend (or even recquire) for clustering, one of them being MPI (OpenMPI is one version usually available in repositories).
I'm afraid I cannot be of great help in setting up this sort of configurations, but instead of going for a two PC cluster, have you considered better only one PC with two CPUs, or maybe an 8core PC? Perhaps it will be cheaper, and less complicated to set up.
					If you check for example C_S official site there you will see many libraries that they reccomend (or even recquire) for clustering, one of them being MPI (OpenMPI is one version usually available in repositories).
I'm afraid I cannot be of great help in setting up this sort of configurations, but instead of going for a two PC cluster, have you considered better only one PC with two CPUs, or maybe an 8core PC? Perhaps it will be cheaper, and less complicated to set up.
- JMB
- Offline
- Elite Member
- 
				  
		Less
		More
		
			
	
		- Posts: 166
- Thank you received: 0
			
	
						15 years 1 month ago				#4680
		by JMB
	
	
		
			
	
															
	
				Replied by JMB on topic Re:Caelinux cluster			
			
				CAVT wrote:
I believe OpenMPI is already installed on CAELinux2010 (try 'locate openmpi') and its version of CA is compiled to run on multiple cores as well. I am reasonably certain that it is only a matter of getting the configurations setup correctly to get a multi-PC cluster functional using CAELinux as the basic starting point. Perhaps CA will have to be recompiled...
Although, I have yet to get it working, I am trying to get there and so cannot offer any more specifics.
Regards,
JMB
					I think it's not an issue if CaeLinux can or cannot, which I think it should be able since it's more than anything a network, but rather that you use and set up the libraries recquired for paralell computing.
If you check for example C_S (C_A?) official site there you will see many libraries that they recommend (or even require) for clustering, one of them being MPI (OpenMPI is one version usually available in repositories).
I believe OpenMPI is already installed on CAELinux2010 (try 'locate openmpi') and its version of CA is compiled to run on multiple cores as well. I am reasonably certain that it is only a matter of getting the configurations setup correctly to get a multi-PC cluster functional using CAELinux as the basic starting point. Perhaps CA will have to be recompiled...
Although, I have yet to get it working, I am trying to get there and so cannot offer any more specifics.
Regards,
JMB
- Joël Cugnoni
- 
				  
- Offline
- Moderator
- 
				  
			
	
						15 years 1 month ago				#4683
		by Joël Cugnoni
	
	
		
			
					
Joël Cugnoni - a.k.a admin
www.caelinux.com
					
	
															
	
				Replied by Joël Cugnoni on topic Re:Caelinux cluster			
			
				Hi
actually, CAELinux already contains several codes compiled with OpenMPI like Code Aster 10.1, Code Saturne 2rc1, OpenFOAM 1.7, Elmer FEM 5.5, Gerris flow solver and Impact.
These codes should all be capable to run on multiple machines, but they all have different procedure to submit a job so you will need to read each manual carefully.
However, CAELinux is ready to build a cluster, you just need to configure it properly. By default, ssh is already configured correctly for local connections without password. To build your cluster, you mostly need to extend this config to have global passwordless connexions with ssh.
So, to build a basic cluster with CAELinux, you would mostly need a bunch of PC with a gigabit ethernet switch and the following procedure:
1. install CAELinux on each node, give different hostnames (node0,node1,...), but same login/password!
2. setup static ip adresses for your nodes and /etc/hosts to list the ip/hostnames so that each node can resolv the hostname of the other nodes
3. setup ssh connexion without password between each node :
a. generate the list of known host keys, on 1st node (repeat for all nodes):
[code:1]
ssh-keyscan -t dsa,rsa node1 >> ~/.ssh/known_hosts
ssh-keyscan -t dsa,rsa node2 >> ~/.ssh/known_hosts[/code:1]
b. copy the folder /home/youruser/.ssh of the 1st node to all other nodes with: (on node0)
[code:1]scp -r ~/.ssh node1:~
scp -r ~/.ssh node2:~[/code:1]
c. test: ssh node1 should allow you to connect without password/confirmation..
4. you may want to mount a shared folder within your cluster. A simple system could be to mount on each node a common directory stored on node0 using sshfs. On node0:
[code:1]mkdir ~/shared
ssh node1 "mkdir ~/shared; sshfs node0:~/shared ~/shared"
ssh node2 "mkdir ~/shared; sshfs node0:~/shared ~/shared"
... [/code:1]
(sshfs is not restored after rebooting, so you will need to repeat this step or modify /etc/fstab..)
Then, read the manuals of the code you want to run over the cluster and see which config files must be updated and how to launch MPI jobs. For Aster, I know that you need to edit /opt/aster101/etc/codeaster/aster-mpihosts and maybe /opt/aster101/etc/codeaster/asrun
I have not tested all these possibilities yet, but I have built several small clusters like this.
Let us know about your experience as I am pretty sure a lot of people are interested
							actually, CAELinux already contains several codes compiled with OpenMPI like Code Aster 10.1, Code Saturne 2rc1, OpenFOAM 1.7, Elmer FEM 5.5, Gerris flow solver and Impact.
These codes should all be capable to run on multiple machines, but they all have different procedure to submit a job so you will need to read each manual carefully.
However, CAELinux is ready to build a cluster, you just need to configure it properly. By default, ssh is already configured correctly for local connections without password. To build your cluster, you mostly need to extend this config to have global passwordless connexions with ssh.
So, to build a basic cluster with CAELinux, you would mostly need a bunch of PC with a gigabit ethernet switch and the following procedure:
1. install CAELinux on each node, give different hostnames (node0,node1,...), but same login/password!
2. setup static ip adresses for your nodes and /etc/hosts to list the ip/hostnames so that each node can resolv the hostname of the other nodes
3. setup ssh connexion without password between each node :
a. generate the list of known host keys, on 1st node (repeat for all nodes):
[code:1]
ssh-keyscan -t dsa,rsa node1 >> ~/.ssh/known_hosts
ssh-keyscan -t dsa,rsa node2 >> ~/.ssh/known_hosts[/code:1]
b. copy the folder /home/youruser/.ssh of the 1st node to all other nodes with: (on node0)
[code:1]scp -r ~/.ssh node1:~
scp -r ~/.ssh node2:~[/code:1]
c. test: ssh node1 should allow you to connect without password/confirmation..
4. you may want to mount a shared folder within your cluster. A simple system could be to mount on each node a common directory stored on node0 using sshfs. On node0:
[code:1]mkdir ~/shared
ssh node1 "mkdir ~/shared; sshfs node0:~/shared ~/shared"
ssh node2 "mkdir ~/shared; sshfs node0:~/shared ~/shared"
... [/code:1]
(sshfs is not restored after rebooting, so you will need to repeat this step or modify /etc/fstab..)
Then, read the manuals of the code you want to run over the cluster and see which config files must be updated and how to launch MPI jobs. For Aster, I know that you need to edit /opt/aster101/etc/codeaster/aster-mpihosts and maybe /opt/aster101/etc/codeaster/asrun
I have not tested all these possibilities yet, but I have built several small clusters like this.
Let us know about your experience as I am pretty sure a lot of people are interested
Joël Cugnoni - a.k.a admin
www.caelinux.com
- JMB
- Offline
- Elite Member
- 
				  
		Less
		More
		
			
	
		- Posts: 166
- Thank you received: 0
			
	
						15 years 1 month ago				#4684
		by JMB
	
	
		
			
	
															
	
				Replied by JMB on topic Re:Caelinux cluster			
			
				Administrator wrote:
Hello Admin,
Thanks for the details! I am 3/4 of the way there. One question:
1. Why do you recommend sshfs? Would NFS not suffice so long as it is within a local subnet? I am currently utilizing NFS mounts within my network and wondered if I needed to specifically set up a different sub-dir under sshfs?
Regards,
JMB<br /><br />Post edited by: JMB, at: 2010/09/07 14:49
					4. you may want to mount a shared folder within your cluster. A simple system could be to mount on each node a common directory stored on node0 using sshfs. On node0:
(sshfs is not restored after rebooting, so you will need to repeat this step or modify /etc/fstab..)
Hello Admin,
Thanks for the details! I am 3/4 of the way there. One question:
1. Why do you recommend sshfs? Would NFS not suffice so long as it is within a local subnet? I am currently utilizing NFS mounts within my network and wondered if I needed to specifically set up a different sub-dir under sshfs?
Regards,
JMB<br /><br />Post edited by: JMB, at: 2010/09/07 14:49
- Joël Cugnoni
- 
				  
- Offline
- Moderator
- 
				  
			
	
						15 years 1 month ago				#4685
		by Joël Cugnoni
	
	
		
			
					
Joël Cugnoni - a.k.a admin
www.caelinux.com
					
	
															
	
				Replied by Joël Cugnoni on topic Re:Caelinux cluster			
			
				Hi JMB
actually NFS is a better choice for systemwide network sharing.
SSHFS is different as it can be mounted by simple users, so it depends on the use case.
If you need permanent shared folders, NFS is the best choice. If you just want a "quick and dirty" network share for a single parallel run sshfs is more flexible and does not require admin rights.
Performance of NFS is probably also much higher than sshfs.
Let us posted about your progress
							actually NFS is a better choice for systemwide network sharing.
SSHFS is different as it can be mounted by simple users, so it depends on the use case.
If you need permanent shared folders, NFS is the best choice. If you just want a "quick and dirty" network share for a single parallel run sshfs is more flexible and does not require admin rights.
Performance of NFS is probably also much higher than sshfs.
Let us posted about your progress
Joël Cugnoni - a.k.a admin
www.caelinux.com
		Moderators: catux	
		Time to create page: 0.132 seconds	
