-
Notifications
You must be signed in to change notification settings - Fork 1
/
Perceus-Report.txt
1063 lines (797 loc) · 56.9 KB
/
Perceus-Report.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
//Perceus Debian Package 1.6 Tutorial/Report
//This report mainly covers the features of Perceus
//and explain why it should be incorporated into the
//BDUC cluster. This report will also provide tutorials
//and examples to set up Perceus.
//Run and create html :
// export fileroot="/home/hjm/nacs/Perceus-Report"; asciidoc -a toc -a toclevels=4 -a numbered ${fileroot}.txt; scp ${fileroot}.[ht]* moo:~/public_html;
// export fileroot="/home/anthony/Dropbox/Work/Perceus-Report"; asciidoc -a toc -a toclevels=4 -a numbered ${fileroot}.txt;
// scp ${fileroot}.txt hmangala@claw1:~/bduc/trunk/sge; ssh hmangala@bduc-login 'cd ~/bduc/trunk/sge; svn update; svn commit -m "new mods to Perceus HOWTO"'
The Perceus Provisioning System
===============================
by Anthony Vuong <[email protected]> & Harry Mangalam <[email protected]>
v1.14, Jan 21, 2012
//(thanks to Kaz Okayasu for the loan of the Gb switch)
//This section will give a detailed description as to what Perceus has to offer.
//There will be detailed information about the exclusive features of Perceus.
//Also show how reliable, flexible, and scalable it is.
What's Perceus?
---------------
http://perceus.org/[Perceus] (Provision Enterprise Resources & Clusters Enabling Uniform Systems)
is an Open Source provisioning system for Linux clusters developed by the creators of http://en.wikipedia.org/wiki/Warewulf[Warewulf], of which Perceus is the successor.
Perceus typically runs as a server process on an administrative node of a cluster and provides the Operating System to requesting nodes via the network. It is optimized for stateless systems - those in which the OS is not resident on-disk, but freshly net-booted at each startup - but can also provision 'stateful' systems in which the OS is written to the disk on the nodes. It can provision nodes to be completely homogenous (as would be required for a compute cluster) or can provision them to be fairly heterogeneous as for sets of application servers such that each OS image is tuned to a particular service, such as compute servers, storage nodes, or interactive nodes. Perceus also provides utilities to modify the client OS images and push out changes either immediately via rsync or to save them to the client image to be refreshed at next reboot.
As befits a tool for handling thousands of nodes, Perceus handles most things automatically. It will detect unidentified MAC addresses in the private network, add them into the default Perceus group, and provision a default image.
Perceus can also set specific configurations for certain nodes based on MAC address.
It should be noted that Perceus is well supported by http://www.infiscale.com/[Infiscale], the company formed to commercialize it. There is a http://altruistic.infiscale.org/docs/perceus-userguide1.6.pdf[User Guide for Perceus 1.6] that should be used as the definitive introduction and guide to Perceus. This document varies from that one in that this is more closely focussed on installing Perceus on Debian-based systems and then integrating the Perceus server and the provisioned cluster with an already existing NIS/NFS/Kerberos campus system. This probably represents a large proportion of how Perceus will be installed.
//Perceus, Rocks, Univaud, LSTP.org
Operating System Support
------------------------
Currently, Perceus supports the following operating systems for both server and clients.
- http://www.debian.org/[Debian] and Debian-derived distributions such as http://www.ubuntu.com/[Ubuntu]. This report will describe the configuration of a Ubuntu 10.04(LTS)-based cluster. mailto:[email protected][Tim Copeland] <[email protected]> has recently provided a http://perceus.criteriondigital.com/[Debian-derivation combination genchroot script] that combines handling a number of Debian, Ubuntu, and Mint distros into one script. It generates a chroot package that can be imported into Perceus as described below.
- http://www.redhat.com/rhel/[Red Hat Enterprise Linux 5] and newer, and other RPM systems such as http://fedoraproject.org/[Fedora] and http://www.centos.org/[CentOS].
Perceus.org provides http://www.perceus.org/site/html/documentation.html[Quickstart Guides] for some of these OSs.
Perseus supplies the client nodes with their OSs in 'modules', a stripped-down version of the OS (no Desktop GUI, minimal libraries & utilities, no applications) that provides the base functionality
The OS modules included with Perceus (from Infiscale) are:
- GravityOS - Debian-based linux distribution.
- Caos NSA - An RPM-based distribution of linux that focuses on high performance computing.
- some others have been made available http://caos.osuosl.org/Perceus/vnfs/premade_vnfs/[from the OSUOSL labs]
- still others can be generated by the http://caos.osuosl.org/Perceus/vnfs/creation_scripts/[alternative genchroot scripts] that have been made available by Infiscale and others.
- as noted, the http://perceus.criteriondigital.com/genchroot.html[master Debian genchroot script from CriterionDigital] will generate multiple versons derived from Ubuntu, Debian, and Mint.
A Perceus Glossary
------------------
Since the Perceus approach is somewhat different than the typical static-OS-on-disk approach, we'll dedicate some space to defining some Perceus terms:
Stateful Provisioning
~~~~~~~~~~~~~~~~~~~~~
In most computers, the OS is installed on the local disk. Stateful provisioning is the process of obtaining an OS image from a server and installing that OS to the local disk of the client node. This approach is useful when bandwidth is extremely limited or changes in OS image are expected to be infrequent. This is provided in Perceus v1.6 and above.
Stateless Provisioning
~~~~~~~~~~~~~~~~~~~~~~
This is the opposite of stateful provisioning. Instead of installing the OS on the local disk, the provided image is
installed into RAM which allows the hard disk space to be used for something else. The advantages of this approach are that the OS image is refreshed at each reboot and that any changes to the server-hosted image is propagated to each node. It also saves some disk space on each node (the nodes could be completely diskless, as long as the work loads did not require fast swap).
Virtual Node File System (VNFS)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The VNFS is essentially just a disk image
that is provisioned to the nodes. The VNFS is split into 4 parts: VNFS
capsule, VNFS rootfs, VNFS config files, and VNFS image.
[[vnscapsule]]
VNFS Capsule
^^^^^^^^^^^^
A VNFS capsule is a compressed base package of the OS. While you can make your own capsules, Perceus supplies sample capsules of GravityOS and CAOS which are generally sufficient for real world use.
VNFS rootfs
^^^^^^^^^^^
Once you've 'imported', uncompressed, and mounted a VNFS capsule on the server using the 'perceus' utility commands, you can access the files of the image and make changes to the image. This appears as a complete root filesystem to a user on the server and can be cd'ed into, edited, upgraded as a chrooted filesystem, etc.
VNFS Config files
^^^^^^^^^^^^^^^^^
These files:
------------------------------------------------------------
close* configure* livesync* master-includes nodescripts/ umount* vnfs.img
config hybridize livesync.skip mount* rootfs/ vmlinuz
------------------------------------------------------------
are located at the top of the VNFS file tree , typically at '/etc/perceus/vnfs/<VNFSNAME>', and describe various options, conditions, filesystem mounts, etc for each VNFS. Because of this, each VNFS can be configured quite differently from any other.
VNFS Image
^^^^^^^^^^
The VNFS image is the actual image that is provisioned to the
nodes. Once you mount and configure the VNFS rootfs, you have to unmount it to
update the VNFS image.
Modules
~~~~~~~
Perceus provides utilities that can import and load modules to
it's nodes (SUCH AS??)
Import Nodes From Other Sources
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Perceus can also import nodes
definitions that are used with other provisioning such as Rocks, Warewulf,
and Oscar (done with a simple 'import' command).
The Provisioning Process
------------------------
Step by step
~~~~~~~~~~~~
As noted above, Perceus is a client/server process, with the clients requesting their entire OS to be given to them by the Perceus Server (in stateless mode).
We assume the server is up and running with a Perceus daemon running and listening to the private interface.
. The client node is booted and requests an OS via PXE-boot.
. The Perceus daemon responds with the first stage boot image which loads the very lightweight Perceus client daemon
. The client daemon configures the node by getting a DHCP-provided IP # and initiating the PXE-boot.
. The client requests a VNFS capsule and preps the in-RAM filesystem to load the OS.
. Once the new kernel boots, the Perceus client daemon is purged and the RAM returned to the system.
. The system runs as normal, starting whatever services and mounting whatever filesystems the VNFS is configured to do.
Some Issues
~~~~~~~~~~~
Since in stateless mode the OS is net-booted each time, the node can run 'entirely diskless'. However, this requires that it have sufficient RAM to hold not only the OS and all associated filesystems in-memory, as well as all of the application and user code, but also that it never hits swap (since it doesn't have any) and the only local working space ('/scratch') is RAM based.
Our nodes do have disks, but we partition them into a swap partition (since some of our codes balloon to fairly large sizes) and to a '/scratch' partition so that user data can be pre-staged to prevent lots of network bandwidth during a run. We provide a Perl init script for the VNFS http://moo.nac.uci.edu/~hjm/format_mount.pl[format-mount.pl] that takes care of detecting, partitioning, and mkfs'ing the disk prior to the node being made available to users.
//Bit Rot, Fresh OS Image
Why Perceus for the BDUC cluster
--------------------------------
The BDUC nodes currently use stateful provisioning through TFTP. The nodes get an entire OS initially (including many of the libs, utils, & apps) and until the OS is manually refreshed, it does not undergo any updates. Currently, these nodes are suffering from 'bit rot', the variation from a standard installation to one that is at variance with the expected image due to many reasons. Chief among them is a node going down and missing an update or cluster-wide installation. This 'bit rot' is typically handled on a case-by-case basis which can entail significant admin time.
Perceus addresses the 'bit rot' problem with a 'livesync' or (worst case) simple reboot. All installation and configuration tasks will be handled by the Perceus server. If we needed to make a quick change within a node OS, we can just configure the VNFS rootfs and push the image changes to the cluster instead of going into every node and making the change.
Apart from Perceus features, we can more efficiently use the node's local disk as scratch space since the OS will not have to reside on-disk, making BDUC more efficient at no additional cost and less manual intervention.
mailto:[email protected][Prakashan Korambath] and mailto:[email protected][Kejian Jin] provided similar arguments for http://moo.nac.uci.edu/~hjm/ucla_perceus_test.pdf[using Perceus on UCLA's Hoffman2 cluster].
Getting Started
---------------
We will describe the Perceus installation and configuration for both a 'minimal setup' similar to the one described in the http://altruistic.infiscale.org/docs/perceus-userguide1.6.pdf[Perceus User Guide] and a 'Production Setup' we will use to 'append' a Perceus cluster to our current BDUC production cluster. The main difference between the two is that the basic setup will allow only 'root' login to the nodes unless another user is added in the VNFS. The production version allows BDUC users to login to the Perceus-provisioned nodes and transparently access their files on the BDUC cluster via integration with the BDUC http://en.wikipedia.org/wiki/Network_Information_Service[NIS]/http://en.wikipedia.org/wiki/Network_File_System_(protocol)[NFS] and http://en.wikipedia.org/wiki/Kerberos_(protocol)[Kerberos] system.
[[perceuscomponents]]
Perceus Components
~~~~~~~~~~~~~~~~~~
Hardware
^^^^^^^^
Perceus requires minimal hardware to test. It requires only:
- A 'private network'. This can be as few as 1 node connected to a small switch or hub. The faster the network hardware the better, but it can be as slow as 10Mb. We used a 24-port Netgear Gigabit Switch.
- At least 1 'Perceus Master Server' with a 2 interfaces, one for the external Internet and one facing a private network that services the cluster. For testing purposes, the Perceus server can be a small, slow machine; the most important parts of it are the speed of the network adapters, although the CPU speed is relevant when compressing a modified VNFS. We used a AMD dualcore Opteron @ 1.4GHZ, 4GB RAM, 60GB IDE HD, Ubuntu 10.04 (AMD64) Desktop OS, 2x Broadcom 1Gb interfaces
- At least 1 node whose BIOS has been configured to 'PXE-boot'. It can also be an a small, slow node, but it has to have enough RAM to hold the OS; I'd rec no less than 1GB. we used 2 nodes, each having 2 AMD Opterons @ 2.4GHZ, 8GB RAM, 320GB SATA HD, 2x Broadcom 1Gb interfaces (only 1 used).
- A link:#vnfcapsule[VNFS capsule] containing the node OS to be provisioned from the server to the node. We used the Debian-derived gravityos module.
This is the simplest Perceus configuration; You can also use multiple Perceus servers with local or nonlocal shared filesystems. For example, in a production cluster, a single hefty server could be used as the login/head node, the Perceus server, and the storage server, altho this puts a lot of eggs in a single basket. An alternative is to keep the head/login node separate and put the Perceus and storage server on the same node. In the following schema, we will use a single server for everything; the rationale is that if one of the parts goes down, most of the cluster functionality is lost anyway.
Network Configuration
^^^^^^^^^^^^^^^^^^^^^
Install a Debian-based Linux OS if you haven't done so. As stated above, we installed Ubuntu 10.04(LTS) Desktop (AMD64) on our Perceus server to take advantage of the GUI tools. Obviously the Desktop version isn't necessary (and there are good reasons not to use it).
Since you'll need an operating network to update the OS and obtain the optional packages, let's address the network configuration 1st. I've never had a good experience with any default Network Manager. The alternative is to edit the '/etc/network/interfaces' file by hand.
Our '/etc/network/interfaces' file:
-----------------------------------------------------------------------------
auto lo
iface lo inet loopback
# The primary network interface
auto eth0
iface eth0 inet static
address 128.200.34.147
netmask 255.255.255.0
network 128.200.34.0
broadcast 128.200.34.255
gateway 128.200.34.1
# dns-* options are implemented by the resolvconf package, if installed
dns-nameservers 128.200.1.201
auto eth1
iface eth1 inet static
address 192.168.1.1
netmask 255.255.255.0
network 192.168.1.0
broadcast 192.168.1.255
# if you want the cluster nodes to be able to see the
# public internet, include the following 2 lines
gateway 128.200.34.1
dns-nameservers 128.200.1.201
-----------------------------------------------------------------------------
We also need to use IP Masquerade to enable the the private *192.168.1.0* network to communicate with the public *128.200.34.0* network and gain access to the outside world. You can directly manipulate iptables to make this configuration, but we chose to use 'guidedog' (part of the KDE Desktop), which accomplished this transparently.
Restart the network to activate the new configurations and check that the OS thinks everything is fine.
-----------------------------------------------------------------------------
/etc/init.d/networking restart
ifconfig
# should dump a configuration that show that eth0 is assigned 128.200.34.147
# and eth0 is assigned 192.168.1.1 and you should now be able to ping
# to remote hosts
ping www.google.com
PING www.l.google.com (66.102.7.99) 56(84) bytes of data.
64 bytes from lax04s01-in-f99.1e100.net (66.102.7.99): icmp_seq=1 ttl=53 time=2.94 ms
64 bytes from lax04s01-in-f99.1e100.net (66.102.7.99): icmp_seq=2 ttl=53 time=4.46 ms
... etc ...
-----------------------------------------------------------------------------
Software
^^^^^^^^
The Perceus server will need the following packages and files to run Perceus. Since we'll be running Perceus on a Ubuntu server, the packages are referenced using the Ubuntu deb names. The client nodes need nothing of course, since they will be fully provisioned by Perceus. The nodes do need to be of recent enough vintage that they can be configured to PXE-boot, which is set using the BIOS configuration (which unfortunately requires you to boot each node into the BIOS configuration screens one time to set this via the 'Boot' or 'Startup' screens.
Here are the packages and files needed to run the basic Perceus on the Ubuntu 10.04(LTS) server.
Files (not part of a Ubuntu distribution).
- http://altruistic.infiscale.org/deb/perceus16.deb[Perceus Version 1.6 Debian Package]
- http://altruistic.infiscale.org/~ian/gravityos-base.vnfs[gravityos (base VNFS Image)]
- http://moo.nac.uci.edu/~hjm/format_mount.pl[format_mount.pl] - locally written disk-formatting utility.
Deb packages (and dependencies, if not noted explicitly).
----------------------------------------------------------------
libnet-daemon-perl nfs-kernel-server
libnet-pcap-perl nasm
libplrpc-perl perl
libunix-syslog-perl libdbi-perl
libyaml-perl libio-interface-perl
libyaml-syck-perl libnet-arp-perl
openssh-server
guidedog # depends on KDE; could also manipulate iptables directly
----------------------------------------------------------------
Install them all with:
----------------------------------------------------------------
sudo apt-get install libnet-daemon-perl nfs-kernel-server \
libnet-pcap-perl nasm libplrpc-perl perl libunix-syslog-perl \
libdbi-perl libyaml-perl libio-interface-perl libyaml-syck-perl \
libnet-arp-perl openssh-server guidedog
----------------------------------------------------------------
Installing and Configuring Perceus
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Getting the necessary Perceus-specific packages and files to install Perceus on the main server:
-----------------------------------------------------------------------------
cd ~
mkdir perceus-dist
cd perceus-dist
# now get the required debs from Infiscale
wget http://altruistic.infiscale.org/deb/perceus16.deb
wget http://altruistic.infiscale.org/~ian/gravityos-base.vnfs
# install Perceus
sudo dpkg -i perceus16.deb
# and start it.
sudo perceus start
-----------------------------------------------------------------------------
When 'perceus start' executes for the 1st time, it will ask some questions about how you want to configure the cluster. The questions are quite straightforward and usually the default answer is acceptable. The following demonstrates the questions, with comments prefixed by '##'. Accepting the default is designated by '<Enter>'
-----------------------------------------------------------------------------
Do you wish to have Perceus do a complete system initialization (yes/no)? yes
What IP address should the node boot address range start at?
(192.168.1.192)> 192.168.1.11
## the private net is going to be used ONLY for the cluster, so we only
## reserve the 1st 10 addresses for special-purpose servers.
What IP address should the node boot address range end at?
(192.168.1.254)> <Enter>
What domain name should be appended to the DNS records for each entry in
DNS? This won't require you to specify the domain for DNS lookups, but it
prevents conflicts from other non-local hostnames.
(nac.uci.edu)> <Enter>
## Perceus determines what local net you're on
What device should the booting node direct its console output to? Typically
this would be set to 'tty0' unless you are monitoring your nodes over the
serial port. A typical serial port option might be 'ttyS0,115200'.
note: This is a global option which will affect all booting nodes.
(tty0)> <Enter>
Creating Perceus ssh keys
Generating public/private dsa key pair.
Your identification has been saved in /root/.ssh/perceus.
Your public key has been saved in /root/.ssh/perceus.pub.
The key fingerprint is:
cb:4e:bb:ee:6c:95:65:f9:a4:89:23:a7:f6:de:23:63 root@flip
The key's randomart image is:
+--[ DSA 1024]----+
| .. |
| . . o. |
| . + o . . |
| +. + |
| o S . + |
| . . o o o |
| . + .. |
| .E+.o. |
| ..oooo |
+-----------------+
Created Perceus ssh host keys
Created Perceus ssh rsa host keys
Created Perceus ssh dsa host keys
Perceus is now ready to begin provisioning your cluster!
## pretty easy, no?
-----------------------------------------------------------------------------
Importing a VNFS Capsule to Perceus
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
At this point, we'll be importing a VNFS capsule created by the developers of
Perceus. The VNFS capsule includes a Debian-based OS image of 'gravityos'.
Locate the 'gravityos-base.vnfs' OS capsule that you just downloaded and 'import' it using the following shell command.
-----------------------------------------------------------------------------
sudo perceus vnfs import /path/to/gravityos-base.vnfs
-----------------------------------------------------------------------------
After importing the capsule, there will be a prompt asking to create a root
password for the VNFS image, gravityos-base in this case. This will be
'your only login' for the basic node setup unless other users are added later on.
There will also be a series of configuration questions (mostly network) regarding the VNFS image. These questions are straightforward; we will be add more details in later versions, if necessary.
Your modified VNFS files are located in '/etc/perceus/vnfs'; the rest of the Perceus configuration files are in '/etc/perceus'.
The file '/etc/perceus/dnsmasq.conf' is automatically configured based on answers provided during the installation process. If you misconfigured somewhere regarding network settings and need to fix it, this is the file to check. You'll also find the dhcp boot range (IP addresses provisioned to nodes) for the nodes here.
// ?? the following is a bit confused ??
The file '/etc/perceus/defaults.conf' holds a default set of configurations to provision nodes that were not explicitly identified in the Perceus cluster, including giving a default image to an unidentified node. The settings found in this configuration file will include:
- default image
- Starting IP # of the client nodes
- default group
In the same '/etc/perceus/defaults.conf', set "Vnfs Name = NAMEOFOS-base". In our test cluster, we set it to "Vnfs Name = gravityos-base".
The file '/etc/perceus/perceus.conf' is also automatically configured by Perceus during the installation process. Make sure the master network device is the ethernet port for the private network ('eth1' in our case) and the VNFS transfer method is 'nfs'.
Now power-on the nodes and the Perceus server should provide default
settings and add new nodes to its database. This ends the basic setup of
Perceus.
When the provisioning is complete you should have a set of nodes that starts from the starting IP# and increases up to the maximum number you set. You should be able to login to the nodes at the console as 'root' (and only as 'root'). You should also be able to ssh to the nodes as 'root' from the Perceus master and poke around to verify that the node is a true compute node. Adding other user names is covered below in the 'Production Setup'.
Reconfiguration of the VNFS
---------------------------
Once the Perceus clients are up and running, you will soon discover that you need other services and filesystem mounts available. While you can make these changes on a live node to verify that they work correctly, in order to make these changes permanent, you'll have to make the changes on the Perceus-imported VNFS image on the Perceus server and then mirror the changes to the image. In most cases, it's sufficient to make the changes in the image and then 'livesync' the changes to the cluster, but you should designate 1 node as a test target and test the new image against that target before any changes are launched cluster-wide.
This is where having a fast Perceus server WILL make a difference, since the ~200MB image has to be processed and compressed each time it's written out. While it's a bit of a pain, the best approach is as described above - test the changes on a live node and then immediately reiterate the change on the mounted image, then test the change in the image by rsyncing it to a designated 'stunt node'.
[[productionsetup]]
Perceus Production Setup with BDUC
----------------------------------
The additional features for the 'Production Setup' are:
- Kerberos (network authentication) to allow transparent centralized authorization to the cluster.
- NIS (Network Information Service) to allow transparent user login to any node after Kerberos authotization
- NFS (Network File System) in conjuction with NIS, allows users to access their files from any node in the cluster.
- autofs / automount - allows the remote filesystems to be mounted on demand and unmounted when idle to prevent stale/locked NFS mounts.
- 'format_mount.pl' - detects, partitions, mkswap's, and mkfs's the node disk to allow 'swap' and '/scratch' to be made & used.
For our cluster, we are using the campus 'Kerberos' server for authorization - ie, the Perceus server is neither the Kerberos server nor the NIS/NFS server, so we can make use of those external services without configuring the Perceus server to supply these additional services; it just has to be configured to consume these services.
To do this, you'll needed these additional packages (and dependencies).
----------------------------------------------------------------
krb5-clients autofs5
libpam-krb5 parted
nis krb5-kdc
binutils
#install them all with ..
sudo apt-get install krb5-clients libpam-krb5 nis autofs5 krb5-kdc binutils parted
the krb5 realm is 'UCI.EDU'
the kerberos kdc server is 'kerberos.service.uci.edu'
the kerberos admin_server is 'kerberos.service.uci.edu'
the NIS domain we want to join is 'YP.bduc.uci.edu'
see bduc-login:/etc/yp.conf
----------------------------------------------------------------
And these configuration files for your cluster (BDUC in our case)
// if there's a choice, use the ones from ubuntu 10.04
Files from a NIS/NFS/Kerberos client in your cluster
---------------------------------------------------------
/etc/yp.conf (from a NIS client)
/etc/ypserv.conf
/etc/nsswitch.conf
/etc/krb5.conf
/etc/autofs_ldap_auth.conf
/etc/auto.master
/etc/auto.misc # not used by BDUC
/etc/auto.net # not used by BDUC
/etc/auto.smb # not used by BDUC
---------------------------------------------------------
These simply need to be copied to the same position on the Perceus server (after carefully making backups of the originals). This will allow us to access our campus LDAP server for login information and automount BDUC's user and application filesystems.
Once the files are backed up and copied to the Perceus Server, the services have to be started.
-----------------------------------------------------------------------------
# start the NIS services
sudo /etc/init.d/nis start
# the following line initializes the local Kerberos database.
# REMEMBER the password you set!! (should only need to be done the 1st time)
kdb5_util create -s
# and then start the krb5 services.
/etc/init.d/krb5-kdc start
-----------------------------------------------------------------------------
Sun Grid Engine requirements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Perceus Master
^^^^^^^^^^^^^^
For the Perceus master to be included usefully in the SGE domain, it must:
- automount the exported SGEROOT
- have the binutils installed (see above)
- be added as an execution host to SGE
- be added as a submission and/or admin host to SGE
- for SGE jobs to be sent to the node, it has to be added to a host group which is servicing jobs (ie: @long).
Perceus Clients
^^^^^^^^^^^^^^^
For the Perceus client nodes to be included in the SGE domain, the vnfs module has to include the same configurations.
The same remote NFS mounts have to be automounted.
/home (mounted over existing /home, if need be)
/sge52
/apps (automounted on request)
NFS access to cluster files
~~~~~~~~~~~~~~~~~~~~~~~~~~~
To access your cluster files from the Perceus server, you'll need the help of a BDUC admin to modify the '/etc/exports' file on all NFS servers that supply files to BDUC (bduc-login, bduc-sched). The file needs to be edited to allow the Perceus server to mount the exported files. Don't forget to 'exportfs' the configuration on the NFS servers.
Finally, test whether our Perceus server is connected to our NIS master on BDUC (bduc-sched) by executing the following on the Perceus server:
-----------------------------------------------------------------------------
ypcat passwd
<dumps yp passwd info>
-----------------------------------------------------------------------------
This command listed login information and directories of all users in the BDUC cluster, so it was a success!
Once you restart the networking on the Perceus server, you should be able to ssh to the Perceus main server with your UCINetID username and password and be able to read/write your BDUC files as if you were logged into BDUC.
We're now done with the Perceus server; now onto the client nodes.
Configuring the Perceus Clients
-------------------------------
In order for the Perceus clients to gain these same abilities, the above configuration files have to be copied to the chrooted, live-mounted VNFS image in the same location (/etc/..), and the same debs have to be chroot-installed into that image as described on pages 22-23 in the http://altruistic.infiscale.org/docs/perceus-userguide1.6.pdf[Perceus User Guide].
Mount the VNFS
~~~~~~~~~~~~~~
On the Perceus server, we have to mount the VNFS image using
-----------------------------------------------------------------------------
sudo perceus vnfs mount gravityos-base # or the name of the VNFS pkg used
-----------------------------------------------------------------------------
The mount directory is '/mnt/gravityos-base'
We'll have to 'chroot' into the directory so we can install packages in the
now-live image.
-----------------------------------------------------------------------------
sudo chroot /mnt/gravityos-base
-----------------------------------------------------------------------------
Install & Configure the Debs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
As with the server, here are the packages we'll need to install for the VNFS image:
----------------------------------------------------------------
krb5-kdc nis
krb5-clients autofs5
libpam-krb5 parted
#install them all (still in the chroot) with ..
sudo apt-get install krb5-kdc krb5-clients libpam-krb5 nis \
autofs5 parted
# and exit the chroot
exit
----------------------------------------------------------------
// ?? do we have to explicitly exit the chroot here ??
Repeat the NIS and Kerberos configuration file copying as described in the production setup above. Essentially, you have to copy those file from the Perceus server to the VNFS image.
----------------------------------------------------------------
# as root on the Perceus master
cd /etc
cp yp.conf ypserv.conf nsswitch.conf krb5.conf autofs_ldap_auth.conf auto.master /mnt/gravityos-base/etc
# have to check that the autofs file is chmod'ed correctly so that
# the owner (root) can rw
chmod u+rw /mnt/gravityos-base/etc/autofs_ldap_auth.conf
# and group and other can't rwx
sudo chmod og-rwx /mnt/gravityos-base/etc/autofs_ldap_auth.conf
----------------------------------------------------------------
Once those files are copied and you verify that the init scripts are in place in the VNFS, you have to push those changes to the nodes. This can be done via the 'livesync' option or by the entire export/reboot process. The 'livesync' is much faster and involves using ssh and rsync to push all changes to the nodes while they're still live.
----------------------------------------------------------------
sudo perceus vnfs livesync gravityos-base
----------------------------------------------------------------
Eventually, you'll have to unmount the VNFS (which results in a significant delay as the image has to be compressed) and then reboot your test client to verify that it works from the ground state.
The umount is done via a specific perceus command
----------------------------------------------------------------
sudo perceus vnfs umount gravityos-base
----------------------------------------------------------------
and you can 'export' it to save it as a backup or to make it available to others
----------------------------------------------------------------
sudo perceus vnfs export gravityos-base /path/to/gravityos-base_<mod_date>.vnfs
----------------------------------------------------------------
As above, check that after a network restart, the client node can automatically communicate to the campus Kerberos server and the BDUC NIS/NFS servers.
Automatic disk processing
~~~~~~~~~~~~~~~~~~~~~~~~~
The next step is to convince the 'format_mount.pl' script to execute during boot
time by incorporating it into the init script sequence.
Get the file http://moo.nac.uci.edu/~hjm/format_mount.pl[format_mount.pl] and chmod it so it becomes executable.
-----------------------------------------------------------------------------
chmod +x format_mount.pl
-----------------------------------------------------------------------------
Copy it to the /bin directory of the VNFS image. (You'll have to re-mount the image if you've exported it).
-----------------------------------------------------------------------------
sudo perceus vnfs import /path/to/gravityos-base.vnfs
sudo cp /path/to/format_mount.pl /mnt/gravityos-base/bin
-----------------------------------------------------------------------------
Now we need to edit '/etc/rc.local' to pass in arguments to the script so it
can run during boot time. It'll run quite late, but only the user will be
accessing the swap and scratch partitions so it won't affect other processes.
Add this line to the VNFS's '/etc/rc.local' file ('/mnt/gravityos-base/etc/rc.local').
Make sure it's above the line that executes 'exit 0'.
-----------------------------------------------------------------------------
...
# we determined that the VNFS detected the disk as '/dev/sda' by examining
# dmesg output on 1st boot.
format_mount.pl sda 8000 xfs NODEBUG
exit 0
-----------------------------------------------------------------------------
Now save and compress the modified image by unmounting the VNFS image:
-----------------------------------------------------------------------------
sudo perceus vnfs umount gravityos-base
-----------------------------------------------------------------------------
Once that's finished, reboot the stunt node to test, and if it appears with new 'swap' and a '/scratch' dir, the production setup is complete.
//Anthony edit begin
//hjm edit
Integrating Perceus with the existing BDUC Environment
-------------------------------------------------------
We have a production cluster (BDUC) and can't bring it down for several days to re-do it as a Perceus cluster. We are therefore integrating a small (25 node) Perceus cluster with the existing CentOS cluster and when we've debugged it to the point where it behaves, we'll flip the entire cluster over a weekend day. As noted, we already have a small Ubuntu-based sub-cluster integrated with the subcluster, so using a Debian-based distro won't be a completely new experience.
Our new Perceus master server, _claw1_ has the _Ubuntu_ 10.04 distribution installed. We've already installed *Perceus 1.6* from using the Debian package downloaded from the http://www.perceus.org/site/html/download.html[main website].
Integrating Perceus into our production server is not hard. Using the tutorial/process above, we discuss the differences below.
New Applications required
~~~~~~~~~~~~~~~~~~~~~~~~~
We needed to install TCL to support http://modules.sf.net[Modules]:
------------------------------------------------------
# as root
perceus vnfs mount percdebian
chroot /mnt/percdebian
apt-get install tcl8.3
------------------------------------------------------
Hardware Changes
~~~~~~~~~~~~~~~~
There are very few hardware changes need to support Perceus. The only notable changes are to set the BIOS to request a netboot on the correct interface (the source of a number of initial errors) and to change the BIOS Chipset section so that the EDAC system will record ECC errors. This is BIOS-specific, so we'll not go into this in depth.
Perceus Configuration file changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/etc/perceus/perceus.conf
^^^^^^^^^^^^^^^^^^^^^^^^^
Originally, the controlport and VNFS transfer master were set to localhost.
Since claw1 has both a public and private IP address, we don't want Perceus to use the wrong ethernet port. Thus we explicitly specified the localhost with the private network IP address.
------------------------------------------------------
master network device = eth0
vnfs transfer method = nfs
vnfs transfer master = 10.255.78.5
vnfs transfer prefix =
database type = btree
database server = localhost
database name = perceus
database user = db user
database pass = db pass
node timeout = 600
controlport bind address = 10.255.78.5
controlport allow address = 10.255.78.5
------------------------------------------------------
/etc/perceus/default.conf
^^^^^^^^^^^^^^^^^^^^^^^^^
------------------------------------------------------
Node Name = n###
Group Name =
Vnfs Name = perdebian
Enabled = 1
First Node = 101
------------------------------------------------------
/etc/perceus/dnsmasq.conf
^^^^^^^^^^^^^^^^^^^^^^^^^
This file is generated after running *sudo perceus init*. Shouldn't have to modify anything here, besides the dhcp-range, if needed.
------------------------------------------------------
interface=eth0
enable-tftp
tftp-root=/usr/var/lib/perceus/tftp
dhcp-option=vendor:Etherboot,60,"Etherboot"
dhcp-boot=pxelinux.0
local=/
domain=bduc
expand-hosts
dhcp-range=10.255.78.100,10.255.78.254
dhcp-lease-max=21600
read-ethers
------------------------------------------------------
Note that more DNS changes are link:#DNS[noted below]
/etc/fstab
^^^^^^^^^^
We have to configure the VNFS capsule to contact 10.255.78.5, the private IP address of 'claw1'. Following the tutorial above, mount the VNFS capsule, chroot into the directory, and edit the '/etc/fstab' file. We need 2 NFS mounts from the Perceus master, the shared Perceus lib and the master's '/usr' tree to provide apps and libs without increasing the size of the VNFS
The modifications should be similar to this:
------------------------------------------------------
# the perceus shared dir
10.255.78.5:/usr/var/lib/perceus /usr/var/lib/perceus nfs ro,soft,bg 0 0
# the claw1 /usr tree to share apps, libs with nodes (see text)
10.255.78.5:/usr /u nfs ro,soft,bg 0 0 0 0
------------------------------------------------------
These are 'permanent mounts' with the accompanying pros (easy) and cons (will fail if claw1 NFS server locks up). We may switch to the 'automount' process that we use for most other NFS mounts on the cluster if we have trouble with the permanent mounts.
/etc/profile
^^^^^^^^^^^^
We have to add some paths in the above '/u' to allow the Perceus nodes to find the apps/libs it provides.
------------------------------------------------------
# near top of file
if [ "`id -u`" -eq 0 ]; then PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/u/bin:/u/local/bin:/u/sbin"
else
PATH="/usr/local/bin:/usr/bin:/bin:/u/bin:/u/local/bin"
fi
export PATH
# ... set other non-Perceus-related things
# then set the LD_LIBRARY_PATH to direct to NFS mounts
export LD_LIBRARY_PATH=/lib:/usr/lib:/usr/local/lib:/u/lib:/u/local/lib:/opt/lib:$LD_LIBARY_PATH
# and do the same thing for a variety of other ENV-related variables (non-exhaustive)
export PERL5LIB=/u/share/perl5
export PYTHONPATH=/u/lib/python2.6
------------------------------------------------------
*SGE* wasn't automounting at all. We've discovered the problem to be that the nodes couldn't contact _bduc-sched_ which controls the automount feature for SGE. It seems that the Perceus client nodes were trying to contact bduc-sched's through its public IP address
which they cannot "see". To remedy this, we had to modify the /etc/hosts file and define bduc-sched to be its private IP address.
/etc/hosts
^^^^^^^^^^
To allow 'SGE' to automount correctly from 'bduc-sched', we had to add the private IP number to the nodes' '/etc/hosts' file, along with 'claw1'.
------------------------------------------------------
127.0.0.1 localhost.localdomain localhost
10.255.78.5 bduc-claw1.nacs.uci.edu claw1
10.255.78.3 bduc-sched.nacs.uci.edu bduc-sched sched
------------------------------------------------------
VNFS Changes
~~~~~~~~~~~~
To use our preconfigured VNFS capsule from the remote Perceus install, we had to move the VNFS to the new Perceus master, 'claw1' There are two ways to do this. The first and *recommended* approach is to log onto the old Perceus master and export the VNFS capsule using *sudo perceus vnfs export*. Then copy the file onto the new Perceus master server and import it using *sudo perceus vnfs import*. The other approach is to simply tar up the '/etc/perceus/vnfs/VNFSNAME' directory, copy it into the new Perceus master server, and extract it (as root) in the /usr/var/lib/perceus/vnfs (symlinked to '/etc/perceus/vnfs')
directory.
For our setup, we went with the tarball and it seems to have worked correctly.
Environment and
- modify vnfs 'ld.so.conf.d' to provide info about the new libs.
- modify vnfs PATH to include '/u/bin', '/u/local/bin'
- modify nvfs ENV variables for 'locale', etc.
- change the '/etc/apt/sources.list' to use the same Ubuntu sources as the master.
- add and master node's root public keys to the VNFS's '/root/.ssh/authorized_keys' so that you'll be able to ssh in without a password.
Symbolic Link Changes
~~~~~~~~~~~~~~~~~~~~~
We need to modify any claw1 symlinks that use full paths to avoid redirecting to the node '/usr' tree.
------------------------------------------------------
ie: we have to change links like this:
ln -s /usr/lib/libblas.so.3 -> /usr/lib/libblas.so.3.0
to this format:
cd /usr/lib; ln -s libblas.so.3 -> libblas.so.3
------------------------------------------------------
Testing all Module applications
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We now have about 140 'Module'-based apps and libs. Each one has to be run to verify that it works correctly on the new nodes. We suspect that only those that have a specific 'libc' or kernel requirement will fail, but this has to be tested. Those that don't run and can't be addressed with symlinks to existing libs will have to be recompiled.
[[DNS]]
Named / DNS Changes
~~~~~~~~~~~~~~~~~~~
Perceus was designed to run as a 'homogeneous' cluster. Since we're running it as a 'heterogeneous' cluster, that presents us with some DNS problems. While the Perceus nodes know about each other and can resolve external-to-Perceus hosts, the other hosts in the cluster don't know about the Perceus nodes. In order to allow this to happen, the authoratative DNS server for the cluster (on the 'login' node, not 'claw1') has be explicitly updated with the Perceus node information.
'/etc/resolv.conf' on 'claw1' points to 'bduc-login' which is the authoratative nameserver for the cluster. Because of that designation, we have to provide 'bduc-login' with the correct IP# & name mappings so that the other cluster nodes can resolve the Perceus nodes. this is especially important for the SGE scheduler.
To this end, we have written a Python script which watches the 'dhcp-leases' file on the Perceus master and re-writes the 'named' database files on 'bduc-login' if there are any changes.
Named Modification Script
^^^^^^^^^^^^^^^^^^^^^^^^^
This Python script will:
- monitor the Perceus 'dhcpd.leases' file and on a change, will
- re-write the 'named' database files '10.255.78.db' and 'bduc.db' on 'claw1' (in '/var/named/chroot/var/named')
- make backups of those files on 'bduc-login'
- copy the new files into place on 'bduc-login', and
- cause 'named' to re-read the configuration files.
http://moo.nac.uci.edu/~hjm/PerceusNotifier.py[The script is here.]
//Anthony edit end
Adding Perceus nodes to SGE
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Many problems associated with SGE are associated with the 'named' problems noted immediately above. The few remaining To Dos are: +
.VNFS Chroot
.. Add the 'sgeexecd' script to the '/etc/init.d/' directory on the VNFS
.. Configure '/etc/init.d/sgeexecd' and append the *PATH* variable with '/u/bin'
.. Install 'sgeexecd' with *update-rc.d sgeexecd defaults*
IMPORTANT: You 'SHOULD NOT' have to make any modifications towards the perceus nodes '/etc/hosts' file for SGE.
.SGE Side
.. Add the nodes as 'Execution Hosts' with *qconf -ae <template>*. Or use the 'qmon' GUI. It's probably a wash as to speed.
IMPORTANT: Also, add the nodes as 'Submit Hosts' with *qconf -as <node.name>*. This is probably faster with the command line. If they are not added as 'Submit Hosts', they will not be able to run the necessary SGE utilities to support 'sgeexecd'. If this is not done, you'll get this kind of error:
--------------------------------------------------------------------
$ qhost
error: commlib error: access denied (client IP resolved to host name "". This is not identical to clients host name "")
error: unable to contact qmaster using port 536 on host "bduc-sched.nacs.uci.edu"
--------------------------------------------------------------------
which is difficult to debug since it doesn't say anything about 'Submit Hosts'
//For the Debian7 system, we have to add the following debs:
One genchroot to rule them all
------------------------------
Or, how to use Tim Copeland's http://perceus.criteriondigital.com/genchroot.html[mint-buntu-deb_genchroot.sh] script to make new vnfs packages based on Mint, Ubuntu, or straight Debian.
Tim Copeland wrote and is supporting a single genchroot script that downloads and creates vnfs's for the Debian based distro's.
It is very easy to use. Just download and execute with the correct options. To generate a Debian 7 (Wheezy) based vnfs, just direct the master script to do so:
------------------------------------------------------
# as root
./mint-buntu-deb_genchroot.sh -D x86_64 -c wheezy -r 7.0 -m
...
# the script auto-names the above definition & puts the capsule in
# '/var/tmp/vnfs/debian-7.0-1.amd64' by default
# then make it stateless or stateful (haven't tried the stateful yet)
./chroot2stateless.sh /var/tmp/vnfs/debian-7.0-1.amd64 \
/usr/var/lib/perceus/opt_vnfs/deb7.stls.vnfs
# new vnfs has to be imported into perceus & modified for local config
perceus vnfs import /usr/var/lib/perceus/opt_vnfs/deb7.stls2.vnfs
# .. (answer a few simple questions.)
# ...
# VNFS Configuration Complete
#
# Un-mounting VNFS 'deb7.stls2'...
# This will take some time as the image is updated and compressed...
# VNFS 'deb7.stls2' has been successfully imported.
# config ends with the vnfs unmounted so it has to be mounted to chroot for
# further mods
# still as root
perceus vnfs mount deb7.stls
chroot /mnt/deb7.stls
# install the infiniband modules, tcl, some utils
apt-get install bzip2 dapl2-utils dialog diffstat ibsim-utils ibutils \
ibverbs-utils infiniband-diags joe less libclass-isa-perl libdapl2 \
libfribidi0 libgdbm3 libibcm1 libibcommon1 libibdm1 libibmad1 \
libibumad1 libibverbs1 libipathverbs1 libmlx4-1 libmthca1 libnewt0.52 \
libopensm2 librdmacm1 libsdp1 libswitch-perl libumad2sim0 \
module-assistant netbase opensm perftest perl-modules \
rdmacm-utils rds-tools sdpnetstat srptools tcl8.4 tclreadline \
whiptail libpam-krb5 nis autofs5 parted sudo
# dont need krb5-kdc krb5-clients - this is the server; only need the clients.
# NB:
# - the krb5 realm is 'UCI.EDU'
# - the kerberos kdc server is 'kerberos.service.uci.edu'
# - the kerberos admin_server is 'kerberos.service.uci.edu'
# - the NIS domain we want to join is 'YP.bduc.uci.edu'(see
# 'bduc-login:/etc/yp.conf')
------------------------------------------------------
And for the Modules system, need to set the VNFS
------------------------------------------------------
# as root on the perceus master; NOT chrooted yet
VNFS=/your/VNFS/mount/point
# ie VNFS=/mnt/deb7.stls
------------------------------------------------------
and then mouse the rest into a root shell
------------------------------------------------------
cp /usr/bin/modulecmd ${VNFS}/usr/bin
# local disk util to set up swap and /scratch on an unpartitioned disk
cp /usr/var/lib/perceus/format_mount.pl ${VNFS}/bin
# local rc.local to set up various module, format_mount.pl
cp /usr/var/lib/perceus/rc.local ${VNFS}/etc
# our local fstab
cp /usr/var/lib/perceus/fstab ${VNFS}/etc
# our local hosts file
cp /usr/var/lib/perceus/hosts ${VNFS}/etc
cd /usr/lib/
cp libX11.so.6 ${VNFS}/usr/lib/libX11.so.6
cp libxcb.so.1.1.0 ${VNFS}/usr/lib/libxcb.so.1
cp libXdmcp.so.6.0.0 ${VNFS}/usr/lib/libXdmcp.so.6
cp libXau.so.6.0.0 ${VNFS}/usr/lib/libXau.so.6
# cp /etc config files across to the VNFS
cd /etc
cp yp.conf ypserv.conf nsswitch.conf krb5.conf autofs_ldap_auth.conf auto.master ${VNFS}/etc
# and set the permissions
chmod u+rw ${VNFS}/etc/autofs_ldap_auth.conf
chmod og-rwx ${VNFS}/etc/autofs_ldap_auth.conf
# prep the SGE init script
cp /etc/init.d/sgeexecd ${VNFS}/etc/init.d
# now chroot
chroot ${VNFS}
# and update the scripts
update-rc.d sgeexecd defaults
# need to add a symlink for the logger in the chroot (for
# the way we've set up syslogging (on a CentOS server that
# provides the modules - it's complicated)
ln -s /usr/bin/logger /bin/logger
ln -s /apps/Modules /usr/share/Modules
# then exit the chroot
exit
# unmount the vnfs to compact and prep it for distribution
perceus vnfs umount deb7.stls
------------------------------------------------------
So now the new, customized VNFS is ready to distribute to nodes. In order to distribute to a few nodes as a test, you have to define them as a group and then define that group to get the new VNFS.
First, lets's take a look at the current disposition of the nodes:
------------------------------------------------------
$ perceus node summary
HostName GroupName Enabled Vnfs
-------------------------------------------------------------------------------
n101 (undefined) yes debuntu
n102 (undefined) yes debuntu
n103 (undefined) yes debuntu
...
n137 (undefined) yes debuntu
n115 debian6 yes deb6.stls
n138 debian7 yes deb7.stls
n139 debian7 yes deb7.stls
------------------------------------------------------
You can see that most of the nodes have NOT been assigned to a group and get teh default 'debuntu' VNFS. Of the ones that have, 'n115' is part of the 'debian6' group which gets the 'deb6.stls' VNFS.
So let's define the nodes that we want to be in the test group.
------------------------------------------------------
# we'll define a test group 'debian7' to include nodes n115 & n138
$ perceus node set group debian7 n115 n138
## Output is:
# Hostname Group NodeID
# --------------------------------------------------------------
# n115 debian6 00:D0:68:12:09:D1
# n138 (undefined) 00:25:90:58:57:7A
### note that node n115 was previously set to debian6 and will now be shifted
### to debian7
# Are you sure you wish to set 'group=debian7' on 2 nodes?
# Please Confirm [yes/no]> yes
# '2' nodes set group='debian7'
------------------------------------------------------
Then set the new group to get the new VNFS
------------------------------------------------------
perceus group set vnfs deb7.stls debian7
## Output is:
# Hostname Group NodeID
# ---------------------------------------------------------
# n138 debian7 00:25:90:58:57:7A
# n115 debian7 00:D0:68:12:09:D1
#
# Are you sure you wish to set 'deb7.stls' on 2 nodes?
# Please Confirm [yes/no]> yes
# '2' nodes set vnfs='deb7.stls'
------------------------------------------------------
So now the new group 'debian7', composed of 'n138 & n115', will get the newly created VNFS 'deb7.stls' on reboot.
Verify that you've done this correctly with:
------------------------------------------------------
$ perceus node summary
HostName GroupName Enabled Vnfs
-------------------------------------------------------------------------------
n101 (undefined) yes debuntu
n102 (undefined) yes debuntu
n103 (undefined) yes debuntu
...
n137 (undefined) yes debuntu
n115 debian7 yes deb7.stls <---
n138 debian7 yes deb7.stls <---
n139 debian7 yes deb7.stls <---
------------------------------------------------------
And then reboot node 'n115' and see what happens:
------------------------------------------------------
ssh n115 reboot
------------------------------------------------------
perceus node summary # see above