Skip to content
clement-jim edited this page Nov 8, 2012 · 1 revision
<style type="text/css">ol{margin:0;padding:0}.c0{vertical-align:top;width:117pt;border-style:solid;background-color:#d3dfee;border-color:#000000;border-width:1pt;padding:5pt 5pt 5pt 5pt}.c3{vertical-align:top;width:117pt;border-style:solid;border-color:#000000;border-width:1pt;padding:5pt 5pt 5pt 5pt}.c16{max-width:468pt;background-color:#ffffff;padding:72pt 72pt 72pt 72pt}.c1{color:#365f91;background-color:#d3dfee}.c9{color:inherit;text-decoration:inherit}.c5{color:#1155cc;text-decoration:underline}.c19{font-size:18pt}.c14{font-family:"Times New Roman"}.c12{font-size:7pt}.c17{font-style:italic}.c18{text-decoration:underline}.c2{font-weight:bold}.c6{direction:ltr}.c10{border-collapse:collapse}.c4{color:#365f91}.c13{font-family:"Courier New"}.c11{color:#1f497d}.c7{font-size:24pt}.c8{height:11pt}.c15{font-size:10pt}.title{padding-top:24pt;line-height:1.15;text-align:left;color:#000000;font-size:36pt;font-family:"Arial";font-weight:bold;padding-bottom:6pt}.subtitle{padding-top:18pt;line-height:1.15;text-align:left;color:#666666;font-style:italic;font-size:24pt;font-family:"Georgia";padding-bottom:4pt}li{color:#000000;font-size:11pt;font-family:"Arial"}p{color:#000000;font-size:11pt;margin:0;font-family:"Arial"}h1{padding-top:24pt;line-height:1.15;text-align:left;color:#000000;font-size:18pt;font-family:"Arial";font-weight:bold;padding-bottom:6pt}h2{padding-top:18pt;line-height:1.15;text-align:left;color:#000000;font-size:14pt;font-family:"Arial";font-weight:bold;padding-bottom:4pt}h3{padding-top:14pt;line-height:1.15;text-align:left;color:#666666;font-size:12pt;font-family:"Arial";font-weight:bold;padding-bottom:4pt}h4{padding-top:12pt;line-height:1.15;text-align:left;color:#666666;font-style:italic;font-size:11pt;font-family:"Arial";padding-bottom:2pt}h5{padding-top:11pt;line-height:1.15;text-align:left;color:#666666;font-size:10pt;font-family:"Arial";font-weight:bold;padding-bottom:2pt}h6{padding-top:10pt;line-height:1.15;text-align:left;color:#666666;font-style:italic;font-size:10pt;font-family:"Arial";padding-bottom:2pt}</style>

JimC

1         Chef

Title

Comments

Work Around

Ref#

Cannot use knife SSH on nodes that are waiting to be allocated.

Allocate the node, to get it fully installed into the system prior to running Knife commands on it.

802

Node failed to find private key after reset.

A node exhibits a problem state and the following message is displayed when running a chef-client on that node.

FATAL: Chef::Exceptions::PrivateKeyMissing: I cannot read /etc/chef/validation.pem, which you told me to use to sign requests!

Reboot the node or run /etc/init.d/crowbar_join.sh on the affected node.

828

 

2         Crowbar UI

Title

Comments

Work Around

Ref#

IE7: Attribute edit textbox translates apostrophe into web safe '

Only on some browsers

Upgrade to IE8 (or turn off IE7 compatibility) or Change to \" to correct

143

Connecting with HTTPS produces bad SSL message error

HTTPS was not a target for the 1.0 release

Reference the Crowbar UI by http:// not https://

67, 162

Adding/Removing a role in Chef does not update the proposal.

If Crowbar does not do the updating then it cannot be reflected in the proposal!

Make changes for proposals in Crowbar

87

There is no way to completely delete a node

Crowbar will remove many aspects of a node from the system, but does not delete all traces (such as IP allocation). Can result in lost IP allocations.

No work around; however, it will be possible to identify lost IPs and recover.

166

Drag and drop is not enabled in IE8

Use “Raw” mode or see the User’s Guide for a list of supported browsers

617

Missing BMC Buttons after shutdown of node using "Power Off Button"

After clicking Power Off button on a node, the other buttons under Node – Edit were missing.

Run the following command:  ipmitool -H BMCIP -P crowbar -U crowbar power on

196

No logout from crowbar

Close open browser windows to remove temporary cookies.

242

Nodes in “Ready Retrying” that are powered off stay flashing red in UI

I had two nodes in ready retrying state and shut them down.  They stayed flashing “Ready Retrying” and not grey, “Unavailable”

This indicates a failure to properly initialize the node. Troubleshoot the underlying problem by inspecting logs and then reinstall the node.

If the node failure is permanent, delete the node from crowbar.

407

RAID setting on Node Details screen needs to reflect the actual configuration.

The raid value for "Disk Drives" on the Node Details screen is not reflective of the actual RAID config.

This value can be changed by editing the node RAID setting and clicking Save.

735

Incorrect privileges when using the crowbar user to issue IPMI commands.

The crowbar user does not have the correct privileges to make changes to the IDRAC controller.

Use the root user when issuing IPMI commands to the IDRAC

764

Switch View shows the virtual switches

Ethernet ports that are on the machine may create views of non-existent switches.

Ignore the non-existent switches in the switch view.

791

Team_mode is required for all conduits

All bonded conduits, team_mode is required in the Network.json file.

Within the conduit, team_mode is a required field.

815

Delete node does not delete Nagios and Ganglia instances.

Nagios and Ganglia continue to monitor the deleted systems.

Manually delete the nodes from Nagios and Ganglia using the vendors UI.

817

IPMI proposal editor doesn't have field for deployment of IPMI-discover role

IPMI has 2 roles - IPMI-discover and IPMI -configure.  The proposal editor only has IPMI -configure in the deployment section.

Use raw mode to edit.

819

Trying to re-apply IPMI proposal once nodes transitioned from discovered state fails:  No Hxxx node found.

IPMI -discover role is used during discovery, meaning the hxxx nodes are added to the IPMI proposal.

Later on, nodes get renamed to dxxx, but the IPMI proposal is not modified, and retains the hxxx nodes.

Edit the IPMI proposal and remove any hXXX nodes from the deployment section (possibly switching to raw modem if the IPMI-discover role is not visible) or make your edits to the IPMI proposal before any nodes have been allocated.

820

Log export attempts to include logs from nodes in Discovered state.

Ignore zero length zips that are included in the main log zip.

826

Download of log export sometimes downloads incomplete archive in IE 8.

Retry the download, or use a different browser such as IE9 or Chrome to download the log exports.

830

Ganglia not displaying graphs

Seems to be when the IP address is in the URL

If the URL uses the MAC address, you do see the graphs.

865

Intermittent - # disks show as -1 on admin node on VmWare

On the Node Inventory report, the number of disks for a VMWare virtual admin node shows up as -1.  The virtual admin node actually has 2 disks.

The admin node disks show up correctly in chef as /block_device/sda and sdb.

897

Reset Nodes after rediscover still has links for Nagios/Ganglia on Dashboard

These links will be removed in next release.

Do not access links after reset until reallocated.

899

3         Glance

Title

Comments

Work Around

Ref#

Not logging the Glance commit to Admin node error.log or syslog

This will be addressed in the Diablo release. Please review the Glance barclamp documentation for more information.

Until then, look for glance-registry.log and glance-api.log in /var/log/glance on the glance server.

104, 85

4         Horizon

Title

Comments

Work Around

Ref#

Horizon cannot be deployed without Nova - Internal Server Error - Dashboard with Swift only Deployment

Failure to deploy Nova returns an Internal Server Error on the Dashboard.

You must Deploy Nova when using Horizon

879

5         Install

Title

Comments

Work Around

Ref#

Admin node BIOS, RAID, and customization requires Keyboard/Monitor to be available

Work to apply Crowbar to Crowbar is planned, but did not make it into the initial the release.

Get a KVM for admin node setup. Refer to "Getting Started" guide for specific settings.

102

Booted machine on network with >1 installed Crowbar creates conflicts between servers.

Crowbar is a DHCP server, you cannot have two of them on the same network

When installing a new Crowbar server, replace the old one instead of bringing it up on a different server.

151

Crowbar only supports 1 top level domain

While it is possible to change the TLD, Crowbar UI assumes that there is only 1 TLD in numerous places.

Only use 1 top level domain

-

Changing the bonding mode requires a reboot

After changing the bonding I found that I had to reboot my nodes to get the new bond type to take.

If changing the bonding mode, reboot your nodes.

474

The network barclamp dosesn't clean up vlan interface on node when switching from single to dual

Deploy a swift storage node with network barclamp configured in single mode.

Edit network barclamp proposal to switch to dual mode, and save & apply proposal.

On a compute node, inspect ifconfig and notice that there are both: eth0.200 and eth1.200

Once the proposal has fully deployed, reboot the node to make the current active network configuration match the correctly created configuration files.

476

BIOS not updating from some version.

Certain Versions cannot be updated to our current rev. 1.30

Please see the BIOS Firmware release notes to determine if you need to disable the BIOS Barclamp prior to updating any machines.

561

Modifying APT sources.list can break admin node

Customers might modify the system configuration to add additional packages into crowbar.   This could cause chef-client execution to fail on the admin node (Which currently occurs on a regular basis), and render the admin node non-functional.

It is highly recommended not to modify resources managed by crowbar.  If new packages are required, build a custom crowbar ISO and include the required packages in the build.

463

Nodes that are not managed by the admin are permitted to register with Chef

I booted up the VM's and did not reset the boot sequence for network.  The nodes registered with Chef and admin.  Now I get an Error 500 page and see the nodes logging to the admin node their replication of swift.

Inspect the chef UI to identify the offending nodes, and bring them onto crowbar (PXE boot to install standard image).  If the nodes are not to be managed by crowbar, either stop the chef-client daemon, or point it to the appropriate chef-server.

527

Swift fails if proxy installed on storage node

It should be possible to install the swift-proxy on a swift storage node. Currently it fails.

Do not install proxy on storage node.

846

Get "volume group name already in use" going from JBOD to RAID 10 (Admin node also)

On Admin node a fast initialization of the existing Raid set and you will cause this.  Seems to happen when manually deleting RAID config the setting RAID to No Configuration in UI.  On install this error is occurring.

Reboot node in question

855, 884

The Same IP on two adaptors after install is done with Team JSON

After finishing a node installation ifconfig reveals that the primary IP is on both bond0 and eth2

Workaround: Rebooting node seems to clear this up.

877

Selecting Raw when drives sda/sdb/sdc are all mounted in use causes Nova to fail to deploy.

Putting  nova-controller on a swift-storage node and setting  the volume mode to RAW causes  stack trace as it tried to manipulate the mounted file systems; this causes the barclamp to fail to deploy.

Workaround:  Edit the barclamp and set the Nova - Volume to use Local.  ReApply.

900

OpenStack bonding network json doesn't support multi-speed

Crowbar allows down/up-grading interface speeds relative to what is specified in the network json. The non-bonded network json does "the right thing".  The bonded one doesn't.

Edit the Network JSON to use the proper Regex expression for the Ethernet.

901

Admin node, pre-install, has eth0 defined twice.

Verify issue:

cd etc/network/interfaces

auto eth0

iface eth0 inet dhcp

auto eth0

iface eth0 inet  static

Although interface does not come up it is still accessible from the console and does not prevent installation.

903

Multijson encode is depricated - This error is on install of Admin node

Errors in log file about MultiJson .encode is deprecated and will be removed  in the next major release.

n/a

904

Should set more BIOS params for C2100

The following parameters are available on the C2100, but are not being set: "Intel VT-d", and "SR-IOV Supported"

Manually Configure the Parameters in the BIOS

913

Redhat Network Warning during Install - CAP_SYS_MODULE

Error in log: “Loading kernel module for a network device with CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias netdev-bond0 instead”

Install continues on normally, no workaround needed.

915

raid-configure recipe rebuilds raid10 everytime (it seems)

An R720 or R720xd will continuously loop on the install when there are only 3 or less drives.

Follow the RA requirement to use the minimum number of drives of 6.

781

Can't install Mesa 1.3.1 on R720xd

Two out of 3 R720xds are failing to install.   They just cycle from yellow to red “Problem”  after failing to install then continue to cycle again and again.

Workaround is to manually delete all RAID sets.

952

BIOS setting on R720 causes node to fail to install

I found if you uncheck one of the boot devices, in the Boot Order screen. Then you go into a hardware problem loop.

Workaround is to enter the BIOS

Bios Boot Order Screen

Makes all devices are checked as bootable.

977

6220 IPMI  set password fails. root password remains root

The new password is cr0wBar! for the IPMI root account

Workaround is to edit the IPMI proposal before deploying any nodes, and ensure that a "complex enough" password is used.

Use, symbols, digits and upper/lower case letters.

964

Installed an admin node on an R720 and discovered 3 more R720s and 3 R720xds.  The DRAC password was correctly set to cr0wBar! on the client nodes, but was incorrectly set to crowbar on the admin node.

Installed an admin node on an R720 and discovered 3 more R720s and 3 R720xds.  The DRAC password was correctly set to cr0wBar! on the client nodes, but was incorrectly set to crowbar on the admin node.

Workaround:  After installation should be done through ipmitool from command of admin node.

999

chef-client not running on admin node

bluepill status showed that it was configured in bluepill.

Workaround:  Reload it in to bluepill:

ssh into admin node

 sudo –i

cd /etc/bluepill

bluepill chef-client-stop

bluepill load chef-client.pill

1003

Resetting and then allocating goes through two OS installs.

Afterwards, it comes up fine, just takes a bit longer now.

n/a

842

Client nodes lose contact with admin following allocation when both 1G and 10G ports are connected.

In the install, the client nodes would go from green to gray following allocation.

Investigation showed that the nodes were switching from the 1G port to the 10G port during a chef-client run.

Workaround:   Disconnect the 1G ports on the client nodes.

1004

720s in Hardware Installing reboot loop

Had 3 720s with 10g Intel NICs.  All of these systems went into a reboot loop in Hardware Installing on manual allocation.

Investigation showed that the NIC firmware installation failed to install and failed to verify.  Attempted firmware installation would then continue in a loop.

The 10G devices are not updated with current release.

System has excluded these updates for just intel-based NICs.

1005

6         Keystone

Title

Comments

Work Around

Ref#

Modifying account info on an active proposal and reapplying does not work.

User must make Keystone account changes before applying proposal.

581

Installing mysql + keystone- returns access denied when creating keystone database when proposals queued.

Workaround:  On the node that has mysql server installed

execute:

rm /etc/mysql/applied_grants

(this will reapply the required permissions to the mysql server)

861

7         Network Barclamp

Title

Comments

Work Around

Ref#

Network barclamp libraries attempt to save nodes that aren't the running node.

An error may show up in your logs for the Admin node.  “INFO: HTTP Request Returned 403 Forbidden: You are not the correct node (auth_user name: admin.phone.com,”  This is benign and can be ignored.

Ignore the log message

837

Deployment guide: JSON path is incorrect

Page 9 lists the JSON at - /opt/dell/chef/data_bags/crowbar/bc-template-network.json

Should be: /opt/dell/barclamps/network/chef...

929

8         Nova

Title

Comments

Work Around

Ref#

Nova failed to install with specific config

Network proposal -> Nova fixed = 192.168.123.0 as a subnet and 255.255.255.0 as a mask

Nova proposal -> num_networks =2 and network_size = 256

The size of nova_fixed in the network proposal must be the same size as the nova proposal (num * size).

482

euca2ools not installed by default on fresh nova deploy

I had to manually install euca2ools to try to query and manipulate images and VMs via the command line.

Use alternative methods to achieve the same capabilities:

Dashboard

Nova-manage commands

518

Nova barclamp uses 1st Glance server found

The Nova barclamp allows the user to select a Glance proposal to use.  But code uses the first one found.   This server may or may not be in the selected Glance proposal.  This can cause issues if there are multiple Glance proposals created.

The Glance server used should be from the Glance proposal selected in the Nova create/edit form.

868

Somtimes get MySQL connect error on dashboard

OperationalError) (2006, 'MySQL server has gone away'

It will clear if you refresh the page a couple of times.

487

Libvirt error when trying to mount ISCSI volumes

See: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/996840

Currently a manual patch by Dell

875

Nova-Volume Raw Select first disk does not use the physical first disk

Nova-Volume will use the first volume and not the first physical drive

936

9         OpenStack

Title

Comments

Work Around

Ref#

OpenStack servers listen on 0.0.0.0

Most servers listen to all local IP addresses

n/a

859

10        RAID

Title

Comments

Work Around

Ref#

WSMAN/RAID - Clearing config requires cold boot for JBOD to work.

After clearing controller configuration at start you need to cold boot the box in order for JBOD to successfully install.  

Cold boot box (in iDrac just click on "Power Cycle System (cold boot)" link)

844

11        Swift

Title

Comments

Work Around

Ref#

Swift with multiple proxies breaks

Currently not supported by Swift recipes in Crowbar

Utilize single proxy

158

Swift does not always start after initial proposal commit. Memcachd is not starting up

Occurs occasionally after the system is first installed. This only occurs during initial install.

Reboot the Swift Proxy node. The installation will continue successfully

24

Not all swift ring parameters updated

Some are:  # of replicas # of zones.  While some should never been updated (e.g. cluster hash) all ring parameters should allow dynamic modifications.

Note that manual changes to the ring do take effect.

622

Creating Swift proposal before Keystone causes Swift install to hang

Creating Swift before Keystone causes Swift to refer to blank Keystone instance.

Delete Swift proposal and create a new one AFTER creating keystone proposal

1027

12    Errata

12.1    Nova Volume Setup

·         If you are planning on running the multi-volume role on the controller, your option for type of volume will default to Local, as the controller must be configured for RAID10.

 

·         If you are planning on running the multi-volume role on a node/nodes other than the controller, and you want your volume type to use “Raw” disk selection, you must manually allocate the node BEFORE applying the Nova proposal.   (The Nova proposal will automatically configure the multi-volume node to be RAID10 if the node is in the unallocated state when applying the proposal, and therefore negates any Raw/disk selection settings. )

 

Steps to take if planning to run multi-volume on node other than the controller:

o   On the system where you want multi-volume to run, edit the node set the RAID to JBOD, and BIOS = Virtualization

o   Allocate the node and wait for it to go into the Ready state

o   Create a Nova proposal

o   Edit the proposal

o   Drag the node that you manually allocated to multi-volume role

o   Under the Volume Options section, you have the option to select the type of volume, Raw or Local and will see a display of disks you can select for nova-volume storage.

 

The user should be aware of this before applying Nova, since once it’s applied, your option to change is gone unless you re-build.

 

Clone this wiki locally