solaris... wtf?!?

Meltdown Spectre SPARC Solaris?!?

In the last couple of weeks the whole IT was shocked getting news about security issues based on basic CPU architecture design covering “all” processor vendors… Spectre (CVE-2017-5753 and CVE-2017-5715) and Meltdown (CVE-2017-5754) vulnerabilities affected XEONs, AMD, POWER and ARM chips, but also SPARC uses similar features like all others. Spectre and Meltdown are different variants of the same fundamental underlying vulnerability, if exploited, allow attackers to get access to data previously considered completely protected. That affects chips manufactured in the last 20 years. Speculative execution to predict the future and out of order execution were implemented by Sun on their SPARC T3 chips (eg.: T3-1; GA November 2010, LOD September 2012).

@Meltdown;

Solaris never had kernel pages mapped in user context on SPARC. That’s the reason why I do not think Meltdown affects SPARC at all. Also Oracle mentioned “Oracle believes that Oracle Solaris versions running on SPARCv9 hardware are not impacted by the Meltdown”. But it could happen on Solaris x86 as far as I understood.

@Spectre;

Well, starting with T3 (S3 core) Oracle introduced speculative and out-of-order execution in the S3 pipeline. Prediction algorithms and deepness of the prediction buffers differs not only between vendors but across CPU generations.

[UPDATE 14.04.2018]
There are public patches from Oracle to fix that issue, MOS document for Solaris / SPARC:

Oracle Support Document 2349278.1 (Oracle Solaris on SPARC and Spectre (CVE-2017-5753 and CVE-2017-5715) and Meltdown (CVE-2017-5754)) can be found at: https://support.oracle.com/epmos/faces/DocumentDisplay?id=2349278.1

All other Oracle products could be found at:

Oracle Support Document 2347948.1 (Addendum to the January 2018 CPU Advisory for Spectre and Meltdown) can be found at: https://support.oracle.com/epmos/faces/DocumentDisplay?id=2347948.1

Still cannot say anything about possible performance issues…

[Update]
Oracle published a new MOS article about the impact:
Oracle Support Document 2386271.1 (Performance impact of technical mitigation measure against vulnerability CVE-2017-5715 (Spectre v2) on SPARC Servers)

Like on other architectures 2-10% … heard some very bad news from customers using older Intel boxes with up to 70% IO loss… real world examples will be interesting…

Console In/Output to ILOM

After a default Linux (in my case Oracle Linux 7.4) installation you won’t see anything after grub finished with “start /HOST/console”

Trick:

Worked on linux 7.4 running on Oracle Server X7-2

Spawn a TTY service on your serial console device port as a service:

# systemctl --now enable getty@ttyS0
Created symlink from /etc/systemd/system/getty.target.wants/getty@ttyS0.service to /usr/lib/systemd/system/getty@.service.

Another way is to tell grub2 to use the console... then you would also see booting and shutdown messages:

# grubby --remove-args="rhgb quiet" --args=console=ttyS0,9600 --update-kernel=ALL

Solaris Cluster Update

Had the fun today to upgrade a bunch of solaris clusters…. this time I wrote a docu 🙂

Cluster Update

let's update our cluster running a flying zone:

On both nodes:

 # scinstall -u update -b 11.3.26.0.5.0 -L accept

[...]

disable the zone-resource

 # clrs disable name-zone-res

reboot second cluster node where the RG is not running into the new BE after coming back, switch the RG to the updated node

 clrg switch -n <node> name-zone-rg

Now set the BE UUID from the first node to the second (DO NOT - if you are using 11.4 / 4.4 - destroyed the grub entry and system was not bootable; worked without setting it)

 first# /opt/SUNWsczone/sczbt/util/ha-solaris-zone-boot-env-id get
 second# /opt/SUNWsczone/sczbt/util/ha-solaris-zone-boot-env-id set <uuid>

Having the storage-res/RG on the second node and the right UUID you can attach the zone:

 # zoneadm -z <name> attach -U -x destroy-orphan-zbes

Check if the zone comes up with a normal state (boot and svcs -xv). If ok, shutdown the zone again.

Now we can give the control back to the cluster

 # clrs enable name-zone-res
 # clrg resume name-zone-rg

Now you can try to switch back...

And you could also upgrade your resource type version


# clrt list
SUNW.LogicalHostname:5
SUNW.SharedAddress:3
SUNW.HAStoragePlus:11
ORCL.ha-zone_sczbt:2
# clrt register ORCL.ha-zone_sczbt
# clrt register SUNW.HAStoragePlus
# clrt list
SUNW.LogicalHostname:5
SUNW.SharedAddress:3
SUNW.HAStoragePlus:11
ORCL.ha-zone_sczbt:2
ORCL.ha-zone_sczbt:4
SUNW.HAStoragePlus:12
# clrs list
pressy-zone-rs
pressy-zp-rs
# clrs show -p Type_version pressy-zone-rs
=== Resources ===
Resource:                                       pressy-zone-rs
  Type_version:                                    2
  --- Standard and extension properties ---
# /opt/SUNWsczone/sczbt/util/rt_upgrade +
Migration of resource:pressy-zone-rs to latest resource type version succeeded.
# clrs show -p Type_version pressy-zone-rs
=== Resources ===
Resource:                                       pressy-zone-rs
  Type_version:                                    4
  --- Standard and extension properties ---
#
# clrs show -p Type_version pressy-zp-rs
=== Resources ===
Resource:                                       pressy-zp-rs
  Type_version:                                    11
  --- Standard and extension properties ---
# clrs set -p Type_version=12 pressy-zp-rs
# clrt unregister SUNW.HAStoragePlus:11
# clrt unregister ORCL.ha-zone_sczbt:2
#