Posts Tagged z/VM

Programming decisions

My Linux-based Large-Scale Cloning Grid experiment, which I’ve implemented four times now (don’t ask, unless you are ready for a few looooooong stories), has three main components to it.  The first, which provides the magic by which the experiment can even exist, is the z Systems hypervisor z/VM.  The second is the grist on the mill, both the evidence that the thing is working and the workload that drives it, the cluster monitoring tool Ganglia.  The third, the component that keeps it alive, starts it, stops it, checks on it, and more, is a Perl script of my design and construction.  This script is truly multifunctional, and at just under 1000 lines could be the second-most complex piece of code [1] I’ve written since my university days [2].

I think I might have written about this before — it’s the Perl script that provides an IRC bot which manages the operation of the grid.  It schedules the starting and stopping of guests in the grid, along with providing stats on resource usage.  The bot sends grid status updates to Twitter (follow @BODSzBot!), and recently I added generation of files which are used by some HTML code that generates a web page with gauges that summarise the status of the grid.  The first version of the script I wrote in Perl simply because the first programming example of an IRC bot I found was a Perl script; I had no special affinity for Perl as a language and, despite me finding a lot of useful modules that have helped me develop the code, Perl actually doesn’t really lend itself to the task especially well.  So it probably comes as little surprise to some readers that I’m having some trouble with my choice.

Initially the script had no real concern over the status of the guests.  It would fire off commands to start guests or shut them down, or check the z/VM paging rate.  The most intensive thing the script did was to get the list of all users logged on to z/VM so it could determine how many clone guests are actually logged on, and even this takes a fraction of a second now.  It was when I decided that the bot should handle this a bit more effectively, keeping track of the status of each guest and taking some more ownership over maintaining them in the requested up or down state, that things have started to come unstuck.

In the various iterations of this IRC bot, I have used Perl job queueing modules to keep track of the commands the bot issues.  I deal with sets of guests 16 or 240 at a time, and I don’t want to issue 2000 commands in IRC to start 2000 guests.  That’s what the bot is for: I tell it “start 240 guests” and it says “okay, boss” and keeps track of the 240 individual commands that achieve that.  This time around I’m using POE::Component::JobQueue, the main reason being that the IRC module I started to use was the PoCo (the short name for POE::Component) one.  It made sense to use a job queueing mechanism that used the same multitasking infrastructure that my IRC component was using.  (I used a very key word in that last sentence; guess which one it was, and see if you’re right by the end.)

With PoCo::JobQueue, you define queues which process work depending on the type of queue and how it’s defined (number of workers, etc).  In my case my queues are passive queues, which wait for work to be enqueued to them (the alternative is active queues, which poll for work).  The how-to examples for PoCo::JobQueue show that the usual method of use is a function that is called when work is enqueued, and that function then starts a PoCo session (PoCo terminology for a separate task) to actually handle the processing.  For the separate tasks that have to be done, I have five queues that each at various times will create sessions that run for the duration of the item of work (expected to be quite short), one session running the IRC interaction, and one long-running “main thread” session.

The problems I have experienced recently include commands queued but never being actioned, and the updating of the statistics files (for the HTTP code) not being updated.  There also seems to be a problem with the code that updates the IRC topic (which happens every minute if changes to the grid have occurred, such as guests being started or stopped, or every hour if the grid is steady-state) whereby the topic gets updated even though no guest changes have occurred.  When this starts to happen, the bot drops off IRC because it fails to respond to a server PING.  While it seems like the PoCo engine stops, some functions are actually still operating — before the IRC timeout happens the bot will respond to commands.  So it’s more like a “graceful degradation” than a total stop.

I noticed these problems because I fired a set of commands to the bot that would have resulted in 64 commands being enqueued to the command processing queue.  I have a “governor” that delays the actual processing of commands depending on the load on the system, so the queue would have grown to 64 items and over time the items would be processed.  What I found is that only 33 of the commands that should have been queued actually were run, and soon after that the updating of stats started to go wrong.  As I started to look into how to debug this I was reminded about a very important thing about Perl and threads.

Basically, there aren’t any.

Well that’s not entirely true — there is an implementation of threads for Perl, but it is known to cause issues with many modules.  Basically threading and Perl have a very chequered history, and even though the original Perl::Thread module has been replaced by ithreads there are many lingering issues.  On Gentoo, the ithreads use flag is disabled by default and carries the warning “has some compatibility problems” (which is an understatement as I understand it).

With my code, I was expecting that PoCo sessions were separate threads, or processes, or something, that would isolate potentially long running tasks like the process I had implemented to update the hash that maintains guest status.  I thought I was getting full multitasking (there’s the word, did you get it?), the pre-emptive kind, but it turns out that PoCo implements “only” cooperative multitasking.  Whatever PoCo uses to relinquish control between sessions (the “cooperative” part) is probably not getting tickled by my long-running task, and that is interrupting the other sessions and queues.

So I find myself having to look at redesigning it.  Maybe cramming all this function into something that also has to manage real-time interaction with an IRC server is too much, and I need to have a separate program handling the status management and some kind of IPC between them [3].  Maybe the logic is sound, and I need to switch from Perl and PoCo to a different language and runtime that supports pre-emptive multitasking (or, maybe I recompile my Perl with ithreads enabled and try my luck).  I may even find that this diagnosis is wrong, and that some other issue is actually the cause!

I continue to be amazed that this experiment, now in its fourth edition, has been more of an education in the ancillary components than the core technology in z/VM that makes it possible.  I’ve probably spent only 5% of the total effort involved in this project actually doing things in z/VM — the rest has been Perl programming for the bot, and getting Ganglia to work at-scale (the most time-consuming part!).  If you extend the z/VM time to include directory management issues, 5% becomes probably 10% — still almost a rounding error compared to the effort spent on the other parts.  z Systems and z/VM are incredible — every day I value being able to work with systems and software that Just Work.  I wonder where the next twist in this journey will take me…  Wish me luck.

—-

[1] When I was working at Queensland Rail I worked with a guy who, when telling a story, always used to refer to the subject of the story as “the second-biggest”, or “the second-tallest”, or “the second-whatever”.  Seemed he wanted to make his story that little bit unique, rather than just echoing the usual stories about “biggest”, “tallest”, or “whatever”.  I quizzed him one day, when he said that I was wearing the second-ugliest tie he’d ever seen, what was the ugliest…  Turned out he had never anticipated anyone actually asking him, and he had no answer.  Shoutout to Peter (his Internet pseudonym) if you’re watching 😉

[2] Despite referencing [1], and being funny, it’s actually an honest and factual statement.  I did write one more complex piece of code since Uni, being the system I’d written to automatically generate and activate VTAM DRDS decks (also while I was at QR) for dynamically updating NCP definitions.  Between the ISPF panels (written in DTL) and the actual program logic (written in REXX) it was well over 1000 lines.  The other major coding efforts that I’ve done for z/VM cloning have probably been similarly complex to this project but had fewer actual lines of code.  Thinking about it, those other coding efforts drew on comparatively few external modules while this Perl script uses a ton of Perl modules; if you added the cumulative LOC of the modules along with my code, the effective LOC of this script would be even larger.

[3] The previous instanciation of this rig had the main IRC bot and another bot running in a CMS machine running REXX for doing stuff in DirMaint on z/VM.  The Perl bot and the REXX bot sent messages to each other over IRC as their IPC mechanism.  It was weird, but cute at the same time!  This time around I’m using SMAPI for the DirMaint interface, so no need for the REXX bot.

Tags: , , , , , , , , , , ,

Confidence

Among the coffee mugs in my cupboard at home is one I’ve had for over 20 years.  It was a gift; if I remember right, a semi-joke gift in an office “Secret Santa”.

"Works and plays well with others"

“Works and plays well with others”. O RLY?

The slogan on it reads “Works and plays well with others”, and it’s a reference to one of the standard phrases seen on children’s school report cards.  It’s one of the standard mugs in my hot beverage rotation, and every time I use it I can’t help but think back to when it was new, and of how much has changed since those days.

It’s easy to treat a silly slogan on a coffee mug as little more than just a few words designed to evoke a wry grin from a slightly antisocial co-worker.  Sometimes it can take on a deeper meaning, if you let it.

For the last 6 months or more I’ve been working on transferring the function of our former demonstration facility in Brisbane to a location in Melbourne.  This has been fraught with problems and delays, not the least of which was an intermittent network fault into the network our systems are connected to.  Steady-state things would be fine; I could have an IRC client connected to a server in our subnet for days at a time.  When I actually try to do anything else (SSH, HTTP, etc), within about 5 minutes all traffic to the subnet would stop for a few minutes.  When traffic would pass again, it would stay up for five or so minutes then fail.  Wash, rinse, repeat.

It looked like the problem you get when Path MTU Discovery (PMTUD) doesn’t work and you have an MTU mismatch[1].  I realised that we had a 1000BaseT network that was connected to a 100BaseT switch port, so went around all my systems and changed where I was trying to use jumbo frames, but that made no difference to the network dropouts.  I found Cisco references to problems with ARP caches filling, but I couldn’t imagine that the network was so big that MAC address learning would be a problem (and if general MAC learning was constrained, why no-one else was having a problem).

Everything I could think of was drawing blanks.  I approached the folks who run the network we uplink through, and all they said was “our network is fine”.  I was putting up with the problem, thinking that it was just something I was doing and that in time we would change over to a different uplink and we wouldn’t have to worry any more.  My frustration at having to move everything out of the wonderful environment we had in Brisbane down to Melbourne, with its non-functional network, multiplied every time an SSH connection failed.  I actually started to rationalise that it was pointless to continue with setting up the facility in Melbourne; I’d never be able to re-create what I’d built in Brisbane, it would never be as accessible and useful, and besides no-one other than me had ever made good use of the z Systems gear in the Brisbane lab anyway.  Basically, I had lost confidence in myself and my ability to get the network fixed and the Melbourne lab set up.

Confidence is a mental strength, like our muscles which provide our physical strength.  Just like muscle, confidence grows from active use and wastes if underused.  Chemicals can boost it, and trauma can damage it.  Importantly though, confidence can be a huge barrier to a person’s ability to “work and play well with others” — too little confidence and one lacks conviction and decision-making; too much confidence and they appear overbearing and dictatorial.

Last week I was in Singapore for the z/VM, Linux on z, and KVM “T3” event.  Whenever I go to something like this I get fired up by all of the things that I’d like to work on and have running to demo.  The motivation to get new things going in the lab overcame my pessimism about the network connection (and lack of confidence), and I got in touch with the intern in charge of the network we connect through.  All I need, I said, is to look and see what the configuration of the port we connect into looks like.  We agreed to get together when I returned from Singapore, and try to work out the problem.

We got into the meeting, and I went over the problem in terms of how we experience it — a steady state that could last for days, then activity leading to three-minute lockouts.  I asked if I could see the configuration of the port we attached to… after a little bit of discussion about which switch and port we might be on, a few lines of Cisco CatOS configuration statements appeared in our chat session.  Straight away I saw:

switchport port-security

W. T. F.

Within a few minutes I had Googled what this meant.  Sure enough, it told the switch to monitor the active MAC addresses on that port and disable the port if “unknown” MACs appear.  There were no configured MACs, so it just remembered the first one it saw.  It explained why I could have a session running to one system (the IRC server) for ages, and as soon as I connected to something else everything stopped — the default violation mode is “shutdown”.  It explained why the traffic would stay down for three minutes and then begin again — elsewhere in the switch configuration was this:

errdisable recovery cause psecure-violation 180

If the switch disabled a port due to port-security violation, it would be automatically recovered after 180 seconds.

The guys didn’t really understand what this all meant, but it made sense to me.  Encouraged by my confidence that this was indeed the problem, they gave me the passwords to log on to the switch and do what I thought was needed to remove the setting.  A couple of “no” commands later and it was gone… and our network link has functioned perfectly ever since.

The real mystery for the other network guys was: why has this suddenly become a problem?  None of them had changed the network port definition, so as far as anyone knew the port was always configured with Port Security.  The answer to this question is, in fact, on our side.  To z/VM and Linux on z Systems people, networks come in two modes: “Layer 3” or “IP” mode, where the system only deals with IP addresses, and “Layer 2” or “Ethernet” mode, where the system works with MAC addresses.  In Layer 3 mode, all the separate IP addresses that exist within Linux and z/VM systems actually exist behind the MAC address of the mainframe OSA card.  In Layer 2 mode however, each individual Linux guest or z/VM stack gets its own MAC address.  When we first set up this network link and the z/VM and Linux systems there the default operating mode was Layer 3, so the network switch only saw one or two MAC addresses.  Nowadays though the default mode is Layer 2.  When I built new systems for moving everything down from Brisbane, I built them in Layer 2 mode.  Suddenly the network switch port was seeing dozens of different MAC addresses where it used to only see one or two, and Port Security was being triggered constantly.

This has been a learning experience for me.  Usually I don’t have any trouble pointing out where I think a problem exists and how it needs to be fixed.  Deep down I knew the issue was outside our immediate network and yet this time for some reason I lacked the ability, motivation, nerve, or whatever, to chase the folks responsible and get it fixed.  The prospect of trying to work with a group of guys who, based on their previous comments, really strongly thought that their gear was not the problem, was so daunting that it became easier to think of reasons not to bother.  Maybe it’s because I didn’t know for certain that it wasn’t something on our side — there is another problem in our network that definitely is in our gear — so I kept looking for a problem on our side that wasn’t there.

For the want of a 15-minute phone meeting, we had endured months of a flaky network connection.

On this occasion it took me too long to become sufficiently confident to turn my thoughts into actions.  Once I got into action though, it was the confidence I displayed to the network team that got the problem fixed.   For me, lesson learned: sometimes I need a prod, but I am one who “works and plays well with others”.

 

[1] I get that all the time when using the Linux OpenVPN client to connect to this lab, and got into the habit of changing the MTU manually.  Tunnelblick on the Mac doesn’t suffer the same problem, because it has a clever MTU monitoring feature that keeps things going.

Tags: , , , , , , , ,

Oracle Database 11gR2 on Linux on System z

Earlier this year (30 March, to be precise) Oracle announced that Oracle Database 11gR2 was available as a fully-supported product for Linux on IBM System z.  A while before that they had announced E-Business Suite as available for Linux on System z, but at the time the database behind it had to be 10g.  Shortly after 30 March, they followed up the 11gR2 announcement with a statement of support for the Oracle 11gR2 database on Linux on System z as a backend for E-Business Suite — the complete, up-to-date Oracle stack was now available on Linux on System z!

In April this year I attended the zSeries Special Interest Group miniconf[1], part of the greater Independent Oracle Users Group (IOUG) event COLLABORATE 11.  I was amazed to discover that there are actually Oracle employees whose job it is to work on IBM technologies — just like there are IBM employees dedicated to selling and supporting the Oracle stack.  Never have I seen (close-up) a better example of the term “coopetition”.

On my return from the zSeries SIG and IOUG, I’ve become the local Oracle expert.  However, I’ve had no more training than the two days of workshops run at the conference!  The workshops were excellent (held at the Epcot Center at Walt Disney World, no less!) but they could not an expert make.  So I’ve been trying to build some systems and teach myself more about running Oracle.  I thought I’d gotten off to a good start too — I’d installed a standalone system, then went on to build a two-node RAC.  I communicated my success to one of my sales colleagues:

“I’ve got a two-node RAC setup running on the z9 in Brisbane!”

“Great!  Good work,” he said.  “So the two nodes are running in different LPARs, so we can demonstrate high-availability?”

” . . . ”

In my haste I’d built both virtual machines in the same LPAR.  Whoops.  (I’ve fixed that now, by the way.  The two RAC nodes are in different LPARs and seem to be performing better for it.)

Over the coming weeks, I’ll write up some of the things that have caught me out.  I still don’t really know how all this stuff works, but I’m getting better!

Links:

IBM System z: www.ibm.com/systems/z or www.ibm.com/systems/au/z

Linux on System z: www.ibm.com/systems/z/os/linux/index.html

Oracle zSeries SIG: www.zseriesoraclesig.org

Oracle Database: www.oracle.com/us/products/database/index.html

[1] Miniconf is a term I picked up from linux.conf.au — the zSeries SIG didn’t advertise its event as a miniconf, but as a convenient name for a “conference-in-a-conference” I’m using the term here.

 

 

 

Tags: , , , , ,

What a difference a working resolver makes

The next phase in tidying up my user authentication environment in the lab was to enable SSL/TLS on the z/VM LDAP server I use for my Linux authentication (I’ll discuss the process on the DeveloperWorks blog, and put a link here).  Apart from being the right way to do things, LDAP authentication appears to require SSL or TLS in Fedora 15.

After I got the Fedora system working, I thought it would be a good idea to have other systems in the complex using SSL/TLS also.  The process was moderately painless on a SLES 10 system, but on the first SLES 11 system I went to YaST froze while saving the changes.  I (foolishly) rebooted the image, and it hung during boot.  Not fun.

After a couple of attempts to fix up what I thought were the obvious problems (each attempt involving logging off the guest, connecting its disk to another guest, mounting the filesystem, making a change, unmounting and disconnecting, and re-IPLing) with no success, I went into /etc/nsswitch.conf and turned off LDAP for everything I could find.  This finally allowed the guest to complete its boot — but I had no LDAP now.  I did a test using ldapsearch, which reported it couldn’t reach the LDAP server.  I tried to ping the LDAP server by address, which worked.  I tried to lookup the hostname of the LDAP server, and name resolution failed with the traditional “no servers could be reached” message.  This was odd, as I knew I’d changed it since it was pointing to the wrong DNS server before…  I could ping the DNS by address, and another system resolved fine.

I thought it might have been a configuration problem — I had earlier had trouble with systems not being able to do recursive DNS lookups through my DNS server.  I went to YaST to configure the DNS Server, and it told me that I had to install the package “bind”.  WHAT?!?!?  How did the BIND package get uninstalled from the system…

Unless…  It’s the wrong system…

I checked /etc/resolv.conf on a working system and sure enough I had the IP address wrong.  I was pointing at a server that was NOT my DNS server.  Presumably the inability to resolve the name of the LDAP server I was trying to reach is what made the first attempt to enable TLS for LDAP fail in YaST, and whatever preload magic SLES uses to enable LDAP authentication got broken by the failure.  Setting the right DNS and re-running the LDAP Client module in YaST not only got LDAP authentication working but got me a bootable system again.

A simple fix in the end, but I’d forgotten the power of the resolver to cause untold and unpredictable havoc.  Now, pardon me while I lie in wait for the YaST-haters who will no doubt come out and sledge me…  🙂

Tags: , , , , , ,

RACF Native Authentication with z/VM

 In 2009 I was part of the team that produced the Redbook "Security for Linux on System z" (find it at http://www.redbooks.ibm.com/abstracts/sg247728.html).  Part of my contribution was a discussion about using the z/VM LDAP Server to provide Linux guests with a secure password authentication capability.  I probably went a little overboard with screenshots of phpLDAPadmin, but overall I think it was useful.

I’ve come back to implement some of what I’d put together then, and unfortunately found…  not errors as such, but things I perhaps could have discussed in a little more detail.  I’ve been using the z/VM LDAP Server on a couple of systems in my lab but had not enabled RACF.  I realised I need to "eat my own cooking" though, so decided to implement RACF and enable the SDBM backend as well as switch to using Native Authentication in the LDBM backend.

Native Authentication provides a way for security administrators to present a standard RFC 2307 (or equivalent) directory structure to clients while at the same time taking advantage of RACF as a password or pass phrase store.  Have a look in our Redbook for more detail, but basically the usual schema is loaded into LDAP and records are created using the usual object classes like inetOrgPerson, but the records do not contain the userPassword attribute.  Instead of comparing a presented password against the field contained in LDAP, the z/VM LDAP Server (when Native Authentication is enabled) issues a RACROUTE call to RACF to have it check the password.

In my existing LDAP database, I had user records that were working quite successfully to authenticate logons to Linux.  My plan was simply to enable RACF, creating users in RACF with the same userid as the uid field in LDAP (I have access to a userid convention that fits RACF’s 8-character restriction, so no need to change it).  After going through the steps in the RACF program directory, and various follow-up tasks to make sure that various service machines would work correctly, I did the LDAP reconfiguration to get Native Authentication.

At this point I probably need to clarify my userid plan.  The documentation for Native Authentication in the TCP/IP Planning and Administration manual says that the LDAP server needs to be able to work out which RACF userid corresponds to the user record in LDAP to be able to validate the password.  It does this by either having the RACF userid explicitly specified using the ibm-nativeId attribute (the object class ibm-NativeAuthentication has to be added to the user object), or by matching the existing uid attribute with RACF.  This is what I hoped to be able to do; by using the same ID in RACF as I was already using in LDAP, I planned to not require the extra object class and attribute.  In the Redbook, because my RACF ID was different from the LDAP one I went straight to using the ibm-nativeId attribute and didn’t go back and test the uid method.

So, I gave it a try.  I had to disable SSH public-key authentication so that my password would actually get used, and once I did that I found that I couldn’t log on.  It didn’t matter whether I tried with my password or pass phrase, neither was successful.  I read and re-read all the LDAP setup tasks and checked the setup, but it all looked fine.  In one of those "let’s just see" moments, I decided to see if it worked with the ibm-nativeId attribute specified in uppercase…  and it did!

Okay, so it appeared that the testing of uid against a RACF id was case-sensitive.  I decided to try creating a different ID, with an uppercase uid, in LDAP to double-check.  Since phpLDAPadmin wouldn’t let me create an uppercase version of my own userid (since that would be non-unique), I created a different LDAP id to test:

[viccross@laptop ~]$ ssh MAINT@zlinux1
Password:
Could not chdir to home directory /home/MAINT: No such file or directory
/usr/X11R6/bin/xauth:  error in locking authority file /home/MAINT/.Xauthority
MAINT@zlinux1:/>

My MAINT user in LDAP has no ibm-nativeId attribute, so the only operational difference is the uppercase uid (the error messages are caused by the LDAP userid not having a home directory; I use a NFS shared home directory had I hadn’t bothered setting up the homedir for a test userid).

The final test was to change the contents of the ibm-nativeId attribute in my LDAP user record to lower-case — and it broke my login.  So that would seem to indicate that the user check against RACF is case sensitive wherever LDAP gets the userid from.  I’m going to have a look through documentation to see if there’s something I need to change, but this looks like something to be aware of when using Native Authentication.

I also noticed that I didn’t describe the LDAP Server SSL/TLS support in the Redbook, but that’s a post for another day…

Tags: , , , , , , ,

OpenSSL speed revisited

 I realised I never came back and reported the results of my OpenSSL "speed" testing after our 2096 got upgraded.  For reference, here was the original chart, from when the system was sub-capacity:

image

… and the question was, does the CPACF run at the speed of the CP (i.e. it runs sub-capacity if the CP is sub-capacity) or does it run at full speed like an IFL, zIIP or zAAP.  If the latter, the result after the upgrade should be the same as before — that would indicate the speed of crypto operations does not change with the CP capacity, and that CPACF is always full speed.  If the former, we should see an improvement between pre- and post-upgrade, indicating that the speed of CPACF follows the speed of the CP.

Place your bets…  Okay, no more bets…  Here’s the chart:

image 
The graph compares the results from the first chart in blue (when the machine was at capacity setting F01) with the full-speed (capacity setting Z01) results in red.

Okay, so did you get it right?  If you know your z/Architecture you would have!  As the name suggests, the Central Processor Assist for Cryptographic Function (or CPACF) is pretty-much an adjunct to each CP, just like any standard execution unit (like the floating point unit, say).  It is not like the Crypto Express cards, which are actually an I/O device and totally separate from the CP.  Because it is directly associated with each CP, for sub-capacity CPs its CPACF is bound to the speed of that CP.

If you look closer, further evidence that CPACF performance scales with capacity setting can be seen in the respective growth rates of each set of data points.  To see this a little clearer (because I don’t know the right mathematical terms to describe the shape of the curve, so I’ll just show you) I drew a couple more graphs:

image image  

Looking at the left graph (which is the same as the bar graph above, just drawn in lines) you can see that in both the software and the CPACF case the lines for before and after the upgrade follow the same trend with respect to the block size.  If these lines followed different trends — for example if the Z01 CPACF line was flat across the block size range instead of a gently falling slope like the F01 line — I’d suspect something else was affecting the result.  Looked at a different way, the right-hand graph above shows the "times-X" improvement between software and CPACF.  You can see that the performance multiplier (i.e. the relative performance improvement between software and hardware; CPACF speed is 16x software at 8192 byte blocks) was the same for each block size.

Now, just to confuse things…  Although I’ve used OpenSSL on Linux as the testing platform for this experiment, most Linux customers will never see the effects I’ve demonstrated here.  Why?  Because Linux is usually run on IFLs, and the IFL always runs at full speed!  Even if there are sub-capacity CPs installed in a machine with IFLs, the IFLs run at full speed and so to does the CPACF associated with the IFLs.  I’ll say again: CPACF follows the speed of the associated CP, so if you’re running Linux on IFLs the CPACF on those IFLs will be full capacity just like the IFLs themselves.  If you have sub-capacity CPs for z/OS workload on the same machine as IFLs, the CPACF on the CPs will appear slower than CPACF on the IFLs.

As far as the actual peak number is concerned, it looks like a big number!  If I understand it right, 250MB/sec would be more than enough speed to have a server doing SSL/TLS traffic driving a Gigabit Ethernet at line speed (traffic over connected sessions, NOT the certificate exchange for connection establishment; the public key crypto for certificate verification takes more hardware than just CPACF, at least on the z9 anyway).  And that’s just one CP!  Enabling more CPs (or IFLs, of course) gives you that much more CPACF capacity again.  Keep in mind that these results are using hardware that is two generations old — I would expect z10 and z196 hardware to get higher results on any of these tests.  Regardless, these are not formal, official measurements and should not be treated as such — do NOT use any of these figures as input to system sizing estimates or other important business measurements!  Always engage IBM to work with you for sizing or performance evaluations.

Tags: , , , , ,

Sharing an OSA port in Layer 2 mode

I posted on my developerWorks blog about an experience I had sharing an OSA port in Layer 2 mode.  Thrilling stuff.  What’s more thrilling is the context of where I had my OSA-port-sharing experience: my large-scale Linux on System z cloning experiment.  One of these days I’ll get around to writing that up.

Tags: , , ,

Short trip to Singapore

This week I’m in Singapore, running a training course on z/VM and Linux on System z. I really enjoy coming here! This is the first time I’ve done any kind of work here, and I’m enjoying fitting into the daily commute in another city!

The weather here is, obviously, hot and humid. It’s been far from unbearable though, in fact I’d almost say it’s comfortable (which is quite something from someone who usually can’t stand hot weather). I’ve rediscovered the transport system, the excellent MRT train system with its regular services and its cheap fares, and I’m using it to go to and from work.

I’ll make some further notes as the week goes on. Wish me luck with the training!

Tags: ,

OpenSolaris on System z

It’s all the rage on YouTube, apparently…  posting video of a z/VM system booting something.  Only kidding, this is a good piece of tech.  If you search YouTube for “OpenSolaris System z” you’ll find a set of five videos that show an interview (recorded at the recent Gartner datacentre conference) with David Boyes of Sine Nomine Associates demonstrating OpenSolaris running on an IBM System z mainframe.  It’s a great achievement, and a fine piece of work — but there’s a catch.

I can’t stress enough what a great job David, Neale (Aussie, Aussie, Aussie!), Adam and everyone at SNA have done.  Networking is not there yet, but I trust it’s not far (need a hand fellas? (: ).  It must have been a hard slog, and for some (particularly Neale) perhaps brought some unpleasant memories (anyone remember Bigfoot?).  Congratulations are deserved.  I can see the lolcat now: I IS SUN. IM IN UR MANEFRAYM, KIKIN OUT YR PENGUINZ.  YA RLY!  Only joking!

The catch is, ironically, the aspect of the port that makes it most useful in the “real” world.  The guys have made the port dependent on z/VM.  Don’t get me wrong, it’s the right thing to do — without z/VM, you can’t play to the strengths of the System z platform and it’s capabilities for massive resource sharing in a virtualisation environment.  Many believe that Linux on System z should have been taken in the same direction, as other platforms (like System p) do big-single-Linux-footprint better than what System z does.

The twist is that by tying the OpenSolaris port to z/VM, they’ve eliminated a set of would-be hackers from contributing to the effort.  Those with motivation, time, skill, and a big Intel box who can get a couple of hundred MIPS out of Hercules.

There are, rightly or wrongly, a lot of people who think that Solaris is a good platform.  These are the kind of people I’m thinking of — maybe folks who have always derided the mainframe, but perhaps are now thinking “gee, well if it runs Solaris now, it can’t be all bad.  Maybe I’ll check it out”.

Obviously I can’t speak for Sun (nor for IBM or SNA), but I’m sure I read that one of the objectives of OpenSolaris was to get Solaris into more hands and to try and benefit from the “millions of set of eyes” effect that Linux enjoys.  It seems ironic then that the first “non-Sun” platform to which OpenSolaris has been ported is one that doesn’t contribute to that goal.

Not to worry.  David at SNA has stated that they are committed to releasing their work to the community.  This will be the point at which an interested party could look at the code and potentially rip out or rewrite the z/VM-specific bits and replace them.  It wouldn’t be impossible — even CMS was able to IPL standalone once upon a time — but it would be a huge piece of work (no doubt part of SNA’s reasoning was to let z/VM do a lot of heavy lifting for I/O and such tasks; that would have to be written for OpenSolaris).  Bags I not-it.  Likewise, our potential interested party would be very likely to turn away to Linux… or even away from System z entirely.

Meh, enough doom-talk.  I’ve downloaded three different flavours of OpenSolaris for x86 (NexentaOS which I had a brief look at previously, Solaris Express Developer Edition, and something that called itself the “Indiana Preview”) and I’m running them in VMware to have a poke around (but not all at the same time, they need a heap of memory).

I’ll be following this as close as I can (or as close as I’m allowed).  I think it will be really interesting to see how this progresses.  Good luck to all involved (and if you need a hand guys… 😉

Tags: ,

Rebooting my belief system

I’ve been away from SHARE for far too long.  It’s really great to hear positive things about Linux on zSeries again, rather than the crap I have to put up with at home.

In Australia, there is no evangelism of zSeries.  There’s an attitude bordering on arrogance that seems to say “we’re not going to explain zSeries to you; if you don’t know you want it already then you’re not worth it”.  At least that’s what it looks like to me.

I’m surrounded by people who think that all problems can be solved by installing an xSeries or pSeries machine.  Maybe some can be, but IMHO they’ll be replacing one set of problems with another (possibly greater) set.

Anyway, it’s nice to hear different stories — like a company whose IT costs went from 1.7% to 0.9% of sales by migrating their ENTIRE server farm (including about a dozen p690s) to a z990 running Linux.  Like a company that has placed 250 Linux server guests onto z/VM inside a year, freezing acquisition of new discrete servers.

Tags: , ,