There are dozens upon dozens of improvements that IBM has made to its new mainframe architecture – the z13. But the biggest, stand-out improvement is that the mainframe is now well suited to integrate real-time analytics into the transactional stream.
I have been hard-pressed for years to argue that mainframe architecture is superior to other architectures (including RISC, x86 and EPIC architectures) when it came to data-intensive analytics workload processing. My primary issue started at the processor level because the z Systems processor has long been a single thread, stacked work processor design that excels at transaction processing. But I also had other concerns about the amount of available memory and cache; data compression; and system-to/from-storage latency (I/O throughput). If the System z was to become a true analytics server, improvements in each of these areas were needed.
The good news is that IBM has made substantial improvements in each of these areas (discussed in the next subsection). As a result the z13 is extremely well suited to process transactions AND perform analytics simultaneously. This is important because it means that transactional data presently held on a z System does not need to be moved to different types of servers for analytics. By not moving z data to other systems, enterprises can save a considerable amount of money by not having to invest in new hardware (more systems, storage and networks) – and even more money by not having to manage their Extract, Transform and Load (ETL) processes (so human management costs are lowered, potential errors are eliminated, and security is improved).
My recommendation, based upon the improvements that have just been made to z Systems, is this: if an enterprise captures transactional data on a mainframe and wants to analyze that data, the analysis should take place on a z13 (because moving that data away from the mainframe is both costly and fraught with potential errors and associated risks).
A Closer Look at z Systems Analytics Improvements
IBM has improved z Systems data-intensive processing for the past twenty years. At the end of the 1990s, IBM delivered IEEE binary floating point facilities. The early 2000s brought 64-bit computing and superscalar parallelism to the mainframe (superscalar architecture implements a form of parallelism called “instruction level parallelism” ─ allowing a single processor to process work at a rate faster than its clock rate). Also, processor clock speeds continually increased.
Further, IBM added “out-of-order execution,” that substantially improved floating point performance (enabling the mainframe to rival reduced instruction set processors such as IBM’s own POWER, Sun [now Oracle] SPARC, and Intel’s Itanium). And, when IBM introduced the EC12, it added significantly more memory and significantly more on-chip cache. All of these improvements – plus better compute-intensive [numerically-intensive] SIMD (single instruction, multiple data) vector processing that has been added in the z13 – have contributed to making the mainframe environment a much more competitive analytics server.
But still, the mainframe needed further improvements to make it more competitive with other architectures that were designed to handle the type of parallel processing activities typical in analytics systems. The processor needed to handle more threads per clock cycle; it needed to have faster access to larger amounts of data held in memory and cache; data compression at the processor level needed to be increased; and I/O speed needed to be improved so I/O bottlenecks could be eliminated. Finally, the software ecosystem needed to be expanded such that more sophisticated analytics software could be deployed. IBM has addressed all of these issues with the new IBM z13.
The Analytics Improvements
IBM’s new z13 boasts dozens of improvements over its EC12 predecessor – including more capacity (a 40% improvement); 141 configurable cores (including combinations of central processors, Integrated Facility for Linux, z Integrated Information Processors, and others); 320 separate channels of dedicated input/output; the planned arrival of KVM (open source virtualization) – and the arrival of the IBM General Parallel File System (GPFS, a large, parallel file system) for Linux on System z. (The new z13 also boasts up to 10 TB of memory at lower costs; IBM Collocated Application Pricing (ICAP); new multiplex pricing, and a zTPF transformation engine.
A closer look at improvements that make the z13 a better analytics engine, however, shows the following:
- A new processor design geared to process dual threads simultaneously (simultaneous multi-threading delivers more throughput… up to 38% for zIIP and up to 32% for IFL processors);
- A new 8-core processor design based on 22nm silicon technology that features a wider instruction pipeline;
- Better data compression with improved on-chip hardware compression;
- New SIMD instructions (faster mathematical processing);
- A 2X increase in channel speed using 16 Gbps FICON (which also improves DB2 write operations with IBM zHyperwrite by 43%);
- Improved I/O backbone (50-80%+) to drive increased transaction throughput per I/O domain
- New virtualization capabilities for RoCE Express to share the feature across LPARs to reduce latency and the need for CPU cycles in communications processes;
- A 3X increase in available main memory (now 10 TB – enabling faster data processing); and,
- A 2X increase in L2 cache (again, placing more data closer to the processor for faster processing).
In addition, IBM also improved its IBM DB2 Analytics Accelerator for z/OS offering – a tightly coupled “sidecar” appliance that can significantly speed the processing of complex analytics workloads. (This appliance uses field programmable gate arrays and Intel processors to do so). Recent improvements to the IBM DB2 Analytics Accelerator include the ability to accelerate a broader spectrum of queries, including support for Static SQL, multiple-row FETCH and multiple encodings on the same accelerator, improved workload balancing, including incremental update performance and improved monitoring, and, improved storage performance, including built-in restore, and better access control of archived partitions and protection for moved partitions.
Further, IBM has also improved the analytics software ecosystem for z Systems. About five years ago, the company’s management mandated that the its software products would be developed to work across all IBM servers (this included mainframe, System x, and Power Systems). Previously, IBM’s Cognos, SPSS and other analytics products had favored distributed systems – but the company’s cross-platform mandate changed that. So the same reporting tools, utilities and applications that were perfected and honed to perform analytics on distributed systems are now available and have been optimized for the mainframe.
Finally, IBM has built IT analytics software specifically for the mainframe. For example, consider the company’s zAware environment, now enabled for Linux on z Systems as well as z/OS environments – an application that uses analytics to create a model of mainframe behavior, and then identifies changes that have negatively impacted the system. And IBM recently launched its leading Big Data Solution on z with IBM InfoSphere BigInsights and IBM Infosphere System z Connector for Hadoop.
In short, in the new z13, IBM has addressed all of my previous concerns regarding the suitability of System z for processing analytics workloads.
With the new z13, IBM has addressed specific processor and system architectural limits that forestalled the mainframe from becoming a premier analytics processing engine. But now, with major improvements in the amount of available memory and cache; with faster I/O; with better compression – and especially, with multi-threading – the mainframe now rivals traditional RISC and x86 competitors when it comes to data-intensive analytics processing.
With the new z13, IBM is delivering a strong benefit to mainframe customers: the ability to process transactional data and perform analytics on the same system in real-time. What this ultimately means is that mainframe customers no longer need to move their data to other systems for analytics processing (and this, in turn, reduces hardware/software/management costs). But it also means that mainframe users can get more reliable results more quickly (because latency is eliminated by not having to move data – and the data is “known-good” because it hasn’t been duplicated and manipulated during the move (ETL) process.
The big question in my mind now is how the market will react to these changes. The mainframe is now capable of delivering analytics results on transactional data in real time. And this means that data no longer needs to be ETLed to distributed processors. But the ETL process has been instantiated over a couple of decades – so people are used to moving their data to other processors for analysis. Plus, those same people are usually “distributed systems” customers who will not be exactly thrilled to see work moving out of the distributed computing silo to mainframe environments. Enterprise executive managers will need to become involved in guiding their enterprises toward a more efficient, less costly approach to analyzing mainframe data. My biggest question, however, is will top-level executive management step-in and referee this situation – or will they continue to stick with less efficient, more risky ETL processes? With the z13, IBM has supplied a very positive answer to that point. Let’s see if businesses are listening clearly.