From: CSBVAX::MRGATE!@KL.SRI.Com,@cis.upenn.edu:CLAYTON@xrt.upenn.edu@SMTP 3-NOV-1987 07:07 To: EVERHART Subj: My Comparison Of DEC and SI HSC/UDA Based Disk Systems... Received: from linc.cis.upenn.edu by KL.SRI.COM with TCP; Sun 1 Nov 87 18:32:36-PST Received: by linc.cis.upenn.edu id AA06928; Sun, 1 Nov 87 21:26:52 EST Posted-Date: Sun, 1 Nov 87 21:04 EDT Message-Id: <8711020226.AA06928@linc.cis.upenn.edu> Received: from xrt.upenn.edu by cis.upenn.edu; Sun, 1 Nov 87 21:24 EDT Date: Sun, 1 Nov 87 21:04 EDT From: "Clayton, Paul D." <@cis.upenn.edu:CLAYTON@xrt.upenn.edu> Subject: My Comparison Of DEC and SI HSC/UDA Based Disk Systems... To: ECKERT%CAR54.DEC@decwrl.dec.com, INFO-VAX@kl.sri.com X-Vms-To: @INFOALL,CLAYTON Information From TSO Financial - The Saga Continues... Chapter 32 - October 25, 1987 Apologies for the lengthy delay in getting this article out to the network and others. This is the second time its been sent, the first apparently went to bit-land. I hope you will find it worth the wait. :-) This article is to inform interested parties about the installation, use and problems that I have had with System Industries (SI) SI83C and SI93C drives. These drives are currently being sold by SI for connection to HSC/UDA disk controllers for use in VAXClusters. This article is not commissioned by System Industries and all comments here are my own. The SI93C disks I received were packaged with six (6) drives per cabinet, while the SI83C drives are packed eight (8) per cabinet. The original SI83C drives I received had 1,021,140 blocks of formatted space that is available to VMS. These drives have since been reformatted to allow more room for diagnostic work areas and the block count now available is 1,016,199 and the drive type has also changed from a RA81 to RA82. The SI83C drives are dual ported between 2 HSC controllers, one HSC50 and one HSC70, and driven by a six node VAXCluster of two 8700's, one 8530 and three 11/785's. The last two drives are ported between a UDA50 inside a 8200 and the HSC70 listed above. We are running VMS 4.5 and Volume Shadowing software, from time to time. The SI93C drives are single ported to an HSC70 and driven by a 8350 VAXCluster cornerstone, running VMS 4.5 and Volume Shadowing all the time. The pertinent comparison information is presented in Table A showing various aspects of the different drive types. The SI93C disk drives also display as a RA82 device to the HSC. Drive Blocks Per # per 512 Blocks Cost per Cost per Type Disk Cabinet per Cabinet Cabinet MB RA81 891,072 4 3,564,288 65,000 35.62 RA82 1,214,843 4 4,859,372 76,440 30.72 SI83C 1,016,199 8 8,129,592 115,000 27.62 SI93C 1,644,300 6 9,865,800 130,000 25.74 Table A Comparison Of Pertinent Information The SI83C drives came on a wood pallet, since they were shipped air freight, that had a unique feature. Instead of supplying ramps to roll the cabinet down, like DEC does, one of the pallets side supports unbolts from the pallet and the whole works then tilts forward. Neat little trick, and it works just like having ramps. All the SDI cables came in two boxes and was accompanied by documentation for bad blocks and a 'Users Guide'. Three packages total. The SI93C drives were delivered on their own wheels, since these were shipped over land by truck, and were wrapped in bubble plastic and cardboard. It was a simple case of rolling them into place and unwrapping them, no fuss. After rolling the drives into place the first of several surprises came to light. SI has wired the electric such that four (4) drives are connected to a power supply that is located VERTICALLY on the left side of the cabinet. The breaker switch is on the front of the power supply and as such is recessed about 1.5 inches from the front of the door jamb. I have told SI about the close proximity of the breaker switch to the front edge and how easy it would be to ACCIDENTALLY knock the switch and kill four drives in one shot. When the breaker is 'ON', it is in the 'UP' position. SI has said that cover plates, or something similar, will be provided on future releases and available for installed customers if desired. The other note here is that there are two power supplies and associated power plugs (L5-30R) that have to be wired for. Do to having the power supplies on the side of the cabinet, the overall size IS wider then a quad-pack of RA type drives by 9.5 inches, while the height is the same. This is subject to change as I hear that the cabinet is being redesigned. The next item to notice is the cabinet itself. It is a solid looking box with fans under the top to aid the air flow. The front door is a metal panel with air slots cut out for cooling. There is no way to see the state of the controller switches and unit numbers without OPENING the door. That is a pain to me, but then again I do not have to get FCC approval for RFI containment. The door handle is something that I have mentioned to SI cries for modification. An Allen wrench is REQUIRED to open the door. To me, it looks lousy seeing a nice looking cabinet marred by the CONSTANT presence of an Allen wrench. The back of the cabinet is also opened by the Allen wrench. I have always heard from my salesman that the SI83C drives are targeted to be shipped with up to twelve (12) drives per cabinet. From what I see in my cabinet, they would fit, vertically. I have to question the ability of the cabinet to hold all the cables in the back going from the SI controller to the SDI junction panels and the drives to the SI controller. The weight and cooling of the cabinet appear to also play a part in squeezing twelve in a cabinet. The 83/93 drives themselves are mounted flat and are placed two across the 19 inch rack space. A panel on the SI83C drives is in front to dress them up a bit, but when you order eight in a cabinet, SI spaces the eight drives and two controllers over the entire height of the cabinet. The result is that you can see the FUJI drives and SI controllers top and bottom, and a consistent 'dress' appearance is lacking. Elimination of cooling problems has been suggested as the reason for spreading them over the height of the cabinet. The problem with this is that CAPACITY/FOOT PRINT is a major selling point for these drives. At some future point I may want to put the last four 83C drives in the cabinet I currently own. The ONLY way to do that is to essentially remove ALL the drives from the cabinet and move them and the cables up/down in the cabinet. This to me can be the cause of problems to come, why mess with something that already works? The 93C drives are shipped tightly packed with NO gaps between the disk drives and the SI interfaces. I have to admit that seeing TWO 510MB FUJI, or TWO 858MB NEC drives side by side in roughly the same space as ONE 420MB RA-81 disk usually gets a chuckle out of me on the bleakest of days. The SI controller that provides the DSA compatibility and allows the FUJI's/NEC's to work on HSC and UDA type controllers is approximately five (5) inches high and 19 inches wide. Each controller provides the interface for up to four (4) drives. The model of the controller that I received provided the next area for serious conversations with SI. First, as far as I am concerned is that the logical unit number that is 'keyed in' for each drive, such as 0 or 1, had only two selection wheels and the number was based in HEX, not DECIMAL. This runs counter to the rest of the world and I told SI that was the dumbest mistake in designing the package. I shudder at thinking about making sure my DECIMAL to HEX conversion is correct and the consequence of doing it wrong. Hence my decision to number the SI drives way out in left field so that conflicting numbers would not occur, by accident. This HEX number problem has since been corrected and I received a replacement set of panels which provide for the unit number in DECIMAL, with a range of 0 to 999. The new panels also provide a hardware write protect switch which was lacking in the original design. There has always been a write protect switch on the drives themselves, this other switch is in addition to the drive switch. The remaining buttons on each panel are for enabling the 'A' and 'B' ports to the outside world. The nice feature that I like here is that if the port is enabled, and is currently selected, and in use the LED remains in a constant 'ON' state. If the port is enabled but NOT selected, the condition when HSC failover is being provided for, the LED is in a 'BLINKING' state. This is very helpful in making sure that both ports are enabled. On the RA drives, the only light lit is the one currently selected. I am always going around and pushing the other port button to determine what 'state' it is in. Some other interesting items on this controller is that each interface adapter between the FUJI/NEC drive and the HSC, called a 'C-Mod Card', is approximately six to seven inches square and contains its own Motorola 68000 microprocessor chip and supporting circuitry. These are stacked two high with metal standoffs between them. There is plenty of space left inside the box, which leads one to wonder if SI has any thoughts of additional options inside the box. The next area of consequence during the installation process was in connecting the SDI cables from the HSC to the cabinet. The cabinet I have is designed such that there are two panels, each about 3 by 18 inches, held in place by 4 screws that are the used to connect the SDI cables to. Each panel has provisions for connecting 8 cables to it. These panels also provided material for further serious conversations with SI. The problem here is that there is about 4 inches of space from the bottom of the cabinet to the top of the computer floor. In order for the installation of the cables to occur, the field service representative removes the screws holding the panels in place so that a screw driver can be used to lock the SDI cables to the panel. Within that 4 inch space, up to sixteen (16) SDI cables, which are thick and bulky, have to get at least one and maybe two 90 degree bends in them, to feed down into a hole in the computer room floor. This is not an easy task and is the reason for my NOT having a floor tile behind the SI cabinet. I consider it very risky to subject those cables to that kind of bending. I have received the new version of the SDI connection panel on the SI93C drives and it now provides for VERY easy cabling and plenty of space to position the cables for getting through the computer floor. The question of retro fitting to the existing user base has not been answered yet to my knowledge. Having a floor tile missing is a hazard and unsightly. In talking with my local field service office, it also came to light that a somewhat questionable batch of SDI cables had been received by SI when the 83C drives were first shipping. The result of this is excessive errors being reported by the HSC's. There is a test that the field service representative can perform to verify the integrity of the cables should they get by an internal check before shipping. The last item of the installation was to become knowledgeable of the terms of the field service contract that is bundled into the purchase. The call window is 9:00 AM to 5:00 PM, Monday to Friday. The window can be expanded at additional cost to what ever is required. The time lag from placing a call for service to having someone here working is to be less then four (4) hours. I am told that these terms can be different according to your location and the distance to the nearest SI field service office. It needs to be noted here that I will not have DEC or other vendors perform the maintenance on these drives for a long time. My reason is that the controller is brand new to the market and still receiving significant fixes and upgrades. I feel the ONLY way to get these fixes in a timely manner is to have the manufacturer perform the maintenance. It also helps when you concern yourself with the sparing issue at the local office. With that done, everyone cleared out and I put the drives to immediate use by running a program I wrote some time ago to heavily load a disk with I/O. The program is designed such that a 400,000 block file is created and random length records, 1 to 32,000 bytes, are written then read from the file on a purely random basis. After the write is complete, a read of the same information at the same location on disk is performed and the information is compared with what was written to insure that there was no errors in writing to the disk. This process continues until the program is aborted or more then 5,000,000 write operations to the disk are completed. Status report lines are generated every so often to indicate how the test is progressing. Should an error occur, all pertinent information is printed out and the operation is retried up to ten (10) times before moving on. The result is a disk that physically feels likes its attempting to rip its heart out. Two editions, for a total of 800,000 blocks of usage, were run on two different unloaded cpu's, one 8700 the other a 8500. The SPM package from DEC accumulated performance statistics during this time period and the following results tell the comparison between various types of disks. The 'response time' stat has been defined to me as the time it takes an I/O request to travel from the VAX bulkhead CI connection, through the HSC, out to the drive, have it performed, and back to the bulkhead. This therefore includes any time needed by the HSC to setup the request for the disk. The tests on the SI93C drives were done on a VAX 8350, with the attached processor enabled. +---------- Shadow Disk Statistics ---------+ ! Serv Resp ! ! Rate Time Time Queue ! ! (/s) (ms) (ms) Length ! ! ------ ------ ------ ------ ! ! RA$81 14.9 67 131 2.0 ! ! SI$83C 16.6 60 117 1.9 ! ! SI$93C 17.5 56 101 1.8 ! +-------------------------------------------+ +-------- Seperate Disk Statistics ---------+ ! Serv Resp ! ! Rate Time Time Queue ! ! (/s) (ms) (ms) Length ! ! ------ ------ ------ ------ ! ! RA$81 16.8 60 117 2.0 ! ! SI$83C 19.2 52 100 1.9 ! ! SI$93C 20.9 48 88 1.8 ! +-------------------------------------------+ Table B Comparison Of Disk Performance Using Test Program For both the tables shown above, it needs to be noted that there was always an I/O request that was waiting to execute on the drive and therefore no 'idle' time as far as the drive was concerned. This is shown in the 'Queue Length' value for each test. The other item to note here is that the 'Response Time' values for the shadow disks are larger then the values for the separate disks. The basis for this, I feel, is that while the shadow disks were on different HSC requestor cards, the test program performs a read after EVERY write. The result is that both members of the shadow set have to complete the previous write operation and then the HSC does a comparison between them to see which could provide the information faster on the following read. The increase that is shown, therefore is the time to completely update both shadow members and do the comparison. I also feel that these numbers represent the worst case scenario, except if both shadow members are on the same requestor, that you should encounter. The point to remember with shadow sets is that the 'win' situation is with I/O that is largely read requests instead of write requests, and the test program is exactly opposite this. The problems, other then the ones listed above, that I have had to date are as follows with any updates that I know at the time of this writing. Error messages on the wrong HSC channel. This problem existed if you had the drives dual ported between two HSC's and the drive was 'selected' by one of them. If the drive reported any errors, both HSC's received the message packet. The problem existed for the HSC which did not have the drive selected and therefore was not in its list of 'known drives'. If the errors happened frequent enough, the opposing HSC invoked ILEXER to find the problem and it could not find the drive and this continued until the HSC would declare the drive inoperatble or crash. This has since been corrected and new firmware is being distributed to existing sites and installed on new systems. I have not had the firmware in long enough to determine if it is truly fixed myself. Concurrent with the firmware update, there may be a need to have the disk(s) reformatted to provide more space for diagnostic testing on the disk. SI has just announced and provided to the field offices a box that will enable them to format and exercise a FUJI/NEC drive fully, without any HSC or host support needed. The third screw back on the rack slides which hold the SI controller in the cabinet can push the metal standoff inwards and the result is a C-Mod card that is bent. This was the case on the SI83C drives and was remedied by removing the screw completely. The problem is a screw that is about 1/16th of an inch to long and the metal standoff between C-Mod cards being perfectly located to coincide with this screw. The SI controller box has only one power switch in its current form. The implication here is that should one C-Mod card fail in the controller cabinet, the entire cabinet has to be powered down to fix the problem, which would also cause up to four drives to unavailable for use during the outage. The space inside the cabinet is layed out in such a way that any work done to one C-Mod card would require that all drives with controllers in the cabinet be turned over to field service. The attachment of the SDI and drive cables to the back of the controller cabinet are done in a very compact way, which can cause headaches for the field service representative. Similar cables for two drives are directly over one another and the cable hold downs have screws on the top. The result is cramped space between the top and bottom cables and the back of the cabinet when it is pulled out on the slide racks. The cable from the SI83C drives to the SI controller cabinet is approximately three (3) inches wide and is installed in such a manner that it covers over an air vent in the power supply that is four (4) inches wide. The result is an air vent with only one (1) inch of effective air flow on a power supply. The failure rate to date has been one C-Mod card, one power supply one FUJI drive and two NEC drives. The C-Mod card and power supply failures I would attribute to the power failures we have had recently (four in five months) and is minor compared to the five HSC requestor cards, two HSC CPU cards, three HSC CI Link cards and various DEC HDA and VAX 8XXX problems I have had in the same period. The FUJI HDA failure happened immediately after delivery and the initial power up, and I consider it the result of shipping damage. The two NEC HDA replacements I attribute to power failures also. These are located in another computer room I have in Wilmington, Delaware. On a GOOD week we ONLY have one power failure, enough said. There appears to be a 'latching' problem between the 'C-MOD' card in the SI controller and the FUJI/NEC drives. The problem comes up when you power down a drive and power it up. Or power a controller cabinet down then up. The write protect signal and drive ready signals can be latched in the wrong state and require several more power down/up cycles to clear them. Overall I have to say that I am very pleased with the disk drives, their performance and the support of my local field service office. SI appears to have a product on the market that provides an alternative to the equipment that is offered by DEC and the pricing is very attractive. Of all the issues that I have talked about here, the only one that could change due to your getting the SI disk drives is the support of the local field service office. I view this as an important issue and one that needs constant monitoring and changes. I feel it is the users responsibility to watch the equipment on a day to day basis and to notify the local office of any problems. It is also the users responsibility to press any issues that arise over questionable or incomplete support that you may be receiving. If the field service office is not supporting you to the extent that you feel is needed, take it up with your salesman, and let them work the issue for you. If they can not resolve the issue call the west coast, but only after all other avenues have been tested. NOTE*** All comments, statements and facts here are my own, and not that of my employers, National Teachers Life Insurance, Teachers Service Organization (TSO) or any of their subsidiaries. All rights to this article are reserved. This article is not meant to be a 'Sales' pitch of the product. I have no connections with SI, short of HEAVILY using their equipment. Any electronic reprint of this article MUST completely contain this NOTE. NO PERMISSION IS GIVEN TO REPRINTING THIS ARTICLE OR ANY PARTS OF IT ON PAPER, OR SIMILAR SUBSTANCES. Paul D. Clayton Manager Of Systems TSO Financial Corp. Horsham, Pa. USA 19044 Address - CLAYTON%XRT@CIS.UPENN.EDU