https://slotsdad.com/ - casino online slots

Performance Testing: A Reader Sounds Off

by Administrator on January 13, 2006

My recent article on performance testing in Byte & Switch garnered an exchange with a very knowledgeable reader via email. I will play it back here in sequence.

Jon,

Having been involved with storage since before SCSI was called SCSI and having directly worked for two of the 4 major FC ASIC vendors and indirectly involved with a 3rd I can relate to your concerns.

The bottom line is that there is no apples to apples comparison due to the ASICs used at the HBA and on the Raid controller. They all interface with the OS differently and treat target devices differently.

The only way to get a fair benchmark is to only introduce a single variable whether that is a HBA, Storage device or OS driver stack. Even then it is not fair unless an extreme effort is made to fully understand the pros and cons of each subsystem in test.

The net result is a lot of work that will only manage to provide meaningful data to a narrow audience. The wider the test results are distributed the more the reviewers will pick at it and find holes in the testing.

Even SPC & Spec can be doctored to give better results than the customer will see. For example, when I worked for HP there was an effort under way to benchmark a new multi-processor platform. The expert on benchmarking couldn’t get the numbers he needed and blamed the disk subsystem. I got called in to help out. After listening to the ranting and raving I was able to settle him down and explain to him how disk drives worked and how to take advantage of ZBR, zone bit recording, currently new to drives at the time.

By artificially forcing the OS to install on an area of disk towards the inner cylinders we left the fastest areas of the disk available for user data, e.g. the benchmark programs. Ultimately this provided the additional bandwidth the ‘expert’ needed and he published his numbers, record setting to boot !

The bottom line was that there was no way that a typical user would have the knowledge to do what he had to do get get that performance so they could never be duplicated easily by anyone other than an highly knowledgeable person familiar with the technology and intimately knowledgeable of the OS.

Believe it or not the same scenario exists to this day. It amazes me how little people who claim to know really do know about the products that they are evangilizing.

Cheers,

[Name Withheld Pending Writer Approval]

I responded, as I always do, to the reader’s comments:

Jim,

I have been giving a great deal of thought to your response, but I am still a bit confused. If my workload is standardized, the network and servers I am using do not change, and I am only changing either an HBA (for a competitive analysis of HBAs) or a switch (for a competitive analysis of switches) or a storage array (for a competitive analysis of array performance), aren’t I “black boxing” the changed component and able to obtain a fair apples-to-apples test comparison? Let’s say that I am only interested in how much work gets done at the end of the day, and I follow standard implementation steps from vendor documentation in setting up and configuring my hardware (or I let them implement), why not document the work accomplished and call that a measurement?

Too many shortcuts?

Jon

To which he responded…

Jon,

I wrote a long response, which I am including, but wanted to address your questions up front rather than obliquely.

Standardize workload – no such thing. This implies that the workload will not change regardless of any external modifications. It is very likely that it will change, negatively or positively, knowing why is the hard part. It is absolutely acceptable to take a given test scenario, baseline it, swap out a component, retest and compare the results and publish them. I will not argue that point, my argument is knowing why the change occurred and what feature of the new device contributed to the change.

When a report provides that type of information then there is value.

*** Long response ***

The assumption is that the two black boxes you are comparing are equal in all things. This is not always the case.

For example, Buffer to Buffer Credits is the flow control mechanism that FC implements to keep data packets moving and to avoid congestion and droops in performance. I think that this is a pretty basic statement and just about everyone will agree on it.

The downside is assuming that BB Credits are the same between all switches and all HBAs. In reality there is a huge difference per vendor per product. Some HBA vendors support 100+ 2kb BB Credits on some products but only 16 on others. In contrast other HBA vendors will only support less that 8 on the high end and only 2 on the low end.

The same goes for switches, depending on the product the BB Credits could range from less than 10 to over 64 and even higher.

To take this one step further, if a HBA only supports 2 credits and during any given exchange the hba/switch transmit/receive 2 credits no further action can take place until those credits are freed. This means DMA into memory or some other acknowledgement of the reciept of that data.

So unless you are intimately familiar with the capabilities of all the components it is extremely difficult to benchmark anything.

The rule of thumb I use is that it is better for the HBA to have more BB credits than the switch, same goes for the target. You want the two end-points to be capable of more bandwidth than the infrastructure components. This allows for multi-threaded capabilites, leaving credit for other exchanges while the endpoint in question processes the data it just sent/received. Does this make sense ?

So for benchmarking purposes, depending on the device benchmarked, you do not want an under-subscribed topology.

I want the endpoints to be waiting on the switch if testing switch performance I want the Initiators waiting on targets if testing target performance I want Targets waiting on the Initiators if testing Initiator performance, etc

Knowing, intimately, how each component operates allows you to create a topology that will ensure that the device of interest will be pushed to its limits and not to the limits of the rest of the device variables in the configuration.

To throw even more variablity into the equation, HBA Vendor X’s solution is architected to operate best with multiple target devices with only a few lun on each device. HBA Vendor Y’s solution is just the opposite, minimum targets many luns.

Even device drivers add to variability. Sure they may adhere to Microsofts minimum miniport spec or StorPort for that matter but what is not implemented or added outside the spec ?

I really do not have a great answer for you, I guess all I am trying to do is to illustrate why it is so hard to get true blackbox comparisons.

I’d love to work with you to put something together if you are interested.

To which I responded…

Jim,

Somebody said once that it was a singular characteristic of the modern era that we are less interested in why things work the way they do than we are in getting enough information to make them work the way we want them to. We are less interested in the nature of electrons than we are in forcing them to move through a copper wire to light a light bulb.

While I admire and appreciate your objections, I still find myself questioning their efficacy. At the end of the day, my clients just want to know which solution gives them the most bang for their buck. They create an RFI, invite vendors to recommend and even install gear in their environment, then check to see which solution processes the workload the fastest, or gets the most work done in the same timeframe.

These are not scientific tests to be sure, but they are what we have.

While every objection you raise makes enormous technical sense, and while I find it especially applicable in large corporate environments with complex infrastructure — where significant practical barriers exist to deploying multiple vendor solutions for testing at all — there has got to be some way to do a much simpler and straightforward apples to apples comparison that at least yields “rule of thumb” guidance to consumers.

Let me know your thoughts.

Jon

To which he responded…

Jon,

I have to agree with you, clients are not interested in the minutia all they want is the solution to their problem. That is why we get paid the big bucks, to protect our clients from having to deal with it.

I am thinking that we should write a ‘Dummies Guide to SANs’ and see if it goes anywhere.

My arguments up to this point have been focused on why it is hard to generate an objective benchmark. What I have failed to do is identify that it does not have to be difficult at all.

When I start analyzing a problem I use a systematic approach to determine what is possible based on the ‘system’ as deployed. Then I look at what is probable based on that same ‘system’. Lastly I make recommendations based on what I know about the ‘system’ to get the most from it.

These same steps applied at the front end of the process, building of the ‘system’, yield the same results earlier in the process.

Ultimately any problems with the ‘system’ at that point are tough, non-configuration related issues that are difficult to identy and to resolve, again why we get paid the big bucks.

I’ll put my thoughts down for you in a future email. I’ve enjoyed this discussion as it has shed a certain light on the way that I, perhaps, communicate to prospective clients. I will in the future raise my talking points to a higher level and only digress into the bits and bytes only if requested.

Cheers,

Jim

Prompting my response…

Jim,

I appreciate the response and I think that everything you have said is extraordinarily valuable and on point.

From a practical perspective, none of this may mean very much. In the Fortune 500, big storage deals tend to be made in the Front Office not the IT Department. No one seems to be interested in testing and in many companies test labs have been eliminated altogether.

I want the old IT sensibility back. I want us to test before we buy and to purpose-build infrastructure based on what apps require, not based on what vendors want to sell us.

I also want to be a general pain in the ass to the vendor community, holding up their performance claims to scrutiny at every turn.

Marketecture has been preferred to architecture for much too long now. That must change if IT is ever going to matter again.

Jon

To which he responded…

Jon,

I love it, Marketecture, I was looking for a term to slander the marketing hype. I like it and plan to use it, with your permission of course.

I blame the economic downturn of the 80s for this trend. I remember when we had ‘experts’ in technology areas that we could go to for advise and assistance. Those ‘experts’ got right-sized right out of companies because their supervisors could not justify their positions.

When that happened we moved into the ‘do more with less’ mantra in parallel with ‘we’ll push all that on to the Vendor’ mantra.

Having been employed on both sides of that argument I can tell you that everyone lost. There was a massive knowledge drain and frankly I do not think that it has ever recovered. Instead of stepping up companies are stepping back.

Because of my background I am a pain in the ass to my vendors. They may not appreciate it but my clients do and they pay the bills.

I have to get some work done, I will give you my thoughts later on, who knows perhaps we will even get to work on something together sometime.

Cheers,

Jim

Jim, if you are reading this, you have my full permission to use the expression “marketecture” and without any Registered Trademark or other legal cite. (I reserve that only for “Information Feng Shui Management.”) And I would be delighted to work with you anytime, anywhere.

Previous post:

Next post: