Web services performance testing: a pilgrim’s progress, part 2

Here is where I vent some frustration. I’m self-taught on all the tools I use and I am the only person in my shop who knows how to use them in this manner. What’s more, I am the only Java/Groovy person right now at my shop. Being an autodidact can make one feel nice and self-sufficient, but the (large) holes in my knowledge are starting to cause me trouble. Why does the tool I use appear to do something very different from what I expect after reading the documentation? How do I instrument the tool to put load on a Web site through the UI, or should I be using a different tool?

Fortunately there’s been a shift in leadership at my workplace so that if I need to make the case for training, I’m more likely to be heard. So I’ll probably be doing that shortly. That being said, if the training doesn’t exist, I’m not sure what I will do. Is there training for JMeter or do developers just pick it up on their own? And if you don’t work in a Java shop, what do you do? Should I start looking into LoadRunner? I suppose I could put this question out to sqa.stackexchange or Software Testing Club and see what I come back with.

The testing community needs good TESTING tools – which are not necessarily automation tools – and reliable documentation, training, and support on those tools. There is DEFINITELY a market for this.

Also: I need the support of prod support/sysadmin types who, frankly, have better things to do than monitor a service in test for hours a day.  After several stress/performance tests have yielded conflicting results, my best bet would probably be to learn how to use the monitors they are using and go conduct some tests on my own time.

Finally: it’s an education to watch how quickly other people and I can fall into the confirmation bias trap. Example: I ran a low volume performance test against the production region on Friday, one I’ve run several times before apparently without incident. Sure enough, the system that is the most likely to have trouble with the load I’m creating started going haywire during the test. It continued to do so after I shut down the script, which makes me start wondering if my tool’s UI is telling me what I need to know. (A tool that monitors incoming requests against the system under load would be helpful here.)

Right away, people (including me) wanted to ascribe the problems with the system in production to the load against the system under test.  However, there’s really no proof to support that unless it happens repeatedly. And repeated production failures are not something I would wish on anyone. Sadly, our test system is not an exact replica of our production system – shared resources need to be allocated first to prod – so I’m not sure what performance testing on the test system will tell me.

So I have some new goals today:

  • Get a monitor that does a different view of the system under test (incoming requests) and learn how to use that monitor
  • Seek out some REAL training on the tools I use, or different ones. First I have to find the right place to ask those questions, though.
  • See if performance testing under the test region will tell us anything useful.
Advertisements

Ten books that have stayed with me

Apparently this meme is making the rounds again. I’m happy to bite.

  • Susan Cain, “Quiet: The Power of Introverts”
  • Charlotte Bronte, “Jane Eyre”/”Villette” (I’m cheating a little)
  • James Baldwin, “Go Tell It On The Mountain”
  • Isaac Bashevis Singer, “Enemies: A Love Story”
  • Cem Kaner, James Marcus Bach, and Bret Pettichord, “Lessons Learned in Software Testing”
  • Roxane Gay, “An Untamed State”
  • Gail Tsukiyama, “The Samurai’s Garden”
  • Mikhail Bulgakov, “The Master and Margarita”
  • Lois P. Frankel, “Nice Girls Don’t Get The Corner Office”
  • Katherine Dunn, “Geek Love”

A fine balance: introversion, leadership, and being a good team member

I recently finished Susan Cain’s excellent book Quiet: The Power of Introverts. For me, an introvert’s introvert, the book confirmed some things I already knew and gave me some new things to think about. It’s dispiriting that American business culture still gives so much credence to the fast talker: in fact, according to studies Cain writes about, big talkers are commonly perceived as being smarter than those who have less to say. (This appears to be a cultural limitation: Cain writes about how some Asian cultures still prize the quiet, studious, family-oriented person.)

I often sit back and observe software team meetings: on my current team, the lead developers appear to be more introverted than the business SMES and analysts, and unless they are given an opportunity to speak they typically won’t do so. I am the lead tester on the team, and a technically-inclined woman as well, and I generally act contrary to my introverted nature in meetings. I’ll speak up unsolicited and offer my opinion if I think it’s warranted. I’ve learned the hard way that unless you speak up in such a meeting, you’re likely not to have an opportunity to be heard. 

It’s even more interesting when you often have something to say because you see something that is off or wrong, or that could simply be improved. I found out recently that the itch for change is part of my nature too. My company recently offered an “influencing skills” workship in preparation for which we each took a DiSC personality assessment. I tend to be more than a little skeptical of personality assessment tests — confirmation bias ahoy. But it was fascinating to watch as we broke down into our DiSC groups and each group tended to behave in accordance with DiSC expectations: the S’s took twice as long to arrive at a decision as anyone else, while the D’s (that was my group) had no trouble making quick decisions.  According to the DiSC approach, D’s tend to want to change things and do so quickly. (As I said to one of my fellow D’s, “D is for diva.”) I identified right down the line with the approach’s description of the D (again, note to myself, confirmation bias ahoy). So I appear to be a Dominant and also an introvert who is all too aware of the cultural bias against introverts. Try sitting still with that at a meeting.

Cain writes about Free Trait Theory as a possible explanation of seeming contradictions in our natures. Simply put, we contain multitudes as did Walt Whitman. You have introverts who are splendid and beloved lecturers, organizational leaders, and actors. But they can’t play those roles for too long: after a few hours in the public eye, you’ll see them running off to their private office whose door they close, and they stay there for quite a while, perhaps with a pet and a radio as their only company. 

What does all of this mean for being a technical test lead for a project? It means that I have to reconcile my innate tendency NOT to speak up against my contrary tendency to want to announce Things That Are Broken That Need Fixing. For reasons I won’t go into here, I believe I went a little too far down the Jeremiah lane this week; however, I did act ethically on the best information available to me at the time. And my actions led to a group initiative to do further exploration of the issue: getting more information on a subject can never hurt. But there’s a time and place for everything. It’s trial and error for anyone who is in a leadership role to know when to sit back, shut up, and listen, and when to speak up. (Mentoring helps. A lot.)

Web services performance testing: a pilgrim’s progress. Part 1.

When I embarked on a high-level test strategy for my latest project, I knew that I wanted to learn how to conduct meaningful performance tests. I had never done any kind of performance testing before, but I knew it was time (probably past time) to cross the Rubicon.

I am lead on the project and my own testing effort involves Web services, so I had to figure out:

  • Which aspects of performance I wanted to look at – meaning which questions I wanted to ask about performance
  • How to use the tools at my disposal – or learn how to use FOSS tools – to answer those questions
  • How to report results so that my project team could use the information I found

Now, if I wanted to learn how to write a Python program, I would have numerous online and print resources at my disposal. If I wanted to learn how to test Web service performance, I would have to look elsewhere. I became aware that I didn’t even know the correct questions to ask. 

I knew that some of my company’s online applications had had performance issues in the past, so I consulted with the people who had looked at those issues in depth.

I also looked at the testing tools that we had in-house that could measure performance: their documentation yielded some information on possible questions I could ask.

It seemed that one critical (and obvious) question would be:  what is the response time for a request? Even that question poses several new ones, though. Among those subquestions are:

  • How many requests should I submit to get a decent sample size?
  • Should I space those requests out evenly over time? Or should I vary their frequency?
  • How do I make sure I get a realistic sample of the request data that could be submitted to the service?
  • Does response time vary predictably with any other parameter? How about the size in bytes of the response? What about other load on the system that the Web services under test share with other applications?
  • Should I use our production region to do performance testing? Or can I get away with using the test region? (It turned out that for various reasons our test region would not give us a realistic idea of performance. My tests, run in parallel in both regions, established this beyond a shadow of a doubt. So I had to bargain with our prod support folks to continue to run my tests in the prod region.)
  • Should I look at average or median response time? (I had to refresh my decades-old introductory statistics course knowledge with online resources to answer that question.)

And I also had to look at the performance requirement to which my project team had stipulated. I learned early on from an old hand  that without a specific performance requirement, your performance data will not be terribly useful to the team. Note that your information might well establish a realistic performance requirement where there is none.

More detail to come.

Weekend Testing Europe session: some thoughts on test reporting and tool evaluation

Nice group with a lot to share at today’s Weekend Testing Europe session, which focused nominally on test reporting but for me was an even better exercise in context evaluation and articulating my thought process. I’m an introvert who proceeds quite a bit by gut feel and I probably fall prey to all the common cognitive biases and maybe some uncommon ones. So it’s important for me to Show My (cognitive) Work just like I used to do in physics class.

I will soon be reporting out on a large group testing effort and I want to do this in a way that does not require me to maintain a tight link between “testing progress” and test case execution. Michael Bolton has written quite eloquently on this issue recently and I can’t say it any better. (One of the participants in today’s session suggested XMind for reporting and I will be looking at this tool this week.)

So that was the context I had in mind for today’s WTE session. I’ve done some tool evaluations for my job and I have come to trust my ability to get a sense, pretty quickly, of a tool’s fitness for a given purpose:

  • Will the tool do the basics of what I have in mind out of the box?
  • If not, could the tool be reconfigured to do what I need it to do? Would I need a developer to help me with that? (The words “will take development time” tend to go over like a lead balloon in my experience. And I can code a bit, but I’m a tester first.)
  • If the tool looks like a contender, have I found any serious bugs in the first half hour of use? If so, that doesn’t bode well for my adopting it. 

If all of the above criteria pass, I would probably spend more time evaluating a trial version of the tool before I made a final recommendation.

The beauty of the WTEU session is that I had to explain all of that first to myself and then to the group. So THAT’S how I do that? Really? 

One last note: If I do not accept the tool for further study but, during my evaluation, I find a serious issue in the course of black box testing that I believe could cause harm of some sort to a user, I should report it to the company that produced the tool. Note that my main mission during the initial review, which should take 30-60 minutes at most, is NOT to find all the bugs I can in the application. It is to evaluate the tool’s fitness for my intended purpose. 

New on the block.

CAST 2014 has inspired me to write about my software testing life. I’ve been testing for almost nine years, which may mean I’m a lifer.

I hope I can focus on some areas for which online or printed guidance is scant: performance testing comes to mind. I use FOSS tools as much as possible and I’ll write about how I used them, but I want to avoid having this become yet another “testing tool talk” blog.

I think that’s enough of a manifesto for one night, especially on an Android keyboard. More later.