September 25th, 2022
If you enjoy this article, see the other most popular articles
If you enjoy this article, see the other most popular articles
If you enjoy this article, see the other most popular articles
How to handle an engineer who lies to you about bugs
(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: firstname.lastname@example.org, or follow me on Twitter.
(This is an excerpt from my book.)
In 2011 I was working at a travel site that gathered up travel deals from the major air, cruise and hotel companies and then promoted them on the site. The whole tech team was just six engineers, and Sonia was our project manager and also our entire QA team.
One week, after we pushed out some new code, Sonia tested the web site – informally clicking around, looking for any mistakes.
Sonia: I don’t think I’m seeing all of the travel deals that I should be seeing.
Me: What is missing?
Sonia: I need to research this.
Everything looked okay to me, but I didn’t know all of the conditions that might impinge on our broadcast of travel deals. Sometimes we’d get new deals from Delta Airlines or Viking Cruise Ships, but sometimes we were supposed to hold those deals, and only unleash them on a particular date. Should they be visible on the dates in the future that Sonia was testing in the search tool? She wasn’t sure so she went over to the Travel Deals Team and consulted with them.
A while later she came back over to where the tech team sat, and now she was convinced there was a problem with our search engine.
Sonia: See, if I search for Bermuda I see certain deals, and if I search for Saint Vincent I see certain deals, and if I add in a specific date range I can still see all of those deals, but if I instead search for “Caribbean” then three deals disappear.
Me: We include Bermuda in the category of Caribbean?
Sonia: Yes, we do.
Me: Okay, well, we should create a ticket and someone should figure out what’s wrong.
Sonia: I’ll create the ticket. How many hours do you estimate this will take?
Me: This has to be an exploration ticket, because the problem could be in many different areas of the code. It could be a problem in the database, or in our search tool Solr, or it could be in the actual frontend website code. I mean, who knows? Someone needs to explore.
Sonia: Okay, how much time will this exploration take?
Me: Let’s say four hours.
For a problem like this, where the bug could be almost anywhere in our system, we would create exploration tickets, where the goal was simply to do research and discover what the real problem was. Then, once we knew what theproblem was, we could create another ticket, to cover the work needed to fix the real problem.
Sonia created the ticket, put down an estimate of four hours, and then the following week, when the next work-sprint started (our work was organized into two week sprints), the ticket was picked up by my co-worker Jerry. About 90% of all the search code had been written by Jerry, so if anyone was going to find out the cause of the problem, it would probably be him. Certainly, he knew the search code better than anyone else.
The next day came and I knew this ticket would be Jerry’s first priority and very quickly I saw that he’d marked the ticket as “done” with a note that said “There was no actual bug.” Jerry lived in another city, and was working from home, so I reached out to him and we set up a video conversation.
Me: So, you had a chance to look at the problem that Sonia found? The travel deals that aren’t being listed when people do certain searches?
Jerry: Yeah, no problem, she was just searching for the wrong things.
Me: What does that mean?
Jerry: She was combining too many search terms. She confused herself.
Me: Well, is our search engine supposed to support all of the search terms that she was using?
Jerry: Sure, but if you combine all of it, then certain deals won’t show up.
Me: Are you sure? Sonia is very careful. She doesn’t usually say some software quirk is a bug unless it really is a bug.
Jerry: The code is very complex and has to take into account dozens of special circumstances. Sonia just didn’t get it. The deals that didn’t show up were not supposed to show up.
Me: Sonia herself has, over the years, defined most of the special circumstances that you reference. She defined them and then you turned them into code. It doesn’t seem likely that she would not be aware of the special circumstances.
Jerry: Sure, but no one can keep track of all the different ways those special circumstances can combine. Fifty special circumstances can have tens of thousands of combinations. In this case the combinations were just too complicated for her. I checked the Solr configuration and everything is correct.
Me: Okay, thanks for doing that. I appreciate it.
I told Sonia the good news, that there was no actual bug, but she was suspicious of that answer. She spent the next hour running various searches, and taking copious notes, and then consulting with the Travel Deals Team to be sure some searches were not showing up. Then she asked me to review what she’d found, which I did, and I acknowledged there did seem to be a real problem.
After all that, she was ready to phrase things in a new way, which might overcome Jerry’s objections: even if this problem wasn’t exactly a software bug, strictly defined, there were deals that the Travel Deals Team wanted to have appear on our website, in conjunction with certain search terms, and yet these deals were not appearing. If this was because of an unusual combination of special circumstances, then some of those special circumstances were in need of modification.
At this point three things happened:
- Sonia created a new ticket.
- Jerry almost instantly marked it as “done,” again with the comment that there was no actual problem.
- I got angry.
The situation had morphed from a technical problem to an attitude problem. I didn’t like Jerry’s tone, so I asked for another one-on-one chat with him over video.
Me: Sonia has devoted considerable time to documenting a problem with the travel search.
Jerry: Again, I’ve explained this to you, there is no actual problem. It’s the way she’s combining special circumstances, in ways that limit which deals can appear on the site.
Me: That doesn’t matter. There are deals that the Travel Deals Team wants to have on the site, and those deals are not appearing. So whatever code you wrote to handle whatever previous special circumstances might have been important in the past, that code is now going to change.
Jerry: But I already told you…
Me: Doesn’t matter! We are trying to sell some deals! The Travel Deals Team wants these deals to appear on the site! We are going to change the code to be sure these deals show up!
Jerry (with a heavy sigh): Okay, okay, okay, whatever, okay? Whatever. This is stupid, but I’ll look into it.
Me: No, don’t bother. I’m going to handle this ticket myself.
By this point I’d lost faith in Jerry. I wasn’t sure he was working in good faith, and worse, I couldn’t trust what he was saying to me. Maybe he was feeling lazy or maybe he was trying to defend the code he’d written in the past. Either way, something kept him from fixing this problem the first time he looked at it, and now I wanted a better understanding of what was going wrong.
As it turned out, the problem was subtle and very deep. The next day, I spent several hours tracking it down. In the end, the problem was in our configuration for Solr, our search tool. The configuration was wrong. Working closely with Sonia, I fixed the problem, and then we ran some tests in a test sandbox environment, to convince ourselves that my update to the code would really fix the problem. The next day we pushed out the new code and thankfully the problem was fixed.
That ended the software problem, but it didn’t end the attitude issue. To have healthy team dynamics we had to be able to trust each other, and I wanted to make that clear to Jerry. We had another one-on-one video chat (I would have preferred to have this meeting in-person, but Jerry lived in another city).
Me: I fixed this issue with the search tool and those travel deals that were not appearing on the site.
Jerry: What was the issue?
Me: There was a mistake in the Solr configuration that affected how data was copied over from our main database.
Jerry: Well, okay, good job. I’m glad you figured that out.
Me: I want to ask you something. When you originally took the ticket, I noticed that you were fairly quick to mark it “done.” Do you remember how long you spent on it?
Jerry: I looked at the Solr configuration, but everything looked correct to me.
Me: Right, sure, but how much time did that take you?
Jerry: About 15 minutes.
Me: The ticket was estimated to take four hours.
Jerry: Why would I waste four hours on a ticket if I could do it in 15 minutes?
Me: But you didn’t really do it, did you?
Jerry: I mean, come on, I did what I could to see if there was a problem.
Me: Did you read any of Sonia’s documentation? In 15 minutes?
Me: Here is the thing. Sonia is the best project manager that I’ve ever worked with. She is very careful. She documents everything. If she says there is a problem, then there is almost certainly a problem.
Jerry: Yeah, yeah, yeah, I see where you are going with this, but no one is perfect. I’m sure she makes mistakes sometimes.
Me: She didn’t make a mistake this time.
Jerry: Okay, this time I made a mistake, what do you want?
Me: I want you to take bug reports seriously, when they come from a trusted source.
Jerry: Okay, whatever, from now on I’ll take all of Sonia’s bug reports seriously.
Me: That means reading all of her documentation.
Jerry: Right, right, sure.
Me: I feel like you’re not really listening to me.
Jerry: I am listening to you! For god’s sake, what do you want? I’m not doing anything else right now. I’m sitting here listening to you.
Me: I need you to understand this. You will not ever do anything like this again.
Jerry: Yeah, yeah, I already said that, I’ll check out Sonia’s bug reports, yeah.
Me: I mean about me. Don’t do it again.
Jerry: Do what? What are you talking about?
Me: Don’t give me your assurance that you’ve examined a piece of code and you are certain there is no bug, when in fact you did not examine the code, you did not examine the bug report, you didn’t even understand what the bug report actually said. Don’t mark a ticket as “done” when you haven’t even started on it.
Jerry: I didn’t want to waste time on a non-issue.
Me: You ended up wasting Sonia’s time and also my time. And you kept travel deals off the site for several days when we could have been making money off them.
Jerry (heavy sigh): Yes, okay, that was bad on my part. I get that. I apologize. Okay? I apologize.
Me: I appreciate that, but the most important thing is that, in the future, I can trust you when you say “There is no bug.”
Jerry: Yes, okay, I get that. I need to be more careful.
Me: If you pull a ticket that authorizes you to do four hours of exploration, then use some of that time and do an honest exploration.
Jerry: Yes, okay, I promise. I made a mistake, I will handle it better next time.
Me: Okay, great. Thank you for talking to me about this.
And thankfully, he really did get it, and we never needed to have a conversation like that again.
It is possible to talk to people directly, honestly, firmly, and respectfully, to communicate how you expect them to work with you. Please note what I didn’t do: I didn’t use swear words, or raise my voice, nor did I say anything personal about their character or work ethic, other than how it applied in this specific case. You might think I’ve cleaned up this dialogue for the book, but no. I do not ever use curse words in a business context, and I recommend this, since any use of curse words carries the risk that people will feel you are being disrespectful.
Some managers are afraid of this kind of direct, honest conversation. They fall back to a style of communication that is much more passive aggressive. I’ve seen cases where, after an incident like this, the manager will send an email to the whole team, and without any reference to the original incident that is motivating the email, they will write “If you get a bug report, please investigate it thoroughly.” Most of the time, such communication is a mistake. Every worker is different, and they will make different kinds of mistakes, and your feedback to them needs to be specific to them. Maybe you have one worker who dismisses bug reports, and another worker who comes in late to work, and another worker who leaves food on their desk overnight, and another worker who never tests their own code. Does that mean you should send four different emails, advising the whole team to take bug reports seriously, come in on time, don’t leave food out overnight, and test your own work? No, the number of possible mistakes is infinite, so you’d have to send an infinite amount of email. Worse, people tend to ignore such emails – if the workers know that you’re afraid to confront them, then they know they can continue in their bad habits with few consequences. Instead, you need to have one-on-one conversations with each worker, and you need to give them feedback that is specific to them.
Some managers feel that direct, honest communication can sometimes feel a bit aggressive, but that shouldn’t matter. As a manager you have an obligation to protect the team, and that means you need to get each person on the team to do their work and to report on that work honestly. You can engage in direct, honest communication while also being respectful. You don’t need to swear. You don’t need to raise your voice. You might be angry but you don’t need to express that. Keep the focus on what matters: the long-term health of the team depends on everyone doing their job, and being honest about what they’ve done.
Context matters. My conversation with Jerry was respectful because it was a private, one-on-one conversation. If I had addressed him like that in a group setting, then he’d have more reason to feel attacked, and therefore he’d have more reason to get defensive. In that case there would be less chance of him actually listening to me. All such feedback should be given privately, in one-on-one conversations. If you want someone to change their behavior, this is always the best form of conversation.
(This essay is an excerpt from my book One-On-One Meetings are Underrated; Group Meetings Waste Time.)
Post external references
February 8, 2022 9:33 am
From Michael S on How I recovered from Lyme Disease: I fasted for two weeks, no food, just water
"Did you have Bartonella, too? Seems it uses autogenesis..."