Code Confidence with Laurie Barth

Transcript

Scott McAllister: Welcome to Page It To The Limit, a Podcast where we explore what it takes to run software and production successfully. We cover leading practices used in the software industry to improve both system reliability and the lives of the people supporting those systems. I’m your host, Scott McAllister, @STMcAllister on Twitter. Today, we’re going to talk about testing. Every time you deploy new code out into the world, there’s usually a sense of anxiety. We’re anxious about how users will receive the feature we’re deploying. And we’re always worried about that new feature and how it might break something and cause a disruption in the service. Having competence in your code before going live to production is key to any deployment process. You can gain that confidence through testing. Today, we’re joined by Laurie Barth, staff software engineer at Gatsby. Laurie welcome to the show.

Laurie Barth: Thanks. So nice to be here. And today I’ve just learned that you have a very good podcasting voice.

Scott McAllister: Well, I thank you very much. I appreciate that. I’ve been working on it. So, I listened to a lot of talk radio as a kid. So-

Laurie Barth: It shows.

Scott McAllister: Yeah.

Laurie Barth: Because the listeners can’t see your facial expressions as you do it.

Scott McAllister: Yeah, it’s probably better. So, to get us started, testing is a broad topic. Would you describe just level set for everybody? Describe testing to someone who may have never done it or like at least the aspect of testing we might be talking about today.

Laurie Barth: So, testing is literally anything you’re doing to validate the way your code runs and the feature or functionality that you are trying to implement. So, that could be anything from what people call monkey testing, which is, you spin up the UI and you type things into the forum and you see if it worked. But it can also be automated testing. And I think that’s a lot of what we’re going to talk about today.

Scott McAllister: So, we have a tradition on this show, Laurie, where we ask each of our guests to first debunk a myth. So, what are some myths or common misconceptions about testing that you want to debunk?

Laurie Barth: So, the main one is that it takes longer. There’s no return on investment, that it just takes a bunch of time upfront and you never benefit from it. One of the main reasons that’s a myth is because it assumes that no one else other than you will ever touch your code. And so, you know exactly what it does and you can confirm that it works, but the minute someone else tries to fix a bug in it and they don’t really understand the entire code, bad things happen.

Another myth about testing is that Unit testing is the only type of testing. That’s not true. There’s Integration Testing, there’s End To End Testing. There’s actually a lot of different terms that we can talk about this that sort of mean the same thing. And there aren’t really clear delineations between the different types testing and the terminology that people have beyond… There’s a few things that people agree on, but as a whole, it’s sort of like, an integration is an end to end test is a thing… Like anything that’s sort of beyond a unit tests there’s some disagreements about. But Unit testing is not the only thing. And then, I think the last thing is, this is probably funny, that you can trust your tests. Just because you have tests doesn’t mean that passing tests mean your code works. That depends entirely on what the tests are. And it’s not just about, is there code coverage and it checks every line of code? It’s, is it checking it correctly for the right things?

Scott McAllister: Right, right. I’m currently involved with our Terraform provider here at PagerDuty and I’m the one of the key maintainers on the project. And so, when I add new features I’ll feel so good because there’s a test in it but then I realized after putting it out in the wild, “Oh wait, that’s not even testing the right thing.” So, that was basically… I need to at least increase what that test is doing or think through what a user will actually try to be doing.

Laurie Barth: I will absolutely admit to the fact that there have been times in my past, don’t tell any of my former employers, that I have waited for tests in CI to pass knowing that unless it was catching my feature as an edge case, it wasn’t going to change anything. But I was like, “As long as I didn’t break anything existing, it’s fine.”

Scott McAllister: Right.

Laurie Barth: It seems reasonable.

Scott McAllister: Sure. Yeah. So, tell me about the first time you were introduced to testing because I know from my background as a software engineer, there’s definitely been teams where at least automated testing was not involved. That is not the case anymore. I mean, the projects I work on now definitely have tests that run before can merge or at least push to the remote. So, tell me about the first time you were introduced to testing.

Laurie Barth: So, this is sort of a complicated question and answer. So, the actual very first time I was ever introduced to testing was a side project I was working on at a company and they just wanted me to write a bunch of JUnit tests. But for various reasons, they didn’t have the auto-completion library installed and they couldn’t get it installed. So, I was manually writing all of the Unit tests for each of the fields of the class which is just terrible. But the first full-time Dev job I had where like all of my time was devoted to Dev, the philosophy was sort of anti-testing. If I’m perfectly honest, the senior developer at the time believed in the myth that testing takes longer than it’s actually worth. So, we didn’t have Unit tests and we didn’t have End To End Test, we didn’t have Integration Tests.

What we did have was two manual QA developers. And so, every time we put new code in the code base, they would be monkey testing. It’s the old joke a developer walks into the bar asks for one beer, ask for 9,999 beers, ask for where the bathroom is. That sort of thing, like that’s what they were doing. But we ourselves did not write any testing at all. And I always thought… that’s where that last myth comes from. Like it’s not only Unit testing, I was like, “Okay, I guess I kind of understand that because we have a CRUD API and sort of like if it creates updates and deletes it, it works. But the idea of End to End testing of any of the UI elements, that was all done manually and there was no other type of testing. I mean, granted at the time this was Angular one. So, the world was a little bit different, but still not great. That was my introduction to lack of testing. My true introduction to testing comes honestly many, many years later. We’re not going to say how many.

Scott McAllister: So, I mean, when you got introduced to it, how did you realize that you needed it? Because it sounds like you first had the Unit testing, which I’ve totally been there, right? Where you’re writing these unit tests on every field of a class, just to kind of see that it’s there kind of thing. But really, as you mentioned before, the different types of testing, there is a lot of value. There is value in those unit tests, right? Making sure that the different objects are there, that the things do behave properly, but then you have those additional integration tests that make sure that all the bits work together. So, when was that aha moment that like testing really actually save time rather than took too much time?

Laurie Barth: Yeah. I will say from a perspective of the Unit test for the individual fields, like in JavaScript lamb, at least on the front end side where you don’t really have classes as much anymore and you can have prop types or some people are using TypeScript, that sort of thing. I do think the need for those individual field testings is probably outdated. It’s probably not necessarily in the same way it’s necessary for functions and for other things. But I honestly did not appreciate the need for testing until I was on a giant open source framework project. I mean, I knew that it was necessary and I had used it in certain aspects, but I did not understand how critical it was until Gatsby. Because when you have community PR is going in and you have a giant model repo and all these different teams working on all these different features and all these different plugins and these integrations and just so many moving pieces.

And, you know for a fact that no one person can ever grok the entire code base and have intimate familiarity with every single one of those pieces, the likelihood of introducing kind of collateral damage without realizing it and consequences and side effects is so, so high. And so, you need that confidence because the minute you merge something and something gets published and something gets built, you’ve now broken a version of a package. And that’s really impactful because it’s not some sort of live streaming system where there’s a bug fix and everyone’s just getting it. They have to upgrade. And so, if you broke something for them, there’s a lot of things that could happen and it can stay broken for a long time. And then, everyone’s like, “Well, this isn’t stable and we don’t want to use it.” And there’s a lot of bad consequences to that.

So, I think I’ve appreciated it for a while, but truly understanding the consequences of it. And obviously, these are low stakes consequences comparatively to, I don’t know, nuclear warheads or something. But that sort of thing. I think that’s when I really started to understand and appreciate just how critical it is to have confidence in a large code base with a lot of moving pieces and a lot of really great people working on it. Like it has nothing to do with the skillset of the team and everything to do with the breadth of knowledge and awareness that no one person can really have.

Scott McAllister: Right. And especially with Gatsby and with it being an open source project. You have a large community of contributors, thousands of contributors who have made changes to that code. Me personally, I’ve made changes to that code. And then, they’re in there today. Well, at least they were in there last time I looked.

Laurie Barth: What’d you change? Because now I want to know.

Scott McAllister: Oh, so there was something to do with Google analytics that it wasn’t quite doing it right. At least, the way that we had it at a previous employer. We needed to be able to use Google Tags, right? Do you use a tag to do analytics? And so, it didn’t handle that Google tagging right. And so, I was able to change the little bit. So it was just a little bit, but-

Laurie Barth: Nice. Congratulations-

Scott McAllister: … Hey, I got my Gatsby t-shirt I was very proud of that shirt.

Laurie Barth: Thank you very much.

Scott McAllister: So, you mentioned before the different types of testing. Let’s take a step back and kind of talk about that just for people who may not understand all of the different ones. That we have the Unit testing, you said Integration testing and the End to End testing. You want to compare and contrast those.

Laurie Barth: Yeah. So, Integration Testing is, it’s your integration points, those interfaces that work with each other. So, it’s this system hitting this system, that sort of thing. For us, that means a lot of CLI things because we have these packages, but when you have a CLI interface on it, you want to make sure that it’s actually triggering the right code. So, that’s one point of integration. Integration can come in a lot of different ways. It can be someone’s hitting your external API. It can be something like a CLI. It can be you’re interacting with some backend server, like third-party thing, there’s basically all of these touch points where you’re switching over from one type of code base to another type of code base. That’s where you want to get those integration points.

And then, End to End Testing is sort of like the end user experience. You click into the UI and you start doing things and do what you expect it to do all the way down. And so, for us with a plugin ecosystem, that’s actual sites that are using the plugins. Is it producing the functionality you expect it to. And then, there’s sort of, I don’t know if you would call it a sibling or someone call it another type of testing, but there’s a lot of visual testing that can go along with that. So, for example, I’ve been working on an image plugin and my colleague implemented a Visual regression testing where it’s literally comparing images next to each other bit by bit, color by color, size by size and saying, yes, these are the same thing. And what’s ironic… Not ironic, but impressive is we had two images that to both of us, we were looking, we’re like, “This looks the same.” And it was cropped just the sliver differently. And the testing caught it and we couldn’t see it until we really, really looked.

Scott McAllister: Nice. Nice. Yeah. It’s nice. Actually, I hate it yet I love it when a test catches something, right? Because it’s like, “Okay, this still doesn’t work.” And then, I’ll be honest when I’ve been working on some projects recently, a lot of times when a test catches something, I’m like, “Is it the test or is it my code?” Because honestly, my code’s got to be good.

Laurie Barth: Yeah. Like, can we talk about Flaky Tests? Because I think it’s really important.

Scott McAllister: Please.

Laurie Barth: Yeah. So, flaky tests are super dangerous and in my opinion, a lot more dangerous than not having tests at all. Which is a very bold statement. But I’m going to explain what I mean by that. So, Flaky Tests are tests that have very common, false positives or false negatives. So, they do the opposite of what you expect them to do. And if they just sort of flip flop back and forth, that’s bad, right? And you run it again and you see if it passes and you wait and you run it again and you see if it passes and you just sort of wait. That’s annoying from a time standpoint, but it’s not as bad as Flaky Tests that consistently fall negative. So, it’s just some connection or timeout where it’s like nine times out of 10, it just doesn’t run correctly. And it fails.

Because those tests exist inside larger suites of tests. And I cannot count the number of times that I have failed a test in that same suite, but I’m used to that suite failing. So, I don’t even look at it because I’m like, “Oh, it’s that test.” But it’s not. It’s a different test and my code actually did cause it, but I’ve lulled myself into the sense of expected failure and it’s bad. It’s not good. And flaky tests are really dangerous. You should remove them or fix them literally as soon as possible because the longer they stay around, the more they get cemented in the developers’ brains of like, “Oh, this is what I should expect to happen here. And I shouldn’t trust it.” And if you start to have a default of not trusting your tests, only bad things can happen.

Scott McAllister: I have so been there as you’ve been describing this, I can see it in my mind, the tests on certain projects where I’ve been like, “Yeah, but that one isn’t always accurate anyway.” So man, yeah. So, what are some of the other struggles that engineers have with testing? Because, honestly having those tests where you have the Flaky Tests, as you mentioned, they can honestly tell me as an engineer, it’s like, “Well, we have tests, but they’re not great. So what’s the point?” What are some other struggles in engineers have?

Laurie Barth: Long running tests and being impatient. And I get called out on this all the time to be perfectly honest. We have a very large model repo. We have a lot of tests to run it for every PR. And my colleague was like, “You keep merging things before they finished tests.” I’m like, “Yeah, because they ran fine on the last run. And I just changed two words in the README. And so, the same one should run.” And he’s like, “Yeah, don’t do that.” And it doesn’t actually matter. It’s not that he doesn’t trust me. And it’s not that they wouldn’t have passed. They would’ve passed. But now we don’t have a record of the fact that they would have passed because GitHub shows us the little red X and its like, Nope, these didn’t finish. So they didn’t pass because you aborted them versus they would have shown the green dot, which is what you want. So, long running tests and impatient people, not a great combination.

I think another thing with tests is using tests to confirm that the code that you wrote just worked instead of testing the things that you actually mean to test. It’s very easy, for example, when you change a feature to go through the tests and say, “Okay, delete this to make this pass, delete this to make this pass, delete this to make this pass.” And that’s probably accurate, right? You have new functionality and there’s certain things that it was testing for that aren’t in the DOM structure anymore or are in the return value, whatever that looks like. But far too often, because the test exists, you remove the things that are no longer valid and you don’t add additional tests to cover the new functionality. So, it’s like, okay, I have conformed the tests to what I just wrote, but I didn’t write any new tests for it. And I haven’t confirmed that those work.

So, people talk about test driven development. And I think there are good things and bad things about that. But there is a point at which, you know the spec of what you’re writing and you can start sort of sketching out what those tests can look like. And I do a different version. I don’t do test driven development. I do documentation driven development, where I have to write the doc for the API that we’re writing. And if I can’t explain what it is, then I need to go back to the drawing board before I go and implement it. And they iterate. It’s not saying like, “This is the doc that’s going to be released. These are the tests that are going to be released.” But the point is to think about the thing you’re doing before you do it, which you have to do because you had to write a spec or you had to write a ticket or whatever you have all of this written down.

And it’s far too easy to finally get the functionality. Monkey test all of it. Say, “Okay, yeah, this is working.” Check. And then say, “Oh no, I got to write test in Docs. Okay, let’s just do that really quick and not really put in the time and effort that should be there in order to confirm and have confidence that the developers that come along after you are going to keep that functionality working. Because that’s really what you’re doing it for. You’re doing it for the next people to touch the code. Even if that next person is yourself in two weeks forgetting everything that you’ve just written.

Scott McAllister: Its key, right? Especially, the point that you’ve mentioned a couple of times now that it’s not necessarily about you, it’s about the people who come in later. I have definitely been on the other side where I’ve adopted a project or been inherited a project. You don’t really understand that all the bits work yet, but you just need to add this feature. So, you add the feature, but you run a test and realize it breaks this other thing over here and you didn’t realize there were dependencies there. And so it gives you a better understanding, at least for me, that’s how I’ve honestly learned a lot of times. A whole system was that, “Oh, I ran tests and saw how things were dependent on each other.” So, it’s a great point.

Laurie Barth: I’m a Doc stork. People know this about me because I worked in Docs for a while. And testing to me is very similar and falls under a very similar category of onboarding. If you have a large code base, you can not expect someone to be fluid in that code base in the first week, in the first month, maybe in the first six months, depending on what that code base looks like. And are you not going to let them contribute or not going to have confidence in their contributions and have a bunch of people who have to look over it and confirm and help them through it for six months? Is that a good value add and a good cost your company? No, it isn’t.

Knowing is going to argue that, that’s a bad idea. It’s not useful. You’re paying this person. They’re spending a lot of time learning a lot of things. They know plenty. They just don’t know enough to realize what the consequences of what they’re coding might be in areas they haven’t yet learned. And testing is there to tell them that. And without it, they are sort of stumbling around in the dark.

Scott McAllister: I’ve definitely been on the stumbling and the dark side of things-

Laurie Barth: Same. We all have.

Scott McAllister: Right. And so, it’s super vital. What are some other keys that you’ve seen in your experience of effective testing?

Laurie Barth: Testing that you can run as confidently locally as you can in CI. I think one of the problems that I’ve seen and experienced a lot is, it’s very challenging to consistently run something locally and so you have to wait for CI to run and you can’t control the order of CI. And so, if it’s the last test in a large suite of things and you just have to push up your changes every time you’re trying to see what’s going on, on some branch, you’re wasting a lot of time. That’s like a lot of churn time. So, an effective and clear way to be able to run subsets of tests for local development is really key and important. That visual testing piece of things, I am not the expert here. Actually, Angie Jones gives this great talk where she talks about Visual Testing that’s still wrong.

Like an inverted image doesn’t catch in those sorts of things. So, visual testing is really important and really helpful for people who are doing UI centric things and things with complicated layouts and mobile and all of that. And I feel like in general, in a lot of places don’t do enough of that. That’s sort of libraries and tooling that’s getting better and better in the past few years. And to understanding consistency in your testing patterns is also super useful. It’s the onboarding piece, it’s the learning to code. When people are learning to code in your system, you want to make sure they’re adding tests for stuff that they add. And you want to make sure that they have patterns to use and copy. I have this joke that work smarter still code, but that’s what you want people to do. You want consistency.

Scott McAllister: Absolutely. Well, Laurie, we have a couple of questions that we like to ask every guest that comes on for the show. And so, for starters, what’s one thing you wish you’d known sooner about testing?

Laurie Barth: That it’s not its own language. Depends on the actual programming language, right? But a lot of the libraries have gotten closer and closer to how you would code, for example, like React Testing Library to how you would code a component to how you would put logic in your regular coding files. And so, it’s not as much of a leap to sort of understand this new paradigm. It’s just focusing on the APIs that are available and the functions and the helpers that are available. You don’t have to learn something brand new.

Scott McAllister: I can relate to that, actually. One of my things, when I learned about testing is I was hesitant because I was like, “Okay, this is something new. I don’t know if I can do this. This is going to take so much longer to ramp up. I just need to get my feature out the door.” But now that I’m involved in projects that have this established testing base in them, they’ve saved me so many times. And it does give you a confidence that when I’m pushing for this Terraform provider that I work on, when I push out a new version, I have that confidence like, Okay, I know at least it’s not going to completely foo bar everybody. It’s at least going to run. And it’s at least going to do mainly what I want. I might still need to tweak it, but what I test will work.

Laurie Barth: Yeah, absolutely.

Scott McAllister: So, is there anything you’re glad that we didn’t ask you about in regards to testing?

Laurie Barth: No, because then you’re going to ask me about it.

Scott McAllister: You are the first person to say that.

Laurie Barth: I can’t be. It’s a trick question. I mean, rocket science and C++ and probably Assembly. Probably, that I’m glad you didn’t ask me about like Front End testing versus Backend Testing versus Systems Testing versus all of that, because I’ve seen all of those different pieces of the puzzle, but I can’t say that I have confidence in writing all of those different pieces of the puzzle. Systems testing is hard. Like real hard. That is the like third rail.

Scott McAllister: Absolutely. I think Systems everything is hard. So yeah. Laurie, thank you so much for being on the show today and for sharing your insights, your knowledge of testing, about how we can have more confidence in our code before it goes live. For everyone who wants to follow Laurie, she’s on Twitter at… Tell us your Twitter handle again.

Laurie Barth: Laurieontech.

Scott McAllister: Yeah. So, thanks again. This is Scott McAllister and I’m wishing you an uneventful day. That does it for another installment of Page It To The Limit. We’d like to thank our sponsor PagerDuty for making this Podcast possible. Remember to subscribe to this Podcast if you liked what you’ve heard. You can find our show notes on pageittothelimit.com and you can reach us on Twitter @pageit2thelimit, using the number two. Let us know what you think of the show. Thank you so much for joining us. And remember uneventful days, are beautiful days.

Code Confidence With Laurie Barth

Transcript

Show Notes

Additional Resources

Guests

Laurie Barth

Hosts

Scott McAllister