Data Engineering Theatre
Thursday, 25th Sep
14:00 - 14:30
This talk, presented by Dan Keeley, Principal Data Engineer and Jonathan Conn, Digital Technology Director from England Rugby, delves into the real-world challenges and triumphs of a complex cloud-to-cloud migration
Dan Keeley
Principal Data Engineer, Rebura
Dan Keeley has been working in data engineering for over 15 years and has presented at BigDataLDN four times. He was named one of the UKs top 50 data influencers in 2016. He is now working as a Principal Data Engineer building out a data team at Rebura, who are one of the foremost AWS partners in the UK.
Dan Butler
Technical Director, Rebura
Dan Butler is the Technical Director at Rebura and a cloud expert with over 25 years of industry experience. He leads a high-performing team of 30 technical professionals , driving technical innovation and delivering scalable, cloud-first solutions that help businesses grow and succeed in the cloud.
Jonathan Conn
Digital Technology Director, England Rugby
England Rugby is the Governing Body for Rugby in England. It operates across both Professional and Community rugby and is the owner operator of Allianz Stadium, Twickenham. Jon has been leading the Technology provision for the last 9 years. During this time there has been a major focus on Digital Transformation with the aim of improving the experience of players, volunteers and fans.
Оглавление (2 сегментов)
<Untitled Chapter 1>
Welcome everybody. Hopefully everyone's still got a bit of energy left. My name is Dan Butler, the technical director of Reura. I'm joined by my colleague Dan Keelley who is the principal data engineer at Rabura. and from England Rugby, John Khan, who is the digital technology director. And we're here to discuss the uh incredible journey that England Rugby have been on over the last 20 months or so, uh where they've performed a large scale migration from a sponsored environment into a more costconscious and modernized environment built on AWS. John, I think a good place to start is on that sponsored environment. Can you tell us a little bit about what that looked like? Yeah, certainly. Um, good afternoon everyone. Uh, it's probably first put this the everyone right that we are a very costconscious organization. So when we talk about the context of moving from an a time that not being costconcious needs a bit of explanation. Um, and it's pretty typical in sport that you have commercial partnerships and in trying to get alignment with a tech partner is sort of utopia. if you can get that sort of commercial arrangement aligned with things that you actually need to do. Uh and previously we were in a relationship with IBM um and as part of that arrangement we had value in kind which supported our move originally into IBM cloud and the consumption that came with that. So in order to recognize the full value of it, you need to spend all the money. And so in a sort of slightly strange way, we were incentivized to spend as much money as possible on IBM cloud. And therefore, we focused a lot about our original move to the cloud and trying to enable that capability than trying to run as efficiently and effectively as possible. Now, of course, when you come out of that commercial partnership um and that sponsorship element is no longer there, you then are left with a true cost to the business and that's a very different situation which drove a very different approach from us. — Excellent. And when you made the decision that you were going to move, how did the organization choose AWS? What sort of processes and internal validation did your team have to go through? — Yeah, so this probably coincided with a few different things. So the sponsorship ending uh we had the emergence of a strategic digital transformation program. So we were forecasting significant growth in this space. So we knew we had to really plan for the long term. So pretty typically we went to market um we ran the tender exercise. Uh you might not be surprised at most of the names that participated in that. Um and ultimately we're left with a pretty difficult situation because as a buyer of these services with certainly the big names in this space it's pretty difficult to differentiate from a capability point of view. Um but again for us there was a couple of key drivers to what sort of supported the decision effectively the AWS solution gave us the lowest cost and that the way that they had helped interpret how we would leverage some of the AWS serverless capability. As an organization we're very peaky in nature. So at times of the year we are we have a lot of eyes on us and a lot of volume through our platforms and at other parts of the year there's not a lot happening. So we needed to move from an area where we were had to run pretty large scale um sizes to deal with our peaks but not being able to scale down. Um and so ultimately the AWS solution is has led us to move into a very different space with that. Um and then as part of the digital transformation we had selected a number of other platforms. So we were uh replacing our marketing platforms, our website, we introduced uh identity management, single sign on and all the products that we chose in that space were also uh underpinned by AWS. So when you started to put those things together, we looked at in the long term, we hope there'd be a lot more opportunity to sort of leverage some of the wider AWS relationship and also some of the marketplace side. — Absolutely. Cheers, John. And um moving on to the migration and this is probably a question for both of you. Um how did you approach the migration in particular the mission critical workloads? — So we the solution there is we double ran everything uh for quite a while actually. Um but what that meant was uh it gave us two advantages. First of all obviously we could compare data side by side. So we could actually test with real world production data which is the best test environment really. Uh but it also allowed us to tease out uh dependencies that hadn't been identified. So, um, and you know, extra there were odd workloads that hadn't been transferred and just small things, not anything significant, but it allowed us to get those things over in good time, but without uh, you know, a without being chaotic. — I think what Dan's sort of trying to say is it turned out to be a bit more complicated than we maybe originally thought and and took a bit longer. Yeah, the two main areas for us is our website infrastructure. So, um again back to our peaky nature things like our fixtures and results as we come into the start of the rugby season. So when we made the decision to go, we had some pretty key milestones ahead of us because you can't move the start of the rugby season and September came upon us pretty quickly and you know, you're always into a decision as to when are you confident to turn something off and have something new running. Um, so I think that was probably one of the biggest things. We had time pressure to achieve the cost savings, but we couldn't take the risk of having some of the service disruption about moving too quickly. And so it became a real balancing act which just ultimately took longer. — Great. And Dan, I know you were close to the migration. Were there any other challenges or surprises that came along? — There were some technical troubles. So we moved to uh MSK which is managed CFKA from AWS. Uh before in IBM they were on full fat CFKA. Um we also moved everything into one CFKA cluster rather than several. And you know the the product team will tell you that msk is fully backwards compatible with cfka. Well, it's not. So the identification is tricky. We had network issues because it was hosted in a different uh account. Um and then it turned out the library we were using couldn't even talk to uh msk. Well, it couldn't authenticate. So that meant we had to there's two libraries with CFKA for some reason. So we had to move to the other library which is a more complicated library but we got there in the end. I mean you could never have identified that really I don't think um but — we got there in the end and now it all works. So — good stuff. And uh obviously we're here talking to you John. So — you had some help from partners along the way. What would you say about the role that partners play in a transition like this? — Yeah so our um service delivery is pretty reliant on third parties. So um and in our previous world we had that was split across a number of different third parties. We had different partners for supporting the platform and to those developing the platform. Um and we brought in a different partner to help us with the actual migration. So Slalam supported us with that. Um and then handed over to uh Rabura to help us with um the ongoing support. So in some respects there's real benefits in having that consolidated into one partner now that we work with from both and probably looking back you know you think that handover point is when most of the work is meant to have been done. — Yeah. — But certainly over the last year that's continued to be a work in progress as we've continued to have to drive some of the efficiencies enable the capabilities. So, you know, it is a difficult balance trying to get the right uh combination of of partners to help you and get the timing right as to when you transition. Excellent. Thanks, John. And I just want to touch on cost control. So, anyone that's done a migration will know that moving workloads is just the start and in order to unlock the true benefit of cloud, modernization is key. How did you address this during and after the migration? Uh I I'll take that to start with. So to be honest, the case for moving was so compelling. When we went through the tender, even looking over a three-year period, the savings we predicted that we would make, even taking into consideration the one-off cost for the migration, it was a no-brainer. And that was before factoring in any of the benefits of um looking to take any reservations on any of the services. So sort of that's all come laterally. Um and that's where we benefit from the phop service from Reura which again we run a small team so we don't have a lot of the expertise to be proactively managing all the workloads that are running all the time. And so we do benefit from that service being provided via Rabura where they will periodically anal an analyze um our consumption and make recommendations. So to be honest, we're now into the territory of making even further savings beyond where we originally thought we'd get to. And that's after a period of just seeing how things are running, understanding the behavior, understanding where we can commit to 12 months, 3 years, and certainly have confidence certainly from my perspective that we're running as efficiently as we can going forward. — Right. And Dan, anything on the modernization side from you? I think the uh the modernization was interesting because um it was a deliberate decision taken to not change everything during the migration. So I mean we could have upgraded all the libraries. We could have moved to the latest version of the database. Um but we actually kept everything the same uh in order to kind of remove variables from that testing. Um but what we did as we went along is we kept track of all this and then we had basically a backlog uh and we kept track of either things that we considered tech debt uh antiatterns in the existing system you know out of date deprecated libraries and all of this. And so we ended up building up um well it was a rather large to-do list um but we've actually worked through it. Some of it has just been naturally fixed when we've been in the code anyway. Um, and others we've actually done a dedicated little project just to work on that particular issue. So, we've kind of we lift and shifted. We just went as is and then we modernized. — Great. And along the journey, were there any unexpected costs that came up and if so, how did you resolve them? Um, well, I think that probably just ties into the fact that we had our digital transformation program running at the same time. And it would be all right if we were talking about the cloud migration and isolation. But you have to remember we were trying to implement Salesforce marketing cloud and data cloud at the same time. Uh, we ripped out our websites and implemented aqua. Uh, we also implemented octa. Um, and if you know your rugby, you know, have to register to play. When you put all of that together, it's a real challenge when you've got, you know, the AWS platform as a sort of foundation block, but a lot of things changing in and around it. And so I look back over the two years, don't necessarily look back on it fondly. um because uh for a relatively small organization that was a lot of change for us and I think what the team have had to deal with is a lot of the knock-on impact of us changing other things rather than what we necessarily just change in AWS. So I think that's a big thing to sort of balance and then you lose a little bit clarity. — What's a cloud migration cost? What's a cloud modernization cost? What's a marketing platform change costs? you know, so it does get pretty complex. — Yeah, absolutely. And I think cost awareness in general is key to running an efficient cloud infrastructure. How did you embed that behavior in the operations? — Uh well, I like I said, I think for everyone it was a bit of a mindset shift as we came out of the sponsorship and this now became quite a focus. You know, good governance is a key part of that. you know, it being a sort of regular um uh discussion point, the fact that the service is proactive and to be honest, it was very well received by our business. So, if you want to please your boss, who happens to be the CFO, then the ability to deliver quite significant savings. Um, but the reality of it is there's a future proofing piece here as well. So almost every stand in this building has AI written on it and most of that is linked to doing something with data and and more compute and I suppose that's something that we're
14:30
trying to prepare ourselves for. How do we make sure if we are doing more in that space we're running as effectively as possible — because we aren't an organization who can afford to be putting more money than we need to into things without very clear business value associated with it. So I think the combinations of the platform, the setup and the right partner just at least gives you confidence that you've got the right environment and leaves you to focus on what it is we actually need to do now to go after some of that value. — Absolutely. And apologies, sorry, we did try and be the only talk not to mention AI, but John's just ruined it. So apologies for that. — Yeah. So it yeah in just in terms of uh cost as well um the way that we work with um uh England Rugby is we are um you know there's day-to-day BAU there's adding new feeds there's fixing broken feeds when upstream systems randomly change columns in the database um you know there's all that and that's all dayto-day business that's fine um but we also do bigger pieces of work so uh moving to a data lake uh structure which has always been on the uh target architecture for England rugby and so we'll when we uh you know spec up that project go through all the requirements come up with a design we also do deliver a cost estimation on that as well — um so that uh and you know it's just a standard AWS cost calculator uh just so that there is an indication um of what that's going to look like moving forward and in fact actually I think We're probably we might even save a little bit more money because we're going to be turning off some databases and moving it into, you know, Athena and things like that. So, it'll be cheaper to run. — Great. Thanks. And putting a retrospective hat on, what would you say, and this is for both of you, any lessons learned from this whole experience? Well, um, for me the hard the hardest thing we had was just understanding what needed to move. No, no one could tell us do that, and that. Um, and so and I think in, you know, retrospectively, we should have just, um, use some data to drive that decision. So look at what feeds were running, you know, when data was last flowed through that feed because we did migrate some feeds that hadn't had data for years. Um, so don't know why. Um, so I think knowing that and then also I think naming became really important. People would call a feed one thing but it was actually called something else in the actual implementation. And so they're saying, "Oh, can you look at this feed? " And we're like, "I don't you can't even find that feed. " Um, so yeah, naming and all that sort of thing. Super important. — And John, is there anything you would have done differently? And you're not allowed to say different partners. — Um, I think it's just looking back, it's probably just being realistic about the pace of change. It's um, no matter what anyone says, it's a complex thing to go through. Um, and just probably, you know, it's the classic that you're sort of under pressure at the outset to try and have to hit certain timelines, even if you know that's pretty punchy and pretty complicated. So, the more you can put it into bite-sized chunks, give yourself more time. Yeah. because all the typical things do end up coming up, you know, and particularly if you're split across a number of third parties, you've got a small team yourself, there's just lots of ingredients for that transition to not necessarily go as smoothly as possible. But I would say it's worth going through because it might have been um a pretty challenging period, but we're very confident we're in a good place now at a much more enterprise level. Everything uh managed under one central account. Um, and like I said, we're not set up to scale, you know, where we need to, you know, as the business demand starts to grow. — Absolutely. — Good stuff. Um, yeah. So, I think now we're a little bit ahead of schedule, so we'd like to open it up to the audience for some questions. Um, yeah, need to get a microphone over here. — Where are they? — Microphones over there, I think. — Yeah. Sorry. Yeah, this gentleman here. — Oh, no. — Technical trouble. — Hello. Good. Uh, hi guys. Um, just two questions. Um Jonathan, are you using a mixture of uh multicloud or is it pass and SAS or just solely AWS? And the second question is in preparation for this weekend's um World Cup. A final what what have you been doing on your cloud to be ready for the surge? — Yes. So I think the only thing we have is not in AWS is some of our corporate tech which is still in Azure. Um, but pretty much everything else is migrated to AWS and we're using that as a strategic platform across our performance, our marketing and anything that we're doing in data. Uh, yeah, there's a small matter of our rugby world cup on Saturday. Um, the Red Roses playing in front of uh hopefully the biggest uh ever uh audience for uh a women's rugby game, which is fantastic. Um we've been doing a lot on our uh digital channels um and our AWS environment underpins everything that we do on our website the data the targeting we do from marketing as we start to do the segmentation. So in our world, it's not just it's how the AWS, all our source systems feeding into AWS, then feeding into our marketing platforms to do our um to really try and grow the game, the communication, all the data we use as a business to inform the decisions we make about growth, you know, are we seeing an uptick on in young girls and boys and and older girls and boys uh taking up rugby across the tree. So, it's multifaceted. It sits at the middle of all the data we use as an organization and really as part of an ecosystem um that's all sort of built around trying to drive greater engagement, greater interest and greater growth of the game of rugby. — One final question, does your IT budget go up if England win the World Cup? — Ah, uh probably not. No. Um — yeah, — thank you. Just to add to the point as well, you I think you asked about if we prepare for the surge and what we do in terms of the systems. So, um pretty usual stuff really. We don't it's nothing that we need to actively scale up. So, we're not doing that sort of thing. Um but we are obviously making sure you know the support team are making sure they are but they're always going to be contactable anyway, but we've just got extra eyes on it. Right. Of course. Um, and you know, no changes. So, we keep the changes, especially away from the website, um, uh, until after the, uh, the surge. So, pretty usual stuff. — And, and John has Dan's mobile number on speed dial in case it all goes wrong. — September is our busiest month of the year anyway from everything all we because we support the community game, you know, as well as the professional game. So, we're in lockdown in the buildup and through September anyway. So, um yeah, fingers crossed all goes well Saturday. — This gentleman, — yeah, hello. Thank you for the keynote. My question is for Dan Killy. You say that you've done the shift and lift or lift and shift. — Lift and shift. Yeah. — Yeah. So how can you estimate the cost that it's required to get your solution or your application on the level of cloud native application and how can you estimate that effort like what do you really need to get at that level? So, so we prim there's an AWS cost calculator um which takes some very high level numbers of obviously depends upon the service um as to what numbers you need in order to estimate it. So I mean if you're just talking about compute or it might just be a certain number of instances it's quite easy because you just say how many instances, what size, how much storage and away you go. Um because we've been working uh with England Rugby for a long time, um we are very sort of aware of the numbers, the amount of data that's flowing, the typical things. So it's quite easy for us to go in and do that. If we're working with a client that we don't know so well, then we do work with them just to tease those numbers out of them. And but it is only an estimate of course and it will change and usage if it goes up costs will go up and if the design changes then sure it's going to change. So but it's an estimate uh and uh it's better than nothing I think. — Yeah. If if you're doing um compute to compute migration AWS does provide some good tooling that can uh run actual usage statistics and then match it to a instance in the cloud. So that gives you a bit more granularity. So there's lot lots of tools out there that do that. — It's probably just worth mentioning for the overall cloud migration. It wasn't the case that we were just looking on some AWS tools for example and we went we ran a very formal process. We tried to paint a picture of what we were trying to do over a number of years. We went to each of the platforms. they then picked um a supplier of their choice to come and work with us, understand, put together those initial estimates. And so it was really over a number of months to build that picture. And it was only at that point that we made the big decision in terms of which platform we were going to go to and how move. — Now that you have experience with a migration project, would you do it again if it to any cloud provider? if it were part of a new sponsorship deal. — Uh, ideally not. Um, and it is a real challenge in sport cuz like I said, you know, one of our top three revenue streams is commercial partners. Um, you'd have to have a really good reason why you would move it to genuinely enable something different. I think moving just to have a different badge doing the same thing, it's just not worth the effort. You know this really has consumed 18 months of our lives you know not the initial migration but just then trying to stabilize and of course compounded by the fact we were changing a lot of things around it but no organization stays still and I like you to so yeah I'm there would have to be a very compelling reason to do so and certainly with my digital technology director hat I would have my skeptical hat on as to whether it was really worthwhile. Although I'm pretty sure our commercial director and finance director, if there was other reasons that we maybe should be looking at, we would we'd be having some interesting conversations, I think. Yeah. But yeah, no, very glad to be on this side of the effort u and not facing it being ahead of us. — Thank you for the talk. Uh my question I was wondering if you could go into more depth around decisions like moving from multiple clusters to one cluster for msk or moving uh databases across services like what does the cost to value conversations how do those conversations go in your organization — sorry what — so when you make decisions like moving from like multiple clusters for msk to one or moving databases across services in AWS so what's the the thought process behind the cost to uh value for across services. — So I don't uh I guess so uh in terms of why did we choose MSK as opposed to Royal CFKA or — Yeah. Okay. So um I think some of it is prescriptive guidance you know just we uh you know they moved from uh MongoDB for the website uh and obviously uh in AWS that becomes document DB. Um so some of it is just natural pairings. Um what was interesting about moving from IBM cloud was that their cloud object storage or their S3 is exactly the same API as S3 which is I was like well this is easy then so you know that was easy. I think one thing I did find was that in IBM everything is a bit rawer. So you know it is full fat cafka. Uh it is just kubernetes. You haven't got 12 different ways of running containers. Um but that's fine. Uh we just you just pick the right solution to move to. So — yeah, I suppose related to that we did have some conversations about architecturally did we have more complexity than we needed but then again some of that was done with the future in mind and so you there is always a balance you know when you we've got time pressure you're trying to be strategic but the reality of this you don't have a lot of time you got to make decisions so a lot of that plays into it I don't think there is uh an easy answer necessarily, but we did have the benefit of overall the cost saving was going to be significant and therefore at that stage alone we didn't really have to get into every detail. We're probably more at that point now when we look how do we drive greater efficiencies, be confident that we can support what we're going to do in the future or we don't have the pressure of just getting off the platform right now. And it just uh that probably goes back to the timing piece, you know, be realistic in your timings, you know, in terms of your priorities and what needs to get done and what can just wait till later. — Right. I think that's actually all we've got time for now. So hopefully you've all enjoyed the uh the story. Um can I just push you really quickly for one takeaway each? — Uh I'd say it's worth it. you know, if you if you're feeling that, you know, you're not you haven't got necessary got what you need, if you can justify the commercial benefits, it's not a straightforward thing to do. Um, but ultimately, I think we can now safely say having gone through that process with all the ups and downs and round and rounds, it is absolutely worth it. Um, and we feel in a much better place for the future. But just yeah, BI's open to that. It's whatever any saleserson tells you it is not straightforward — and down really quickly — I think really was uh it's uh documentation uh really boring subject uh but we do uh just enough that it becomes useful and it becomes a critical part of the design process as well. So yeah — perfect well thank you all for coming um and thank you guys for joining me. Yeah, — thank you.