Just like with golf, technology is as much about ensuring that your bad hits are recoverable as it is ensuring that you make great ones. We’re all going to have failures in our careers but avoiding the really big pitfalls will help you keep your company on the right growth path. Here are 10 common mistakes we at AKF Consulting see made during platform development — and the ones we believe are the most important to avoid.
1) Failing to design for rollback: If you’re developing a SaaS platform and you can only make one tweak to your current process, make it so that you can always roll back any code changes. We know that it takes additional engineering work and testing but in our experience, such effort yields the greatest ROI.
2) Confusing product release with product success: Do you have “release” parties? Don’t — you are sending your team the wrong message. A release has little to do with creating shareholder value. Align your celebrations with achieving specific business objectives, such as increasing sign-ups by 10 percent.
3) Assuming a new Product Development Lifecycle (PDLC) will fix issues with missing delivery dates: Too often CTOs see repeated problems in their development life cycles, such as missing release dates, and wrongly blame the development methodology. Make sure you’re fixing the right thing — lack of ownership or involvement in and/or incomplete understanding of the current PDLC are among the most common root causes of late dates.
4) Allowing history to repeat itself: Organizations don’t spend enough time looking at past failures. The best and easiest way to improve your future performance is to track your past failures, group them by causation and treat the root cause rather than the symptoms. Keep incident logs and review them monthly to identify recurring problems.
5) Scaling through third parties: If you’re a hyper-growth SaaS site, you don’t want to be locked into a vendor for your future business viability; rather you want to make sure that the scalability of your site is a core competency and that it’s built into your architecture. Define how your platform scales through your efforts, not through the systems that a third-party vendor provides.
6) Relying on QA to find your mistakes: You cannot test quality into a system and it’s mathematically impossible to test all possibilities within complex systems to guarantee the correctness of a platform or feature. QA is a risk mitigation function and it should be treated as such. Defects are an engineering problem, and that’s where the problem should be treated.
7) Relying on “revolutionary” or “big bang” fixes: The degree of success of complete rewrites or re-architecture efforts typically ranges somewhere between not returning the expected ROI and complete failure. The best projects — and the ones with the greatest returns — are not revolutionary but evolutionary. Go ahead and paint that vivid description of the ideal future, but approach it as a series of small steps.
8) Not taking into account the multiplicative effect of failure: Every time you have one service call another service in a synchronous fashion, you are lowering your theoretical availability. If each of your services is designed to be 99.999 percent available, then the product of all of the service calls is your theoretical availability. Five calls is (.99999)^5 or 99.995 availability. Eliminate synchronous calls wherever possible and create fault-isolative architectures to help you identify problems quickly.
9) Failing to create and incent a culture of excellence: Bring in the right people and hold them to high standards. You will never know what your team can do unless you find out how far they can go. Set aggressive yet achievable goals and motivate them with your vision. Be a leader.
10) Not having a business continuity/disaster recovery plan: No one expects a disaster, but they happen, and if you can’t maintain normal business operations you will lose both revenue and customers. A solid business continuity plan explains to everyone how to operate in the event of an emergency. Even worse is not having a disaster recovery plan, which outlines how you will restore your site in the event a disaster shuts down a critical piece of your infrastructure, such as your collocation facility or connectivity provider. Our preference is to provide your own disaster recovery through multiple collocation facilities.
Marty Abbott and Michael Fisher are partners with AKF Consulting.