How We Ship: CKSource
How CKSource builds and ships open source software to 18 million users
Our ultimate goal is to understand and share with you, at a nuts and bolts level, how great companies ship software to customers. To do that we're doing email interviews with engineering, product, and project managers at great companies to understand how they ship. If you'd like to share how your team ships software, get in touch!
Codetree: Because shipping software is context dependant, can you let us know your team size and any other context you think might be important for readers?
I’m a lead developer of CKEditor 5, an open source browser-based rich-text editor. I work for CKSource, a company founded by Frederico Caldeira Knabben, the author of FCKeditor (later renamed to CKEditor). The project has been started nearly 14 years ago (at least a couple of eras ago by the JS standards) and since then is one of the most popular (if not the most popular) editors “for the web”. CKEditor is CKSource’s main product so we offer additional services (such as commercial licenses, technical support, SLAs, etc.) and related apps, such as CKFinder (a file manager) and Accessibility Checker.
The important fact is that CKEditor is a product for developers, so we don’t deliver software directly to the end users. It is a component to be customised, extended and integrated with other software which has many important implications on how we design and deliver it. We need to consider such aspects such as a broad API surface, backwards compatibility, documentation, diverse community, internationalisation, indirect access to the end users, and so on.
Besides that, content editing in a browser is a complex and often misunderstood topic itself. It seems to be all about inserting and deleting letters, a task which all of us do every day, so how hard can it be? This belief is especially popular in a Western world with our relatively simple language notations and it causes us some headache from time to time. I can’t go into details in this interview, so excuse me for linking to my own blog post (which only scratches the surface anyway).
Finally, to give you some numbers. I worked on CKEditor 4 and now work on CKEditor 5 for more than 5 years already. Due to business requirements (such as support for collaborative editing and new data model) CKEditor 5 is a complete rewrite of CKEditor 4 and after nearly 3 years of work we are close to release the alpha version (yep, 3 years to MVP). There are around 10 developers involved and CKSource is reaching 40 people right now. Right now, CKEditor 5 is 34k SLOC, 38k lines of comments and 72k lines of tests.
It’s invaluable to have all the decisions and their reasons written down on GitHub because there’s a huge chance that you’ll need to refresh your memory one day.
CT: How do you decide what to build next? What does that process look like?
We never lack ideas for new features or other kinds of improvements. We have CKEditor 4’s and CKEditor 5’s open issue trackers where thousands of developers leave their feedback. We are constantly contacted by new customers who ask about certain features or improvements. Finally, we have our own goals, especially regarding the technical debt and maintenance. These 3 channels have a bit different characteristics regarding the type of work to do, so we need to prioritise and reconcile business goals (which mean inflow) with community requirements (which mean good reception and happy developers) and with our goals (which mean our satisfaction and well-being).
Surprisingly, I don’t recall a lot of friction here. Our CEO, Frederico Knabben, has years of developer experience, our CTO, Wiktor Walc, has just celebrated his 10-year anniversary in CKSource, and the teams and their leaders have good understanding of the business goals and situation so we make decisions together.
The process depends on the type of change. There are issues which are clearly bugs, but there's also a whole range of enhancement requests between "this is a bug" and "this is a feature". I actually observe a lot of problems that we have in describing items for our changelogs – every developer categorises issues a bit differently. This often comes from the fact that what we do is working against browsers. Fixing a browser bug may require developing a whole new feature. For the users who experienced the initial problem this will be a bug fix, but for many other developers this will be a new feature of the framework.
I can't, therefore, say that feature owners pick bugs to work and Frederico (CEO) or Wiktor (CTO) make decisions regarding features. Many ideas come from feature owners and project leaders and they take most of the decisions. But we're also open for suggestions from the rest of the team. Finally, many long-term, strategic decisions come from Frederico and Wiktor, but are discussed with the teams so we can pick the right moments to bootstrap such topics.
Of course, it’s not all roses. There’s pressure from the open source community to work on the features that they are interested in, customers usually ask for completely different features and, as developers, we have our own goals. Naturally, we need to postpone and reject a lot of ideas. This turns out to be a delicate topic in the open source world because there’s always someone who believes that if we deliver anything for free then we should also provide all kinds of support for free. In fact, helping the community is the fourth channel of work to do. It doesn’t bring any direct revenue, but we love to do that and try to devote it as much time as possible.
CT: Could you sketch out what shipping a new feature looks like for you end-to-end?
Since CKEditor is an open source project most of our work happens on GitHub.
- First, we pick a feature owner. We have a flat structure so this can be a couple of guys.
- Then, usually, we bootstrap the topic by some face to face discussions and we open a ticket describing the initial proposal, our doubts and other considerations.
- This is the moment when more developers get involved, including the community. Those discussions can be terrifying at times, when you see how a seemingly simple ticket becomes a monstrous one, with walls of text and dozens of references to other tickets. Such discussions take time and energy, but most of the time they really help. As I mentioned, as an open source library we need to take into consideration a very wide spectrum of use cases and we need to avoid mistakes due to a backwards compatibility. Collaborative approach lets us validate some concepts early and I value these discussions a lot.
- Once the problem is clarified well enough, we start the development. This is a moment when new problems appear and we encourage developers to continue posting their observations under the ticket. It’s easier to discuss things on Slack or face to face but it’s invaluable to have all the decisions and their reasons written down on GitHub because there’s a huge chance that you’ll need to refresh your memory one day. After all, if something wasn’t clear initially it will not be clear in 3 years time and I’ve been browsing 8 years old tickets myself. Also, thanks to that we keep the interested developers informed, so they can step in once something becomes unclear or incorrect.
- Once the feature is more or less ready, the developer makes a pull request and we begin the code review. We avoid involving too many developers at this time, so there’s usually just one reviewer, up to two in case of more tricky issues. Everyone has their way of reviewing PRs and different PRs require different approaches. Sometimes we focus more on the functional part and manual tests. Sometimes more on the architecture and API documentation. Sometimes we review the tests. Personally, I really like reading API docs and the textual part of the code (method names, etc.). Based on them I can understood whether, at least conceptually and architecturally, the code is ok (which is a big part of the success).
- The steps 5 and 4 are being repeated a couple of times because, in average, a PR gets “review-” 2-3 times. This is also a moment when we decide how perfect a PR needs to be to get merged by assessing things like proximity to the release and the risks. We report necessary followups for things which we decided to postpone.
- Finally, we merge the PR. Now, the code needs to be maintained and (as a feature) it will stay with us for years (usually, there’s no step back). Depending on whether the change was a breaking or not we release it in a minor or major release (so we need to maintain two code branches).
- The last step of the process is the testing phase which we perform just before the release. I’ll talk about this later.
Of course, the process will differ depending on the size and complexity of the task. There’s no reason to involve everyone if a change is rather straightforward and has little impact on others. But there are also architectural changes which none of us feels safe to do alone. Therefore, we’ve never created a “software architect” position as none of us would have the level of expertise and genius to fill it.
Perhaps I should also mention that a year ago we adopted Holacracy. It’s a framework for organising a self-managing company. Our approach, as usual, was to do this in a non-extreme way so we picked what worked for us and forgot the rest. One thing which stayed are “roles”. We don’t define positions and structure, but instead assign roles to people (a single person usually has multiple roles) and organise sets of roles in circles (let’s say “teams”). So, in fact there are “architect” roles in project circles, but no one became “the architect”. I think that this really well reflects how you organise software companies nowadays.
CT: Do you estimate how long features will take to build? If so what does that process look like? Do you track if features are behind or ahead of schedule? Do you track how long they took compared to how long you thought they would take?
This is another question for which I don’t have a short answer :D. Yes and no. We estimate features for our customers. But the risk and level of uncertainty is usually huge. Content editing is a tough topic itself and the browser environment in which we work is hostile (the technology we use is not standardised yet and there are major differences between browsers and many things are not possible to achieve). Fortunately, many customers allow us to open source the work which we did for them (although, we need to cover the cost of “generalising” the feature to be adequate for the community) so even if the estimation was too low the community profits.
The bugs are usually the trickiest to estimate. It’s not uncommon that even after investigating the issue we can say that it will take one day up to a month or more. And you really don’t know unless you fix it and test it in all browsers. The same with proof of concepts. I remember a feature for which we created a really good PoC in 3 days and then spent more than half a year before we were able to ship it. That was 4 years ago but it’s the same today as there’s very little repetitive work that we do. During the past 2 years we’ve been implementing and polishing our Operational Transformation mechanisms. To our knowledge, no one did this before on a similar data model and we didn’t dare to estimate this work.
All kinds of quick bugfixes have a high risk of introducing other regressions and regressions are what we dislike the most.
CT: How do you have a particular process for paying down technical debt and/or making significant architectural changes?
We keep the debt acceptably low and make changes when we need to do them or feel that it’s time. There’s a lot of gut feeling involved but it’s another thing where our estimations may fail. One day you say “we can live with this broken module for a couple of months” only to completely rewrite it the next day because it completely blocks some ticket.
Besides the typical dilemmas, we also need to consider the fact that we ship a framework. It means that if we’ll start hacking our code too much we’ll force others to do so too. Once we’ll fix the issue for real, other developers will need to revert their hacks and they will hate our “improvements”. Some developers don’t understand that and criticise us for not doing some changes as quickly as they’d do them (using hacks). But it’s all about the additional cost of some actions which we need to consider in a long run.
I think that the most important thing that we achieved here is high level of trust in our decisions from Frederico and Wiktor and we’re really grateful for that. They understand that we need to keep the technical debt low. At the same time, we avoid refactoring things for which there’s no real reason. It’s sometimes tough because we’re really pedantic about the code but we remember and accept the fact that the goal is to finish the milestone as soon as possible.
Also, writing everything on GitHub really helps. Some changes may sound extremely critical in your head but if you can’t clearly justify them with words there’s a chance that you’re wrong.
CT: How do you deal with delivery dates and schedules?
It ships when all that we really need to ship is ready. So, no fixed deadlines and no fixed scope. With fixed scope we would never release anything because there are always things which would block the milestone forever. And with fixed deadlines we’d produce a very bad software.
It’s a little bit different with minor releases if we don’t have some critical issues that we want to squeeze in them. Then we try to deliver them on time and trim down the scope.
CT: How many people are in your product + dev team? What are their roles?
We don’t have fixed teams, so it’s a bit hard to answer this question. We should rather count the number of people who are (at some point) involved. For instance, a front-end developer will get involved once there’s some UI to work on, but is not necessary when we’re working on an architecture. Also, we have projects which are relatively small and require one core developer to those which would accommodate a couple of dozens (at which point they’d split into subprojects).
So, there’s 1-7 core developers, one of them is the circle’s “lead link” (another Holacratic term – slightly different meaning than a typical “team leader” because it more focuses on the “link” part). Most of the circles have a separate role for an architect which is filled by 1-3 people (usually, core developers). Testing is part of the developer role but we also involve an additional person who can jump in when needed. The same with front-end developer and a designer. In CKEditor 5’s team Olek Nowodziński, who’s worked with us for 5 years already as a developer is at the same time a great UX/UI designer which is invaluable. Finally, there’s documentation which is really important for a product like CKEditor (people are often surprised how serious we’re about it :)) so there’s an additional role for a person who oversees the documentation. So that’s up to 10-12 people who have at least one role in the circle.
Besides that, we have the marketing, sales and business development circles which are involved in many strategic decisions. There’s also the people care circle which handles all kinds of inquiries and makes sure that our customers and community are satisfied. This gives up to 20 people in total.
CT: Do you organize people by feature (e.g. cross functional feature teams made up of devs, PMs, designers etc.), or by functional team (e.g. front-end team, backend team, etc), or some other way?
I think my previous answer covered this. There’s usually a small set of core developers who are assigned to one project for a long time due to their expertise, but even them are often involved in the other circles (as “links”).
Helping the community doesn’t bring any direct revenue, but we love to do that and try to devote it as much time as possible.
CT: How do you know if a feature you’ve built was valuable to customers? What metrics do you use if any?
I can answer this question based on CKEditor 4 which has been on the market for nearly 5 years already.
First of all, we are asked about many of the new features we then work on. So based on the frequency of the feedback we can choose things which we know will be valuable to the community and our customers. There are also innovative features which the community haven’t yet thought off and in this case we can tell how well they were adopted by the number of 3rd party plugins which are created based on them and the number of questions on StackOverflow and bug reports. It may be challenging, though, to not get depressed if you deal with too much of negative feedback (and as developers we can always find something to complain about), but all constructive feedback is good and there are also heartwarming moments :).
So, we don’t use any particular metric and we miss some feedback (especially, directly from the end users but sometimes also from the developers who not always share their thoughts) but there’s also a lot of information that we can analyse.
CT: Do you have a dedicated testing role? How important is automated testing in your process? What tools do you use for testing?
When it comes to testing, we talk about manual testing (aka monkey clicking) and automated tests. We spend a good time doing both.
We generally write a lot of tests. CKEditor 5 has a complete 100% coverage and our other products are close to that. Every PR needs to be covered by tests which verify the actual change. So, we’re not targeting an illusive 100% – we focus on covering all important cases, some edge cases, some potential false positives and some potential future problems. 100% CC either comes automatically or we see that we wrote a bit too much code (i.e. dead code) or missed some cases in our test suite.
From the technical side, we try to have as few unit tests and mock as few things as possible. Higher level tests or simply integration tests are easier to maintain in a long run. There’s a far lower chance that we’ll need to completely rewrite them during refactoring and that we’ll miss some issues in how components work together (and this is where we identify most of the bugs).
When it comes to manual tests, we do have a dedicated person who fill the tester role in various circles. But until very recently we didn’t have anyone. It doesn’t mean, though, that we didn’t manually test our products. It’s actually the opposite because we always paid a lot of attention to QA and testing always was and still is one of the responsibilities of the developer role. We expect PRs to be tested by their authors, we often test them during reviews and, we perform a testing phase just before a release. This is a moment when all developers spend a while checking if the new or modified features work properly. It’s a good moment for the team to actually use the product and see it in a bit wider picture because during development we tend to focus on the details.
This crowd-based approach to testing has one additional advantage – unlike in many apps, there’s really infinite number of scenarios. Text editing features will never be fully orthogonal – they all need to work together in one place on content formatted in a completely unpredictable ways. You can spend a month testing nested lists and e.g. headings and undo and you’ll be discovering new (and still quite realistic) cases. This is also one of the reasons why we never introduced automated manual testing using Selenium or other technology.
CT: Does your team create functional specs or technical specs for new features? What do they look like? Who reviews them? Who has final say on the spec?
We don’t differentiate these types of specs. It also really depends on a ticket because different things need to be defined for how the image feature needs to work and how the command API needs to look.
Assuming that we’re talking about a feature which has UI, we may start with some mockups and flows (like here). The UX discussions smoothly lead to some discussions about the behaviour of the new feature. Those, in turn, to some investigating and defining technical aspects.
We try not to overspec things. We stop as soon as we believe that the developer who will work on the task knows everything he/she needs. We clarify things on demand during the implementation phase and also during review.
For the specs that we create we don’t use some specific tooling. For our beautiful UI/UX designs we use Sketch and Keynote with some fancy template. And when describing technical and less visual parts of the features we simply produce walls of text. I think that during the whole development of CKEditor 5 we created only one architecture diagram. Fortunately, this diagram is nearly up to date today, despite all the refactoring that we’ve done so far. But if we tried to go even slightly deeper it would all be completely outdated by now.
In fact, we took the typical lesson. 3 years ago we started developing CKEditor 5 from discussing some basic aspects like the conventions and some really basic architecture. Then, we wrote short wiki pages for most of those things. Really high level stuff (at least most of it). By today 90% of it is completely outdated despite being rewritten at least once. Some really basic architecture has changed more than 3-5 times already. I don’t know what we’ve been thinking back then since we already knew that this approach never works :).
CT: What tools are important to you in your engineering process?
Lots of them, as usual. Starting from Slack and GitHub where all the work happens, through Codetree on which we manage all our repos (we have lots of them – more about this later) to Travis and again GitHub where we host some websites. We also use a lot of home grown tools (mostly to streamline the development) – e.g. mgit2 to work with multirepo projects and the whole set of developer tools for automation of releasing all the packages, tests running, building documentation and so on.
We avoid involving too many developers during code reviews, so there’s usually just one reviewer, up to two in case of more tricky issues.
CT: Do you triage issues? If so what does that process look like? How often do you do it?
Yes, we need to. Even now, before we released the first alpha version of CKEditor 5, we have more than 500 open tickets.
The first thing we do is assigning the labels such as “type:bug” or “type:feature”. We treat bugs with higher priority, so they end up in the next milestone faster. Then, we have labels such as “candidate:1.0.0” which makes a big list (~250) of issues that we’d like to close before 1.0.0.
We did have labels for defining the priority of a ticket more directly (1-3 stars) but they never worked. What we worked on next rarely took them into consideration because at different stages different aspects were more important when picking the tickets for the next milestone. The only difference is with severe bugs but those are added directly to the current milestone to be fixed ASAP. The feedback that we get from the community also helps because we can identify common problems. Basically, notifications that we get from GitHub work as a kind of reminder “please, check if I shouldn’t be fixed ASAP”.
CT: What part of your current process gives you the most trouble? What is working the best?
I think that the process, as long as it’s acceptably good, plays a secondary role. Our process is a result of couple of years of polishing and works really well for us. What takes most of the time is solving problems and the process shouldn’t get in the way of you trying to do your job.
We were often tempted by some additional formalism. E.g. recently we removed a few GitHub labels defining the stage at which a ticket is because there was little profit from them and we tended to forget to use them.
On the other hand, the more developers are in the team and the bigger and more active community you have, the better organised you need to be. I had some hard time explaining why we need all the “labelology” on GitHub to my teammates who didn’t experience dealing with dozens of tickets reported every day by the community (out of which 1/4 is invalid, 1/4 duplicated and 1/4 needs some additional information).
There’s also the aspect of automation. We’d love to automate nearly all of our work and that’s great. I love the CI, nightly builds, notifications on Slack, managing 30 GitHub projects through Codetree or working with those 30 git repositories using mgit2, etc. But there are still things where the cost of automation is higher than the cost of doing things manually, so we keep repeating ourselves.
CT: What is something you do that they feel is different from what others are doing?
It’s hard for me to answer this question without checking some statistics. We’re definitely not alone in doing things the way we do – there will be always some other team which came to the same conclusions.
However, based on talks with other developers, especially those who joined us, I can tell that:
From the technical perspective:
- Our attention to tests and documentation is rather uncommon.
- Using the multirepo approach.
From the organisational perspective:
- Our openness is something rather rare (but that’s not a surprise taken how few companies do open source).
- Holacracy is new and there aren’t many companies which try to introduce self-management.
CT: If you had to give your shipping process a name what would be it be?
“Non-Dogmatic Agile” sounds like what we do.
CT: When you are building new features how do you handle bugs that get found before the feature ships? After it ships?
The most important bugs are usually caught either during the review or during the testing phase, just before the release. If a bug is really visible, then even if it’s caught late in the process, we do try to fix it. We definitely prioritise anything which could lead to data loss (crashes, incorrect output, etc.). Many bugs, however, can occur pretty frequently but have little impact on the UX. If one of the user actions had an incorrect result the user can (usually) undo that action. Such issues are often postponed to the next milestone.
Once we fix some late spotted bug we repeat the testing phase of that fragment of the functionality. All kinds of quick bugfixes have a high risk of introducing other regressions and regressions are what we dislike the most.
CT: What does your branching and merging process look like?
In CKEditor we have two code lines (minor releases and less frequent major releases) so we have two main branches – master and major. When working on a ticket we create a branch t/
Finally, one of very important aspects is that we use a multirepo architecture for CKEditor 5 – a package per repository. Right now we have nearly 30 packages. This adds some overhead to our work, but it means that a single PR is oriented around one feature and forces us to better define the changes. E.g. if we need to add some behaviour to the image feature and this requires some changes in the engine, that change in the engine has its own ticket and its own entry in the engine’s changelog. Thanks to that a developer who uses the engine but not the image package gets a proper, contextual information what has been changed. If we mark the change in the image to be a breaking change, new engine’s version may still be marked as backwards compatible.
I remember a feature for which we created a really good PoC in 3 days and then spent more than half a year before we were able to ship it.
CT: How is new code deployed?
Not applicable to our case. But yes, we do use CI (Travis). We ship code manually after performing testing phase and checking everything.
CT: How often do you ship new code to customers?
We try to have a minor release (mostly bug fixes and other backward compatible improvements) every month and a major release (new features and bigger changes) twice every year. It doesn’t make sense for us to ship more often because it’s not that cheap process in our case. It also happens that we release new versions less frequently than this if we don’t have something important completed because it doesn’t also make sense to release empty milestones.
CT: Do you use feature flagging? Staging servers?
CT: Do you use Sprints? How long are they? How is that working out?
Anyway, we work iteratively. One iteration equals one minor release (so about a month). Sometimes we define the scope of the next iteration prior to it but often we just add tickets to the milestone as the time goes. It’s a bit different for CKEditor 4 which is used in production for years now (in which case we need to be very careful what we ship) and in CKEditor 5 where we can release next developer’s previews without caring that much if they are really complete and stable. So the iterations have more or less fixed time and rather fluent scope.
We hold weekly meetings where we talk about the progress within the circle. And we also have daily standups where we do a very quick sync between all projects.
We don’t do sprint reviews but, to a certain degree, they happen somehow automatically because all iterations end with a release and some summary.
I think that this process works for us very well. We polished it for the last five years, never blindly sticking to some dogma. Although, we didn’t have real retrospectives yet so maybe something’s wrong and I don’t even know ;).
Interested in learning more about CKEditor 5? They've just launched a Gitter channel where you can chat with team members.
Can we improve this? Do you want to be interviewed, or want us to interview someone else? Get in touch here!