The future of Audacity, interview with the team
It seems, these days every other major free/libre media production tool is undergoing dramatic changes that promise richer feature set, better usability, and, generally, more power to users. Audacity is one of them.
Originally developed by Dominic Mazzoni and Roger Dannenberg, Audacity has been with us for the past 16 years. By now, there's probably a whole generation of people doing things with sound and using Audacity as the go-to application for simple recording, editing, and mixing audio, as well as for completely uncommon projects such as making 3D jewelry out of waveforms.
However, like other high-profile free software, the project appears to be torn between an insane amount of feature requests. Some of them have already been addressed with two latest releases that introduced real-time FX preview for LADSPA/VST/AU plugins, support for LV2 plugins, and basic spectral editing.
Modern Spacer Black VST plugin running in Audacity with real time preview
But there are far more requests: contemporary user interface, non-destructive effects and automation, better support for various plugin APIs, complete MIDI workflow etc. So LGW sat down with the team to talk about development priorities and the outlook for the future of the project.
Q: You recently released v2.1.0 with major changes such as real-time preview for effects and spectral selection/editing. Congratulations! Now that it's out, what's the next thing to occupy your time?
James Crook: I expect us to be putting more developer time into quality, but in a smarter way:
- Tests with each of our recently automated build-on-commits that go beyond pass/fail and monitor performance and our memory/CPU headroom.
- Low overhead 'countdown' logging so we can log anything we think might help. I intend this to help us track down some glitches that should not happen.
- Enhancements to scripting to automatically collect/update all the screenshots for the manual.
The screenshot script is for documentation, but of course will be giving Audacity quite a good workout too.
Q: There's still a major gap in crossplatform free/libre software when it comes to an easy-to-use digital audio workstation like Apple's GarageBand. Various existing projects are either inactive (Jokosher), Linux and JACK-only (Qtractor, MusE, Rosegarden, NON-*), EDM-oriented (LMMS), or just commonly considered too complicated for beginners (Ardour). Do you see Audacity filling that void for "bedroom musicians"?
James Crook: Audacity has, in my view, become too hard to use. We need a much simpler mode for it that at the same time does not 'sandbox' you away from the more advanced features. That's a big GUI design challenge rather than just a programming challenge.
I think Julian Dorn and Leon Schlechtriem have some very good thoughts on that with their dedicated recording mode:
Q: MIDI features in Audacity are still basic, and proposed musical time in the timeline hasn't been implemented yet either. Is it about project vision not involving MIDI much, some sort of technical limitations, or the lack of contributors?
James Crook: What gets developed depends on people's interests and time, and MIDI is unfinished indeed. Yes, we are all pulling in slightly different directions. As a group, improving real-time is much higher priority for us than MIDI. But we do want MIDI, for reasons beyond using it for composing.
Both MIDI and RT will benefit from pluggable track types, and that is where there is more activity.
Q: About that activity. What are the most exciting features in the works lately?
James Crook: Last year we did Audacity Unconference in Preston, organized by Martyn Shaw, where we demoed radically transformed user interface, converting hand claps to notes (MIDI and wave), a minimally editable score track (musical notation), the RT preview that is now in 2.1.0, an RT effects dock, and automation curves. Not all demos we make will make it into production, but there is exciting stuff in the works.
Q: Adding real-time effects dock and automation would involve a major rewrite of the audio engine (not to mention redesigning the UI), something like what Joshua Haberman started years ago with the Mezzo project, right?
James Crook: I can only partly agree. The FX dock demo was based on what is now 2.1.0 code, so the 2.1.0 audio path supports it. Leland has put down a lot of the foundations for full real-time by spring boarding from cross platform work by GStreamer.
The automation curves were demoed on new audio code with micro-fades that rejoins Audacity at PortAudio. We are making changes in mainstream Audacity audio path based on experience with it. One of those changes will be in 2.1.1.
For both these demos GUI is currently the real barrier to that feature being ready. There will also be work to get the built in effects real-time, as each one will need to be visited.
Joshua's Mezzo initiative was very focused on the audio engine. We do need a much cleaner API between the audio engine and the GUI — and that is where Mezzo was heading. We also need other structural changes even more. If we don't think these things through carefully and prototype, then we are writing 'the same code' over and over in the GUI in slightly different disguises.
Much of the Audacity specific code that we still have to write for these features is GUI code. The demo code helps us work out what structural changes to make both in GUI and audio API.
Q: But you don't talk about these work-in-progress projects much, do you?
James Crook: It would be very irresponsible to get end-users' hopes up based on these early demos. There is though more happening, more new activity, than you see in the main git repo.
Like the current MIDI code, and like Mezzo, there is no guarantee work in progress will ever make it into released code, or that if it does that it will be any time soon.
Q: Speaking of the user interface, Audacity is both praised and criticised for its UI, its branding etc. The team used to be somewhat wary of radical UI changes. Later you added and then, apparently, removed the ability to make skins (or, rather, color themes) for Audacity. Finally, since last year or so, you've been posting UI and logo proposals from users on your Google+ page and collecting input. Is there a change of heart? Are we going to see redesigned user interface and updated branding?
Steve Daulton: I'm very keen to promote engagement and contributions for Audacity beyond coding. Developing a major project such as Audacity requires many types of skills and contributions, and is not limited to computer programmers (though as a software project, high quality code is obviously important). Writers, graphic artists, musicians, translators, VO artists, accessibility specialists... All may make valuable contributions.
UI proposal by Lucas Romero Di Benedetto
Vaughan Johnson: Additionally, in 2014, we worked with Intel on prototyping a touch version of Audacity. I'm trying to get back to that project, now that we released Audacity 2.1.
Audacity with touch interface, picture courtesy by Intel
Q: Since its inception, Audacity has been developed in a somewhat generic fashion, which is why it got adopted by a great variety of users. It got Nyquist scripting early on to simplify writing new features, and there have been at least half a dozen of friendly forks (mostly by team members like Vaughan) to customize it for various purposes. Would you say that Audacity today is truly modular and extensible, or do you see ways to improve the state of affairs? How?
James Crook: No. Audacity modularity is minimal as yet. We only have the basics. We are making slow progress though. As mentioned before, we are working on pluggable track types so that we have more modularity in the GUI.
I view Nyquist in Audacity as 'a secret weapon' that few people really know about, analogous to having Elisp in Emacs. My impression is that no one is using it to its potential in Audacity. The more involved work using Nyquist seem to be in the standalone version of it. New features like SAL land there first.
Nyquist isn't as integral and central to operation of Audacity as Elisp is to Emacs. As yet, Nyquist in Audacity has knowledge only of the audio and not of the GUI. To extend Nyquist properly we need to tell it about the GUI and to be able to plug new GUI elements in.
Q: One of annoyances users have with Audacity is its overly long Effect menu — whenever too many plugins are installed and discovered. Years ago effects taxonomy was introduced to make it possible choosing FX based on category they belong to (reverbs, compressors etc.). It was later removed for technical reasons. Today, Audacity still separates internal effects from pluggable ones and breaks external ones into numbered submenus. Do you envision a way forward with this?
James Crook: Yes, we have already done some preliminary design work on that.
Q: Stats at OpenHub give an (admittedly, questionable) impression that the team is getting smaller in terms of code contributions, and there's a huge difference in activity even between TOP5 committers. Would you say you are growing or shrinking as a developers team?
Steve Daulton: Take a large bucket of salt. The stats on OpenHub were frozen for nearly 4 months and the last time I looked the stats were over a month out of date. I don't mean to criticise OpenHub, I think they do a great job overall, I'm just pointing out that such stats are not at all reliable for fine grained analysis.
Vaughan Johnson: Yes, OpenHub looks only at code contributions. E.g. Leland always does a lot of commits, sort of "agile"-style, so he gets a very high commit count. I'm okay with that measure, but I think it's not always representative of actual overall contribution. Line count has also been shown to be a very questionable measure, for many years.
Audacity team is actually growing, e.g., we just added Paul Licameli and encouraged him to add code by giving him commit privileges. James has committed Paul's contributions prior to us giving Paul commit privileges, so it looks like James is contributing those, but they're actually Paul's. James has made his own contributions, too, recently — I'm just saying it's a misperception that Audacity team is shrinking.
Besides, I and others have been putting in a lot of work that doesn't register on OpenHub — website files/updates, builds, releases etc. — things that OpenHub ignores.
Q: Is there a particular line of work on Audacity that you need help with the most? Something that, once completed, would move the project light years ahead?
James Crook: People should do what they personally care about. That's where they will make the most difference. I love the ways that Audacity is already being used in education. Vi Hart did a lovely video explaining overtones using Audacity.
The maths in audio programming ranges from straightforward (amplify is just multiplication) to the diabolically subtle. The hard maths is the biggest most difficult barrier to more developers writing audio code. It's worth tackling head on.
This is the right time to build the FLOSS audio developer community and bring more people in. Done right the hard maths can be understandable and satisfying. Likewise the programming that follows from it.
So I am repurposing convoluted content from Wikipedia and mining existing code, working with others to comb out the tangled explanations, trying to make a new really beautiful and wide on ramp for audio programming from the very earliest stages on.
I'd love more help. There's challenges of all kinds in it. It's not to put just Audacity light years ahead.
Q: What do you see as the most challenging tasks for the project in the foreseeable future — feature-wise, organization-wise etc.?
Steve Daulton: Difficult to put a finger on any one thing as there is so much going on, and different areas require different priorities.
For the documentation crew the major challenge is to continue to provide high quality documentation for a project that is progressing at a rate of knots.
For the user support team it is to provide high quality support for an ever increasing user base. It is the continuing "challenge" that drew me to Audacity (and no doubt the same for other contributors) — we don't choose to do things, because they are easy, but because they pose a challenge and personal satisfaction when we are able to rise to those challenges.
James Crook: I think, keeping the project fun is the number one challenge for us. We are all volunteers. As code gets bigger, it is harder for an individual to have a big visible impact. That could tend to make it less fun.
A bigger mature project can make development, particularly the "fixing other people's bugs" more like work than a hobby. We are doing pretty well at fun and impact. AU14 was fun. Both Leland's and Paul's changes in 2.1.0 have big visible impact.
We're working on ways to make the code smaller, less work to bug fix, and related things to keep the project fun.