v002 @ Eyebeam

I am fortunate enough to have received a residency at Eyebeam for Winter / Fall 2012. All Eyebeam resident applicants apply with a project in mind. What is my project? v002.app.

What is v002.app?

An experimental live (video) performance application, with a large focus on improvisational user interface and interaction design.

Why? Because I want a tool that is meant to be played. A tool that doesnt feel like work. A tool that is designed to adapt effortlessly to changing circumstance.

I feel a deep dissatisfaction with the current crop of video performance tools. While most are capable of performing their task (and I use them to the best of my ability), none attempt to approach the core problem of software as instrument. An instrument you can pick up, and instantly join an improvisational moment. Think Jazz.

Too often is time spent creating presets, adjusting windows, importing media, setting effects chains and mapping midi triggers. Once your presets are set, you can move about creatively, but only in the box you’ve made for yourself (if you even have that option) – adjusting the assumptions of your presets takes time with all of these applications, shaving off critical moments from when you want to react to a change in your environment, to when you can finally get that decision out and to the world.

Moments are fleeting, and anything that gets in your way is only going to compound the issue. This is without discussing the subtle issues of our visual cognition, user interaction design, competing visual cues and information overload, hunt a peck flight deck interfaces and incredibly modular UI’s to the point of making F-22 fighter pilots blush.

Do this experiment. Open your performance environment of choice, and make a new empty project. Load your music library and put it on random. Hit play.

Now perform visuals. Quick, get something on screen that fits the mood.

How long does it take you to create a composition? How much flexibility do you really have once you’ve settled on the look / feeling / mood of the first track, to react to whatever happens to come on next? Do you feel behind?

Performing live visuals isn’t easy when you know whom you are performing for, when you have media and presets pre-made, have rehearsed, prepared, and have a known track list. Add live musical improvisation, add uncertainty, add unknown variables, and on top of all of that, try to push yourself to really perform, adapt and play. Can you do more than trigger a movie, scratch a clip or throw seemingly random effects on top of a clip while audio-analysis makes decisions for you?

Is there a language you are trying to speak in? Is your tool fluent?

How do you bend your clip library to match a mood you did not plan for? Make a flashy electro-house motion graphic cliche dance to a different beat and find a home to some country western? Quick, change that preset. Add a layer. Add a mixer. Drop in a mask. Now remove all of that, its a new tune and it demands a new feeling.

Perhaps I am being unfair. Most of this comes down to you, the live visualist. Your taste, your visual aesthetic, particular synaesthetic approaches and how well you know your instrument. Your clip library, the footage you use and how you use it.

Just like any discipline, “garbage in” generally nets you “garbage out”, but the tool and the process the tool enforces on you has huge ramifications even when you are on the top of your game.

How quickly can you react? How dynamic is the tool? How many decisions per frame can you really make? Does it get out of your way?

I only have a few answers, and many more questions – this is a inexorably complex problem with no correct answer. These issues are qualitative, driven by metaphor, ideas, approaches, limitations, even aesthetic assumptions. That said, I have an approach I think is worth sharing:

  • An application to create an aesthetic experience must itself be an aesthetic experience. Function follows form – but the function is form – dynamic and changing. This is the most important guiding principle.
  • Shorten the time to react to changes. Interface must be non intrusive, non distracting, customizable, & interactions must be contextually relevant to the task being attempted. This is the main pragmatic focus.
  • Reduce complexity, repeat the same metaphors for systematic approaches to similar problems.
  • Treat the application as an instrument to be played.

And these are some common pitfalls I find myself re-visiting when using practically all live performance software:

  • Performance interfaces must be “low latency” – this means:
  • Consistent – don’t have to switch mental models, second guess, interpret, etc.
  • Non distracting – don’t have to ignore overwhelming UI cues.
  • Low complexity – don’t have to deconstruct abstractions.
  • Context aware – show only what you need in a contextually relevant fashion.
  • Customizable but consistent – adapt to my needs, not the other way around
  • Provide an at a glance understanding of state of your composition.

Additionally, I find that

  • Layers are constructs and are not necessary, add additional interface overhead and only add an abstraction. Anyone familiar with node based tools knows this.
  • Preset philosophies are the antithesis of improvisation.
  • Choosing pre-selected media is (generally) the priority – not setting a mood.
  • Limited context-aware actions, multiple interface paradigms all encourage “high latency” UX design. I have to stop and think to use a tool, find the pulldown menu, the tab, the disclosure triangle, etc.

Where to go from here? I have some ideas I’ve been working on…

13 Responses to “v002 @ Eyebeam”

  1. toby*spark Says:

    one person’s input isn’t enough to constantly animate a mix – there’s too much going on, and much of it is a bad fit for human input. take the simplest case: i’m going to jump cut on the beat. do you really want to manually trigger that cut every second, accurately on the beat.

    with node based compositing you’ve got a great route for dealing with the assemblage of “layers”, effects and so on. lets say thats half the problem. i’d say the other is in extending your reach as one human’s input in time, so i’d urge you to think about the marshalling of a small army to do the animation of the mix for you. fwiw, audio analysis is a big part of that for me, don’t dismiss it.

    steering not controlling.

  2. protofALk Says:

    Sound like a very nice journey in front of you. You are the right person for such an endeavour. Two things you shouldn´t loose focus on – speed. Nothing sucks more then a non fluid image or half frozen interface – and make it modular as hell if you want other to use it too – as there is not a single performer out there who works the same way as the next.

    Node based rules. Will be interesting to see that in a live environment – together with context awareness it might even work. But even for preconstructing contexts it would be a heaven send.

    as for presets – well I think of presets more as contexts – yes they might be bad for improvision in themself but they do create a good looking starting point that fits a mood and you can go from there. Pure improvisation might get you into a dead end for without a preset or even worse a reset you are not getting out and it just becomes bad after worse
    Please have a look at how improvising Humans function – Improv Theater, Jazz bands, Improv Dance before you go full out on the no preset idea idea.

    But yeah gogo :) all the best for the project!

  3. laserpilot Says:

    Really awesome that you’re taking this on, I can’t wait to see what comes of it. Just prompted me to dump a few thoughts about this I’ve been sitting on for a while.

    I’ve spent 4 or 5 years tinkering with my max/jitter performance interface, it gets hard to toss that all out and start fresh because your performance language almost becomes that tool. I used the same basic interface for wildly free improv, straight ahead pop-electro, dance performances. As you say above it really comes down to personal aesthetic and how you treat that synesthetic relationship. I tried to really stick with that instrument so long and really learn it, to spend as little time on screen as possible, just me, my instrument and the music. I kept a lot of the same effects as just what I saw as a visual performance vocabulary and kept building a library of clips with my camera…what kinds of things could I create with a sort of base visual texture or motion while applying the same effects vocabulary. Not sure if this is really analogous to taking a music sample and applying the full range of possible audio effects to it, but maybe?

    I think we’re close to the same boat where we see the human as the most important translation catalyst. Audio analysis has this sticky spot of making the video about the audio and there is no room for a back and forth. Where is audio’s video cut detector? Where is the automatic audio effect that bends to my colors? These aren’t simple translations that necessarily need to be linked 100% of the time.

    I do see the value of Toby’s point of “steering not driving” that using some kind of oscillator can help with animating certain parameters that you wouldn’t want to continuously control via your body, but I don’t think audio analysis is rich enough yet to give it that much mindless control yet. Volume dynamics and beats are nice, but they hardly speak at all to texture, energy or what came before or what’s coming.

    I feel like one helpful thing could be an interface that learns you like you learn it. If it can tell how feverishly you’re moving controls it could move into it’s more dynamic modes. One of the biggest issues I had when doing improv was trying to program that really good dynamic range of moving quickly or moving reaaaaaally slowly with a limited set of controls. A lot of emphasis on speed made me forget that going smoothly slow with 30fps source footage was a huge challenge.

    Again..really excited to see what you come up with…wish I had the skills to help!

  4. Randall Says:

    Love Eyebeam. Very excited to see what comes of this. Good luck!

  5. ilan Says:

    1) If the material is not footage of a human being or some other photographic element, then it should never have to be pre-rendered

    2) If it is not pre-rendered, then every parameter of that content should have the ability to be modified on the spot via any type of input one chooses to use, be it the temperature of the room or the gas from their ass

    The above two points should be the cornerstone in the design process of such a tool. The third one below is an extra bonus that I think is worth mentioning given the fact that I am not a programmer:

    3) For the ‘creation’ part of this tool, there should be several levels of usability that will allow at least 3 different types of people to interact with: a)Novice b) Intermediate c) Programmer. They should interlock and overlap each other, but NEVER interfere or intimidate each other in the user experience.

    As always… simple is best.

  6. Andrew Benson Says:

    Here are some thoughts that you can take or leave. I really respect this journey you are on, and think you will find something in all of it, even if it isn’t the perfect app. I hope we can discuss as you go.

    Most tools/environments fail in one of two areas: ease of adding and generating new content or ease of adding and generating new behavior/functionality without rebuilding everything. Either of these can kill the type of adaptability you seem to be after. If it’s too hard to alter content (mood, color, different relationships) you’ll get tired of it. The concept of Layers helps with this, as you have a functional container for whatever content you want to add. But I agree that Layers aren’t the ultimate solution, and look forward to seeing what other ideas you have in this area.

    If it’s too difficult to develop new features or add new behaviors to an existing structure, that will also stop your development. High level tools often excel at the content part of the equation but lock you into functional rooms. Low-level tools often struggle on content (I’d like to add this image, but then I have to program a special node for display of that image), while offering endless possibilities to indulge feature fantasy. Unless you are into the live-coding thing, this doesn’t really help in the “quick make something bright” department. This is tough to get around really.

    Then there are the situations where the ‘content’ is a technical feature. I’m talking about particle effects, openGL junk, and other types of visual things that require a technical infrastructure to add. Being able to create these elements fluidly without having to rewrite everything is difficult without a modular infrastructure in place. I would also include camera-driven effects in this category.

    Which brings me to modularity. In my experience, it’s hard to create modules that you will actually use in a modular way. Most processes I’ve come across imply a whole set of processes that they are a part of. How do you define the boundaries of a module? How do you work modularly without devoting 50% of dev time to applying modular glue to the exterior of all your ideas?

    Finally, one of the big problems I have with most video performance systems is the layer of metaphors that obscure the technical reality. What if a system was built with full acknowledgement that it was an openGL context with a bunch of FBOs and shaders and texture maps and junk, instead of “layers” and “effects” and “generators”?

  7. toby*spark Says:

    on a tangent, and not as flamebait, but on andrew’s last point: one thing that really got me about the early days of quartz composer is that is was so obviously a direct mapping of the capabilities of your graphics card / openGL. it lost that along the way, and the current “optimisations” have confused any mental mapping i had about what the fastest path would be.

  8. vade Says:

    I havent had a chance to respond to these comments, and now there is quite a bit to respond to.

    In no order:

    Toby: QC never really mapped that. For example, Image data types could be processed on the CPU, then the GPU, then back to the CPU due to how a particular processor patched worked behind the scenes. There was and is no way to determine if any image processor is GPU accelerated. This is one area where Jitters extended data types for CPU side matrices and GPU side textures make some sense. It is a shame that these nuances need to be dealt with at all, in my opinion.

    Andrew: I dont think it will be the “perfect app”, in fact, I expect (and sort of hope) that whatever the final form of v002 is, it will leave some people with a bad taste in their mouth. I’d rather serve a few use cases *really well* than serve many poorly. As for features and development, part of the solution is to have fully open tools. A second is to have well designed APIs that make adding functionality that works, and feels like it belongs in the host. Thats on area I think QC did eceptionally well. Plugin development and the API is sensible and easy to follow. Jitter’s API is just numbing to me. As for exposing technical underpinnings – I dont know. I have to disagree. I think an elegant UI does not need to deal with the metal – the issues at hand are in the domain of the cinematographer, the photographer, the painter, sculptor etc, not that of a programmer. In other words, when you perform, are you actively thinking about FBOs, texture IDs and shaders, or are you thinking about “motion”, “chaos”, “counterpoint”, “color”, “mood”?

    Ilan, your 3rd point is very interesting to me. Im not sure how I feel about it. The optimist and eternal hopeful in me wants to say one well designed interface should complement all users. The cynic and pragmatic in me says otherwise.

    protofALk: Speed if of the essence. Both technically “optimal”, but also in a performative sense. Again, UI latency, time to make a decision, time to change a mood, etc.

    Toby, I think laserpilot echo’s where I was coming from. Automation is one thing, automatic is another (? if that makes any sense). I think beat matching, internal clocks and LFO’s as automation with a human hand. I see audio-reactivity as “automatic” (think iTunes visualizers, etc). I prefer the former to the latter – I also dont think you meant it quite that dramatically. I think there is a lot of room for automation, and as you said, one performer may not necessarily be able to bake the entire cake. Automation helps a lot in that situation.

  9. Your Vision is Quartz « Secret Killer Of Names Says:

    [...] useful QC plugins I returned to the wonderful world of Anton Marini AKA Vade to discover news of his residency at Eyebeam.  It appears he is working on an app for live visual performance that he describes as [...]

  10. Joel Dittrich Says:

    Good read! Looking forward to trying out v002.app one day.

  11. Vibeke (Udart) Says:

    These thoughts are a lot in line with things I have been pondering in my time as a live visuals performer. I applaud you Vade for taking the time to consider these issues and trying to create an app that takes a new approach to these things.
    That being said I think it’s utopian to tell a VJ to accommodate to any musical mood on the fly. And I’m not saying this is Vades point – it’s just something I also have been thinking about. Telling a VJ to create a visual equivalent to a musical mood without any preparation is like telling a house pianist to create a soundtrack to any movie on the fly whether it’s a thriller, sic-fi or love story. The pianist may be able to come up with something but it won’t be a full fledged appropriate score for sure. Or if it is he will have used presets from a large library of genres…

    Adding to this point I also think it makes sense for VJs to try and perform together – like a visual band. Most musicians perform together more than one person on stage, so if visual performers are ever to match the details and complexity of music it makes sense to create tools for live collaboration. I have done that on occasion and it’s a lot of fun too.

  12. tomplot Says:

    Amazing!!!! grettings from Argentina!

  13. Bill Etra Says:

    I like !
    The more people trying to work on Vj action the more likely I’ll be able to see the realization of a dream I had almost 1/2 a century
    ago .
    The mantel is your’s Anton go for it!
    bill etra

Leave a Reply