I once was afraid of voice chat
Posted on: April 2, 2008The following is another story from my Second Life…
I once was afraid of voice chat.
And this was long before I even knew it was coming.
I knew some of my friends in Second Life used 3rd-party apps like Ventrillo and Skype to hold conversations concurrent with their inworld experiences; I didn't participate. Ever-so-now-and-then, I'd come by a gathering of avatars in a circle, and one of them would nicely motion to me and explain they were all on voice.
I didn't feel left out, but I did feel limited.
Part of this had to do with my predominant identity as a female avatar, aka Torley Jr. (I've included pictures of her evolution in a new gallery.) It sounds, feels strange to have my deep male basso coming out of a slender, pale (now tan) chick. Not that I would've been altogether opposed, but the cognitive dissonance is a difficult one to overcome.
Over time, more changed. For those who were there at the start of my Second Life (and alas, you are so few now), you know I began with a male avatar, aka Torley Sr. I still preferred not to use my voice, and there was also an added complication: hyperacusis, my ears hurt from hearing loud sound for an extended duration of time. It has me feel reclusive. I'm still afflicted, but I've grown more resilient.
I didn't plan on a specific pathway to the present, but several things transpired: one was, a large part of my job at Linden Lab involved public communication (still does!) and I repeatedly became frustrated at explaining certain things in text that simply could be… shown. It was a timewaster both for myself, my company, and the Residents I wanted to help so much.
I fell into a sort of creative hole at that point, and through some elucidation, the solution became clear: I would show Second Life as it is, as I experienced it. But what to do first?, I pondered. Turns out one of my growing responsibilities became the Official Second Life Blog, and as I've become increasingly known for, I launched a type of "planned spontaneity*".
* This, before we go further, has roots in my musical background. I'm an extremely strong improviser, able to take established chord sequences I keep in mind beforehand, then rapidly weave a sonic tapestry around them, bringing to life melody on the spot — mostly underpinned by the fundamentals. In much the same way with other mediums, be it raw text (as you read now) or the power of video as a communication tool, I improvise using a few "pillars". It provides me with the flexibility I crave to explore, and the "safety net" (altho I'd often not term it as such, because if I fall… I can fly!) of principles I already know.
Without thinking too much about it, I recorded what would become my very first "video tutorial" in a manner of speaking. I forget about it time to time because it's not even on YouTube or part of my considered official canonicity, but nevertheless, it is real. It exists. And you can watch my "How to Use This Blog" Video Tutorial for a trip back into 2006:
You'll quickly notice several things: (1) HOLY CRAP! The blog looks so different! Yeah, that was before we had a nice webdev facelift. And I forgot what #2 would've been, but you can tell me.
What especially charmed me following this: I received a few warm and kindly replies. Not the sheer bulk of incoming mail I get today which pleasantly threatens to bust my inbox looser than a — well, let's not go there — but it was encouraging nonetheless. Some regulars from the unofficial #secondlife IRC channel passed on nice words. And there were a few comments to the effect of:
"You sound exactly like I expected you to."
It's really smile-inducing today, because I hear that all the time.
That gave me courage.
I ramped up making video tutorials… initially, one at a time, which eventually turned to me raising them like a sort of farm with animals — meaning, having numerous vidtuts in varying states of completion before the Golden Harvest.
By the time voice chat got here (I had received internal word before the public ever knew, but even so, I accepted the news casually), I felt relaxed and eager to try it out. I had forgotten about my fears, and grown into sharing my First Life's one-and-only voice (although I can do a great amount of comedy impressions!). But when I think about it, I can clearly see why I was, and why some others still continue to be afraid.
I read a thoughtful essay by Gwyneth Llewelyn recently, "Immersionism and Augmentation Revisited"; as much as I'll repeat this, I have many thanks to give to her for being so kind to me in the early stretch of my Second Life. She was "born" about a couple months before me, and we had some fascinating late-night (3 AM my time) discussions about… what does it all mean?
Gwyn (to me, always "Gwyngwyngwyn") writes about immersing into Second Life and adopting an anonymous persona vs. using SL to openly augment your First Life. I generalize here, and Gwyn is careful to make the point of a gray area. She uses me as an example, and sometimes I wonder if it's more like a rainbow (I see the world colorfully), or that we perceive visible light but there's much else out there on the spectrum that we not see, but we do experience — the interpersonal analogue of gamma rays, say.
Admittedly, I haven't thought to label myself as an "immersionist" vs. an "augmentist" (I could see some post-cyberpunk novel calling them "mersies" and "augies"): I have little use for restricting labels, and, as I've learned from my electronic music years (where genre labels are a dime a couple dozen), if I'm to use labels, then it's for accessibility and convenience — NOT artificially stunting my potential. Which is what I'm concerned some people do, lump themselves into a camp and refuse to cross the line to see what awaits them in this big world. (There's so much to learn, how could anyone be so scared that their hunger for being alive is trumped by rigid self-definition? Alas, there are people like that.)
So I've mused, and moved in many small pieces. Change didn't come all at once, but day by day, it arrived. Like a gradient sweep of one color into the next, it wasn't shocking.
I respect the right of a person to choose, and that includes making an informed decision which they consider the best for them. There'll always be people who don't want to use voice chat, much as there are people vehemently opposed to graphics in computer games and prefer old text adventures. And to wit, there are those who'd rather just read a book. I don't class them Luddites as long as they've been willing to explore the new frontiers and decided it was not to their liking — after all, they can always come back — or is it forward? — should they choose. Our humanity knows not a shortage of choices; rather, in this age of technological onslaught, we're battered by surplus decisions. And as much as a meaningless abstraction as "informed decisions" may be to some, I don't write it without having arrived here after much experience, with much more to go.
I encourage little experiments. The so-called "baby steps". If someone feels rewarded, they tend to keep continuing in a given direction. My friendliness in the online world of Second Life was met with much friendliness in return, and thus, I became bolder. If I had come inworld to suffer a hostile 1st-time experience, I doubt I would've stayed. I was blessed to have friends around when I was griefed for the first time, and ended up laughing my ass off instead of being up-in-arms-outraged. And while I've had my fair share of bumps along the way (including on the job), from all these shattered shards, I've looked for cohesive threads to stitch into a quilt of sorts.
And the panels of this quilt have become the backbone of my blog; observing, participating, sharing.
Like I wrote earlier, I find it's incredibly ripe to see Second Lifers use their Resident name in predominantly First Life-oriented places (Flickr, Facebook, etc.). To me, it's a sign of investment and pride in your virtual identity. Whether you choose to keep that "separate" from your offline existence to others or if it's more of a blend, as mine has been, you're the same whole person.
Perhaps you're like an inverse onion: inner layers are peeled off first, the kinds of things that you may think silly to share with in-the-flesh chums, but you've been able to find camaraderie with folks who appreciate X obscure interest or Y healthy fetish in Second Life. Maybe there's darkness within your soul you've been able to deal with thanks to Second Life: instead of imposing drama on others, dragging in your baggage and beating the crap out of it, as I like to say.
This is what happened to me, I happened to Second Life, and I even made a set of video tutorials to mark the passage of my fear.
I'm no longer afraid of voice chat**.
** I don't use it as much as I could, however, primarily because the audio sounds so harsh to my ears: it'd be nice to have a virtual analogue tube saturator to round out the sharp digital clipping and auto-level the volumes (I write this as a sudden blare almost startles me from my wife's computer as she treks in Second Life).



April 2nd, 2008 at 10:02 PM PDT
Thank you for this, it was very interesting. To me, you're the epitome of what Second Life should be.
April 2nd, 2008 at 11:18 PM PDT
While I can see the uses for voice chat, some of us are just used to parsing 10 conversation in text (the joys of ADD, heh heh heh) and find it EXTREMELY hard to follow the same number of conversations by voice.
(And that's of course if it runs without eating older computer's faces…)
Just a view from the other side.
April 3rd, 2008 at 1:56 AM PDT
So much deep thought and artistic articulation, Torley, all presented in an easy fun manner. I second Sougent: You are the epitome of what Second Life should be.
April 3rd, 2008 at 5:15 AM PDT
I agree with Alexandra…although I've used voice chat, I prefer the old fashion typing. I can't follow all the conversations, and I can't scroll up to see what I missed when my 5 year old asks me a question… Not to mention everyone asking "who has the kid?".
April 3rd, 2008 at 6:55 AM PDT
I too participate in multiple discussions in different locations at the same time via IM and chat. And sometimes I get up and leave the room briefly. And sometimes my 20 year old (who is just as disruptive as Krissy's 5 year old!) comes home from work and is very loud, and engages me in conversation. Without the ability to scroll up through history to see what was said, I would be lost! Also, my husband and I use SL at the same time, in the same room. In a small room with no door, next to our bedroom. It just wouldn't work logistically for us to use voice. And I like the quiet. I don't watch television, and I generally don't have the radio on either. I can relate to Torley's hyperacusis.
Additionally, I like the illusion. It is primarily a visual world, for me. I create a voice for each person in my head, based on what they look like, how they act, even what and how they "speak" in text. Much like reading a book. If a big bad drooling orc tried to kidnap my little faerie, it would be very strange for the orc to have a wispy young girlish voice in reality. And hysterical if the little faerie cried out for help in a booming masculine voice! Now, if we could pick and alter our voices as easily as we could our avatar appearance, I might reconsider voice!
Princess Ivory
April 3rd, 2008 at 6:51 PM PDT
I have always been one to be hesitant to use voice because I am always nervous talking to someone rather then typing out my feelings. I am not afraid to express these feelings (obvious reasons both on & off SL). I started out with those I have know for quite some time now & am willing to talk to those I feel have something positive to share via a voice chat (compared to the many I run into who would rather use it for reasons I don't feel like participating in.) I am always willing to voice chat with the positive flow of SL.
Torley, you are a complete inspiration to all of us.
April 4th, 2008 at 7:52 AM PDT
Thats a really great post Torley.
I've been thinking alot about the principles of immersion recently. For me,voice has been part of an evolution my expression of my personality and thus my immersive technique.
I personally think the "mersies / augies" comparison is an over simplification of the many different ways people project persona, in both the physical and virtual realm.
Recently I've been considering the gestalt effects of immersion alot whilst composing a huge soundtrack and designing the environmental sound for the Ruta Maya exhibit in Second Life. The use of rich audio in virtual environments poses many questions about the type of immersion people may or may not want to experience. Just as some people may prefer an purely imagined text only virtual realm ( Richard Bartle would be a good example here perhaps ), others may prefer a visual representation with a text communications system, others still embrace their voice entering the equation.
I believe that extensive bespoke musical and environmental sound design vastly increases the "immersive bandwidth" of virtual place. However in Ruta Maya we have been very careful ( just as LL have with the implementation of voice ) to make sure that those extra audible channels of immersion are always optional and controllable ( thanks to the great new sound mixer interface design this is now much more accessable).
However, as you say, it would be fantastic to have increased control over the quality and dynamics of the voice channel in Second Life. I would assume with all the contracts Vivox are getting lately, they will have enough resources to develop a nice warm analogue sounding compressor plug in for their voice systems. An equaliser system would be great too. I had an interesting chat with some Vivox staff once about the concept of an in world booth which you could go into, talk and hear your own voice then adjust its volume and EQ. I think these systems, along with one of your excellent tutorials, would really help the expansion of the use of voice.
I also did some research recently into the possible reception in the community of voicefonts and the possibilites of voice actually becoming a creative tool for expression through the design of custom voicefonts ( which could possibly be sold in world like windlight settings? - whole new economy there! ) It was interesting to see the range of responses. Some residents loved the concept ( and indeed are already doing it with third party software ) and others had the opposite response - saying that even if their voice was completely disguised, they would never use it. What was most interesting perhaps, was that that group of residents seemed to like the concept of designing custom text to speech voices very much. Here is a link to the article I wrote for SLNN on it : http://www.slnn.com/index.php?SCREEN=article&about=thinking-about-voice
This reveals what could be a key point for me. That one critical issue with voice ( and possibly many other aspects of immersion ) may be directly related to any form of "physical" connection between the organic body and the avatar. This has also been something we have explored extensively with the PARSEC project ( which enables the movement of prims in Second Life with your voice ). http://parsec.wordpress.com/
I hope that Ruta Maya is something people will really enjoy. It was created with real enthusiasm and passion for Second Life and the joy of really feeling a virtual environment. I also hope that it encourages the increased development of the auditory aspects of virtual immersion in other areas of the grid.
Dizzy
April 4th, 2008 at 9:11 AM PDT
I specially enjoyed reading about your "rite of passage", Torley, for two reasons obviously. One is purely philosophical, which actually comes from the ever-present discussion which obviously will never be settled by either side (we will still have people preferring to read books than watch TV in 2100!). The second reason, more dear to me, is that since we first met, you have certainly improved dramatically in your health and dealing with your hyperacusis.
So there is a "story within the story". One is the exoteric one — the visible one to everybody: from someone who was shy and in a sense "isolated" in an environment full of sound that you could not listen to (and I can't possible imagine how that can feel to somebody like you that made music your primary career and hobby), to someone that gradually, in little over three years, is fully able not only to deal with the hyperacusis, be the most social person in a huge environment (that is, ultimately, mostly about people, no matter how fun and entertaining the technology is!), and make it your secondary career (as the Most Optimistic And Positive Linden Employee Ever™
) — while, at the same time, not neglecting your first passion (and interpreting and composing music again — a distant dream which you referred to on the first talks we had, soooo long ago) and acquiring a few new ones (becoming a successful teacher through your video production). I have no idea if any doctor is writing a doctorship's thesis about your incredible recovery (I know it's not complete!… but… what a difference!), but they should. A "miracle" happened along the way — obviously, by tapping your inner strength to allow the miracle to happen, but still, from an outsider watching you over the years — what a change!
This is something so encouraging to hear! Fortunately, you're not the only one, but very likely, the one that has been most open to discuss it freely to your vast audience.
This is, obviously, just marginally related to the discussion
There is also an "inner" voyage, an even more deeper one, with an esoteric meaning. Somewhere in this ongoing process, "coming out" of your fears of using voice to become an enthusiastic voice adept, there was something experimented, that changed and transformed you. Voice chat was part of it.
I really don't know what to say about myself. I never liked voice chat on the Internet, or even on the phone (I might just be simply strange). I like presence — and SL is about presence — or, by contrast, imagination. Books and text have an appeal that voice doesn't. I never manage to follow podcasts, and hate to watch videocasts, when the focus is on the content and not the visuals. Obviously I like a good movie or a good TV show (even if I don't own a TV), but there has to be a perfect sync between the dialogue and the visuals to make it stimulating for me. It doesn't work one without the other. On the other hand, "dialogue without visuals" are more than enough — if they're written, since I can have my imagination fill in the gaps. Podcasts, voice conversations, simply don't work for me. Neither do audiobooks.
There is a third issue which most people don't realise. Video and audio production, to be successful, are made my professionals. If I listen to a radio show while driving, they know how to make it compelling, so that just the "right amount" of conversation goes through. Podcasts are too amateurish — people simply don't have the required skills to make voice compelling (unless they happen to be professional actors). They're too distracting with their "uh, huh, errr" and the confusing argumentation that circles around a subject which you never manage to understand what it is. That's why you have super communicators doing keynote speeches, but a poor teacher with little communication skills has to use a whiteboard or a slideshow presenter to get a message across to their boring students.
Text, however, allows your mind to drift and allows you to scroll back to see if things make sense. Text-based classes don't require 100% focus on what the teacher is conveying — if you get lost, you go back on the history and read it again. Granted, doing audiovisuals in a SL class also requires attention — you can't simply expect that what you're reading from half an hour ago still has any bearing with what you're watching on the screen (or being displayed with prims).
The experience I had with voice chat in SL is neither better, not worse, than what I had with all sorts of teleconferencing tools. Half the time is always wasted with people tweaking settings. When time counts, and you're not into SL "just for fun" and have an infinite amount of time to tweak things, voice is too disturbing. But even on the "just for fun and leisure" moments, voice is hard if you don't live isolated in a sound-proof cave. Try to have a romantic moment in SL with the trash truck passing on the street, the vacuum cleaner being turned on, children yelling, dogs barking, the neighbours screaming… it completely turns anyone off. Or when doing it during the night, when you have to whisper not to wake up your neighbours, if you're not lucky to live in your own villa somewhere remote, but are in a small flat late up at night where your neighbours expect you to be mostly silent. Even romantic moments on the plain old telephone should be savoured when you're alone, relaxed, on an environment where nobody's interrupting.
Or maybe I'm just being overly demanding — just because I live in a noisy envrionment by day, and a silent environment by night; just because I have used unsuccessfully three or four sets of headphones and microphones which always had problems; just because my computer seems always to be underpowered in the voice settings, always conflicting with other software, making my voice either sound thinny like Minnie Mouse or full of echoes and random noise to the point that nobody understands me through my thick accent — I don't know. I might just be extremely unlucky. But it always pains and bothers me to waste so much time with all that worthless technology when trying to communicate with other people — either professionally or in my scattered leisure time — and worry much more about why the technology is always failing on me, instead of focusing on the issues.
Ironically, some good friends of mine, with whom I managed to have fascinating and interesting text chat conversations, but who also have talked to me in chat, are absolutely boring that way
which was a quite unpleasant surprise.
Being surrounded by bad experiences everywhere, I guess that I've been anything but convinced about the "usefullness" (or even "fun") of voice chat. I think that I remember just one good example — and we talk more over the phone than in SL — who is a specialist audio producer with access to US$5k worth of technology, to have the audio sound perfect. Well, I guess that if you're a sound engineer — or are friends with one, or have an audio studio at home, or at least access to it — voice chat can be a fulfilling, immersive experience, as deep as simple text is. For the unlucky types, I guess we're always going to sigh deeply and roll our eyes when the next person pops in and "insists" to use audio for whatever purpose they wish.
As an exercise, I once did a text transcript of a 40-minute audio conversation with a dozen participants (the conversation was not very intense). It took me two whole days (about 20 hours total), and I missed several words. Then, once having the whole transcript (originally in English), I translated it into Portuguese, for the benefit of some friends that have some difficulty in listening to English. Retyping the whole transcript took about an hour. I think it was then when I realised how different audio is from text, and I have my own theories (vaguely supported by neurological science) on how our brains work so differently when processing audio, compared to text. It's not surprising now to understand that we took 100,000 years to a million years to develop audio communication to an advanced stage, but just a few millenia to come up with written communication (in fact it appeared very shortly after we learned to live in societies depending on agriculture and started building the first cities). It really says something about our abilities and how different they are. It also explains why, although we have the technology for that, portable dictation tools to write letters automatically never really caught on (and the tools get better and better every day!), except on very specialised cases (people with some sort of inabilities).
But I digress! Essentially, I'm very happy to know that at least for you, Torley, "voice chat" was an "audio therapy" that enabled you to feel much better, deal even better with your health-related issues, and break free from a grey past and walk in the rainbow light of a bright new future where you can be much more yourself — happy, excited, optimistic, and enthusiastic about what will come next.
Just for that, I'm very thankful that voice chat was introduced in Second Life
April 4th, 2008 at 9:37 AM PDT
Dizzy's comments made me add another comment on my own. Yes, it also "confuses" me when the sound is not "in sync" with the image or the imagination. I keep unfocussing and going from one to the other. Watching a huge dragon talk in a small, squeaky, piping voice simply "confuses" me too much — unless I'm watching a Disney cartoon! — and I tend simply to turn the SL camera away. Good audio/video integration is hard to do well, and requires professional experience. It will sound (pun intended!) a bit like propaganda, but Dizzy Banjo — a professional and talented composer — managed to do a perfect integration of a "soundtrack composed for Second Life" on the Ruta Maya project (Mexico's presence in SL). I was very pleasantly surprised, but perhaps I should not be — Dizzy is, after all, a trained, skilled professional.
But he also has a point with the text-to-speech technology. No matter how good my "home studio" might be — with a sound-proof room and a lot of equipment — it will be very hard to get my avatar sound like a 30 year old Welsh woman with a distinctive pronounciation — and which is what people picture "Gwyn" to be in-world
The technology is there — God knows that if you can get JayLo to sing, you can do anything with professional audio and sound engineers — but it's simply not accessible to the common user.
Text-to-speech, however, comes pretty close to that, even with the rudimentary tools we have today. By carefully selecting "voice sets" from companies like Acapela group or Cepstral you're starting to get pretty convincing results. Definitely not perfect, and still slightly "computer-sounding", but definitely becoming better and better. I admit that I have a license for one of those voices and use it routinely when building virtual world presences for a customer that "suddenly" needs a sound bite done in a hurry for some interactive device that we've just created, and we have no time to get a professional actor to record it for us. These text-to-speech engines are very useful for that!
April 6th, 2008 at 12:31 PM PDT
I think Gwyneth and Dizzy hold the record for the longest, most detailed comments on here!
Some of my thoughts — and thank *you* for yours:
@Sougent: Thx! I often find myself sharing experiences in hopes they'll be useful to others.
@Alexandra: Very, very true. I know some of our devs and ops @ Linden Lab *strongly* prefer text because there's a written record of conversation to parse for post-mortems later. Otherwise it's a big waste of time to retrace steps. Recording voice chat isn't as convenient and it still needs to be transcribed. I, too, enjoy replying rapid-fire to multiple text questions asked in succession; in voice, even two voices at the same time are an audible jumble!
Someday I hope we can get highly accurate auto-transcription. Until then, there's http://castingwords.com and the like.
@xanna: Thank you so much! I strive to be a good explainer.
@Krissy: That reminds me of someone who has a very vocal cat who often makes an appearance on voice chat.
@Dizzy: I appreciate your grasp/interest on audio issues inworld. It's funny too, the moment I wrote about puretext, I was thinking about Bartle as well. Today when I was briefly inworld, I heard some trippy goa trance and I wondered where it was coming from: it wasn't the parcel music, but someone on voice chat whose avatar was dancing and they'd routed in those beats (presumably via an Internet radio station).
I find it so ugly and jarring to hear distortion — frequently — on voice chat. Most people are familiar with how smooth broadcast-quality audio is, but don't know how to achieve it themselves. The sheer (yet subtle at times) difference in quality is why with few exceptions, I recommend putting a limiter at the end of the audio signal chain for even casual audiovisual projects.
I've also heard a term, "voice fonts" used before. *reads your article* As with many technologies, a lot of currently limited views will appear strangely antiquated in the future.
What additionally throws me off with voice is lack of acoustic environmental effects: if you go into a cave, your voice doesn't sound heavily reverberant, and that relates to what you said about "physical" connection in a sense.
Hee, the PARSEC logo reminds me of part of Justice's "DVNO" video . I like that "Muxtape" image! Very cool to see the recognition it's getting. It should!
@Princess: Ah yeah, with voice, it's difficult to be in "different locations at the same time" or to even excuse yourself from a convo.
My general feelings are that aural technological developments lag behind their visual counterparts, and the development of the Internet (and what's within it) is reflective of this.
@Cerulean: Thanks for sharing!
@Gwyneth: Your comments here could've constituted another of your insightful blog posts. You're so encouraging, thank-you. My hearing itself basically hasn't improved but I've become more resilient; I wasn't going to shrink away and fade, so I had to put my fullest into my lives (First + Second!).
I find it strangely disconcerting when dialog is unsynced from visuals, like on badly-dubbed kung fu movies. (Sometimes, it's done to humorous effect, but too much of it loses my interests.) I've grown to like YouTube rants like Ben Croshaw's game reviews, where his speedy voice is punctuated by minimalist corresponding imagery in-sync with his distinctive inflections. Have you seen those? They are a lot of laughs!
Part of what you write may reveal why I don't care for company meetings in voice chat (generally speaking): it's all too easy to get lost if not focused enough, and I dislike when meetings drag on… I find myself doing what I consider more valuable stuff in the background.
Hopefully due to the likes of pioneers like Apple, getting great results easier will become more common. I know some of the "professionals" are scared (and snobbish) about everyday people attaining wonderful quality with their productions (which doesn't mean that the actual content will be any more substantial, but still…), but I embrace it.
Are you familiar with Yamaha's Vocaloid? Prolly among the better "computer singers" and it doesn't do just text, per se, but it's not bad. You can hear it in this rather bizarre "cheap" version of Blade Runner, where some people actually mistook it for an eerie acapella: http://www.youtube.com/watch?v=9n5WmS7eWp0
We live in an era where most people tolerate poorly-encoded 128-kbps MP3s, and I have to wonder if often hearing obviously computer-generated voices will make more people used to them…
I remain curious about what your voice sounds like. =)