Review: SmartMouth for Flash Makes Lip-Syncing Simple

Review: SmartMouth for Flash Makes Lip-Syncing Simple

I’m neither an artist nor an animator, and yet I was able to lip-sync an audio track in next to no time, all thanks to SmartMouth by Justin Putney. This Flash Professional extension really impressed me with how quickly it was able to automate an otherwise extremely tedious task. Read on to see how it can make animating your cutscenes so much easier.


First Impressions

SmartMouth comes in a standard MXP package, like most Flash Professional extensions, so it was a snap to install: I just double-clicked the MXP and followed the on-screen instructions. (It’d be the same for any version of Flash from CS3 upwards, though for CS3 itself you must have installed the Extension Manager.)

Once I installed it and restarted Flash, I could access the main panel via Commands | Lip Sync with SmartMouth:

SmartMouth main panel

The Help document can be brought up by clicking the question mark button; this does a great job of explaining the separate elements of the panel, but — call me biased — I felt it could also have used a brief tutorial walking me through how to use the tool. Still, there’s a detailed tutorial over on the Adobe Developer Connection, and the process is pretty simple anyway:

First, I imported a sound track (I picked this public domain reading of one of Aesop’s Fables, from LibriVox), put it on its own layer, and set its Sync to Stream.

Next, I created a new layer for the mouth to go on. Like I said, I’m not an artist, but fortunately we have a free Lip Sync Assets pack in the Activetuts+ archives, so I downloaded and imported that. I dragged and dropped each symbol onto a frame in my MouthShapes layer.

Then I re-opened the SmartMouth panel; it had taken a guess at the layers I wanted to use for audio and animation, so all I had to do was choose the shapes that corresponded to each phoneme:

SmartMouth main panel -- populated

As you can see, it picked a Start Frame and an End Frame for me, so all I had to do was click Tell me, SmartMouth. This kicks off the audio analyzer, which plays the whole audio track through (visualizing it as it goes):

SmartMouth Audio Analyzer

After that, there’s a brief wait while it adds the keyframes for each mouth sound. I picked a 45 second sound file, which took up about a thousand frames (at 24fps), and SmartMouth figured out which mouth sounds went where — and actually placed the keyframes — within twenty seconds:

SmartMouth modifies the timeline

All I had to do then was remove the original mouth shapes from the timeline, and add a “grin” to the end (okay, technically that last one was optional). Check out the results for yourself:



Click to start the audio and animation.

I’m impressed!


Room for Improvement

I did come across a couple of bugs while using SmartMouth. When I entered my registration key, the “Success” dialog got stuck in a loop, and kept reappearing no matter how many times I hit OK. Then, later, I tried deleting all the mouth frames which SmartMouth had placed and running it again; this made it run a lot slower, and in fact it took longer than the 60 second time limit Flash imposes, making it crash without finishing its job.

Still, neither bug was a big problem, since SmartMouth has a kind of “emergency exit”: right-click the main panel and click EXIT, and it’ll shut down, putting you back in control. Plus, if the audio is too long, you can work in chunks of a few hundred frames at a time by changing the Start and End Frame options.

I mentioned, the Help docs are well-written — but I would have liked to see tool tips on the various buttons within the panel. It’s not immediately obvious what the buttons next to End Frame are for, nor what Mode or Limit To actually do, without reading up on them. Even “Tell me, SmartMouth” doesn’t suggest a command that will automatically place symbols in the timeline. But these are just nitpicks; once you’ve used the options, you’ll know what they do.

My one major gripe was that, even though I placed the mouth shape symbols in different places around the stage, SmartMouth aligned them all when syncing to the audio (I think the mouth shape for the letter O is out of place in the SWF demo above). However, this proved to be my mistake: if I’d created a new symbol on the MouthShapes layer, and placed the individual mouth symbols inside that symbol, SmartMouth would have preserved my positionings.


My Verdict

After Ian finished Animating the Envato Community Podcast, he told me that a tool like SmartMouth would have saved him a lot of time and tedium. (Actually, he used rather more excited terms than that.) I can see why.

In that video, there were several different people talking in turn, so there were different mouths that needed to be animated. SmartMouth doesn’t have an interface for doing this specifically, but it would be pretty simple to use it for that. Either:

  • separate the speakers’ voices into separate tracks on separate layers and run SmartMouth once per track,
  • use the Start and End Frame boxes to isolate the section of the track corresponding to one character at a time, or
  • run it once for each character and simply delete the frames that don’t match the character who’s talking.

Although SmarthMouth’s most instantly impressive feature is its ability to put the mouth symbols on the stage in sync with the vocal track, this isn’t strictly necessary. If you prefer, you can tell SmartMouth just to create a new layer with labels corresponding to each phoneme in the vocals, so you can put the graphics in manually without having to keep scrubbing the timeline to see what sound you’re supposed to be imitating. This would be useful for frame-by-frame animation, or a scene with a lot of motion.

It’s also possible to make SmartMouth export the phoneme data to an XML file; this could then be used in another platform, like Unity, or even loaded into a SWF with AS3 so that you could animate a custom avatar’s mouth dynamically. (From what I hear, Justin is working on a version of the tool specifically for that purpose.)

Overall, I highly recommend SmartMouth if you need to do any lip-syncing in Flash. The basic functionality it amazing by itself, and the extra features push it over the edge.

SmartMouth is available for purchase at the Ajar Productions website; prices start at $49.99 for a single seat, with discounts if multiple seats are bought at once.

  • Jaron

    This is cool!

  • Digby

    Am I looking at the wrong example? That lip syncing is absolutely terrible. You paid 50 bucks for that??

    • http://michaeljameswilliams.com/ Michael James Williams
      Author

      I’ll agree it’s not perfect. I wanted to use it as a true example of what SmartMouth could do within a couple of minutes. I’m not sure whether the fault lies with the graphical assets, with SmartMouth itself, or with me — like I say, I’m not an animator.

      Take a look at this other SmartMouth demo, which uses different graphics: http://www.adobe.com/content/dam/Adobe/en/devnet/flash/articles/lip-sync-smartmouth/fig_13.swf

      • David

        I think the asset for “O” is making the whole effect look a little ugly.

  • http://blog.nicholasczuma.com/ Nicholas Czuma

    It would be helpful for certain kinds of projects, but most of the time I animate mouths that fluidly transform from one state into another; this extension is more for people who go with static mouth shapes (probably the majority of Flash animators).

    • http://michaeljameswilliams.com/ Michael James Williams
      Author

      I’m curious: would SmartMouth’s ability to add labels, corresponding to each phoneme, to a layer help you out? Or is it just a totally different kind of work?

  • Andrew

    The “O” isn’t quite right :)

    • http://michaeljameswilliams.com/ Michael James Williams
      Author

      That’s what I thought :( I can’t figure out how to fix it, though!

  • Greg W

    Having done frame by frame phenom syncing manually before, I will say that this could probably save a HUGE amount of time.

    That said… With all shortcuts, you sacrifice quality. Its pretty clear that the syncing here needs some major improvements. Not faulting you, maybe its even the audio sample that you used as it seems very low quality, but the only things that look synced are the “ess” sounds, like when the narrator says “grapes.”

  • Merrick

    Well, its not great (no disrespect, the demo on the abobe site isn’t that good either), there seems to be far too many “ooo” shapes thrown in.

    But for $50 dollars, I’m sure its going to save a lot of time for some people involved in projects at the lower end of budgets. After all, you can’t expect to make eye candy special effects (Avatar, hated the film, even though it looked pretty) by using MacPaint.

    However, its a good starting point, and best of luck to the developers, Ajar Productions. By version 3 or 4, I’m sure that this is going to become a pretty useful and well rounded piece of software.

  • http://ajarproductions.com Justin Putney

    Hi Everyone, thanks for your comments!

    This is Justin, co-founder of Ajar Productions, and developer of SmartMouth. I agree with the comments regarding the “o” sound in this particular audio sample. This is something that I’m working on for the next version. I think of this software as a living, evolving product. We will continue to improve it as we have more chances to test new samples. And updates are free!

    We don’t anticipate that this extension will ever replace the animator, but hopefully it will make his/her life easier, as well as save time and money. There will always be clean-up and customization to do after running an extension like this, but we will work to reduce the amount of clean-up required.

    We’re open to your suggestions, especially if you have specific audio and mouth shapes that you feel are not syncing. You can reach us at support@ajarproductions.com.

  • The Man

    I was the creator of those graphics, well my graphics are not the cause of the horrible preview, they were made to position them manually. cause, the “O” hasn’t to be soo big!

    • http://michaeljameswilliams.com/ Michael James Williams
      Author

      Ah — so I should have resized the O? Did any other mouth sounds need resizing?

  • Rafael John

    that’s cool and neat way, but as I know (in my previous 2d animation course) when you make lip sync you have to make or even redraw an inbetween in the main mouth syllable in any transition,also I can see that the mouth outputs the word per syllable, but sometimes some syllables are omitted for better movement.
    But this is really cool,

    -
    well here’s what I’ve tried
    http://ryujin2490.deviantart.com/art/any-body-there-254141182